In my previous article on HTML5 semantics, I discussed some of the new elements added to the HTML5 specification along with their semantic meaning. In this article I'll look at the differences between HTML4 (or XHTML—the terms are used interchangeably in this article) and the HTML5 document structure, including the addition of new global attributes.
 

 
Changes in the document structure

HTML5 has introduced several changes to the document itself. To my own personal chagrin, HTML5 allows authors to create documents that are not well-formed. In other words, it allows a looser structure where <p> and <li> elements do not need to be closed. The browser still knows how to deal with it. It is case insensitive, so you can capitalize or lowercase at will. If you're used to writing HTML4, you can continue in that style. If XHTML is your preference, carry on—it's perfectly acceptable. However, even though a loosely formatted document is acceptable, it may not be advisable. Troubleshooting messy code can be problematic, so I recommend continuing to use clean mark-up.
 
 
The doctype
The most obvious difference between HTML4 and HTML5 is the new, shortened doctype. I don't know about you, but I didn't memorize the HTML4 or XHTML1 doctypes. They were long and clunky. But we've now gone from this long form doctype:
 
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
to a very short, unversioned one:
 
<!DOCTYPE HTML>
Leaving a version number off doesn't mean there will never be advances and further evolution of HTML. Since HTML5 is meant to be backward compatible, the W3C didn't feel it necessary to continue using a numbering system when extending it. A modern browser will render what it's able to render, regardless. Internet Explorer (version 5 and earlier) used a noncompliant, broken box model. When Microsoft changed to the standard W3C box model rendering, a way was needed to indicate which rendering mode to use for a web page. Doctypes were created to allow browsers to switch their rendering between Standards mode (the W3C version) or Quirks mode (the broken version which many older documents on the web were using). The new, simplified doctype is the least number of characters needed to let browsers know to render the document in standards mode.
 
 
The charset
Another structural change to the document is in the charset or character encoding. Previously you used:
 
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
Like the doctype, you can now use this simplified version:
 
<meta charset="utf-8">
 
Links to style sheets and scripts
In keeping with simplification, the type attribute is no longer required for <link> and <script> elements. Where you previously used this:
 
<link href="assets/css/main.css" rel="stylesheet" type="text/css" /> <script src="assets/js/modernizr.custom.js" type="text/javascript"></script>
You can now optionally use this shortened version:
 
<link href="assets/css/main.css" rel="stylesheet" /> <script src="assets/js/modernizr.custom.js"></script>
 
The full document
Pulling the above information into a single document, your HTML5 page would look like this:
 
<!DOCTYPE HTML> <html> <head> <meta charset="UTF-8"> <title>Document Name</title> <link href="assets/css/main.css" rel="stylesheet" /> <script src="assets/js/modernizr.custom.js"></script> </head> <body> <p>Your content</p> </body> </html>

 
Global attributes

Note that there are changes to HTML4 attributes. Several existing attributes, that you're likely already familiar with, have been made global. They may be applied to any and all elements as needed. They include:
 
  • accesskey
  • class
  • dir
  • id
  • lang
  • style
  • tabindex
  • title
In addition, a set of new global attributes have been added. Let's have a quick look at each of them.
 
 
The contenteditable attribute
The contenteditable attribute allows any HTML element to be made editable. It can contain three values: true, false, and inherit.
 
You may have seen contenteditable in action already. It's very useful when writing a tutorial where you'd like the user to be able to interact with a demo and change values (see the CSS Tricks demo). Maybe you've seen a presentation at a conference where the presenter has created her slides using HTML5—and then edited them live in the browser during the presentation.
 
You can make anything editable. Think about the possibility for creating an in-page text editor. If you use local storage, your user can come back to the page later and retain the changes. Since contenteditable has been supported in IE since version 5.5, the support for it is good (though not yet in the mobile arena). If you're making part of your page editable, you can give the user a visual indication using outline and an attribute selector:
 
[contenteditable]:hover, [contenteditable]:focus { outline: 2px dotted red; } <p contenteditable="true">Your content</p>
Attribute selectors have been supported since IE7 and allow you to target an element that has a specific attribute. Note that I used both the :hover and :focus pseudo-class? That's so that users navigating with both a mouse and keyboard can see the visual indication. I chose to use outline over border because it doesn't add to the box model of the element so the area of the page doesn't appear to jump when triggered. Be aware that if it's important to have IE6/7 support, you should use border instead.
 
 
The contextmenu attribute
According to the W3C HTML5 Working Draft:
 
"The contextmenu attribute gives the element's context menu. The value must be the ID of a menu element in the DOM."
 
The menu element itself is simply a list of commands. They could be form elements, list items, or other elements. The menu is hidden until an event like keyup or mouseup is fired causing it to provide a bubble menu of options and actions.
 
This allows you to save UI space in the same way a drop-down menu does since it is only shown when requested in some way. There is, at this time, no support in a modern browser though it is ready for first implementations. The code may look like this:
 
<label for="char">Charter name: </label> <input name="char" type="text" contextmenu="boatmenu" required> <menu type="context" id="boatmenu"> <!—menu content elements here --> </menu>
 
The data-* attribute
"A custom data attribute is an attribute in no namespace whose name starts with the string "data-", and has at least one character after the hyphen..."
 
These custom data attributes allow you to create attributes to share data with scripts run on your own site. They are not to be used, or harvested, by generic software. You are not limited in how many custom data attributes you can specify. According to caniuse.com, "all browsers can already use data-* attributes and access them using getAttribute."
 
Due to good support, there are many examples of custom data attributes that already exist in the wild. If you have Dreamweaver CS5.5, you can create a jQuery Mobile (JQM) application. jQuery Mobile makes extensive use of custom data attributes for identifying roles of elements, themes, and many other things. Here's an example of a JQM page:
 
<div data-role="page" id="page" data-theme="b"> <div data-role="header"> <h1>Header</h1> </div> <div data-role="content">Content</div> <div data-role="footer"> <h4>Footer</h4> </div> </div>
 
The role and aria-* attributes
If you put effort into making your website accessible to users with a variety of different browsing habits and physical disabilities, you'll likely recognize the role and aria-* attributes. WAI-ARIA (Accessible Rich Internet Applications) is a method of providing ways to define your dynamic web content and applications so that people with disabilities can identify and successfully interact with it. This is done through roles that define the structure of the document or application, or through aria-* attributes defining a widget-role, relationship, state, or property.
 
ARIA use is recommended in the specifications to make HTML5 applications more accessible. When using semantic HTML5 elements, you should set their corresponding role. The basic structure may look something like this:
 
<header id="banner" role="banner"> ... </header> <nav role="navigation"> ... </nav> <article id="post" role="main"> ... </article> <footer role="contentinfo"> ... </footer>
There is also a host of aria-* attributes to make your content more navigable and understandable. Things like aria-labelledby, aria-level, aria-describedby, and aria-orientation all make your content more recognizable. You can read more about it on the ARIA states and properties page.
 
In my earlier article on HTML5 semantics, I looked at the new figure and figcaption elements. The code I used looked like this:
 
<figure> <img src="virgin-gorda.jpg" alt="The boat as seen through the rocks at the Baths on Virgin Gorda."> <figcaption>The Baths at Virgin Gorda</figcaption> </figure>
If you add the aria-describedby attribute, you can create a relationship between the <figure> and <figcaption> elements that doesn't yet exist semantically for assistive technology. It might look like this:
 
<figure> <img src="virgin-gorda.jpg" alt="The boat as seen through the rocks at the Baths on Virgin Gorda." aria-describedby="capt1"> <figcaption id="capt1">The Baths at Virgin Gorda</figcaption> </figure>
To learn more, check out Derek Featherstone's tutorial on ARIA and accessibility in the wild at SitePoint.
 
 
The draggable and dropzone attributes
These two attributes were placed together since they're part of the new drag and drop API (DnD API). For the draggable attribute, there are three states: true, false, and auto (auto is not a keyword, it's simply the missing value default). According to the W3C HTML5 Working Draft:
 
"The true state means the element is draggable; the false state means that it is not. The auto state uses the default behavior of the user agent."
 
If you're going to drag something, you need to be able to drop it too. That's what the dropzone attribute does. Three values can currently be specified— copy, move, and link: copy creates a copy of the dragged element; move actually moves the element to the new location; link makes a link to the dragged data. The DnD API is starting to gain traction with Gmail using it as the basis for their file upload that allows you to drag directly onto the browser. Ryan Seddon has created a way to test custom fonts without uploading them to the server (called Font Dragr). It uses the DnD API and allows you to drag the font file right onto the browser to preview.
 
Support for these attributes is good (in all browsers except Opera—including Android), though with dropzone, you will need to get into a bit of JavaScript. For an excellent tutorial to get you started on the ins and outs, read Remy Sharp's HTML5 Doctor article, Native Drag and Drop. Be aware there are aria-dropeffect and aria-grabbed (state) you should use to make your content more accessible as well.
 
 
The hidden attribute
Here's the W3C HTML5 Working Draft on the hidden attribute:
 
"The hidden attribute is a boolean attribute. When specified on an element, it indicates that the element is not yet, or is no longer, relevant. User agents should not render elements that have the hidden attribute specified."
 
Of course, you're going to have to manipulate this attribute with JavaScript. An example might be using the hidden attribute to log into a web game. Initially the user would see the log in screen with the game hidden. On verifying credentials, the user would see the game with the log in screen hidden.
 
When an element has the hidden attribute applied, it is hidden from all user agents, including screen readers; however, scripts and form controls can still execute. It is merely a change in presentation, the same as display:none. (HTML5 Accessibility states all supporting browsers—which basically excludes IE only—use display:none.) You'll recall that display:none causes an element not to display a box at all—so everything around it collapses into its place. The hidden attribute does the same. You may need to carefully consider whether it's better to use the hidden attribute, display:none, or the aria-hidden attribute.
 
<fieldset id="login" hidden>
 
The spellcheck attribute
According to the W3C HTML5 Working Draft:
 
"User agents can support the checking of spelling and grammar of editable text, either in form controls (such as the value of textarea elements), or in elements in an editing host (using contenteditable)."
 
Just like the contenteditable attribute, the possible values for the spellcheck attribute are true, false, or default: true means it will be checked, false means it will not, and inherit takes the value of the parent if there is one.
 
You can try the live demo on Wufoo to see the spellcheck in action. Browser support is good in modern browsers (not available in IE or Safari Mobile). Figure 1 shows a screenshot of spellcheck in action in the Chrome browser. Note the red underline that looks much like it would in any text editor or mail program.
 
<input type="text" spellcheck="true">
Figure 1. Spellcheck in action in the Chrome browser.
Spellcheck doesn't really require a polyfill for fallback if it isn't available. It's an attribute that is nice to have but not required in most cases and it fails silently, so there's no reason not to add it where appropriate.
 

 
The super duper link element

And lest I've made your head spin with all these new global attributes, I'll leave you with the most exciting thing about HTML5 semantics. It's the biggest reason you'll want to start using the doctype now, even if you don't use the new elements yet.
 
 
You can put a hyperlink around anything. Seriously.
Gone are the days of wrapping a link around an image and the same link around its caption. Or placing the same but separate links around a heading and the paragraph of content below it. As stated in the W3C HTML5 Working Draft:
 
"The a element may be wrapped around entire paragraphs, lists, tables, and so forth, even entire sections, so long as there is no interactive content within (e.g. buttons or other links)."
 
I mean, how freakin' cool is that? The first thing I did when I learned this was run to my own site where the homepage had the heading/paragraphs grouped, but separately wrapped. I changed the code to this:
 
<ul> <li>...</li> <li id="training"> <a href="services.html#train"> <h1>Corporate Training</h1> <p>Bring your team up to speed on web standards, HTML5, CSS3, accessibility or Dreamweaver with a top expert in the industry.</p> </a> </li> <li>...</li> </ul>
This has good browser support. In my own work, I have occasionally run into an odd little rendering glitch, which has always been fixed by setting the hyperlink to display: block . Derek Featherstone has done some fairly extensive testing for accessibility with this technique. He found it to be acceptable for assistive technology as well.
 
 
An a element can be a placeholder
When using consistent navigation throughout a site, you typically don't want the page your user is currently on to link to itself. But many times, the styling of the navigation is as integrated with the a element as it is with the li or p that contain it. Removing the entire a element so the link isn't active can make the design fall apart. All that's fixed now too—you can simply remove the href altogether. Here's the W3C on the topic:
 
"If the a element has no href attribute, then the element represents a placeholder for where a link might otherwise have been placed, if it had been relevant."
 
So for list-based navigation, if someone on the services.html page, the code would look like this:
 
<ul> <li>...</li> <li id="services"><a>Services</a></li> <li id="resources"><a href="resources.html">Resources</a></li> <li id="clients"><a href="clients.html">Clients</a></li> </ul>
Using an a element as a placeholder has great support as well. If you have underlines on your navigation (fairly unlikely in a menu), the underline will not be present since it is not a hyperlink. The rest of your styling will remain. Win!
 
In Understanding HTML5 semantics – Part 3, I discuss some of the changes to older elements: some are obsolete, some have changed in semantic meaning, and a few have been reintroduced.
 
Also, don't be afraid to venture over to the spec to check on all the shiny new HTML5 things happening. If the overly technical version is just too much, Ben Schwarz has cleaned out the portions for browser vendors and left just the stuff you and I need to know: an extra readable version of the HTML5 spec!
 
Happy coding...