modern-web-design-and-development

(Brent) #1

HTML5 Legalizes Tag Soup


HTML5 is a lot more forgiving in its syntax than XHTML: you can write tags
in uppercase, lowercase or a mixture of the two. You don’t need to self-
close tags such as img, so the following are both legal:


1 <img src="nice.jpg" />
2 <img src="nice.jpg">

You don’t need to wrap attributes in quotation marks, so the following are
both legal:


1 <img src="nice.jpg">
2 <img src=nice.jpg>

You can use uppercase or lowercase (or mix them), so all of these are legal:


1 <IMG SRC=nice.jpg>
2 <img src=nice.jpg>
3 <iMg SrC=nice.jpg>

This isn’t any different from HTML4, but it probably comes as quite a shock
if you’re used to XHTML. In reality, if you were serving your pages as a
combination of text and HTML, rather than XML (and you probably were,
because Internet Explorer 8 and below couldn’t render true XHTML), then it
never mattered anyway: the browser never cared about trailing slashes,
quoted attributes or case—only the validator did.


So, while the syntax appears to be looser, the actual parsing rules are much
tighter. The difference is that there is no more tag soup; the specification
describes exactly what to do with invalid mark-up so that all conforming
browsers produce the same DOM. If you’ve ever written JavaScript that has
to walk the DOM, then you’re aware of the horrors that inconsistent DOMs
can bring.

Free download pdf