Chapter 4
Optional Attributes
There are no optional attributes for the br or wbr elements.
Older versions of HTML featured a clear attribute for the br element, giving browsers
instruction on how text and other elements should flow around the line break. This
presentational attribute is obsolete now, replaced by the equivalent clear property in
CSS.
Special Characters
You know by now that an HTML document is simply plain text. There’s nothing special at all about the file
format; it’s just written in a language that web devices are programmed to understand. Tags within that
plain-text document are enclosed by angle brackets (< and >) to distinguish the tags from ordinary text.
When a browser encounters those symbols, it can assume it’s dealing with markup and behave
accordingly. This raises one issue, of course: what if you need to use angle brackets in your text? If the
browser treats them as part of a tag, the entire page might fall apart.
HTML includes a large number of character references, which offer a way to encode special characters
that aren’t part of the regular English alphanumeric set of characters (A-Z, a-z, 0-9, and most common
punctuation). A character reference begins with an ampersand (&) and ends with a semicolon (;). Between
those symbols there are two different ways to invoke the special character you desire: with a character
entity name or a numeric character reference.
A character entity name is simply a predefined name referring to a particular symbol, like a nickname. The
entity for the “less than” symbol (<) is < and its counterpart, the “greater than” symbol (>), is >. You
can use these entities to render the symbols in your content and prevent them from being treated as tags.
Your other option, the numeric character reference, refers to a character by its assigned Unicode number,
and is specified by an octothorpe (#) after the ampersand. The numeric character reference for the “less
than” symbol is < and “greater than” is >. Most of the time, the much-easier-to-remember entity
names are sufficient, but some more obscure symbols may not have entity names.
In XML or XHTML, encoding special characters in this manner is known as escaping,
because these embedded codes are excluded from the regular XML parsing. If you’re
authoring XHTML, one character you must be careful to escape is the ampersand itself;
a non-escaped ampersand in your XHTML markup will be treated as the beginning of a
character reference. In order to display an ampersand in an XHTML document, encode
it with the entity & or the numeric reference &. This also goes for ampersands
in URLs within an attribute (such as cite, src, or href). HTML5 is much more forgiving
and doesn’t require escaping ampersands.