Sams Teach Yourself HTML, CSS & JavaScript Web Publishing in One Hour a Day

(singke) #1
ptg16476052

138 LESSON 7: Formatting Text with HTML and CSS


Special Characters


As you’ve already learned, HTML files are ASCII text and should contain no formatting
or fancy characters. In fact, the only characters you should put in your HTML files are
the characters that are actually printed on your keyboard. If you have to hold down any
key other than Shift or type an arcane combination of keys to produce a single character,
you can’t use that character in your HTML file. This includes characters you might use
every day, such as em dashes and curly quotes. (If you are using a word processor that
does automatic curly quotes, you should find another HTML editor that writes text files
instead.)
“But wait a minute,” you say. “If I can type a character like a bullet or an accented a on
my keyboard using a special key sequence, and I can include it in an HTML file, and my
browser can display it just fine when I look at that file, what’s the problem?”
The problem is that the internal encoding your computer does to produce that character
(which enables it to show up properly in your HTML file and in your browser’s display)
probably won’t translate to other computers. Someone on the Internet who’s reading your
HTML file with that funny character in it might end up with some other character or just
plain garbage.
So, what can you do? HTML provides a reasonable solution. It defines a special set of
codes, called character entities, that you can include in your HTML files to represent the
characters you want to use. When interpreted by a browser, these character entities dis-
play as the appropriate special characters for the given platform and font.
Some special characters don’t come from the set of extended ASCII characters. For
example, quotation marks and ampersands can be presented on a page using character
entities even though they’re found within the standard ASCII character set. These char-
acters have a special meaning in HTML documents within certain contexts, so they can
be represented with character entities to avoid confusing web browsers. Modern browsers
generally don’t have a problem with these characters, but it’s not a bad idea to use the
entities anyway.

HTML validators will complain when they encounter ampersands
that are not part of entities, so you always want to encode them
using entities on your pages.

CAUTION
Free download pdf