A Complete Guide to Web Design

(やまだぃちぅ) #1
460 Chapter 27 – Internationalization

HTML 4.0 Language Tags


Web Design in a Nutshell, eMatter Edition

HTTP header). The meta tag that corresponds to the above header message would
look like this:
<META http-equiv="Content-Type" content="text/html;
charset=ISO-8859-8">
Note that the browser must support your chosen character set in order for the
page to display properly.
Browsers may one day send anaccept-charsetvalue, specifying their preferred
character encoding when requesting a document (currently only Lynx supports
this function). The server would then serve the document with the appropriate
encoding, if the preferred version is available.
Theaccept-charsetattribute is already a part of the HTML 4.0 specification for
form elements (although it is not yet supported by major browsers). With the
accept-charsetattribute, the document can specify which character sets the
server can receive from the user in text input fields.

HTML 4.0 Language Tags


Coordinating characters sets is only the first part of the challenge. Even languages
that share a character set may have different rules for hyphenation, spacing, quota-
tion marks, punctuation, and so on. In addition to character shapes (glyphs),
issues such as directionality (whether the text reads left-to-right or right-to-left) and
cursive joining behavior had to be taken into account as well.
This prompted a need for a system of language identification. The W3C responded
by incorporating the language tags put forth in the RFC 2070 standard on
internationalization.

The “LANG” Attribute


Thelangattribute can be added within any tag to specify the language of the
contained element. It can also be added within the <html>tag to specify a
language for an entire document. The following example specifies the document’s
language as French:
<HTML LANG="fr">
It can also be used within text elements to switch to other languages within a
document, for example, you can “turn on” Norwegian for just one element:
<BLOCKQUOTE lang="no">...</BLOCKQUOTE>
The value for thelangattribute is a two-letter language code (not the same as
country codes). Table 27-1 lists the currently available language codes.

Table 27-1: Code for the Representation of Names of Languages

Code Country Code Country Code Country
aa Afar ia Interlingua rn Kirundi
ab Abkhazian id Indonesian
(formerly in)

ro Romanian
Free download pdf