262 Chapter 11—Microdata
This could be the beginning of a fictitious review of a concert in an equally fic-
titious blog: two paragraphs full of information, filtered and combined auto-
matically by the reader whilst reading. The event is defined in terms of time and
location; objects, instruments, and events on stage are recognized and people
mentioned in the text are identified as a matter of course as musicians with their
respective instruments. The human brain is trained to filter information effi-
ciently. Computers are not and require help to filter information. This help basi-
cally boils down to marking and correlating the relevant information.
Which information is relevant depends entirely on what we want to filter out of
the text. For a diary it would be the name of the event, its time, and place; for an
address book, the contact details of the musicians; and for searching for new CDs
to add to your music collection, you need the names of the artists and bands.
One option for offering the quintessence of a text in the relevant context and
in machine-readable form is microdata—a very young and emotionally debated
feature of HTML5.
In the eyes of many critics, microdata stands in direct competition with RDFa,
the Resource Description Framework, another option of embedding metadata.
Its close connection to XHTML makes it especially difficult to fit in with the con-
cept of HTML5, which lacks the namespaces used abundantly in RDFa. The result
of the tug-of-war between the two approaches is, not surprisingly, two specifica-
tions, with microdata present both as an integrated WHATWG version and as a
W3C stand-alone version, whereas RDFa can only be found in the W3C. The links
to the specifications are
z http://www.w3.org/TR/microdata
z http://www.whatwg.org/specs/web-apps/current-work/multipage/links.
html#microdata
z http://www.w3.org/TR/rdfa-in-html
The a in RDFa stands for attributes, which brings us to the feature both tech-
niques have in common. Both RDFa and microdata use a set of attributes to de-
fine metadata. In RDFa, this metadata is present as a triple of subject, predicate,
and object. As explained in Wikipedia with regard to the Resource Description
Framework, the subject denotes the resource (Pat Metheny), the predicate de-
notes traits or aspects of the resource (musician), and the object expresses a rela-
tionship between the subject and the object (Orchestrion). With microdata, the
information ends up as name-value pairs, such as Pat Metheny : musician or Pat
Metheny : Orchestrion. Which of the two approaches will ultimately prevail is un-
certain. Both techniques have advantages and disadvantages, and could also co-
exist. But because microdata can already be integrated seamlessly into HTML5,
we will concentrate on microdata in this chapter.