46 2 XML Semantics
- XSD does not change the semantics of XML documents.
- An XML DTD can be converted to XSD usingdtd2xsd.pl.
2.4 XML Data
An XML DTD has a rich language for specifying the kinds of element that
can be contained within each element. By contrast there is very little one can
say about attributes and text content with an XML DTD. In practice, nearly
all attributes are defined to be of typeCDATAwhich allows any kind of text
data except for XML elements. When an element has text, then its content
is defined to be of type#PCDATAwhich allows any kind of text data. Many
important types of data are routinely used by computer and database sys-
tems, such as numbers, times, dates, telephone numbers, product codes, and
so on. The limitations of XML DTDs have prevented XML processors from
properly validating these types. The result has been that individual appli-
cation writers have had to implement type checking in an ad hoc manner.
The XSD datatype recommendation addresses the need of both document
authors and applications writers for a robust, extensible datatype system
for XML. This standard has been very successful, and it has now been in-
corporated into other XML-related standards such as RDF and OWL, to be
discussed in chapter 4.
For example, in the Medline schema part of which was shown in sec-
tion 2.3, theDayelement specifies the day of the month, but this schema
allows one to use any text whatsoever as a day. At the very least, one should
limit the values to positive numbers. To do this one should change the spec-
ification for the Day element to the following:
<element name=’Day’ type=’xsd:positiveInteger’/>
An even better specification would further restrict the possible numbers to
be positive numbers no larger than 31 as in
<element name=’Day’>
<simpleType>
<xsd:restriction base=’xsd:positiveInteger’>
<xsd:maxInclusive value=’31’/>
</xsd:restriction>
</simpleType>
</element>