untitled

(ff) #1

10.2 Transforming XML 239


the name of the parameter and the value of the parameter. The=>symbols
are suggestive of this way of using parameters. This style for designing pro-
cedures is analogous to the attributes in an XML element. One first gives the
attribute name and then the attribute value. In XML the attribute name and
attribute value are separated by an equal-to sign. In Perl they are separated
by=>symbols.
Program 10.14 can only process information that is in XML attributes. XML
content requires additional handlers. Consider the task of parsing the output
of program 10.19 of subsection 10.2.4. The XML document in this case has no
XML attributes at all, and all of the data are in XML content. Program 10.15
will accomplish the task. Just as a story has a beginning, a middle, and an
end, there are now three handlers, one for when an element starts, one the
content, and the last one for when an element ends. TheweightElement
variable is nonzero exactly when one is parsing aWeightelement. This
ensures that thecharprocedure will print the content only forWeightele-
ments. In general, thecharprocedure will be invoked several times within
a single element. It will usually be called once for each line of the content.
One of the most useful resources for general biomedical information is
PubMed. This is a repository of citations to biomedical publications. More
than half of the citations include abstracts. There are over 15 million citations
available online using PubMed. These citations are available as XML docu-
ments. The following is what part of a typical PubMed citation looks like.
The actual citation is over 130 lines long.



99405456
10476541

1999
10
21


1999
10
21


2001
11
02

Free download pdf