244 10 Transforming with Traditional Programming Languages
use XML::DOM;
$p = new XML::DOM::Parser;
$doc = $p->parsefile($ARGV[0]);
$weights = $doc->getElementsByTagName(Weight);
for ($i = 0; $i < $weights->getLength; $i++) {
$weight =
$weights->item($i)->getFirstChild->getNodeValue;
print("Weight $weight\n");
}
Program 10.17 Converting an entire XML document using a Perl data structure
10.2.3 The Document Object Model
Although transforming XML element by element is capable of accomplishing
any transformation task, it gets very complicated very quickly. The problem
is that the information one needs at a given time might not be in the element
being processed. It may, for example, be in the parent element. To deal with
this, one can parse the entire document and then extract the parts that are
needed. For example, suppose that one would like to extract the date, height,
and weight of each interview. Program 10.17 uses the “whole document” ap-
proach to this task. This program uses the XML::DOM package. DOM stands
for document object model. It reads an entire XML document into a single
module object. Just as in the XML::Parser package, one constructs a parser,
but no handlers need to be defined. The parser is then invoked to parse a
document, and the module object containing the document is assigned to
thedocvariable. After that, one extracts information about the document
by using DOM methods.
There are many DOM methods. One of the most popular methods is
getElementsByTagNamewhich extracts all of the elements within the cur-
rent element which have a particular tag. Most DOM methods return an
object, so one uses the->operator to extract information from it. Theitem
method gets one of the elements extracted bygetElementsByTagName.
Theitemmethod is yet another way to extract one of the items in a list. One
uses brackets to get one item in an array, and braces to get one item in a hash.