256 10 Transforming with Traditional Programming Languages
<HealthStudyUS>
<Interview Date=’2000-1-15’ Weight=’101.794’ Height=’24.18’/>
<Interview Date=’2000-1-15’ Weight=’151.69’ Height=’24.57’/>
<Interview Date=’2000-2-1’ Weight=’203.566’ Height=’25.35’/>
<Interview Date=’2000-2-1’ Weight=’110.77’ Height=’26.13’/>
</HealthStudyUS>
Program 10.24 deals only with information in attributes. When informa-
tion is in XML content, one must use additional handlers. Suppose that one
has the same task as in program 10.15, but the output must be in XML. The
output should look like this:
<WeightList>
<Weight>46.27</Weight>
<Weight>68.95</Weight>
<Weight>92.53</Weight>
<Weight>50.35</Weight>
</WeightList>
The solution is shown in program 10.25.
XML allows data to be in either attributes or content. Attributes are much
simpler to process, but they are more limited than content. Content can
have markup while attributes cannot. Generally speaking, one should use
attributes for simple data values and one should use content for more com-
plex data values.
One common transformation task is to convert from one of these two for-
mats to the other. Consider the task of converting the health study from con-
tent attributes to ordinary attributes. In program 10.26 the$printContent
variable is used by thestarthandler to inform thecharandendhandlers
that the content information is to be printed. Theendhandler turns this
variable “off.”
While the handlers style of parsing and processing XML documents is effi-
cient, programs can get very complicated as the transformation task involves
data and attributes on more than one level. As an exercise, try to modify
the program above so that it converts the weight and height from kilograms
and centimeters to pounds and inches. To do this exercise, one must intro-
duce one or more variables that allow thestarthandler to inform thechar
handler about which attribute is being printed so that the appropriate con-
version can be performed. The problem with this style is that the handling
of each element is spread over the three handlers. It would be better if all
the processing for each type of element were handled in one place. Other