untitled

(ff) #1

11


The XML Transformation


Language


The XML Transformation Language (XSLT) (W3C 2001d) is one of the most
popular, as well as the most commonly available, transformation languages
for XML documents. Although this language was originally intended for use
by the XML Stylesheet Language (XSL), one can use XSLT for many other
useful transformations, including data transformations for bioinformatics.
In fact, XSLT is used mostly for transformation today. While there are many
XML transformation languages, XSLT has the advantage of being rule-based
and being itself written in XML. This chapter introduces this style of pro-
gramming.

11.1 Transformation as Digestion


XSLT is very different from the procedural style of programming that dom-
inates mainstream programming languages. XSLT is rule-based. An XSLT
rule is called atemplate, and an XSLT program is just a set of templates. The
templates are separate from one another (i.e., one template can never contain
another), and the order in which they appear in the program does not matter.
The whole XSLT program is called atransformation programor atransform.
Consider the document in figure 11.1 that shows some protein interaction
data from a microarray experiment. Suppose that one would like to change
the names (tags) of some of the elements. Specifically, suppose that instead of
Proteinwe want to useP, and instead ofSubstrate,useS. Transform 11.1
shows the XSLT program for doing this task. To understand how this pro-
gram functions, consider how enzymes digest molecules such as proteins.
Proteins are long chains of amino acids, and each enzyme is capable of split-
ting the chain at one or more specific points in the chain, which match the
active site of the enzyme. This process is shown symbolically in figure 11.2.
Free download pdf