Java 7 for Absolute Beginners

(nextflipdebug5) #1

C H A P T E R 9


169

Writing and Reading XML


XML stands for Extensible Markup Language. You might think it would be “eXtensible Markup
Language,” but it's not. Odd acronym aside, XML rates inclusion in a book for beginning programmers
because, as your software-development career (whether as a hobbyist sqor a professional) continues,
you'll inevitably run into XML in all sorts of places. It's used to store documents, from the contents of a
single web page to the contents of entire sets of encyclopedias. It's also used to transmit data between
applications, whether the servers running those applications are halfway around the world or in the
same room. It's even used (with Cascading Style Sheets) to display information in web browsers. Every
company I've worked for over the last dozen years, and every application I've written (at least those
applications more serious and substantial than Minesweeper), has made at least some use of XML.
Although a specialized language called XSLT (Extensible Stylesheet Language Transformation) exists
specifically for processing XML, Java is also a very popular language for dealing with XML. Also, one of
the best and most popular XSLT processors, called Saxon, is coded in Java. Java is especially handy for
working with XML because it includes a number of packages intended specifically for processing
(reading, writing, and transforming) XML. The two most common packages (largely because they are
included in Java) are DOM (Document Object Model) and SAX (Simple API for XML). You can use DOM
to read and write XML. SAX only reads (or, more properly, parses) XML. For writing XML, though, you
can also create a String object and write that to your file. Done correctly, writing String objects offers
the lowest overhead (in both memory and speed) for producing XML. This chapter will cover writing
XML from DOM and from String objects, and reading XML with DOM and SAX.


The Structure of XML


Before you get to processing the stuff, you should see what XML looks like and learn a bit about its
nature. First off, know that XML, while called a language, isn't a language in the same sense as Java. XML
is a storage format, and it offers no processing capabilities of its own. It has no looping structure, no way
to specify variables or data types (except that a program might use a bit of XML as a variable or data type,
but that's not the same as what Java does). So, XML is really just text that has been organized in a
particular way.
The root of any XML document is a single element. That element can have any number of other
elements as children, and each of these children can have any number of children, and so on, resulting
in a hierarchical structure of arbitrary complexity and depth (which is to say that an XML document can
be of any size and have elements nested to any depth). Also, each element can have any number of
attributes. However, attributes cannot have children, so most of the content, in most XML documents,
comes from the elements.
Before going any further, take a look at the smallest possible XML file.


Java 7 for Absolute Beginners
© Jay Bryant 2012

J. Bryant,
Free download pdf