THE Java™ Programming Language, Fourth Edition

(Jeff_L) #1

  • TT_WORD: A word was scanned. The String field sval contains the word that was found.
    TT_NUMBER: A number was scanned. The double field nval contains the value of the number.
    Only decimal floating-point numbers (with or without a decimal point) are recognized. The tokenizer
    does not understand 3.4e79 as a floating-point number, nor 0xffff as a hexadecimal number.




  • TT_EOL: An end-of-line was found.

  • TT_EOF: The end-of-file was reached.


The input text is assumed to consist of bytes in the range \u0000 to \u00FFUnicode characters outside this
range are not handled correctly. Input is composed of both special and ordinary characters. Special characters
are those that the tokenizer treats speciallynamely whitespace, characters that make up numbers, characters
that make up words, and so on. Any other character is considered ordinary. When an ordinary character is the
next in the input, its token type is itself. For example, if the character '¿' is encountered in the input and is
not special, the token return type (and the ttype field) is the int value of the character '¿'.


As one example, let's look at a method that sums the numeric values in a character stream it is given:


static double sumStream(Reader source) throws IOException {
StreamTokenizer in = new StreamTokenizer(source);
double result = 0.0;
while (in.nextToken() != StreamTokenizer.TT_EOF) {
if (in.ttype == StreamTokenizer.TT_NUMBER)
result += in.nval;
}
return result;
}


We create a StreamTokenizer object from the reader and then loop, reading tokens from the stream,
adding all the numbers found into the burgeoning result. When we get to the end of the input, we return the
final sum.


Here is another example that reads an input source, looking for attributes of the form name=value, and
stores them as attributes in AttributedImpl objects, described in "Implementing Interfaces" on page 127:


public static Attributed readAttrs(Reader source)
throws IOException
{
StreamTokenizer in = new StreamTokenizer(source);
AttributedImpl attrs = new AttributedImpl();
Attr attr = null;
in.commentChar('#'); // '#' is ignore-to-end comment
in.ordinaryChar('/'); // was original comment char
while (in.nextToken() != StreamTokenizer.TT_EOF) {
if (in.ttype == StreamTokenizer.TT_WORD) {
if (attr != null) {
attr.setValue(in.sval);
attr = null; // used this one up
} else {
attr = new Attr(in.sval);
attrs.add(attr);
}
} else if (in.ttype == '=') {
if (attr == null)
throw new IOException("misplaced '='");
} else {
if (attr == null) // expected a word
throw new IOException("bad Attr name");
attr.setValue(new Double(in.nval));
attr = null;

Free download pdf