20.10. A Taste of New I/O
The java.nio package ("New I/O") and its subpackages give you access to high performance I/O, albeit
with more complexity. Instead of a simple stream model you have control over buffers, channels, and other
abstractions to let you get maximum speed for your I/O needs. This is recommended only for those who have
a demonstrated need.
The model for rapid I/O is to use buffers to walk through channels of primitive types. Buffers are containers
for data and are associated with channels that connect to external data sources. There are buffer types for all
primitive types: A FloatBuffer works with float values, for example. The ByteBuffer is more
general; it can handle any primitive type with methods such as getFloat and putLong.
MappedByteBuffer helps you map a large file into memory for quick access. You can use character set
decoders and encoders to translate buffers of bytes to and from Unicode.
Channels come from objects that access external data, namely files and sockets. FileInputStream has a
getChannel method that returns a channel for that stream, as do RandomAccessFile,
java.net.Socket, and others.
Here is some code that will let you efficiently access a large text file in a specified encoding:
public static int count(File file, String charSet, char ch)
throws IOException
{
Charset charset = Charset.forName(charSet);
CharsetDecoder decoder = charset.newDecoder();
FileInputStream fis = new FileInputStream(file);
FileChannel fc = fis.getChannel();
// Get the file's size and then map it into memory
long size = fc.size();
MappedByteBuffer bb =
fc.map(FileChannel.MapMode.READ_ONLY, 0, size);
CharBuffer cb = decoder.decode(bb);
int count = 0;
for (int i = 0; i < size && i < Integer.MAX_VALUE; i++)
if (cb.charAt(i) == ch)
count++;
fc.close();
return count;
}
We use a FileInputStream to get a channel for the file. Then we create a mapped buffer for the entire
file. What a "mapped buffer" does may vary with the platform, but for large files (greater than a few tens of
kilobytes) you can assume that it will be at least as efficient as streaming through the data, and nearly
certainly much more efficient. We then get a decoder for the specified character set, which gives us a
CharBuffer from which to read.[4]
[4] Note that there is an unfortunate discrepancy between the ability to map huge files and the
fact that the returned buffer has a capacity that is limited to Integer.MAX_VALUE.
The CharBuffer not only lets you read (decoded) characters from the file, it also acts as a
CharSequence and, therefore, can be used with the regular expression mechanism.
In addition to high-performance I/O, the new I/O package also provides a different programming model that
allows for non-blocking I/O operations to be performed. This is an advanced topic well beyond the scope of
this book, but suffice it to say that this allows a small number of threads to efficiently manage a large number