Pitfall - Small reads writes on unbuffered streams are inefficient

Consider the following code to copy one file to another:

import java.io.*;

public class FileCopy {

    public static void main(String[] args) throws Exception {
        try (InputStream is = new FileInputStream(args[0]);
             OutputStream os = new FileOutputStream(args[1])) {
           int octet;
           while ((octet = is.read()) != -1) {
               os.write(octet);
           }
        }
    }
}

(We have deliberated omitted normal argument checking, error reporting and so on because they are not relevant to point of this example.)

If you compile the above code and use it to copy a huge file, you will notice that it is very slow. In fact, it will be at least a couple of orders of magnitude slower than the standard OS file copy utilities.

(Add actual performance measurements here!)

The primary reason that the example above is slow (in the large file case) is that it is performing one-byte reads and one-byte writes on unbuffered byte streams. The simple way to improve performance is to wrap the streams with buffered streams. For example:

import java.io.*;

public class FileCopy {

    public static void main(String[] args) throws Exception {
        try (InputStream is = new BufferedInputStream(
                     new FileInputStream(args[0]));
             OutputStream os = new BufferedOutputStream(
                     new FileOutputStream(args[1]))) {
           int octet;
           while ((octet = is.read()) != -1) {
               os.write(octet);
           }
        }
    }
}

These small changes will improve data copy rate by at least a couple of orders of magnitude, depending on various platform-related factors. The buffered stream wrappers cause the data to be read and written in larger chunks. The instances both have buffers implemented as byte arrays.

With is, data is read from the file into the buffer a few kilobytes at a time. When read() is called, the implementation will typically return a byte from the buffer. It will only read from the underlying input stream if the buffer has been emptied.
The behavior for os is analogous. Calls to os.write(int) write single bytes into the buffer. Data is only written to the output stream when the buffer is full, or when os is flushed or closed.

What about character-based streams?

As you should be aware, Java I/O provides different APIs for reading and writing binary and text data.

InputStream and OutputStream are the base APIs for stream-based binary I/O
Reader and Writer are the base APIs for stream-based text I/O.

For text I/O, BufferedReader and BufferedWriter are the equivalents for BufferedInputStream and BufferedOutputStream.

Why do buffered streams make this much difference?

The real reason that buffered streams help performance is to do with the way that an application talks to the operating system:

Java method in a Java application, or native procedure calls in the JVM’s native runtime libraries are fast. They typically take a couple of machine instructions and have minimal performance impact.
By contrast, JVM runtime calls to the operating system are not fast. They involve something known as a “syscall”. The typical pattern for a syscall is as follows:

1. Put the syscall arguments into registers.
2. Execute a SYSENTER trap instruction.
3. The trap handler switched to privileged state and changes the virtual memory mappings.  Then it dispatches to the code to handle the specific syscall.
4. The syscall handler checks the arguments, taking care that it isn't being told to access memory that the user process should not see.
5. The syscall specific work is performed.  In the case of a `read` syscall, this may involve:
   1. checking that there is data to be read at the file descriptor's current position
   2. calling the file system handler to fetch the required data from disk (or wherever it is stored) into the buffer cache,
   3. copying data from the buffer cache to the JVM-supplied address
   4. adjusting thstream pointerse file descriptor position
6. Return from the syscall.  This entails changing VM mappings again and switching out of privileged state.

As you can imagine, performing a single syscall can thousands of machine instructions. Conservatively, at least two orders of magnitude longer than a regular method call. (Probably three or more.)

Given this, the reason that buffered streams make a big difference is that they drastically reduce the number of syscalls. Instead of doing a syscall for each read() call, the buffered input stream reads a large amount of data into a buffer as required. Most read() calls on the buffered stream do some simple bounds checking and return a byte that was read previously. Similar reasoning applies in the output stream case, and also the character stream cases.

(Some people think that buffered I/O performance comes from the mismatch between the read request size and the size of a disk block, disk rotational latency and things like that. In fact, a modern OS uses a number of strategies to ensure that the application typically doesn’t need to wait for the disk. This is not the real explanation.)

Are buffered streams always a win?

Not always. Buffered streams are definitely a win if your application is going to do lots of “small” reads or writes. However, if your application only needs to perform large reads or writes to / from a large byte[] or char[], then buffered streams will give you no real benefits. Indeed there might even be a (tiny) performance penalty.

Is this the fastest way to copy a file in Java?

No it isn’t. When you use Java’s stream-based APIs to copy a file, you incur the cost of at least one extra memory-to-memory copy of the data. It is possible to avoid this if your use the NIO ByteBuffer and Channel APIs. (Add a link to a separate example here.)

Found a mistake? Have a question or improvement idea? Let me know.

Table Of Contents