Character Streams: Readers and Writers – Java I/O: Part I

20.3 Character Streams: Readers and Writers

A character encoding is a scheme for representing characters. Java programs represent values of the char type internally in the 16-bit Unicode character encoding, but the host platform might use another character encoding to represent and store characters externally. For example, the ASCII (American Standard Code for Information Interchange) character encoding is widely used to represent characters on many platforms. However, it is only one small subset of the Unicode standard.

The abstract classes Reader and Writer are the roots of the inheritance hierarchies for streams that read and write Unicode characters using a specific character encoding (Figure 20.3). A reader is an input character stream that implements the Readable interface and reads a sequence of Unicode characters, and a writer is an output character stream that implements the Writer interface and writes a sequence of Unicode characters. Character encodings (usually called charsets) are used by readers and writers to convert between external bytes and internal Unicode characters. The same character encoding that was used to write the characters must be used to read those characters. The java.nio.charset.Charset class represents charsets. Kindly refer to the Charset class API documentation for more details.

Figure 20.3 Selected Character Streams in the java.io Package

Click here to view code image

static Charset forName(String charsetName)

Returns a charset object for the named charset. Selected common charset names are “UTF-8”, “UTF-16”, “US-ASCII”, and “ISO-8859-1”.

Click here to view code image

static Charset defaultCharset()

Returns the default charset of this Java virtual machine.

Table 20.4 and Table 20.5 give an overview of some selected character streams found in the java.io package.

Table 20.4 Selected Readers

ReaderDescription
BufferedReaderA reader is a high-level input stream that buffers the characters read from an underlying stream. The underlying stream must be specified and an optional buffer size can be given.
InputStreamReaderCharacters are read from a byte input stream which must be specified. The default character encoding is used if no character encoding is explicitly specified in the constructor. This class provides the bridge from byte streams to character streams.
FileReaderCharacters are read from a file, using the default character encoding, unless an encoding is explicitly specified in the constructor. The file can be specified by a String file name. It automatically creates a FileInputStream that is associated with the file.

Table 20.5 Selected Writers

WritersDescription
BufferedWriterA writer is a high-level output stream that buffers the characters before writing them to an underlying stream. The underlying stream must be specified, and an optional buffer size can be specified.
OutputStreamWriterCharacters are written to a byte output stream that must be specified. The default character encoding is used if no explicit character encoding is specified in the constructor. This class provides the bridge from character streams to byte streams.
FileWriterCharacters are written to a file, using the default character encoding, unless an encoding is explicitly specified in the constructor. The file can be specified by a String file name. It automatically creates a FileOutputStream that is associated with the file. A boolean parameter can be specified to indicate whether the file should be overwritten or appended with new content.
PrintWriterA print writer is a high-level output stream that allows text representation of Java objects and Java primitive values to be written to an underlying output stream or writer. The underlying output stream or writer must be specified. An explicit encoding can be specified in the constructor, and also whether the print writer should do automatic line flushing.

Readers use the following methods for reading Unicode characters:

Click here to view code image

int read() throws IOException
int read(char cbuf[]) throws IOException
int read(char cbuf[], int off, int len) throws IOException

Note that the read() methods read the character as an int in the range 0 to 65,535 (0x0000–0xFFFF).

The first method returns the character as an int value. The last two methods store the characters in the specified array and return the number of characters read. The value -1 is returned if the end of the stream has been reached.

Click here to view code image

long skip(long n) throws IOException

A reader can skip over characters using the skip() method.

Click here to view code image

void close() throws IOException

Like byte streams, a character stream should be closed when no longer needed in order to free system resources.

Click here to view code image

boolean ready() throws IOException

When called, this method returns true if the next read operation is guaranteed not to block. Returning false does not guarantee that the next read operation will block.

Click here to view code image

long transferTo(Writer out) throws IOException

Reads all characters from this reader and writes the characters to the specified writer in the order they are read. The I/O streams are not closed after the operation.

Writers use the following methods for writing Unicode characters:

Click here to view code image

void write(int c) throws IOException

The write() method takes an int as an argument, but writes only the least significant 16 bits.

Click here to view code image

void write(char[] cbuf) throws IOException
void write(String str) throws IOException
void write(char[] cbuf, int off, int length) throws IOException
void write(String str, int off, int length) throws IOException

Write the characters from an array of characters or a string.

Click here to view code image

void close() throws IOException
void flush() throws IOException

Like byte streams, a character stream should be closed when no longer needed in order to free system resources. Closing a character output stream automatically flushes the stream. A character output stream can also be manually flushed.

Like byte streams, many methods of the character stream classes throw a checked IOException that a calling method must either catch explicitly or specify in a throws clause. They also implement the AutoCloseable interface, and can thus be declared in a try-with-resources statement (§7.7, p. 407) that will ensure they are automatically closed after use at runtime.

Analogous to Example 20.1 that demonstrates usage of a byte buffer for writing and reading bytes to and from file streams, Example 20.3 demonstrates using a character buffer for writing and reading characters to and from file streams. Later in this section, we will use buffered readers (p. 1251) and buffered writers (p. 1250) for reading and writing characters from files, respectively.

Example 20.3 Copying a File Using a Character Buffer

Click here to view code image

/* Copy a file using a character buffer.
   Command syntax: java CopyCharacterFile <from_file> <to_file> */
import java.io.*;
class CopyCharacterFile {
  public static void main(String[] args) {
    try (// Assign the files:
        FileReader fromFile = new FileReader(args[0]);                 // (1)
        FileWriter toFile = new FileWriter(args[1]))  {                // (2)
      // Copy characters using buffer:                                 // (3a)
      char[] buffer = new char[1024];
      int length = 0;
      while((length = fromFile.read(buffer)) != -1) {
        toFile.write(buffer, 0, length);
      }
      // Transfer characters:
//    fromFile.transferTo(toFile);                                     // (3b)
    } catch(ArrayIndexOutOfBoundsException e) {
      System.err.println(“Usage: java CopyCharacterFile <from_file> <to_file>”);
    } catch(FileNotFoundException e) {
      System.err.println(“File could not be copied: ” + e);
    } catch(IOException e) {
      System.err.println(“I/O error.”);
    }
  }
}