reading file line by line in Java with BufferedReader

Reading files in Java is the cause for a lot of confusion. There are multiple ways of accomplishing the same task and it's often non articulate which file reading method is best to employ. Something that's quick and dirty for a pocket-size instance file might not be the best method to use when yous need to read a very large file. Something that worked in an earlier Java version, might not be the preferred method anymore.

This article aims to be the definitive guide for reading files in Coffee 7, 8 and ix. I'g going to cover all the ways you tin read files in Java. Too often, you'll read an article that tells you 1 way to read a file, just to discover afterward at that place are other means to practise that. I'one thousand actually going to cover 15 different ways to read a file in Java. I'm going to cover reading files in multiple ways with the cadre Coffee libraries as well as ii third party libraries.

But that's not all – what skillful is knowing how to practice something in multiple means if you don't know which way is all-time for your situation?

I also put each of these methods to a real functioning test and document the results. That style, you will have some difficult information to know the functioning metrics of each method.

Methodology

JDK Versions

Java code samples don't live in isolation, especially when it comes to Java I/O, as the API keeps evolving. All code for this article has been tested on:

  • Coffee SE 7 (jdk1.7.0_80)
  • Coffee SE 8 (jdk1.viii.0_162)
  • Java SE nine (jdk-9.0.iv)

When there is an incompatibility, it will exist stated in that section. Otherwise, the code works unaltered for unlike Coffee versions. The main incompatibility is the apply of lambda expressions which was introduced in Java 8.

Java File Reading Libraries

There are multiple ways of reading from files in Java. This article aims to be a comprehensive drove of all the dissimilar methods. I will comprehend:

  • java.io.FileReader.read()
  • coffee.io.BufferedReader.readLine()
  • java.io.FileInputStream.read()
  • java.io.BufferedInputStream.read()
  • coffee.nio.file.Files.readAllBytes()
  • coffee.nio.file.Files.readAllLines()
  • java.nio.file.Files.lines()
  • java.util.Scanner.nextLine()
  • org.apache.commons.io.FileUtils.readLines() – Apache Commons
  • com.google.common.io.Files.readLines() – Google Guava

Closing File Resource

Prior to JDK7, when opening a file in Java, all file resource would need to be manually closed using a endeavor-take hold of-finally block. JDK7 introduced the try-with-resources argument, which simplifies the process of closing streams. You lot no longer need to write explicit code to shut streams considering the JVM volition automatically close the stream for you, whether an exception occurred or not. All examples used in this commodity use the try-with-resource statement for importing, loading, parsing and closing files.

File Location

All examples volition read test files from C:\temp.

Encoding

Character encoding is not explicitly saved with text files so Java makes assumptions about the encoding when reading files. Normally, the assumption is right only sometimes you want to be explicit when instructing your programs to read from files. When encoding isn't correct, you'll run across funny characters announced when reading files.

All examples for reading text files utilise two encoding variations:
Default organization encoding where no encoding is specified and explicitly setting the encoding to UTF-eight.

Download Code

All code files are available from Github.

Lawmaking Quality and Code Encapsulation

There is a divergence betwixt writing code for your personal or work project and writing code to explain and teach concepts.

If I was writing this code for my own project, I would use proper object-oriented principles like encapsulation, abstraction, polymorphism, etc. But I wanted to make each example stand alone and hands understood, which meant that some of the lawmaking has been copied from i example to the next. I did this on purpose because I didn't want the reader to have to effigy out all the encapsulation and object structures I so cleverly created. That would have abroad from the examples.

For the aforementioned reason, I chose NOT to write these example with a unit testing framework similar JUnit or TestNG considering that's not the purpose of this commodity. That would add together another library for the reader to understand that has naught to do with reading files in Java. That's why all the case are written inline inside the main method, without extra methods or classes.

My main purpose is to make the examples equally easy to understand every bit possible and I believe that having extra unit testing and encapsulation lawmaking will not aid with this. That doesn't mean that's how I would encourage you to write your own personal code. It'southward just the manner I chose to write the examples in this article to make them easier to understand.

Exception Handling

All examples declare any checked exceptions in the throwing method proclamation.

The purpose of this article is to bear witness all the unlike ways to read from files in Coffee – it's not meant to show how to handle exceptions, which volition exist very specific to your situation.

Then instead of creating unhelpful try catch blocks that just print exception stack traces and ataxia up the code, all example will declare whatever checked exception in the calling method. This will brand the lawmaking cleaner and easier to understand without sacrificing any functionality.

Future Updates

Equally Coffee file reading evolves, I volition be updating this commodity with whatsoever required changes.

File Reading Methods

I organized the file reading methods into 3 groups:

  • Archetype I/O classes that take been function of Java since before JDK i.vii. This includes the java.io and coffee.util packages.
  • New Coffee I/O classes that have been part of Java since JDK1.vii. This covers the java.nio.file.Files grade.
  • Third party I/O classes from the Apache Commons and Google Guava projects.

Classic I/O – Reading Text

1a) FileReader – Default Encoding

FileReader reads in i grapheme at a fourth dimension, without any buffering. It'due south meant for reading text files. It uses the default character encoding on your arrangement, so I have provided examples for both the default instance, as well as specifying the encoding explicitly.

          

1
2
iii
four
five
6
7
viii
ix
x
11
12
13
fourteen
15
16
17
xviii
19

import java.io.FileReader ;
import java.io.IOException ;

public form ReadFile_FileReader_Read {
public static void main( Cord [ ] pArgs) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt" ;

endeavor ( FileReader fileReader = new FileReader (fileName) ) {
int singleCharInt;
char singleChar;
while ( (singleCharInt = fileReader.read ( ) ) != - ane ) {
singleChar = ( char ) singleCharInt;

//display one grapheme at a time
Arrangement.out.print (singleChar) ;
}
}
}
}

1b) FileReader – Explicit Encoding (InputStreamReader)

It'south actually not possible to gear up the encoding explicitly on a FileReader so you accept to use the parent class, InputStreamReader and wrap it around a FileInputStream:

          

ane
two
3
4
five
half-dozen
7
8
9
10
xi
12
13
14
15
16
17
18
19
20
21
22

import java.io.FileInputStream ;
import java.io.IOException ;
import java.io.InputStreamReader ;

public class ReadFile_FileReader_Read_Encoding {
public static void master( String [ ] pArgs) throws IOException {
Cord fileName = "c:\\temp\\sample-10KB.txt" ;
FileInputStream fileInputStream = new FileInputStream (fileName) ;

//specify UTF-viii encoding explicitly
try ( InputStreamReader inputStreamReader =
new InputStreamReader (fileInputStream, "UTF-8" ) ) {

int singleCharInt;
char singleChar;
while ( (singleCharInt = inputStreamReader.read ( ) ) != - ane ) {
singleChar = ( char ) singleCharInt;
System.out.impress (singleChar) ; //brandish one character at a fourth dimension
}
}
}
}

2a) BufferedReader – Default Encoding

BufferedReader reads an entire line at a time, instead of one character at a fourth dimension like FileReader. It's meant for reading text files.

          

1
ii
3
four
v
six
7
8
9
10
11
12
13
fourteen
fifteen
xvi
17

import coffee.io.BufferedReader ;
import java.io.FileReader ;
import coffee.io.IOException ;

public course ReadFile_BufferedReader_ReadLine {
public static void chief( String [ ] args) throws IOException {
Cord fileName = "c:\\temp\\sample-10KB.txt" ;
FileReader fileReader = new FileReader (fileName) ;

attempt ( BufferedReader bufferedReader = new BufferedReader (fileReader) ) {
Cord line;
while ( (line = bufferedReader.readLine ( ) ) != zippo ) {
Organisation.out.println (line) ;
}
}
}
}

2b) BufferedReader – Explicit Encoding

In a similar style to how we set encoding explicitly for FileReader, nosotros demand to create FileInputStream, wrap it inside InputStreamReader with an explicit encoding and laissez passer that to BufferedReader:

          

1
2
3
4
5
six
7
8
9
x
eleven
12
13
fourteen
15
16
17
18
nineteen
xx
21
22

import java.io.BufferedReader ;
import coffee.io.FileInputStream ;
import coffee.io.IOException ;
import java.io.InputStreamReader ;

public grade ReadFile_BufferedReader_ReadLine_Encoding {
public static void main( String [ ] args) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt" ;

FileInputStream fileInputStream = new FileInputStream (fileName) ;

//specify UTF-viii encoding explicitly
InputStreamReader inputStreamReader = new InputStreamReader (fileInputStream, "UTF-8" ) ;

try ( BufferedReader bufferedReader = new BufferedReader (inputStreamReader) ) {
Cord line;
while ( (line = bufferedReader.readLine ( ) ) != nothing ) {
Organisation.out.println (line) ;
}
}
}
}

Classic I/O – Reading Bytes

i) FileInputStream

FileInputStream reads in one byte at a time, without whatsoever buffering. While it's meant for reading binary files such as images or sound files, it can nevertheless exist used to read text file. Information technology'due south like to reading with FileReader in that you lot're reading one character at a time as an integer and y'all need to cast that int to a char to see the ASCII value.

By default, it uses the default character encoding on your system, so I have provided examples for both the default case, too every bit specifying the encoding explicitly.

          

1
2
three
4
5
6
7
8
9
ten
xi
12
13
xiv
15
16
17
18
19
20
21

import coffee.io.File ;
import coffee.io.FileInputStream ;
import java.io.FileNotFoundException ;
import coffee.io.IOException ;

public class ReadFile_FileInputStream_Read {
public static void master( String [ ] pArgs) throws FileNotFoundException, IOException {
String fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

try ( FileInputStream fileInputStream = new FileInputStream (file) ) {
int singleCharInt;
char singleChar;

while ( (singleCharInt = fileInputStream.read ( ) ) != - 1 ) {
singleChar = ( char ) singleCharInt;
System.out.print (singleChar) ;
}
}
}
}

2) BufferedInputStream

BufferedInputStream reads a prepare of bytes all at once into an internal byte assortment buffer. The buffer size can be set explicitly or employ the default, which is what we'll demonstrate in our example. The default buffer size appears to be 8KB but I have not explicitly verified this. All performance tests used the default buffer size so information technology will automatically re-size the buffer when it needs to.

          

one
2
3
4
5
6
vii
viii
9
ten
11
12
13
xiv
fifteen
16
17
18
19
20
21
22

import java.io.BufferedInputStream ;
import java.io.File ;
import java.io.FileInputStream ;
import java.io.FileNotFoundException ;
import java.io.IOException ;

public class ReadFile_BufferedInputStream_Read {
public static void main( String [ ] pArgs) throws FileNotFoundException, IOException {
Cord fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;
FileInputStream fileInputStream = new FileInputStream (file) ;

effort ( BufferedInputStream bufferedInputStream = new BufferedInputStream (fileInputStream) ) {
int singleCharInt;
char singleChar;
while ( (singleCharInt = bufferedInputStream.read ( ) ) != - 1 ) {
singleChar = ( char ) singleCharInt;
System.out.print (singleChar) ;
}
}
}
}

New I/O – Reading Text

1a) Files.readAllLines() – Default Encoding

The Files class is part of the new Java I/O classes introduced in jdk1.vii. It only has static utility methods for working with files and directories.

The readAllLines() method that uses the default graphic symbol encoding was introduced in jdk1.8 so this example will not piece of work in Java 7.

          

1
2
three
4
five
vi
7
viii
9
x
11
12
13
14
15
16
17

import java.io.File ;
import java.io.IOException ;
import java.nio.file.Files ;
import java.util.Listing ;

public grade ReadFile_Files_ReadAllLines {
public static void principal( String [ ] pArgs) throws IOException {
Cord fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

List fileLinesList = Files.readAllLines (file.toPath ( ) ) ;

for ( String line : fileLinesList) {
System.out.println (line) ;
}
}
}

1b) Files.readAllLines() – Explicit Encoding

          

1
two
3
4
five
vi
7
viii
9
10
xi
12
13
xiv
15
16
17
eighteen
19

import java.io.File ;
import java.io.IOException ;
import java.nio.charset.StandardCharsets ;
import java.nio.file.Files ;
import java.util.List ;

public grade ReadFile_Files_ReadAllLines_Encoding {
public static void principal( String [ ] pArgs) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

//utilise UTF-viii encoding
List fileLinesList = Files.readAllLines (file.toPath ( ), StandardCharsets.UTF_8 ) ;

for ( Cord line : fileLinesList) {
System.out.println (line) ;
}
}
}

2a) Files.lines() – Default Encoding

This code was tested to work in Coffee 8 and 9. Java 7 didn't run because of the lack of support for lambda expressions.

          

1
2
3
iv
v
vi
7
eight
ix
10
11
12
thirteen
fourteen
xv
sixteen
17

import java.io.File ;
import java.io.IOException ;
import java.nio.file.Files ;
import java.util.stream.Stream ;

public class ReadFile_Files_Lines {
public static void main( Cord [ ] pArgs) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

try (Stream linesStream = Files.lines (file.toPath ( ) ) ) {
linesStream.forEach (line -> {
System.out.println (line) ;
} ) ;
}
}
}

2b) Files.lines() – Explicit Encoding

Just like in the previous case, this code was tested and works in Java 8 and 9 simply not in Java 7.

          

one
2
3
four
5
6
7
8
9
x
xi
12
13
xiv
15
16
17
xviii

import coffee.io.File ;
import java.io.IOException ;
import java.nio.charset.StandardCharsets ;
import java.nio.file.Files ;
import coffee.util.stream.Stream ;

public class ReadFile_Files_Lines_Encoding {
public static void main( String [ ] pArgs) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

endeavor (Stream linesStream = Files.lines (file.toPath ( ), StandardCharsets.UTF_8 ) ) {
linesStream.forEach (line -> {
Arrangement.out.println (line) ;
} ) ;
}
}
}

3a) Scanner – Default Encoding

The Scanner grade was introduced in jdk1.7 and tin can be used to read from files or from the console (user input).

          

i
two
3
iv
5
6
7
8
9
10
eleven
12
xiii
14
xv
16
17
18
19

import java.io.File ;
import coffee.io.FileNotFoundException ;
import java.util.Scanner ;

public form ReadFile_Scanner_NextLine {
public static void master( String [ ] pArgs) throws FileNotFoundException {
Cord fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

effort (Scanner scanner = new Scanner(file) ) {
String line;
boolean hasNextLine = false ;
while (hasNextLine = scanner.hasNextLine ( ) ) {
line = scanner.nextLine ( ) ;
Organization.out.println (line) ;
}
}
}
}

3b) Scanner – Explicit Encoding

          

1
ii
3
4
5
six
7
8
9
x
11
12
13
14
15
sixteen
17
xviii
19
twenty

import java.io.File ;
import java.io.FileNotFoundException ;
import java.util.Scanner ;

public form ReadFile_Scanner_NextLine_Encoding {
public static void master( String [ ] pArgs) throws FileNotFoundException {
String fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

//employ UTF-eight encoding
attempt (Scanner scanner = new Scanner(file, "UTF-viii" ) ) {
Cord line;
boolean hasNextLine = false ;
while (hasNextLine = scanner.hasNextLine ( ) ) {
line = scanner.nextLine ( ) ;
System.out.println (line) ;
}
}
}
}

New I/O – Reading Bytes

Files.readAllBytes()

Even though the documentation for this method states that "it is not intended for reading in large files" I establish this to be the absolute best performing file reading method, even on files as big as 1GB.

          

1
2
3
4
5
half dozen
vii
eight
9
10
11
12
13
14
fifteen
16
17

import coffee.io.File ;
import coffee.io.IOException ;
import coffee.nio.file.Files ;

public class ReadFile_Files_ReadAllBytes {
public static void main( String [ ] pArgs) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

byte [ ] fileBytes = Files.readAllBytes (file.toPath ( ) ) ;
char singleChar;
for ( byte b : fileBytes) {
singleChar = ( char ) b;
System.out.print (singleChar) ;
}
}
}

tertiary Party I/O – Reading Text

Commons – FileUtils.readLines()

Apache Commons IO is an open up source Java library that comes with utility classes for reading and writing text and binary files. I listed it in this article because it tin can exist used instead of the built in Coffee libraries. The class nosotros're using is FileUtils.

For this commodity, version two.6 was used which is compatible with JDK one.7+

Annotation that you need to explicitly specify the encoding and that method for using the default encoding has been deprecated.

          

1
ii
iii
4
v
6
vii
viii
9
10
11
12
13
xiv
15
16
17
xviii

import java.io.File ;
import coffee.io.IOException ;
import java.util.Listing ;

import org.apache.eatables.io.FileUtils ;

public class ReadFile_Commons_FileUtils_ReadLines {
public static void main( String [ ] pArgs) throws IOException {
Cord fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

List fileLinesList = FileUtils.readLines (file, "UTF-eight" ) ;

for ( Cord line : fileLinesList) {
System.out.println (line) ;
}
}
}

Guava – Files.readLines()

Google Guava is an open source library that comes with utility classes for mutual tasks like collections handling, enshroud management, IO operations, string processing.

I listed it in this article considering information technology can be used instead of the built in Java libraries and I wanted to compare its performance with the Java congenital in libraries.

For this article, version 23.0 was used.

I'grand not going to examine all the dissimilar ways to read files with Guava, since this article is not meant for that. For a more detailed await at all the different ways to read and write files with Guava, have a look at Baeldung's in depth article.

When reading a file, Guava requires that the character encoding be set explicitly, simply like Apache Commons.

Compatibility annotation: This lawmaking was tested successfully on Coffee 8 and 9. I couldn't become it to work on Java seven and kept getting "Unsupported major.minor version 52.0" error. Guava has a carve up API doc for Java seven which uses a slightly different version of the Files.readLine() method. I thought I could get it to work simply I kept getting that mistake.

          

i
ii
3
4
5
6
vii
viii
ix
10
11
12
xiii
fourteen
15
16
17
xviii
nineteen

import java.io.File ;
import java.io.IOException ;
import coffee.util.List ;

import com.google.common.base.Charsets ;
import com.google.common.io.Files ;

public class ReadFile_Guava_Files_ReadLines {
public static void main( String [ ] args) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

List fileLinesList = Files.readLines (file, Charsets.UTF_8 ) ;

for ( String line : fileLinesList) {
Organization.out.println (line) ;
}
}
}

Performance Testing

Since there are so many ways to read from a file in Java, a natural question is "What file reading method is the all-time for my situation?" So I decided to test each of these methods against each other using sample data files of different sizes and timing the results.

Each code sample from this article displays the contents of the file to a string and then to the console (System.out). However, during the performance tests the Arrangement.out line was commented out since it would seriously ho-hum down the performance of each method.

Each performance examination measures the time it takes to read in the file – line by line, character by character, or byte by byte without displaying anything to the console. I ran each test five-10 times and took the average and then as not to let any outliers influence each test. I also ran the default encoding version of each file reading method – i.e. I didn't specify the encoding explicitly.

Dev Setup

The dev environment used for these tests:

  • Intel Core i7-3615 QM @ii.3 GHz, 8GB RAM
  • Windows 8 x64
  • Eclipse IDE for Java Developers, Oxygen.2 Release (4.vii.2)
  • Java SE 9 (jdk-9.0.4)

Data Files

GitHub doesn't let pushing files larger than 100 MB, so I couldn't find a practical way to store my large test files to allow others to replicate my tests. So instead of storing them, I'yard providing the tools I used to generate them and then y'all can create exam files that are similar in size to mine. Obviously they won't be the same, just you'll generate files that are like in size equally I used in my performance tests.

Random String Generator was used to generate sample text and then I but copy-pasted to create larger versions of the file. When the file started getting likewise large to manage inside a text editor, I had to use the control line to merge multiple text files into a larger text file:

copy *.txt sample-1GB.txt

I created the following 7 data file sizes to test each file reading method beyond a range of file sizes:

  • 1KB
  • 10KB
  • 100KB
  • 1MB
  • 10MB
  • 100MB
  • 1GB

Performance Summary

There were some surprises and some expected results from the performance tests.

As expected, the worst performers were the methods that read in a file character by character or byte by byte. But what surprised me was that the native Coffee IO libraries outperformed both 3rd party libraries – Apache Eatables IO and Google Guava.

What'southward more – both Google Guava and Apache Commons IO threw a coffee.lang.OutOfMemoryError when trying to read in the 1 GB exam file. This likewise happened with the Files.readAllLines(Path) method only the remaining seven methods were able to read in all examination files, including the 1GB examination file.

The post-obit table summarizes the average fourth dimension (in milliseconds) each file reading method took to consummate. I highlighted the meridian 3 methods in green, the boilerplate performing methods in yellow and the worst performing methods in red:

The following chart summarizes the above table but with the post-obit changes:

I removed java.io.FileInputStream.read() from the chart because its performance was so bad it would skew the entire chart and yous wouldn't see the other lines properly
I summarized the data from 1KB to 1MB because after that, the chart would get too skewed with then many under performers and also some methods threw a java.lang.OutOfMemoryError at 1GB

The Winners

The new Coffee I/O libraries (java.nio) had the all-time overall winner (java.nio.Files.readAllBytes()) just it was followed closely behind by BufferedReader.readLine() which was too a proven top performer across the board. The other splendid performer was java.nio.Files.lines(Path) which had slightly worse numbers for smaller test files but actually excelled with the larger test files.

The absolute fastest file reader across all data tests was java.nio.Files.readAllBytes(Path). It was consistently the fastest and even reading a 1GB file but took most 1 second.

The following chart compares functioning for a 100KB test file:

You tin can see that the lowest times were for Files.readAllBytes(), BufferedInputStream.read() and BufferedReader.readLine().

The following chart compares performance for reading a 10MB file. I didn't carp including the bar for FileInputStream.Read() because the performance was so bad it would skew the entire nautical chart and you couldn't tell how the other methods performed relative to each other:

Files.readAllBytes() actually outperforms all other methods and BufferedReader.readLine() is a distant second.

The Losers

As expected, the accented worst performer was java.io.FileInputStream.read() which was orders of magnitude slower than its rivals for most tests. FileReader.read() was also a poor performer for the same reason – reading files byte past byte (or character past character) instead of with buffers drastically degrades performance.

Both the Apache Eatables IO FileUtils.readLines() and Guava Files.readLines() crashed with an OutOfMemoryError when trying to read the 1GB test file and they were about average in operation for the remaining test files.

java.nio.Files.readAllLines() besides crashed when trying to read the 1GB test file but information technology performed quite well for smaller file sizes.

Operation Rankings

Here'southward a ranked list of how well each file reading method did, in terms of speed and treatment of large files, also as compatibility with different Java versions.

Rank File Reading Method
1 java.nio.file.Files.readAllBytes()
two java.io.BufferedFileReader.readLine()
3 java.nio.file.Files.lines()
4 java.io.BufferedInputStream.read()
5 java.util.Scanner.nextLine()
6 java.nio.file.Files.readAllLines()
7 org.apache.commons.io.FileUtils.readLines()
eight com.google.common.io.Files.readLines()
9 java.io.FileReader.read()
10 coffee.io.FileInputStream.Read()

Conclusion

I tried to present a comprehensive fix of methods for reading files in Coffee, both text and binary. We looked at 15 unlike ways of reading files in Coffee and we ran functioning tests to encounter which methods are the fastest.

The new Java IO library (coffee.nio) proved to be a great performer but so was the classic BufferedReader.