Read from a text file

Let’s discuss how to read information from a text file in C++. For simplicity, we assume that the end of a stream’s buffer is marked with a special character named end-of-file (EOF). This special character cannot be read, i.e. an attempt to read it makes the stream to enter in bad-state.

The following code skeleton illustrates how to read information from a text file. Let inFil be a stream connected to the text file.

 ...;  // read first item

while (inFil) {  // test if stream inFil is in good-state
     ...;   // do something with the item read

     ...;  // read next item
}

Testing if the stream inFil is in good-state, as test condition of the while-loop, allows us to check whether the last reading operation from the file has succeeded. If so then we can proceed with doing whatever computations are needed with the data item just read from the file and then attempt to read the next data item from the file.

In the previous section I/O streams: Test a stream state, we discussed how to test a stream state and gave some examples of how to read from stream std::cin. Note that reading from std::cin is not conceptually different from reading from a file stream.

Example 1: we show a program that reads a list of numbers from a text file and stores them in a vector. The list of values stored in the vector are then displayed.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
#include <iostream>
#include <vector>
#include <fstream>

int main() {
    std::string file_name;
    std::cout << "Enter file name: ";
    std::cin >> file_name;

    std::ifstream inFil(file_name);  // create a stream and connect it to a file for reading

    if (inFil) {  // test if stream is in good-state
        std::vector<double> numbers_list;

        double x;
        inFil >> x;  // read first number

        while (inFil) { // test if stream is in good-state
            numbers_list.push_back(x);  // store number in the vector

            inFil >> x;  // read next number
        }

        //display the list of numbers read
        std::cout << "List of numbers: ";

        for (double d : numbers_list) {
            std::cout << d << " ";
        }
        std::cout << "\n";
    } else {
        std::cout << "File could not be opened!!";
    }
}

The code from line $16$ to line $22$ follows the code skeleton presented above. Alternatively, these lines of code could be re-written as follows (similar with reading from std::cin).

while (inFil >> x) { // read + test if stream is in good-state
     // store x in the vector
     numbers_list.push_back(x);
}

Recall that a stream is also set to bad-state, when attempting to read inFil >> x;, if the next sequence of characters in the stream’s buffer cannot be converted to a value of variable x’s type, i.e. double. For instance, assume that the text file has the following contents.

Streams

The file contents are transfered to the stream inFil’s buffer. Thus, the stream looks as depicted below.

Streams

After the first execution of inFil >> x, the characters '1''2''.''5' are extracted from the stream’s buffer and converted to the double $12.5$ which is stored in variable x. After this first reading, the stream is still in good-state and it looks as follows.

Streams

After the second execution of inFil >> x, the characters '\n''6''6' are extracted from the stream’s buffer, any leading white spaces (like the new line ‘\n’) are simply ignored, and the characters “66” converted to the double $66.0$ which is stored in variable x. After this second reading, the stream is still in good-state and it looks as follows.

Streams

When attempting the third execution of inFil >> x, it’s not possible to convert the next characters in the stream’s buffer (i.e. “\nOhoops!!”) into a double. The stream is set to bad-state and it looks as follows. Note that some of the error bits turn to $1$ (let’s imagine that’s the second bit which is set to one).

Streams

Therefore, if somewhere in the text file with a list of numbers there is some stuff which cannot be converted to a number then the reading loop stops executing, though one has not arrived to the end of the file.

To summarize. Any of the loops below can be used to read a text file. Both, effectively test for EOF or whether the input in the file is valid. If there is an attempt to read EOF or to read an invalid value (i.e. a value which cannot have type T) then the reading loop stops executing.

// T is a type such as int, double, std::string
T x;
// read + test if stream is in good-state
while (inFil >> x) {
    ...;  // Do something with x
}
// T is a type such as int, double, std::string
T x;  
inFil >> x;  // read first value

while (inFil) { // test if stream is in good-state
    ...; // Do something with x

    inFil >> x;  // read next value
}
struct T {
    ...; // T's data fields
}

// Read an item of type T from file
// Return the item read
T read_T(std::ifstream& file);	
...
T x = read_T(inFil); // read first item
while (inFil) { // test if stream is in good-state
    ...; // Do something with x

    x = read_T(inFil);  // read next item
}

Functions for reading

The table below summarizes some of the most used operations for reading from a stream, whether std::cin or a file stream. Assume that

  • variable x has either a type such as int, double, char or, std::string;
  • variable w has type std::string; and
  • variable c has type char.
Operation Description
inFil >> x
To read a number, a word, or a character. Leading white spaces are read and discarded. Stop reading at first character that cannot be converted to a value conformant with the type of x or at a white space.
std::getline(inFil, w)
To read several words until a new line ('\n'). White spaces are also read and stored in w. The new line is read but not stored in w.
std::getline(inFil, w, c)
To read several words until character c. White spaces are also read and stored in w. The character c is read but not stored in w.
inFil.get(c)
To read next character from the stream's buffer and store it in c.
inFil >> std::ws
To extract leading white spaces from the stream's buffer, if any.

All stream input operations above return the stream after reading. Thus, any of these operations can be used as a test condition in a loop, as we have seen previously. However, the operation in the last line of the table is usually combined with a getline. For example, the following piece of code reads a list of person names from a file. The file has one name per line and, as usual, a person’s name consists of several words (like first name and last name).

std::string name;

// Read a list of person names from a text file
while (inFil >> std::ws && std::getline(inFil, name)) {
     // do something with name
}

To account for the case where there are white spaces preceding a name (i.e. leading white-spaces), one can then use inFil >> std::ws to extract these white spaces. Note that, for example, the string "James Hetfield" and the string "   James Hetfield" are two different strings, i.e. the comparison

("James Hetfield" == "   James Hetfield") returns false. If there are no leading white spaces preceeding the name to be read then inFil >> std::ws has no effect on the stream.

Good programming practice: Reading with std::getline should be preceded with inFil >> std::ws to extract any leading white spaces.

What you should avoid

In this section, we refer to two common bad practices when reading a text file that should be avoided. Because we see them rather often, we decided to write this section to make you aware of their pitfalls.

Case 1

Consider again Example 1, given in the previous section, about reading a list of numbers from a text file and storing it into a vector. It may be tempting to use the following code for the loop reading the list of numbers from the file, though the loop is not correct and leads to problems.

1
2
3
4
5
6
7
8
std::vector<double> numbers_list;
double x = 0;

// This reading loop is problematic!!
while (inFil) {  // test if stream is in good-state
    inFil >> x;  // read number
    numbers_list.push_back(x);  // store number in the vector  
}

So, why is this reading loop a problem? There are two main issues.

  • First, the last value read from the file is added twice to the end of the vector. For example, if the text file has only the values $12.5$ then, after the reading loop, the vector stores twice the value $12.5$.
  • Second, when the text file is empty (i.e. has no values), the loop above adds value zero to the vector (zero is the value used to initialize x in line $2$).

We strongly encourage you to try the code above and check that these odd situations occur. Let us look at the first case above. Assume the text file has only one number, $12.5$. Before the reading loop starts executing, the steam’s buffer contains '1''2''.''5'EOF <(recall that EOF is just a character marking the end of the stream’s buffer). Then, the loop above iterates twice.

$1^{st}$ iteration of the loop:

  • In line $5$, the stream inFil is tested. Since the stream is in good-state then the body of the loop starts executing.
  • In line $6$, the first (and only) value of the file is read into variable x, i.e. x stores $12.5$. Reading has succeeded and the stream is, therefore, in good-state. Note that, after reading $12.5$, the buffer of the stream stores EOF.
  • In line $7$, the value stored in x is added to the vector. So far so good.

$2^{nd}$ iteration of the loop:

  • Execution proceeds with the code in line $5$ and the stream is tested again. Since the stream is still in good-state then the body of the loop starts executing again.
  • Execution of the code in line $6$ leads to the attempt to read EOF and the stream is set to bad-state. In this case, the value stored in variable x is not modified, i.e. x still stores $12.5$.
  • In line $7$, the value stored in x is added to the vector which leads to adding again $12.5$.

Since the stream is in bad-state, testing again the stream state in line $5$ stops the reading loop. So, after reading the file, the vector stores twice the last value read!!

For the case when the text file is empty, the loop above iterates once because the stream is in good-state. The body of the loop is executed: it sets the stream to bad-state (line $6$ leads to the attempt to read EOF) and then adds the value stored in variable x (zero) to the vector.

Good programming practice: When reading from a file, the code should also work correctly in the case the file is empty.

Case 2

We continue using the example of reading a list of numbers from a text file. Rather often one sees code that uses the boolean function eof() as a test condition of the loop reading the file. This function is defined in the standard library and inFil.eof() returns true, if stream inFil has reached end-of-file (EOF).

double x;
inFil >> x;  // read first number

// This reading loop is problematic!!
while (!inFil.eof()) {  // test if EOF has been reached
    std::cout << x << '\n'; 

    inFil >> x;  // read next number
}

(!inFil.eof() is equivalent to the comparison inFil.eof() == false)

The probelm with the code above is that if the text file being read contains an invalid value, i.e. a value not conformant with the type of variable x, then the loop iterates forever (does not stop). Just try the loop above with a text file with the following contents. Note that “Ohoops!!” cannot be converted to a double, the type of x.

Streams

The reason for this behavior is that the while-loop above explicitly says that it only ends when EOF is reached. Well, then what should it be done with “Ohoops!!”?!! It cannot be converted to a double but the end-of-file (EOF) hasn’t been reached, yet. So, the program will just try over and over to read “Ohoops!!” and convert it without succeeding.

Good programming practice: The loop reading a text file should effectively test for EOF and whether input in the file consists of valid values.

Recall that when there is an attempt to read EOF or to read an invalid value then the stream enters in bad-state. The reading loops presented above stop executing when any of these situations occur.