COMS W4995 C++ Deep Dive for C Programmers

I/O Streams

In addition to the C-style FILE*-based I/O functions, C++ also provides an object-oriented I/O streams library. C++23 introduced a new print-family of functions that can work with both FILE* and I/O stream objects.

In this chapter, we’ll take a brief tour of the I/O stream library’s design and essential features. With the knowledge of C++ inheritance under our belts, we’ll be able to appreciate how the library’s design enables polymorphism.

Using I/O streams

The 09/io1 program reads a sequence of whitespace-separated integers one by one from standard input (stdin) and prints a running sum. When the user closes stdin, the program prints the final sum and terminates.

The shell session below demonstrates how io1 reads in one integer at a time and terminates when the user presses Ctrl-D:

$ ./io1
10
current sum:   10        fail()==0  bad()==0  eof()==0
2
current sum:   12        fail()==0  bad()==0  eof()==0
3
current sum:   15        fail()==0  bad()==0  eof()==0
4
current sum:   19        fail()==0  bad()==0  eof()==0
[Ctrl-D]
  final sum:   19        fail()==1  bad()==0  eof()==1

We can also redirect io1’s stdin to read from the output of another program, echo, as shown below:

$ echo "10 2 3 4" | ./io1
current sum:   10        fail()==0  bad()==0  eof()==0
current sum:   12        fail()==0  bad()==0  eof()==0
current sum:   15        fail()==0  bad()==0  eof()==0
current sum:   19        fail()==0  bad()==0  eof()==0
  final sum:   19        fail()==1  bad()==0  eof()==1

Each line also shows the current state of the I/O stream object that represents stdin. We’ll explain I/O stream states shortly. Here’s the program code:

void print_io_states(std::ios& io) {
    std::cout << '\t';
    std::cout << " fail()==" << io.fail();
    std::cout << "  bad()==" << io.bad();
    std::cout << "  eof()==" << io.eof();
    std::cout << '\n';
}

int sum_ints(std::istream& is) {
    int i, sum = 0;
    while (is >> i) {
        sum += i;
        std::cout << "current sum: " << std::setw(4) << sum;
        print_io_states(is);
    }
    return sum;
}

int main() {
    int sum = sum_ints(std::cin);
    std::cout << "  final sum: " << std::setw(4) << sum;
    print_io_states(std::cin);
}

The main() function calls sum_ints(), passing the global object std::cin as an argument. sum_ints() takes its istream parameter is by reference because istream has copy operations deleted. The while loop in sum_ints() invokes istream::operator>>(int&) to read a whitespace-separated integer from is. Each iteration of the loop prints the current sum and the state of the is object.

We use std::setw(4) to print sum 4-spaces wide and right-justified. It has the same effect as calling std::cout.width(4) before printing the sum, but is more convenient because we can write it as part of the put-to chain. std::setw() works by returning an object of some type, say WidthManipulator, which has an operator put-to overload defined something like this:

ostream& operator<<(ostream& os, const WidthManipulator& wm) {
    os.width(wm.get_width());
    return os;
}

The I/O stream library defines many helper functions like std::setw() that manipulates I/O stream objects through operator>>() and operator<<(). They are collectively known as I/O manipulators.

The while loop in sum_ints() checks the condition is >> i. But istream::operator>>() doesn’t return a bool. It returns an istream& to enable chaining, as we’ve seen before. The while loop works because istream provides a type conversion operator, operator bool(), that returns false if the last read failed. We’ll revisit operator bool() and what failure means in this context shortly. When there is no more input, is >> i evaluates to false, and the program prints the final sum and exits.

I/O stream class hierarchy

The C++ I/O stream library is organzied into a class hierarchy as follows:

09-io-stream-hierarchy

std::cout and std::cin are instances of the ostream and istream classes, respectively. They derive from the ios class, which in turn derives from ios_base. The base classes hold data members (e.g., the stream state information that we will cover shortly) and provides a common interface to access them.

iostream showcases multiple inheritance. It’s a stream that supports both operator<<() and operator>>() because it inherits from both ostream and istream. To make this possible, istream and ostream virtually inherit from ios.

At the bottom of the hierarchy, there are stream classes for interacting with strings and files. We’ll use them later on this chapter.

I/O stream states

The std::ios_base base class defines three status bits, referred to as iostates, that indicate the current state of the stream:

The std::ios base class then provides three accessor functions to examine the iostates:

Note the asymmetry of the fail() accessor function – it not only checks failbit, but also checks badbit. The justification is that fail() is meant to be a blanket check for any kind of error, whether it be a formatting/parsing or hardware error. In fact, the class also defines a type conversion operator to bool as follows:

explicit ios::operator bool() const { return !fail(); }

This type conversion operator makes it possible to use a stream object wherever a bool type is expected, like while (is >> i) in sum_ints(). The explicit keyword prevents unsafe implicit conversions (e.g., to an int).

Let’s revisit our previous sample run of io1 and understand the iostates at each step:

$ echo "10 2 3 4" | ./io1
current sum:   10        fail()==0  bad()==0  eof()==0
current sum:   12        fail()==0  bad()==0  eof()==0
current sum:   15        fail()==0  bad()==0  eof()==0
current sum:   19        fail()==0  bad()==0  eof()==0
  final sum:   19        fail()==1  bad()==0  eof()==1

Consider what happens when the io1 program reads in the last number 4. The echo command adds a newline character to the end of the output, which serves as whitespace that delimits the number 4. The second line from the bottom of the output, printed right after the number 4 was read, shows that the eofbit wasn’t set because the io1 program stopped reading right after that newline. When the next iteration of the loop tries to read from is again, it hits EOF, setting the eofbit, and fails to parse anything, setting the failbit.

In the next example, we run echo with the -n flag, which doesn’t add a newline to the end of the output:

$ echo -n "10 2 3 4" | ./io1
current sum:   10        fail()==0  bad()==0  eof()==0
current sum:   12        fail()==0  bad()==0  eof()==0
current sum:   15        fail()==0  bad()==0  eof()==0
current sum:   19        fail()==0  bad()==0  eof()==1
  final sum:   19        fail()==1  bad()==0  eof()==1

The second to last line of the output shows that the eofbit is set right after parsing the last number 4. Note that the program successfully parsed the number 4 even though there was no whitespace at the end of the input. Hitting EOF effectively serves as a delimiter here. It’s only when we attempt to read another number from is that the failbit is set.

In this last example, we insert “abc” into the input, which will fail to parse as an integer:

$ echo "10 abc 3 4" | ./io1 
current sum:   10        fail()==0  bad()==0  eof()==0
  final sum:   10        fail()==1  bad()==0  eof()==0

The failbit is set after failing to parse “abc”, which exits the while-loop and terminates the program. We’ll show how to recover from such failures in the next section.

Error Recovery

The 09/io2 program sums integers from stdin like 09/io1, but can recover from parsing errors, as shown below:

$ echo "10 abc 3 4" | ./io2
current sum:   10        fail()==0  bad()==0  eof()==0
  skipped: a
  skipped: b
  skipped: c
current sum:   13        fail()==0  bad()==0  eof()==0
current sum:   17        fail()==0  bad()==0  eof()==0
  final sum:   17        fail()==1  bad()==0  eof()==1

We’ve rewritten the sum_ints() function to consider each status bit instead of simply evaluating is >> i as a boolean:

int sum_ints(std::istream& is) {
    int i, sum = 0;
    while (!is.eof()) {
        is >> i;
        if (is.bad()) {
            throw std::runtime_error{"bad istream"};
        } else if (is.fail()) {
            // fail(), but not bad()
            if (is.eof()) {
                // failbit && eofbit: this is normal (trailing whitespace)
                break;
            }
            // failbit && !eofbit: probably non-digit; just skip it
            is.clear();
            char c;
            is.get(c);
            std::cout << "  skipped: " << c << '\n';
        } else {
            // read i successfully;
            // we could have hit eof or not (depends on trailing whitespace)
            sum += i;
            std::cout << "current sum: " << std::setw(4) << sum;
            print_io_states(is);
        }
    }
    return sum;
}

As we loop until is.eof(), we individually check the badbit and the failbit. If the badbit is set, we throw an std::runtime_error because the error is unrecoverable. We then call is.fail() to check the failbit, knowing that badbit isn’t set. If the failbit is set, this means that we were unable to parse an integer. There are two subcases: encountering a non-digit or hitting EOF. If we encounter a non-digit, we skip it by resetting the iostate by calling ios::clear() and then consuming the problematic character by invoking istream::get(). Consuming the problematic character is necessary because is >> i will put the character back into the input stream. If we hit EOF, this means that there was no more input to parse, aside from possible whitespaces that is >> i skipped, so we break out of the loop. If neither the badbit or the failbit are set, then we’ve successfully parsed an integer.

C++ I/O streams vs. C standard I/O

By default, C++’s std::cin, std::cout, and std::err are synchronized with C’s stdin, stdout, and stderr. This means that the C++ streams do not perform buffering on their own; instead, each C++ I/O operation is delegated to the corresponding C stream. This synchronization allows you to interleave C++ and C I/O. It also ensures that C++ I/O operations are thread-safe.

You can turn off this synchronization by detaching C++ streams from C streams as follows:

std::ios_base::sync_with_stdio(false);

Unsynchronized C++ streams can independently buffer their I/O, which may improve performance. However, you can no longer simply mix C++ I/O with C I/O, or assume that C++ streams are thread-safe.

Recall that std::endl is an I/O manipulator that writes a newline character, '\n' and then flushes the stream. C’s stdout is line-buffered when connected to the terminal, so writing '\n' causes the output to be flushed. This means that when C++ I/O is synchronized with C I/O, std::endl will perform a redundant flush. Thus, simply writing a '\n' is preferred to using std::endl, unless we detach C++ I/O from C I/O.

Generally speaking, you should keep C++ and C I/O synchronized unless you measure a significant performance improvement by detaching the streams.

Example: Polymorphic I/O streams

In this section, we present an example that demonstrates the polymorphic nature of I/O streams. We start with 09/io3, as shown below:

using namespace std;

static ostream& operator<<(ostream& os, const pair<string,double>& e) {
    return os << "[" << e.first << "] (" << e.second << ")";
}

static istream& operator>>(istream& is, pair<string,double>& e) {
    char c;
    // read opening quote, skipping whitespace
    if (is >> c && c == '"') {
        string student;
        // read all chars into student, stop at closing quote, then discard it
        if (std::getline(is, student, '"')) {
            // read a comma, skipping whitespace
            if (is >> c && c == ',') {
                double grade;
                if (is >> grade) {
                    e = make_pair(student, grade);
                    return is;
                }
            }
        }
    }
    // if we are here, we could not read "student name", grade
    is.setstate(ios_base::failbit);
    return is;
}

void f1(istream& is) {
    while (!is.eof()) {
        is >> std::ws;          // discard leading whitespace
        if (!is || is.eof()) {  // break if nothing else
            break;
        }
        pair<string,double> e;
        is >> e;
        if (is.fail()) {
            break;
        }
        cout << e << '\n';
    }

    if (is.bad()) {
        throw runtime_error{"bad istream"};
    } else if (is.fail()) {
        throw invalid_argument{"bad grade format"};
    }
}

int main(int argc, char **argv) {
    try {
        f1(cin);
    } catch (const exception& x) {
        cerr << x.what() << '\n';
    }
}

The io3 program reads "student name", grade pairs from stdin and prints them in a slightly different format, [student name] (grade), as demonstrated below:

$ echo ' "Jae Lee",76.5  "Hans Montero",67.8 ' | ./io3
[Jae Lee] (76.5)
[Hans Montero] (67.8)

The main() function invokes f1() and passes in the global istream object cin, which corresponds to stdin. main() catches any exception thrown and prints out its message.

f1() keeps reading "student name", grade pairs from its istream object, is, printing each pair until it hits EOF or fails to parse a pair.

We represent the student name & grade pair using std::pair<std::string, double>. std::pair is a simple class template declared as template<typename T1, typename T2> struct pair; i.e., it’s a struct that holds a T1 object and a T2 object. We’ve implemented the put-to and get-from operators for the pair:

ostream& operator<<(ostream& os, const pair<string,double>& e);
istream& operator>>(istream& is, pair<string,double>& e);

f1() calls our get-from and put-to operators in a loop until we hit EOF. At each iteration, we invoke is >> std::ws to discard leading whitespace and then check again if we hit EOF before attempting to read in another pair. This way, if get-from fails, we know it’s because of a parsing or system error, and not a lack of input.

The get-from operator attempts to read an opening quote, a student name, a closing quote, a comma, and then a double, in that order. If it fails at any point in that sequence, the function sets the failbit on the istream object.

For the student name, we want to capture all characters between the double quotes, including whitespace. We use the std::getline() function to accomplish this. The function takes an istream, a string to write into, and a delimiter character to read up to. It returns a reference to the istream object, which we test to see if the failbit has been set (e.g., if there was no end quote).

In the shell session below, we verify our implementation using two text files:

$ cat grades.txt 

   "Edgar Allan Poe" , 88.8

"Mark Twain",     66.6

  "Herman Melville",77.7


$ ./io3 < grades.txt 
[Edgar Allan Poe] (88.8)
[Mark Twain] (66.6)
[Herman Melville] (77.7)

$ cat grades-malformed.txt 

   "Edgar Allan Poe" , 88.8

"Mark Twain",,    66.6

  "Herman Melville",77.7


$ ./io3 < grades-malformed.txt 
[Edgar Allan Poe] (88.8)
bad grade format

The end of the session shows that our current implementation gives up when it encounters the malformed line in 09/grades-malformed.txt:

"Mark Twain",,    66.6

There is an extra comma in the record. The get-from operator returns early with the failbit set when it sees the extra comma instead of a double. f1() then throws an std::invalid_argument exception. We’ll improve our handling of malformed lines in the next section.

String streams

If we assume that input files have only one record per line, as was the case in the previous two examples, we can change the code such that it skips over malformed lines and continue parsing the input file. We could modify f1() to do this, but for the sake of demonstrating polymorphism, we’re going to wrap it in a new function f2(), shown below:

...

void f2(istream& is) {
    string str;

    while (std::getline(is, str)) {
        istringstream iss(str);

        try {
            f1(iss);
        } catch (const invalid_argument& x) {
            cerr << x.what() << ": " << str << '\n';
        }
    }
}

int main(int argc, char **argv) {
    try {
        // f1(cin);
        f2(cin);
    } catch (const exception& x) {
        cerr << x.what() << '\n';
    }
}

The f2() function reads the input stream line by line and passes each line to f1() to parse. We use std::getline() to read a single line. Here, we do not specify a delimiter, so it defaults to the newline character. We then construct an istringstream object out of the line. The istringstream class derives from istream and it provides a stream of characters to read from its underlying string. Since f1() takes an istream& argument, we can pass this istringstream object to f1() without modifying f1().

If a parsing error is encountered in f1(), it’ll throw std::invalid_argument. f2() will catch that exception and move onto the next line. However, f2() will not catch the std::runtime_error that f1() could throw, allowing it to percolate up to the main() function.

As shown below, the io3 program now skips over malformed lines:

$ ./io3 < grades-malformed.txt 
[Edgar Allan Poe] (88.8)
bad grade format: "Mark Twain",,    66.6
[Herman Melville] (77.7)

File streams

Up until now, our io3 program only read from stdin. Let’s modify the program to take a file to read as a command line argument. We’re going to wrap f2() in a new function f3() which will pass an ifstream object to f2() instead of std::cin. The ifstream class derives from istream and it provides a stream of bytes to read from the underlying file. Since f2() takes an istream& argument, we can pass this ifstream object to f2() without modifying f2(). The changes to io3 are shown below:

...

void f3(const char* filename) {
    ifstream ifs { filename };
    if (!ifs) {
        if (filename != nullptr)
            throw runtime_error{"can't open file: "s + filename};
        else
            throw runtime_error{"no file name provided"};
    }
    f2(ifs);
}

int main(int argc, char** argv) {
    try {
        // f1(cin);
        // f2(cin);
        f3(argv[1]);
    } catch (const exception& x) {
        cerr << x.what() << '\n';
    }
}

The main() function now calls f3() with the first command line argument, argv[1]. In f3(), we create an ifstream object, ifs, with the file name. The ifstream constructor will attempt to open the file. If it fails, the failbit will be set, causing operator bool() to evaluate to false.

Note the suffix s after the string literal "can't open file: ". The suffix converts a C-style string literal into an std::string object. This is an example of a literal operator, which we’ll revisit in a later chapter. The conversion to std::string was necessary because we can’t invoke operator+() with two const char*s.

Also note that we didn’t have to explicitly close the ifstream. In C, we’d have to manually fclose() after calling fopen(). In C++, the ifstream class follows the RAII paradigm: the destructor will close the file for you.


Last updated: 2025-10-07