COMS W4995 C++ Deep Dive for C Programmers

Associative Containers

Containers that hold key-value pairs are known as associative containers. In this chapter, we’ll study the main four associative containers offered in STL: std::map, std::unordered_map, std::multimap, and std::unordered_multimap. We’ll also briefly cover the corresponding set containers in STL.

std::map

Consider the following program, 12/words1, that reads in strings from cin and prints each word with its number of occurances:

#include <map>
#include <unordered_map>
#include <set>
#include <unordered_set>
#include <string>
#include <iostream>
#include <iomanip>
using namespace std;

int main() {
    map<string,int> m;

    for (string s; cin >> s; ) {
        ++m[s];
    }

    for (auto i = m.begin(); i != m.end(); ++i) {
        cout << setw(20) << (*i).first << '|';
        cout << i->second << '\n';
    }
}

Here’s the output of the program when we feed it the source code:

$ ./words < words.cpp
                  !=|1
            #include|7
               '\n';|1
                '|';|1
          (*i).first|1
               (auto|1
             (string|1
                   )|1
                ++i)|1
             ++m[s];|1
                  <<|5
           <iomanip>|1
          <iostream>|1
               <map>|1
               <set>|1
            <string>|1
     <unordered_map>|1
     <unordered_set>|1
                   =|1
                  >>|1
                 cin|1
                cout|2
                 for|2
                   i|2
           i->second|1
                 int|1
          m.begin();|1
            m.end();|1
                  m;|1
              main()|1
     map<string,int>|1
           namespace|1
                  s;|2
            setw(20)|1
                std;|1
               using|1
                   {|3
                   }|3

The program uses std::map, which is a sorted associative container of key-value pairs of unique keys. Each element is an std::pair of the key and value, which is an std::pair<string,int> in our case. The map uses a balanced binary tree to keep the pairs sorted, usually implemented as a Red-Black Tree, providing O(log N) insertion, removal, and search operations. No references, pointers, or iterators are invalidated as a result of insertion or removal (except the element removed, of course).

Given two keys, the map must be able to determine if one key is “less than” the other. It suffices to have the key type support operator<(). If not, you must provide a custom comparator as the third template parameter after the key and value types.

For every string s we read from cin, we access the corresponding count stored in our map using operator[]() and then increment it. If operator[]() is invoked with a non-existent key, it will first insert a new key-value pair. In that case, the new value is value-initialized, which means default-constructed for class types and zero-initialized for primitive types. std::map also offers at(), which throws std::out_of_range if the key is not in the map instead of creating an entry for you.

After populating the map, we perform an in-order traversal of the binary search tree using std::map::iterator. Dereferencing the iterator will yield a std::pair<string,int>. The expression (*i).first gives us the key, the first element of the pair. We show an another way to access the pair through the iterator to retrieve the value: i->second. The iterator implements operator->() to return a pointer to the key-value pair that it refers to.

As we discussed previously, C++ provides the range-for syntactic sugar to replace iterator-based loops, as follows:

for (auto& e : m) {
    cout << setw(20) << e.first << '|' << e.second << '\n';
}

Structured binding

C++17 introduced structured binding, which allows you to bind a name to each subcomponent of an object. In our case, we can use structured binding to bind two references, s and n, to each pair’s first and second data member, respectively:

for (auto& [s, n] : m) {
    cout << setw(20) << s << '|' << n << '\n';
}

Searching for a key

The map::find() member function looks up a key and returns an iterator to the key-value pair, or the “past-the-end” iterator (i.e., what map::end() returns) if the key isn’t found. Here’s an example where we look up how many header files are included in the program:

auto it = m.find( "#include"s );
if (it != m.end()) {
    auto& [s, n] = *it;
    cout << '\n' << s << " appears " << n << " times." << '\n';
}

std::unordered_map

std::unordered_map is an associative container implemented using a hash table, instead of a binary search tree. As such, it provides average O(1) insertion, removal, and search operations. No references or pointers are invalidated as a result of insertion or removal (except the element removed, of course). Iterators, on the other hand, can be invalidated by insertions, if it results in rehashing. (An iterator to an erased element is of course invalid.)

The API is the same as std::map, so 12/words1 works without any modifications if we swap out std::map for std::unordered_map. However, running the program reveals that the output is no longer sorted by the key, which makes sense given the underlying structure of the hash table.

Instead of std::map’s requirement of “less-than” operation for the key, std::unordered_map has two requirements:

STL provides hash functions for various types by specializing the std::hash<T> class template, which implements operator() – i.e., it’s a function object. By default, std::unordered_map is parameterized using std::hash<KeyType>. If STL provides a specialization for KeyType, there’s nothing more the user has to do to use std::unordered_map with that KeyType. Otherwise, the user can provide a custom specialization for the std::hash class template.

std::multimap and std::unordered_multimap

std::multimap is like std::map but it can store multiple entries with the same key. Consider the following program, 12/words2, that creates a std::multimap that is the inverse map of the frequency table we created in 12/words1:

#include <map>
#include <unordered_map>
#include <set>
#include <unordered_set>
#include <string>
#include <iostream>
#include <iomanip>
using namespace std;

int main() {
    // Build a map of words to their frequencies
    unordered_map<string,int> word_to_freq;
    for (string s; cin >> s; ) {
        ++word_to_freq[s];
    }

    // Build a reverse map of frequencies to words
    multimap<int,string> freq_to_words;
    for (const auto& [word, freq] : word_to_freq) {
        freq_to_words.insert( {freq, word} );
    }

    // Output the reverse map
    for (const auto& [freq, word] : freq_to_words) {
        cout << setw(4) << freq << '|' << word << '\n';
    }

    // Find out how many words occur three times
    auto [b, e] = freq_to_words.equal_range(3);
    if (b != e) {
        auto& freq = b->first;
        auto num_words = distance(b, e);
        cout << '\n' << num_words << " words occur " << freq << " times.\n";
    }
}

Here is the output:

$ ./words2 < words2.cpp
   1|times.\n";
   1|!=
   ...
   2|cout
   2|(const
   3|"
   3|=
   3|for
   3|map
   3|auto&
   3|freq
   4|//
   4|words
   5|{
   5|}
   7|#include
  10|<<

6 words occur 3 times.

After creating the frequency table word_to_freq, we insert a word and its freq into freq_to_words as a reverse pair (freq, word). Here, the insert() member function takes an std::pair<int,string> to insert into the multimap. The expression {freq, word} serves as the initializer for the insert() function’s parameter. This is more concise than writing freq_to_words.insert( make_pair(freq, word) ).

Searching for a key

Let’s say we wanted to count all words with frequency 3. The find() member function we used previously for std::map doesn’t work here because it only returns one arbitrary match.

Instead, we use std::multimap::equal_range(), which returns a pair of iterators that represents the range of key-value pairs that match the given key. We bind those iterators to b and e. b points to the first element that matches the key and e points to one past the last element that matches the key. If there were no matches, b would be equal to e.

We pass the two iterators to the std::distance() function template to compute how far apart they are, which tells us the number of words with the given frequency. The algorithm performed by std::distance() depends on the type of iterator passed to it. If b and e are random access iterators, it could simply return e - b. Otherwise, std::distance() will have to walk the b iterator until it reaches e, i.e., it would execute a loop like this:

int count = 0;
while (b != e) {
    ++b;
    ++count;
}
return count;

In the next chapter, we’ll explain how std::distance() can switch between these two algorithms at compile-time.

std::unordered_multimap

STL also provides an unordered variant of multimap, std::unordered_multimap, which uses a hash table instead of a binary search tree. The API remains the same and we can swap out the std::multimap in 12/words2 for a std::unordered_multimap without any additional modifications.

Sets

Finally, STL provides ordered, unordered, and multi variants of a set:

The difference between a map and a set is that the latter only stores keys instead of key-value pairs. Sets are useful for querying presence or absence of a particular element.


Last updated: 2025-10-24