COMS W4995 C++ Deep Dive for C Programmers

Function Template

Defining a function template

Consider the following program:

int Max(int x, int y) {
    return x > y ? x : y;
}

std::string Max(std::string x, std::string y) {
    return x > y ? x : y;
}

int main() {
    using namespace std;

    cout << Max(3, 4) << ";" << Max( string{"abc"}, string{"xyz"} ) << endl;
}

It generates the following output:

4;xyz

The Max() function is overloaded to support comparing ints and std::strings. Given that operator>() is defined for both types, the implementation is actually the same. In C++, we can templatize the implementation by its type so that we can deduplicate the code:

template <typename T>
T Max(T x, T y) {
    return x > y ? x : y;
}

The template definition isn’t real code. When the compiler sees the callsites Max(3, 4) and Max( string{"abc"}, string{"xyz"} ), it’ll try to find overloads for a function called Max() with parameter types that match the callsites. When it doesn’t find any overloads, it sees the template definition and generates concrete function implementations from the template instead. Two versions of Max() will be generated: one takes two integers, the other takes two strings. A function template defines a family of functions that the compiler can generate based on what callsites exist in the program.

By the way, notice how we’re taking the arguments x and y and returning the result by value. If T is some C++ object, we’ll be invoking the copy constructor for these three things. It would be better to specify const T& for all three because we’re not mutating anything and we’re not creating a new object to return:

template <typename T>
const T& Max(const T& x, const T& y) {
    return x > y ? x : y;
}

Template specialization

Next, let’s consider what happens when we add the following invocation of the Max() function to our program:

cout << Max("AAA", "BBB") << endl;

This is not the same as the std::string invocation; this will cause the compiler to generate a new Max() implementation with two char* parameters. When we try to compile the new program, the compiler actually emits a warning:

In instantiation of ‘const T& Max(const T&, const T&) [with T = char [4]]’:
...
warning: comparison between two arrays is deprecated in C++20 [-Warray-compare]
     |     return x > y ? x : y;
     |            ~~^~~
...

The compiler is warning you about trying to compare two arrays, but as was the case in C, the comparison decays into comparing the memory addresses of the first element in each array. Pointer comparison is undefined when the two pointers don’t refer to elements in the same array; the placement of arrays in memory by the compiler is arbitrary.

Surely enough, our output is incorrect:

4;xyz
AAA

It seems that our templatized implementation that simply invokes operator>() won’t work for char*. What do we do if we have a function template implementation but need to adjust the definition for a particular type? We can provide a template specialization for a particular type to instruct the compiler to use the specialized version of the function instead:

const char* Max(const char* x, const char* y) {
    return strcmp(x, y) > 0 ? x : y;
}

The program now computes Max("AAA", "BBB") correctly:

4;xyz
BBB

Compilation Model

So far, we’ve been putting the template defintion in the same source file as our main() function. If we were to properly organize our code, you may think that we should break out our Max() implementation into a header and source file like this:

// max.h

#ifndef __MAX_H__
#define __MAX_H__

template <typename T>
const T& Max(const T& x, const T& y);

#endif

// max.cpp

template <typename T>
const T& Max(const T& x, const T& y) {
    return x > y ? x : y;
}

After all, we’ve always broken up our source code by placing prototypes in header files and the implementation in source files. Anyone that wants to use our Max() template can just include our max.h to see the prototype in order to compile and then link with our max.o to get the implementation.

Recall from earlier, however, that we said that a template definition is not real code. It’s just an outline used by the compiler to generate real code on-the-fly as it sees callsites of the Max() function. If we compile our hypothetical max.cpp by itself, it’ll basically be an empty object file; the compiler won’t generate template code unless it sees the template being used.

As such, we have to define our template directly into the header file max.h:

// max.h

#ifndef __MAX_H__
#define __MAX_H__

template <typename T>
const T& Max(const T& x, const T& y) {
    return x > y ? x : y;
}

#endif

Any source code files that call the Max() function need to include our max.h header file to bring in the template definition in order to compile their code.

What happens when you have multiple source files that use the same template definition? Let’s consider func1.cpp and func2.cpp, two source files that both use the Max() template:

// func1.cpp

#include "max.h"

int func1(int x, int y) {
    return Max(x, y); // will instantiate Max(int,int)
}

// func2.cpp

#include "max.h"

int func2(int x, int y) {
    return Max(x, y); // will instantiate Max(int,int)
}

We’ll now call func1() and func2() from our max-test.cpp program:

int main() {
    using namespace std;

    int func1(int, int); // defined in func1.cpp
    int func2(int, int); // defined in func2.cpp

    cout << func1(5, 6) << ";" << func2(7, 8) << endl;
}

When func1.cpp and func2.cpp are separately compiled, they will each generate Max(int, int). Let’s take a look inside of func1.o and func2.o to verify this. The objdump -d command will display the assembler contents of the given object file:

$ objdump -d func1.o

func1.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <_Z5func1ii>:
   0:   f3 0f 1e fa             endbr64
   4:   55                      push   %rbp
   5:   48 89 e5                mov    %rsp,%rbp
   8:   48 83 ec 10             sub    $0x10,%rsp
   c:   89 7d fc                mov    %edi,-0x4(%rbp)
   f:   89 75 f8                mov    %esi,-0x8(%rbp)
  12:   48 8d 55 f8             lea    -0x8(%rbp),%rdx
  16:   48 8d 45 fc             lea    -0x4(%rbp),%rax
  1a:   48 89 d6                mov    %rdx,%rsi
  1d:   48 89 c7                mov    %rax,%rdi
  20:   e8 00 00 00 00          call   25 <_Z5func1ii+0x25>
  25:   8b 00                   mov    (%rax),%eax
  27:   c9                      leave
  28:   c3                      ret

Disassembly of section .text._Z3MaxIiERKT_S2_S2_:

0000000000000000 <_Z3MaxIiERKT_S2_S2_>:
   0:   f3 0f 1e fa             endbr64
   4:   55                      push   %rbp
   5:   48 89 e5                mov    %rsp,%rbp
   8:   48 89 7d f8             mov    %rdi,-0x8(%rbp)
   c:   48 89 75 f0             mov    %rsi,-0x10(%rbp)
  10:   48 8b 45 f8             mov    -0x8(%rbp),%rax
  14:   8b 10                   mov    (%rax),%edx
  16:   48 8b 45 f0             mov    -0x10(%rbp),%rax
  1a:   8b 00                   mov    (%rax),%eax
  1c:   39 c2                   cmp    %eax,%edx
  1e:   7e 06                   jle    26 <_Z3MaxIiERKT_S2_S2_+0x26>
  20:   48 8b 45 f8             mov    -0x8(%rbp),%rax
  24:   eb 04                   jmp    2a <_Z3MaxIiERKT_S2_S2_+0x2a>
  26:   48 8b 45 f0             mov    -0x10(%rbp),%rax
  2a:   5d                      pop    %rbp
  2b:   c3

$ objdump -d func2.o

func2.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <_Z5func2ii>:
   0:   f3 0f 1e fa             endbr64
   4:   55                      push   %rbp
   5:   48 89 e5                mov    %rsp,%rbp
   8:   48 83 ec 10             sub    $0x10,%rsp
   c:   89 7d fc                mov    %edi,-0x4(%rbp)
   f:   89 75 f8                mov    %esi,-0x8(%rbp)
  12:   48 8d 55 f8             lea    -0x8(%rbp),%rdx
  16:   48 8d 45 fc             lea    -0x4(%rbp),%rax
  1a:   48 89 d6                mov    %rdx,%rsi
  1d:   48 89 c7                mov    %rax,%rdi
  20:   e8 00 00 00 00          call   25 <_Z5func2ii+0x25>
  25:   8b 00                   mov    (%rax),%eax
  27:   c9                      leave
  28:   c3                      ret

Disassembly of section .text._Z3MaxIiERKT_S2_S2_:

0000000000000000 <_Z3MaxIiERKT_S2_S2_>:
   0:   f3 0f 1e fa             endbr64
   4:   55                      push   %rbp
   5:   48 89 e5                mov    %rsp,%rbp
   8:   48 89 7d f8             mov    %rdi,-0x8(%rbp)
   c:   48 89 75 f0             mov    %rsi,-0x10(%rbp)
  10:   48 8b 45 f8             mov    -0x8(%rbp),%rax
  14:   8b 10                   mov    (%rax),%edx
  16:   48 8b 45 f0             mov    -0x10(%rbp),%rax
  1a:   8b 00                   mov    (%rax),%eax
  1c:   39 c2                   cmp    %eax,%edx
  1e:   7e 06                   jle    26 <_Z3MaxIiERKT_S2_S2_+0x26>
  20:   48 8b 45 f8             mov    -0x8(%rbp),%rax
  24:   eb 04                   jmp    2a <_Z3MaxIiERKT_S2_S2_+0x2a>
  26:   48 8b 45 f0             mov    -0x10(%rbp),%rax
  2a:   5d                      pop    %rbp
  2b:   c3                      ret

First, notice the names of the functions: _Z5func1ii(), _Z5func2ii, and _Z3MaxIiERKT_S2_S2_. Since C++ introduced function overloading with different types, a function name is no longer unique. There could be several versions of a function with the same name that take different types. To disambiguate function names, the C++ compiler performs name-mangling to ensure each function overload is named uniquely according to its type signature.

Second, we can see that both object files indeed contain the same implementation of Max(int, int), named _Z3MaxIiERKT_S2_S2_. When we link func1.o and func2.o together, shouldn’t the linker complain about the duplicate definitions of _Z3MaxIiERKT_S2_S2_()? This is the case for normal functions, but template functions are special. The compiler makes a note for the linker that it may see multiple definitions of the template function and that it should just choose one of them for the final executable.

Weak Binding in ELF Object Files (Optional)

If you’re familiar with the ELF binary format, we can dive deeper and see how the compiler and linker handle duplicate template instantiations. Let’s take a look at the symbol table for func1.o:

$ readelf --symbols func1.o

Symbol table '.symtab' contains 12 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
0000000000000000     0 FILE    LOCAL  DEFAULT  ABS func1.cpp
0000000000000000     0 SECTION LOCAL  DEFAULT    2 .text
0000000000000000     0 SECTION LOCAL  DEFAULT    6 .text._Z3MaxIiER[...]
0000000000000000     0 SECTION LOCAL  DEFAULT    7 .debug_info
0000000000000000     0 SECTION LOCAL  DEFAULT    9 .debug_abbrev
0000000000000000     0 SECTION LOCAL  DEFAULT   12 .debug_rnglists
0000000000000000     0 SECTION LOCAL  DEFAULT   14 .debug_line
0000000000000000     0 SECTION LOCAL  DEFAULT   16 .debug_str
0000000000000000     0 SECTION LOCAL  DEFAULT   17 .debug_line_str
0000000000000000    41 FUNC    GLOBAL DEFAULT    2 _Z5func1ii
0000000000000000    44 FUNC    WEAK   DEFAULT    6 _Z3MaxIiERKT_S2_S2_

The Max(int, int) instantiation of the Max() function template, _Z3MaxIiERKT_S2_S2_(), has a WEAK binding. When the linker is creating the final executable and only sees weak bindings for Max(int, int), it arbitrarily chooses one and throws away the rest.

Modules, introduced in C++20, changed this model of template compilation, but compiler support is not widely available at the time of this writing.

Last updated: 2025-09-03