Pointers in C++

Introduction

Pointers are a fundamental concept in C++ programming, offering a powerful mechanism for managing memory and facilitating advanced operations. In this chapter we will explore pointers from their basic syntax to advanced use cases, covering everything you need to know to master this essential aspect of C++.

Introduction to Pointers:

At its core, a pointer is a variable that stores the memory address of another variable.

Consider a normal variable, as given follows:

char z {}; // chars use 1 byte of memory

Simplifying a bit, when the code is generated for this definition is executed, a piece of memory from RAM will be assigned to this object. For the sake of example, let's say that the variable z is assigned memory address 100. Whenever we use variable z is an expression or statement, the program will go to memory address 100 to access the value stored there.

The nice thing about variables is that we don't need to worry about what specific memory address are assigned, or how many bytes are required to store the object's value. We just refer to the variable by its given identifier, and the compiler translated this name into the appropriately assigned memory address. The compiler takes care of all the addressing.

This is also true with references:

int main()
{
    char x {}; // assume this is assigned memory address 140
    char& ref { x }; // ref is an lvalue reference to x (when used with a type, & means lvalue reference)

    return 0;
}

Because ref acts as an alias for x, whenever we use ref, the program will go to memory address 100 to access the value. Again the compiler takes care of the addressing, so that we don't have to think about it.

The address-of operator (&)

Although the memory used by variables aren't exposed to us by default, we do have access to this information. The address-of operator (&) returns the memory address of its operand. This is pretty straightforward:

#include <iostream>

int main()
{
    int x{ 7 };
    std::cout << x << '\n';  // print the value of variable x
    std::cout << &x << '\n'; // print the memory address of variable x

    return 0;
}

On my machine, the above program printed:

7
0056DED

In the above example, we use the address-of-operator (&) to retrieve the address assigned to variable x and print that address to the console. Memory addresses are typically printed as hexadecimal values often without the 0x prefix.

The dereference operator (*)

Getting the address of a variable isn't very useful by itself.

The most useful thing we can do with an address is access the value stored at that address. The dereference operator (*) (also occasionally called the indirection operator) returns the value at a given memory address as an lvalue:

#include <iostream>

int main()
{
    int z{ 5 };
    std::cout << z << '\n';  // print the value of variable z
    std::cout << &z << '\n'; // print the memory address of variable z

    std::cout << *(&z) << '\n'; // print the value at the memory address of variable z (parentheses not required, but make it easier to read)

    return 0;
}

On author's machine, the above program printed:

5
0045FDE5
5

In this program, first we declare a variable z and prints its value. Then we print the address of variable z. Finally, we use the dereference operator to get the value at the memory address of variable z (which is just the value of z), which is printed to the console.

Pointers

A pointer is an object that holds a memory address (typically of another variable) as its value.

Much like reference type are declared using an ampersand (&) character, pointer types are declared using an asterisk (*).

int;  // a normal int
int&; // an lvalue reference to an int value

int*; // a pointer to an int value (holds the address of an integer value)

To create a pointer variable, we simply define a variable with a pointer type:

int main()
{
    int x { 5 };    // normal variable
    int& ref { x }; // a reference to an integer (bound to x)

    int* ptr;       // a pointer to an integer

    return 0;
}

Best Practice: When declaring a pointer type, place the asterisk next to the type name.

Although you generally should not declare multiple variables on a single line, if you do, the asterisk has to be included with each variable.

For example:

int* ptr1, ptr2;   // incorrect: ptr1 is a pointer to an int, but ptr2 is just a plain int!
int* ptr3, * ptr4; // correct: ptr3 and ptr4 are both pointers to an int

Pointer Initialization

Like normal variables, pointers are not initialized by default. A pointer that has not been initialized is sometimes called a wild pointer. Wild pointer contain a garbage address, and dereferencing a wild pointer will result in undefined behavior. Because of this, you should always initialize your pointers to a known value.

int main()
{
    int x{ 5 };

    int* ptr;        // an uninitialized pointer (holds a garbage address)
    int* ptr2{};     // a null pointer (we'll discuss these in the next lesson)
    int* ptr3{ &x }; // a pointer initialized with the address of variable x

    return 0;
}

Since pointers hold addresses, when we initialize or assign a value to a pointer, that value has to be an address. Typically, pointers are used to hold the address of another variable (which we can get using the address-of operator (&).

Once we have a pointer holding the address of another object, we can then use the dereference operator (*) to access the value at that address. For example:

#include <iostream>

int main()
{
    int x{ 5 };
    std::cout << x << '\n'; // print the value of variable x

    int* ptr{ &x }; // ptr holds the address of x
    std::cout << *ptr << '\n'; // use dereference operator to print the value at the address that ptr is holding (which is x's address)

    return 0;
}

// This prints:
5
5

Pointers and Assignment

We can use assignment with pointers in two different ways:

  1. To change what the pointer pointing at (by assigning the pointer a new address)
  2. To change the value being pointed at (by assigning the dereferenced pointer a new value)

First, let's look at a case where a pointer is changed to point at a different object:

#include <iostream>

int main()
{
    int x{ 5 };
    int* ptr{ &x }; // ptr initialized to point at x

    std::cout << *ptr << '\n'; // print the value at the address being pointed to (x's address)

    int y{ 6 };
    ptr = &y; // // change ptr to point at y

    std::cout << *ptr << '\n'; // print the value at the address being pointed to (y's address)

    return 0;
}

// Output
5
6

In the above example, we define pointer ptr, initialize it with the address of x, and dereference the pointer to print the value being pointed to (5). We then use the assignment operator to change the address that ptr is holding to the address of y. We then dereference the pointer again to print the value being pointed to (which is now 6).

Now let's look at how we can also use a pointer to change the value being pointed at:

#include <iostream>

int main()
{
    int x{ 5 };
    int* ptr{ &x }; // initialize ptr with address of variable x

    std::cout << x << '\n';    // print x's value
    std::cout << *ptr << '\n'; // print the value at the address that ptr is holding (x's address)

    *ptr = 6; // The object at the address held by ptr (x) assigned value 6 (note that ptr is dereferenced here)

    std::cout << x << '\n';
    std::cout << *ptr << '\n'; // print the value at the address that ptr is holding (x's address)

    return 0;
}

// Output
5
5
6
6

In this example, we define pointer ptr, initialize it with the address of x, and then print the value of both x and *ptr (5). Because *ptr returns an lvalue, we can use this on left hand side of an assignment statement, which we do to change the value being pointed at by ptr to 6. We then print the value of both x and *ptr again to show that the value has been updated as expected.

When we use pointer without a dereference (ptr), we are accessing the address held by the pointer. Modifying this (ptr = &y) changes what the pointer is pointing at.

When we dereference a pointer (*ptr), we are accessing the object being  pointed at. Modifying this (*ptr = 6) changes the value of the object being pointed at.

Pointers behave much like lvalue references

Pointers and lvalue references behave similarly. Consider the following program:

#include <iostream>

int main()
{
    int x{ 5 };
    int& ref { x };  // get a reference to x
    int* ptr { &x }; // get a pointer to x

    std::cout << x;
    std::cout << ref;  // use the reference to print x's value (5)
    std::cout << *ptr << '\n'; // use the pointer to print x's value (5)

    ref = 6; // use the reference to change the value of x
    std::cout << x;
    std::cout << ref;  // use the reference to print x's value (6)
    std::cout << *ptr << '\n'; // use the pointer to print x's value (6)

    *ptr = 7; // use the pointer to change the value of x
    std::cout << x;
    std::cout << ref;  // use the reference to print x's value (7)
    std::cout << *ptr << '\n'; // use the pointer to print x's value (7)

    return 0;
}

// Output
555
666
777

In the above program, we create a normal variable x with value 5, and then create an lvalue reference and a pointer to x. Next, we use the lvalue reference to change the value from 5 to 6, and show that wer can access that updated value via all three methods.

Finally. we use the dereferenced pointer to change the value from 6 to 7, and again show that we can access the updated value via all three methods.

Thus, pointers and references both provide a way to indirectly access another object. The primary difference is that with pointers, we need to explicitly get the address to point at, and we have to explicitly dereference the pointer to get the value. With references, the address-of and dereference happens implicitly.

There are some other differences between pointers and references worth mentioning:

  • References must be initialized, pointers are not required to be initialized (but should be).
  • References are not objects, pointers are.
  • References can not be reseated (changed to reference something else), pointers can change what they are pointing at.
  • References must always be bound to an object, pointers can point to nothing.
  • References are “safe” (outside of dangling references), pointers are inherently dangerous.

The address-of operator returns a pointer

It's worth noting that the address-of operator (&) doesn't the address of its operand as a literal. Instead, it returns a pointer containing the address of the operand, whose type is derived from the argument (e.g., taking the address of an int will return the address in a int pointer).

We can see this in the following example:

#include <iostream>
#include <typeinfo>

int main()
{
	int x{ 4 };
	std::cout << typeid(&x).name() << '\n'; // print the type of &x

	return 0;
}

With gcc, this prints “pi” (pointer to int) instead.

The size of pointers

The size of a pointer is dependent upon the architecture the executable is compiler for – 32-bit  executable used 32-bit memory addresses – consequently, a pointer on a 32-bit machine is 32 bits (4 bytes). With a 64-bit executable, a pointer would be 64 bits (8 bytes). Note that this true regardless of the size of the object being pointed to:

#include <iostream>

int main() // assume a 32-bit application
{
    char* chPtr{};        // chars are 1 byte
    int* iPtr{};          // ints are usually 4 bytes
    long double* ldPtr{}; // long doubles are usually 8 or 12 bytes

    std::cout << sizeof(chPtr) << '\n'; // prints 4
    std::cout << sizeof(iPtr) << '\n';  // prints 4
    std::cout << sizeof(ldPtr) << '\n'; // prints 4

    return 0;
}

The size of the pointer is always the same. This is because a pointer is just a memory address, and the number of bits needed to access a memory address is constant.

Dangling pointers

Much like a dangling reference, a dangling pointer is a pointer that is holding the address of an object that is no longer valid (e.g., because it has been destroyed).

Dereferencing a dangling pointer (e.g., in order to print the value being pointed at) will lead to undefined behavior, as you are trying to access an object that is no longer valid

#include <iostream>

int main()
{
    int x{ 5 };
    int* ptr{ &x };

    std::cout << *ptr << '\n'; // valid

    {
        int y{ 6 };
        ptr = &y;

        std::cout << *ptr << '\n'; // valid
    } // y goes out of scope, and ptr is now dangling

    std::cout << *ptr << '\n'; // undefined behavior from dereferencing a dangling pointer

    return 0;
}

Null pointers

Besides a memory address, there is one additional value that a pointer can hold: a null value. A null value (often shortened to null) is a special value that means something has no value. When a pointer is holding a null value, it means the pointer is not pointing to anything. Such a pointer is called a null pointer.

The easiest way to create a null pointer is to use value initialization:

int main()
{
    int* ptr {}; // ptr is now a null pointer, and is not holding an address

    return 0;
}

Because we can use assignment change what a pointer is pointing at, a pointer that is initially set to null can later be changed to point at a valid object:

#include <iostream>

int main()
{
    int* ptr {}; // ptr is a null pointer, and is not holding an address

    int x { 5 };
    ptr = &x; // ptr now pointing at object x (no longer a null pointer)

    std::cout << *ptr << '\n'; // print value of x through dereferenced ptr

    return 0;
}

The nullptr keyword

Much like keywords true and false represent Boolean literal values, the nullptr keyword represents a null pointer literal. We can use nullptr to explicitly initialize or assign a pointer a null value.

int main()
{
    int* ptr { nullptr }; // can use nullptr to initialize a pointer to be a null pointer

    int value { 5 };
    int* ptr2 { &value }; // ptr2 is a valid pointer
    ptr2 = nullptr; // Can assign nullptr to make the pointer a null pointer

    someFunction(nullptr); // we can also pass nullptr to a function that has a pointer parameter

    return 0;
}

In the above example, we use assignment to set the value of ptr2 to nullptr, making ptr2 a null pointer.

Dereferencing a null pointer results in undefined behavior

Much like dereferencing a dangling (or wild) pointer leads to undefined behavior, dereferencing a null pointer also leads to undefined behavior.

The following program illustrates this, and will probably crash or terminate your application abnormally when you run it.

#include <iostream>

int main()
{
    int* ptr {}; // Create a null pointer
    std::cout << *ptr << '\n'; // Dereference the null pointer

    return 0;
}

Conceptually, this makes sense. Dereferencing a pointer means “go to the address the pointer is pointing at and access the value there”. A null pointer holds a null value, which semantically means the pointer is not pointing at anything.

Accidentally dereferencing null and dangling pointers is one of the most common mistakes C++ programmers make.

Checking for null pointers

Much like we can use a conditional to test Boolean values for true or false, we can use a conditional to test whether a pointer has value nullptr or not:

#include <iostream>

int main()
{
    int x { 5 };
    int* ptr { &x };

    if (ptr == nullptr) // explicit test for equivalence
        std::cout << "ptr is null\n";
    else
        std::cout << "ptr is non-null\n";

    int* nullPtr {};
    std::cout << "nullPtr is " << (nullPtr==nullptr ? "null\n" : "non-null\n"); // explicit test for equivalence

    return 0;
}

// Output
ptr is non-null
nullPtr is null

Boolean values, we noted that integral values will implicitly convert into Boolean values: an integral value of 0 converts to Boolean value false, and any other integral value converts to Boolean value true.

Similarly, pointers will also implicitly convert to Boolean values: a null pointer converts to Boolean value false, and a non-null pointer converts to Boolean value true. This allows us to skip explicitly testing for nullptr and just use the implicit conversion to Boolean to test whether a pointer is a null pointer. The following program is equivalent to prior one:

#include <iostream>

int main()
{
    int x { 5 };
    int* ptr { &x };

    // pointers convert to Boolean false if they are null, and Boolean true if they are non-null
    if (ptr) // implicit conversion to Boolean
        std::cout << "ptr is non-null\n";
    else
        std::cout << "ptr is null\n";

    int* nullPtr {};
    std::cout << "nullPtr is " << (nullPtr ? "non-null\n" : "null\n"); // implicit conversion to Boolean

    return 0;
}

Legacy null pointer literals: 0 and NULL

In older code, you may see two other literal values used instead of nullptr.

The first is the literal 0. In the context of a pointer, the literal 0 is specially defined to mean a null value, and is the only time you can assign an integral literal to a pointer.

int main()
{
    float* ptr { 0 };  // ptr is now a null pointer (for example only, don't do this)

    float* ptr2; // ptr2 is uninitialized
    ptr2 = 0; // ptr2 is now a null pointer (for example only, don't do this)

    return 0;
}

Additionally, there is a preprocessor macro name NULL (defined in the <cstddef> header). This macro is inherited from C, where it is commonly used to indicate a null pointer.

#include <cstddef> // for NULL

int main()
{
    double* ptr { NULL }; // ptr is a null pointer

    double* ptr2; // ptr2 is uninitialized
    ptr2 = NULL; // ptr2 is now a null pointer

    return 0;
}

Both 0 and NULL should be avoided in modern C++ (use nullptr instead)

Favor references over pointers whenever possible

Pointers and references both give us the ability to access some other object indirectly.

Pointers have the additional abilities of being able to change what they are pointing at, and to be pointed at null. However, these pointer abilities are also inherently dangerous: A null pointer runs the risk of being dereferenced, and the ability to change what a pointer is pointing at can make creating dangling pointers easier:

int main()
{
    int* ptr { };

    {
        int x{ 5 };
        ptr = &x; // set the pointer to an object that will be destroyed (not possible with a reference)
    } // ptr is now dangling

    return 0;
}

Since references can't be bound to null, we don't have to worry about null references. And because references must be bound to a valid object upon creation and then can not be reseated, dangling references are harder to create.

Pointers and const

int main()
{
    int x { 5 };
    int* ptr { &x }; // ptr is a normal (non-const) pointer

    int y { 6 };
    ptr = &y; // we can point at another value

    *ptr = 7; // we can change the value at the address being held

    return 0;
}

With normal (non-const) pointers, we can change both what the pointer is pointing at (by assigning the pointer a new address to hold) or change the value at the address being held (by assigning a new value to the dereferenced pointer).

int main()
{
    const int x { 5 }; // x is now const
    int* ptr { &x };   // compile error: cannot convert from const int* to int*

    return 0;
}

The above snipped won't compile – we can't set a normal pointer to point at a const variable. This makes sense: a const variable is one whose value cannot be changed. Allowing the programmers to set a non-const pointer to a const value would allow the programmer to dereference the pointer and change the value. That would violate the const-ness of the variable.

Pointer to const value

A pointer to a const value (sometimes called a pointer to const for short) is a (non-const) pointer that points to a constant value.

To declare a pointer to a const value, use the const keyword before the pointer's data type:

 int main()
{
    const int x{ 5 };
    const int* ptr { &x }; // okay: ptr is pointing to a "const int"

    *ptr = 6; // not allowed: we can't change a const value

    return 0;
}

In the above example, ptr points to a const int. Because the data type being pointed to is const, the value being pointed to can't be changed.

However, because a pointer to const is not const itself (it just points to a const value), we can change what the pointer is pointing at by assigning the pointer a new address:

int main()
{
    const int x{ 5 };
    const int* ptr { &x }; // ptr points to const int x

    const int y{ 6 };
    ptr = &y; // okay: ptr now points at const int y

    return 0;
}

Just like a reference to const, a pointer to const can point to non-const variable too. A pointer to const treats the value being pointed to as constant, regardless of whether the object at that address was initially defined as const or not:

int main()
{
    int x{ 5 }; // non-const
    const int* ptr { &x }; // ptr points to a "const int"

    *ptr = 6;  // not allowed: ptr points to a "const int" so we can't change the value through ptr
    x = 6; // allowed: the value is still non-const when accessed through non-const identifier x

    return 0;
}

Const pointers

We can also make a pointer itself constant. A const pointer is a pointer whose address can not be changed after initialization.

To declare a const pointer, use the const keyword after the asterisk in the pointer declaration:

int main()
{
    int x{ 5 };
    int* const ptr { &x }; // const after the asterisk means this is a const pointer

    return 0;
}

In the above case, ptr is a const pointer to a (a non-const) int value.

Just like a normal const variable, a const pointer must be initialized upon definition and this value can't be changed via assignment:

int main()
{
    int x{ 5 };
    int y{ 6 };

    int* const ptr { &x }; // okay: the const pointer is initialized to the address of x
    ptr = &y; // error: once initialized, a const pointer can not be changed.

    return 0;
}

However, because the value being pointed to is “non-const”, it is possible to change the value being pointed to via dereferencing the const pointer:

int main()
{
    int x{ 5 };
    int* const ptr { &x }; // ptr will always point to x

    *ptr = 6; // okay: the value being pointed to is non-const

    return 0;
}

Const pointer to a const value

Finally, it is possible to declare a const pointer to a const value by using the const keyword both before the type and after the asterisk:

int main()
{
    int value { 5 };
    const int* const ptr { &value }; // a const pointer to a const value

    return 0;
}

A const pointer to a const value can not have its address changed, nor can the value it is pointing to be changed through the pointer. It can only be dereferenced to get the value it is pointing at.

Recap

To summarize, we only need to remember 4 rules:

  • A non-const pointer can be assigned another address to change what it is pointing at.
  • A const pointer always points to the same address, and this address can not be changed.
  • A pointer to a non-const value can change the value it is pointing to. These can not point to a const value.
  • A pointer to a const value treats the value as const when accessed through the pointer, and thus can not change the value it is pointing to. These can be pointed to const or non-const l-values.

Keeping the declaration syntax straight can be a bit challenging:

  • A const before the asterisk is associated with the type being pointed to. Therefore this is a pointer to a const value, and the value cannot be modified through the pointer.
  • A const after the asterisk is associated with the pointer itself. Therefore, this pointer cannot be assigned a new address.
int main()
{
    int v{ 5 };

    int* ptr0 { &v };             // points to an "int" but is not const itself, so this is a normal pointer.
    const int* ptr1 { &v };       // points to a "const int" but is not const itself, so this is a pointer to a const value.
    int* const ptr2 { &v };       // points to an "int" and is const itself, so this is a const pointer (to a non-const value).
    const int* const ptr3 { &v }; // points to an "const int" and is const itself, so this is a const pointer to a const value.

    // if the const is on the left side of the *, the const belongs to the value
    // if the const is on the right side of the *, the const belongs to the pointer

    return 0;
}