Introduction
Pointers are a fundamental concept in C++ programming, offering a powerful mechanism for managing memory and facilitating advanced operations. In this chapter we will explore pointers from their basic syntax to advanced use cases, covering everything you need to know to master this essential aspect of C++.
Introduction to Pointers:
At its core, a pointer is a variable that stores the memory address of another variable.
Consider a normal variable, as given follows:
char z {}; // chars use 1 byte of memory
Simplifying a bit, when the code is generated for this definition is executed, a piece of memory from RAM will be assigned to this object. For the sake of example, let's say that the variable z
is assigned memory address 100
. Whenever we use variable z
is an expression or statement, the program will go to memory address 100
to access the value stored there.
The nice thing about variables is that we don't need to worry about what specific memory address are assigned, or how many bytes are required to store the object's value. We just refer to the variable by its given identifier, and the compiler translated this name into the appropriately assigned memory address. The compiler takes care of all the addressing.
This is also true with references:
int main()
{
char x {}; // assume this is assigned memory address 140
char& ref { x }; // ref is an lvalue reference to x (when used with a type, & means lvalue reference)
return 0;
}
Because ref
acts as an alias for x
, whenever we use ref
, the program will go to memory address 100 to access the value. Again the compiler takes care of the addressing, so that we don't have to think about it.
The address-of operator (&)
Although the memory used by variables aren't exposed to us by default, we do have access to this information. The address-of operator (&) returns the memory address of its operand. This is pretty straightforward:
#include <iostream>
int main()
{
int x{ 7 };
std::cout << x << '\n'; // print the value of variable x
std::cout << &x << '\n'; // print the memory address of variable x
return 0;
}
On my machine, the above program printed:
7
0056DED
In the above example, we use the address-of-operator (&) to retrieve the address assigned to variable x
and print that address to the console. Memory addresses are typically printed as hexadecimal values often without the 0x prefix.
The dereference operator (*)
Getting the address of a variable isn't very useful by itself.
The most useful thing we can do with an address is access the value stored at that address. The dereference operator (*) (also occasionally called the indirection operator) returns the value at a given memory address as an lvalue:
#include <iostream>
int main()
{
int z{ 5 };
std::cout << z << '\n'; // print the value of variable z
std::cout << &z << '\n'; // print the memory address of variable z
std::cout << *(&z) << '\n'; // print the value at the memory address of variable z (parentheses not required, but make it easier to read)
return 0;
}
On author's machine, the above program printed:
5
0045FDE5
5
In this program, first we declare a variable z
and prints its value. Then we print the address of variable z
. Finally, we use the dereference operator to get the value at the memory address of variable z
(which is just the value of z
), which is printed to the console.
Pointers
A pointer is an object that holds a memory address (typically of another variable) as its value.
Much like reference type are declared using an ampersand (&) character, pointer types are declared using an asterisk (*).
int; // a normal int
int&; // an lvalue reference to an int value
int*; // a pointer to an int value (holds the address of an integer value)
To create a pointer variable, we simply define a variable with a pointer type:
int main()
{
int x { 5 }; // normal variable
int& ref { x }; // a reference to an integer (bound to x)
int* ptr; // a pointer to an integer
return 0;
}
Best Practice: When declaring a pointer type, place the asterisk next to the type name.
Although you generally should not declare multiple variables on a single line, if you do, the asterisk has to be included with each variable.
For example:
int* ptr1, ptr2; // incorrect: ptr1 is a pointer to an int, but ptr2 is just a plain int!
int* ptr3, * ptr4; // correct: ptr3 and ptr4 are both pointers to an int
Pointer Initialization
Like normal variables, pointers are not initialized by default. A pointer that has not been initialized is sometimes called a wild pointer. Wild pointer contain a garbage address, and dereferencing a wild pointer will result in undefined behavior. Because of this, you should always initialize your pointers to a known value.
int main()
{
int x{ 5 };
int* ptr; // an uninitialized pointer (holds a garbage address)
int* ptr2{}; // a null pointer (we'll discuss these in the next lesson)
int* ptr3{ &x }; // a pointer initialized with the address of variable x
return 0;
}
Since pointers hold addresses, when we initialize or assign a value to a pointer, that value has to be an address. Typically, pointers are used to hold the address of another variable (which we can get using the address-of operator (&
).
Once we have a pointer holding the address of another object, we can then use the dereference operator (*
) to access the value at that address. For example:
#include <iostream>
int main()
{
int x{ 5 };
std::cout << x << '\n'; // print the value of variable x
int* ptr{ &x }; // ptr holds the address of x
std::cout << *ptr << '\n'; // use dereference operator to print the value at the address that ptr is holding (which is x's address)
return 0;
}
// This prints:
5
5
Pointers and Assignment
We can use assignment with pointers in two different ways:
- To change what the pointer pointing at (by assigning the pointer a new address)
- To change the value being pointed at (by assigning the dereferenced pointer a new value)
First, let's look at a case where a pointer is changed to point at a different object:
#include <iostream>
int main()
{
int x{ 5 };
int* ptr{ &x }; // ptr initialized to point at x
std::cout << *ptr << '\n'; // print the value at the address being pointed to (x's address)
int y{ 6 };
ptr = &y; // // change ptr to point at y
std::cout << *ptr << '\n'; // print the value at the address being pointed to (y's address)
return 0;
}
// Output
5
6
In the above example, we define pointer ptr
, initialize it with the address of x
, and dereference the pointer to print the value being pointed to (5). We then use the assignment operator to change the address that ptr
is holding to the address of y
. We then dereference the pointer again to print the value being pointed to (which is now 6).
Now let's look at how we can also use a pointer to change the value being pointed at:
#include <iostream>
int main()
{
int x{ 5 };
int* ptr{ &x }; // initialize ptr with address of variable x
std::cout << x << '\n'; // print x's value
std::cout << *ptr << '\n'; // print the value at the address that ptr is holding (x's address)
*ptr = 6; // The object at the address held by ptr (x) assigned value 6 (note that ptr is dereferenced here)
std::cout << x << '\n';
std::cout << *ptr << '\n'; // print the value at the address that ptr is holding (x's address)
return 0;
}
// Output
5
5
6
6
In this example, we define pointer ptr
, initialize it with the address of x
, and then print the value of both x
and *ptr
(5). Because *ptr
returns an lvalue, we can use this on left hand side of an assignment statement, which we do to change the value being pointed at by ptr
to 6
. We then print the value of both x
and *ptr
again to show that the value has been updated as expected.
When we use pointer without a dereference (
ptr
), we are accessing the address held by the pointer. Modifying this (ptr = &y
) changes what the pointer is pointing at.When we dereference a pointer (
*ptr
), we are accessing the object being pointed at. Modifying this (*ptr = 6
) changes the value of the object being pointed at.
Pointers behave much like lvalue references
Pointers and lvalue references behave similarly. Consider the following program:
#include <iostream>
int main()
{
int x{ 5 };
int& ref { x }; // get a reference to x
int* ptr { &x }; // get a pointer to x
std::cout << x;
std::cout << ref; // use the reference to print x's value (5)
std::cout << *ptr << '\n'; // use the pointer to print x's value (5)
ref = 6; // use the reference to change the value of x
std::cout << x;
std::cout << ref; // use the reference to print x's value (6)
std::cout << *ptr << '\n'; // use the pointer to print x's value (6)
*ptr = 7; // use the pointer to change the value of x
std::cout << x;
std::cout << ref; // use the reference to print x's value (7)
std::cout << *ptr << '\n'; // use the pointer to print x's value (7)
return 0;
}
// Output
555
666
777
In the above program, we create a normal variable x
with value 5
, and then create an lvalue reference and a pointer to x
. Next, we use the lvalue reference to change the value from 5
to 6
, and show that wer can access that updated value via all three methods.
Finally. we use the dereferenced pointer to change the value from 6
to 7
, and again show that we can access the updated value via all three methods.
Thus, pointers and references both provide a way to indirectly access another object. The primary difference is that with pointers, we need to explicitly get the address to point at, and we have to explicitly dereference the pointer to get the value. With references, the address-of and dereference happens implicitly.
There are some other differences between pointers and references worth mentioning:
- References must be initialized, pointers are not required to be initialized (but should be).
- References are not objects, pointers are.
- References can not be reseated (changed to reference something else), pointers can change what they are pointing at.
- References must always be bound to an object, pointers can point to nothing.
- References are “safe” (outside of dangling references), pointers are inherently dangerous.
The address-of operator returns a pointer
It's worth noting that the address-of operator (&) doesn't the address of its operand as a literal. Instead, it returns a pointer containing the address of the operand, whose type is derived from the argument (e.g., taking the address of an int
will return the address in a int
pointer).
We can see this in the following example:
#include <iostream>
#include <typeinfo>
int main()
{
int x{ 4 };
std::cout << typeid(&x).name() << '\n'; // print the type of &x
return 0;
}
With gcc, this prints “pi” (pointer to int) instead.
The size of pointers
The size of a pointer is dependent upon the architecture the executable is compiler for – 32-bit executable used 32-bit memory addresses – consequently, a pointer on a 32-bit machine is 32 bits (4 bytes). With a 64-bit executable, a pointer would be 64 bits (8 bytes). Note that this true regardless of the size of the object being pointed to:
#include <iostream>
int main() // assume a 32-bit application
{
char* chPtr{}; // chars are 1 byte
int* iPtr{}; // ints are usually 4 bytes
long double* ldPtr{}; // long doubles are usually 8 or 12 bytes
std::cout << sizeof(chPtr) << '\n'; // prints 4
std::cout << sizeof(iPtr) << '\n'; // prints 4
std::cout << sizeof(ldPtr) << '\n'; // prints 4
return 0;
}
The size of the pointer is always the same. This is because a pointer is just a memory address, and the number of bits needed to access a memory address is constant.
Dangling pointers
Much like a dangling reference, a dangling pointer is a pointer that is holding the address of an object that is no longer valid (e.g., because it has been destroyed).
Dereferencing a dangling pointer (e.g., in order to print the value being pointed at) will lead to undefined behavior, as you are trying to access an object that is no longer valid
#include <iostream>
int main()
{
int x{ 5 };
int* ptr{ &x };
std::cout << *ptr << '\n'; // valid
{
int y{ 6 };
ptr = &y;
std::cout << *ptr << '\n'; // valid
} // y goes out of scope, and ptr is now dangling
std::cout << *ptr << '\n'; // undefined behavior from dereferencing a dangling pointer
return 0;
}
Null pointers
Besides a memory address, there is one additional value that a pointer can hold: a null value. A null value
(often shortened to null) is a special value that means something has no value. When a pointer is holding a null value, it means the pointer is not pointing to anything. Such a pointer is called a null pointer.
The easiest way to create a null pointer is to use value initialization:
int main()
{
int* ptr {}; // ptr is now a null pointer, and is not holding an address
return 0;
}
Because we can use assignment change what a pointer is pointing at, a pointer that is initially set to null can later be changed to point at a valid object:
#include <iostream>
int main()
{
int* ptr {}; // ptr is a null pointer, and is not holding an address
int x { 5 };
ptr = &x; // ptr now pointing at object x (no longer a null pointer)
std::cout << *ptr << '\n'; // print value of x through dereferenced ptr
return 0;
}
The nullptr keyword
Much like keywords true
and false
represent Boolean literal values, the nullptr
keyword represents a null pointer literal. We can use nullptr
to explicitly initialize or assign a pointer a null value.
int main()
{
int* ptr { nullptr }; // can use nullptr to initialize a pointer to be a null pointer
int value { 5 };
int* ptr2 { &value }; // ptr2 is a valid pointer
ptr2 = nullptr; // Can assign nullptr to make the pointer a null pointer
someFunction(nullptr); // we can also pass nullptr to a function that has a pointer parameter
return 0;
}
In the above example, we use assignment to set the value of ptr2
to nullptr
, making ptr2
a null pointer.
Dereferencing a null pointer results in undefined behavior
Much like dereferencing a dangling (or wild) pointer leads to undefined behavior, dereferencing a null pointer also leads to undefined behavior.
The following program illustrates this, and will probably crash or terminate your application abnormally when you run it.
#include <iostream>
int main()
{
int* ptr {}; // Create a null pointer
std::cout << *ptr << '\n'; // Dereference the null pointer
return 0;
}
Conceptually, this makes sense. Dereferencing a pointer means “go to the address the pointer is pointing at and access the value there”. A null pointer holds a null value, which semantically means the pointer is not pointing at anything.
Accidentally dereferencing null and dangling pointers is one of the most common mistakes C++ programmers make.
Checking for null pointers
Much like we can use a conditional to test Boolean values for true
or false
, we can use a conditional to test whether a pointer has value nullptr
or not:
#include <iostream>
int main()
{
int x { 5 };
int* ptr { &x };
if (ptr == nullptr) // explicit test for equivalence
std::cout << "ptr is null\n";
else
std::cout << "ptr is non-null\n";
int* nullPtr {};
std::cout << "nullPtr is " << (nullPtr==nullptr ? "null\n" : "non-null\n"); // explicit test for equivalence
return 0;
}
// Output
ptr is non-null
nullPtr is null
Boolean values, we noted that integral values will implicitly convert into Boolean values: an integral value of 0
converts to Boolean value false
, and any other integral value converts to Boolean value true
.
Similarly, pointers will also implicitly convert to Boolean values: a null pointer converts to Boolean value false
, and a non-null pointer converts to Boolean value true
. This allows us to skip explicitly testing for nullptr
and just use the implicit conversion to Boolean to test whether a pointer is a null pointer. The following program is equivalent to prior one:
#include <iostream>
int main()
{
int x { 5 };
int* ptr { &x };
// pointers convert to Boolean false if they are null, and Boolean true if they are non-null
if (ptr) // implicit conversion to Boolean
std::cout << "ptr is non-null\n";
else
std::cout << "ptr is null\n";
int* nullPtr {};
std::cout << "nullPtr is " << (nullPtr ? "non-null\n" : "null\n"); // implicit conversion to Boolean
return 0;
}
Legacy null pointer literals: 0 and NULL
In older code, you may see two other literal values used instead of nullptr
.
The first is the literal 0
. In the context of a pointer, the literal 0
is specially defined to mean a null value, and is the only time you can assign an integral literal to a pointer.
int main()
{
float* ptr { 0 }; // ptr is now a null pointer (for example only, don't do this)
float* ptr2; // ptr2 is uninitialized
ptr2 = 0; // ptr2 is now a null pointer (for example only, don't do this)
return 0;
}
Additionally, there is a preprocessor macro name NULL
(defined in the <cstddef>
header). This macro is inherited from C, where it is commonly used to indicate a null pointer.
#include <cstddef> // for NULL
int main()
{
double* ptr { NULL }; // ptr is a null pointer
double* ptr2; // ptr2 is uninitialized
ptr2 = NULL; // ptr2 is now a null pointer
return 0;
}
Both 0
and NULL
should be avoided in modern C++ (use nullptr
instead)
Favor references over pointers whenever possible
Pointers and references both give us the ability to access some other object indirectly.
Pointers have the additional abilities of being able to change what they are pointing at, and to be pointed at null. However, these pointer abilities are also inherently dangerous: A null pointer runs the risk of being dereferenced, and the ability to change what a pointer is pointing at can make creating dangling pointers easier:
int main()
{
int* ptr { };
{
int x{ 5 };
ptr = &x; // set the pointer to an object that will be destroyed (not possible with a reference)
} // ptr is now dangling
return 0;
}
Since references can't be bound to null, we don't have to worry about null references. And because references must be bound to a valid object upon creation and then can not be reseated, dangling references are harder to create.
Pointers and const
int main()
{
int x { 5 };
int* ptr { &x }; // ptr is a normal (non-const) pointer
int y { 6 };
ptr = &y; // we can point at another value
*ptr = 7; // we can change the value at the address being held
return 0;
}
With normal (non-const) pointers, we can change both what the pointer is pointing at (by assigning the pointer a new address to hold) or change the value at the address being held (by assigning a new value to the dereferenced pointer).
int main()
{
const int x { 5 }; // x is now const
int* ptr { &x }; // compile error: cannot convert from const int* to int*
return 0;
}
The above snipped won't compile – we can't set a normal pointer to point at a const variable. This makes sense: a const variable is one whose value cannot be changed. Allowing the programmers to set a non-const pointer to a const value would allow the programmer to dereference the pointer and change the value. That would violate the const-ness of the variable.
Pointer to const value
A pointer to a const value (sometimes called a pointer to const for short) is a (non-const) pointer that points to a constant value.
To declare a pointer to a const value, use the const
keyword before the pointer's data type:
int main()
{
const int x{ 5 };
const int* ptr { &x }; // okay: ptr is pointing to a "const int"
*ptr = 6; // not allowed: we can't change a const value
return 0;
}
In the above example, ptr
points to a const int
. Because the data type being pointed to is const, the value being pointed to can't be changed.
However, because a pointer to const is not const itself (it just points to a const value), we can change what the pointer is pointing at by assigning the pointer a new address:
int main()
{
const int x{ 5 };
const int* ptr { &x }; // ptr points to const int x
const int y{ 6 };
ptr = &y; // okay: ptr now points at const int y
return 0;
}
Just like a reference to const, a pointer to const can point to non-const variable too. A pointer to const treats the value being pointed to as constant, regardless of whether the object at that address was initially defined as const or not:
int main()
{
int x{ 5 }; // non-const
const int* ptr { &x }; // ptr points to a "const int"
*ptr = 6; // not allowed: ptr points to a "const int" so we can't change the value through ptr
x = 6; // allowed: the value is still non-const when accessed through non-const identifier x
return 0;
}
Const pointers
We can also make a pointer itself constant. A const pointer is a pointer whose address can not be changed after initialization.
To declare a const pointer, use the const
keyword after the asterisk in the pointer declaration:
int main()
{
int x{ 5 };
int* const ptr { &x }; // const after the asterisk means this is a const pointer
return 0;
}
In the above case, ptr
is a const pointer to a (a non-const) int value.
Just like a normal const variable, a const pointer must be initialized upon definition and this value can't be changed via assignment:
int main()
{
int x{ 5 };
int y{ 6 };
int* const ptr { &x }; // okay: the const pointer is initialized to the address of x
ptr = &y; // error: once initialized, a const pointer can not be changed.
return 0;
}
However, because the value being pointed to is “non-const”, it is possible to change the value being pointed to via dereferencing the const pointer:
int main()
{
int x{ 5 };
int* const ptr { &x }; // ptr will always point to x
*ptr = 6; // okay: the value being pointed to is non-const
return 0;
}
Const pointer to a const value
Finally, it is possible to declare a const pointer to a const value by using the const
keyword both before the type and after the asterisk:
int main()
{
int value { 5 };
const int* const ptr { &value }; // a const pointer to a const value
return 0;
}
A const pointer to a const value can not have its address changed, nor can the value it is pointing to be changed through the pointer. It can only be dereferenced to get the value it is pointing at.
Recap
To summarize, we only need to remember 4 rules:
- A non-const pointer can be assigned another address to change what it is pointing at.
- A const pointer always points to the same address, and this address can not be changed.
- A pointer to a non-const value can change the value it is pointing to. These can not point to a const value.
- A pointer to a const value treats the value as const when accessed through the pointer, and thus can not change the value it is pointing to. These can be pointed to const or non-const l-values.
Keeping the declaration syntax straight can be a bit challenging:
- A
const
before the asterisk is associated with the type being pointed to. Therefore this is a pointer to a const value, and the value cannot be modified through the pointer. - A
const
after the asterisk is associated with the pointer itself. Therefore, this pointer cannot be assigned a new address.
int main()
{
int v{ 5 };
int* ptr0 { &v }; // points to an "int" but is not const itself, so this is a normal pointer.
const int* ptr1 { &v }; // points to a "const int" but is not const itself, so this is a pointer to a const value.
int* const ptr2 { &v }; // points to an "int" and is const itself, so this is a const pointer (to a non-const value).
const int* const ptr3 { &v }; // points to an "const int" and is const itself, so this is a const pointer to a const value.
// if the const is on the left side of the *, the const belongs to the value
// if the const is on the right side of the *, the const belongs to the pointer
return 0;
}