Introduction
References in C++ are powerful and versatile constructs that play a crucial role in manipulating data efficiently. They provide a way to create aliases for variables, enabling us to work with the same data under different names. In this chapter, we will delve into the concept of references in C++, exploring their syntax, usage and benefits.
Introduction to References
A reference in C++ is essentially an alias or alternative name for an existing variable. Unlike pointers, references cannot be null and must be initialized upon declaration. They are declared using the &
symbol and are used to refer to an already existing variable. The syntax for declaring a reference is as follows:
int main() {
int originalVariable = 42;
int& referenceVariable = originalVariable;
// Now, referenceVariable is an alias for originalVariable
return 0;
}
In this example, referenceVariable
is a reference to origianlVariable
, meaning any changes made to one will directly affect the other.
Syntax and Declaration:
To declare a reference, the syntax is as follows:
dataType& referenceName = existingVariable;
Here, dataType
is the type of the referenced variable, and referenceName
is the name of the reference.
Initialization Of References
Much like constants, all references must be initialized.
int main()
{
int& invalidRef; // error: references must be initialized
int x { 5 };
int& ref { x }; // okay: reference to int is bound to int variable
return 0;
}
When a reference is initialized with an object (or function), we say it is bound to that object (or function). The process by which such a reference is bound is called reference binding. The object (or function) being referenced is sometimes called the referent.
References must be bound to a modifiable LValue.
int main()
{
int x { 5 };
int& ref { x }; // valid: lvalue reference bound to a modifiable lvalue
const int y { 5 };
int& invalidRef { y }; // invalid: can't bind to a non-modifiable lvalue
int& invalidRef2 { 0 }; // invalid: can't bind to an rvalue
return 0;
}
Lvalue references can't be bound to non-modifiable lvalue or rvalues (otherwise you be able to change those values through the reference, which would be a violation of their const-ness)
The type of the reference must match the type of the referent (there are some exceptions to this rule):
int main()
{
int x { 5 };
int& ref { x }; // okay: reference to int is bound to int variable
double y { 6.0 };
int& invalidRef { y }; // invalid; reference to int cannot bind to double variable
double& invalidRef2 { x }; // invalid: reference to double cannot bind to int variable
return 0;
}
References can't be reseated (changed to refer to another object)
Once initialized, a reference in C++ cannot be reseated, meaning it cannot be changed to reference another object.
New C++ programmers often try to reseat a reference by using assignment to provide the reference with another variable to reference. This will compile and run – but not function as expected. Consider the following program:
#include <iostream>
int main()
{
int x { 5 };
int y { 6 };
int& ref { x }; // ref is now an alias for x
ref = y; // assigns 6 (the value of y) to x (the object being referenced by ref)
// The above line does NOT change ref into a reference to variable y!
std::cout << x << '\n'; // user is expecting this to print 5
return 0;
}
Output = 6
Perhaps surprisingly, this prints 6
When a reference is evaluated in an expression, it resolves to the object it's referencing. So ref = y
doesn't change ref
to now reference y
. Rather, because ref
is an alias for x
, the expression evaluates as if it was written x = y
– since y
evaluates to values 6
, x
is assigned the value 6
.
References and referents have independent lifetimes
- A reference can be destroyed before the object it is referencing.
- The object being referenced can be destroyed before the reference.
When a reference is destroyed before the referent, the referent is not impacted. The following program demonstrates this:
#include <iostream>
int main()
{
int x { 5 };
{
int& ref { x }; // ref is a reference to x
std::cout << ref << '\n'; // prints value of ref (5)
} // ref is destroyed here -- x is unaware of this
std::cout << x << '\n'; // prints value of x (5)
return 0;
} // x destroyed here
// Output = 5
5
When ref
dies, variable x
carries on as normal, blissfully unaware that a reference to it has been destroyed.
Dangling references
When an object being referenced destroyed before a reference to it, the reference is left referencing an object that no longer exists. Such a reference is called a dangling reference
. Accessing a dangling reference leads to undefined behavior.
int& getReference() {
int x = 42;
return x; // Returning a reference to a local variable
}
// This would lead to undefined behavior
Lvalue reference to const
By using the const
keyword when declaring an lvalue reference, we tell an reference to treat the object it is referencing as const. Such a reference is called an reference to a const value. (const reference).
references to const can bind to non-modifiable values.
int main()
{
const int x { 5 }; // x is a non-modifiable lvalue
const int& ref { x }; // okay: ref is a an lvalue reference to a const value
return 0;
}
Because references to const treat the object they are referencing as const, they can be used to access but not modify the value being referenced:
#include <iostream>
int main()
{
const int x { 5 }; // x is a non-modifiable lvalue
const int& ref { x }; // okay: ref is a an lvalue reference to a const value
std::cout << ref << '\n'; // okay: we can access the const object
ref = 6; // error: we can not modify an object through a const reference
return 0;
}
Initializing an reference to const with a modifiable value
Reference to const can also bind to modifiable values. In such a case, the object being referenced is treated as const when accessed through the reference (even though the underlying object is non-const):
#include <iostream>
int main()
{
int x { 5 }; // x is a modifiable lvalue
const int& ref { x }; // okay: we can bind a const reference to a modifiable lvalue
std::cout << ref << '\n'; // okay: we can access the object through our const reference
ref = 7; // error: we can not modify an object through a const reference
x = 6; // okay: x is a modifiable lvalue, we can still modify it through the original identifier
return 0;
}
In the above program, we bind const reference ref
to modifiable value x
. We can then use ref
to access x
, but ref
is const, we can not modify the value of x
through ref
. However, we still can modify the value of x
directly (using the identifier x
).
Key Characteristics of References
- Initialization: References must be initialized when declared, and once initialized, they cannot be reassigned to refer to another variable. This makes them safer and more straightforward than pointers.
- No Null References: Unlike pointers, references cannot be null. They must always refer to a valid object or variable.
- Syntax: The syntax for references uses the
&
symbol, but it is essential to distinguish between the declaration of a reference and the address-of operator used with pointers.
Passing by Reference
One of the most common use cases for references is in function parameters. Passing by reference allows a function to modify the original data directly, avoiding the overhead of copying large objects.
void modifyValue(int& value) {
value *= 2;
}
int main() {
int number = 5;
modifyValue(number);
// 'number' is now 10
return 0;
}
When an argument passed to a function is copied into the function's parameter:
#include <iostream>
void printValue(int y)
{
std::cout << y << '\n';
} // y is destroyed here
int main()
{
int x { 2 };
printValue(x); // x is passed by value (copied) into parameter y (inexpensive)
return 0;
}
In the above program, when printValue(x)
is called, the value of x
(2) is copied into parameter y
. Then, at the end of the function, object y
is destroyed.
This means that when we called the function, we made a copy of our argument's value, only to use it briefly and then destroy it! Fortunately, because fundamental types are cheap to copy, there isn't a problem.
Some objects are expensive to copy
Most of the types provided by the standard library (such as std::string) are class
types. Class types are usually expensive to copy. Whenever possible, we want to avoid making unnecessary copies of objects that are expensive to copy, especially when we will destroy those copies almost immediately.
Consider the following program illustrating this point:
#include <iostream>
#include <string>
void printValue(std::string y)
{
std::cout << y << '\n';
} // y is destroyed here
int main()
{
std::string x { "Hello, world!" }; // x is a std::string
printValue(x); // x is passed by value (copied) into parameter y (expensive)
return 0;
}
This prints
“Hello, world!”
While this program behaves like we expect, it's also inefficient. Identically to the prior example, when pirintValue()
is called argument x
copied into printValue()
parameter y
. However, in this example, the argument is a std::string
instead of an int
, and std::string
is a class type that is expensive to copy. And this expensive copy is made every time printValue()
is called.
Pass by reference
One way to avoid making an expensive copy of an argument when calling a function is to use pass by reference
instead of pass by value
. When using pass by reference, we declare a function parameter as a reference type (or const reference type) rather than as normal type. When this function is called, each reference parameter is bound to the appropriate argument. Because the reference acts as an alias for the argument, no copy of the argument is made.
Here's the same example as above, using pass by reference instead of pass by value:
#include <iostream>
#include <string>
void printValue(std::string& y) // type changed to std::string&
{
std::cout << y << '\n';
} // y is destroyed here
int main()
{
std::string x { "Hello, world!" };
printValue(x); // x is now passed by reference into reference parameter y (inexpensive)
return 0;
}
The program is identical to the prior one, except the type of parameter y
has been changed from std::string
to std::string&
. Now, when printValue(x)
is called, reference parameter y
is bound to argument x
. Binding a reference is always inexpensive, and no copy of x
needs to be made. Because a reference acts as an alias for the object being referenced, when printValue()
uses reference y
, it's accessing the actual argument x
(rather than a copy of x
).
Pass by reference allows us to change the value of an argument
When an object is passed by value, the function parameter receives a copy of the argument. This means that any changes to the value of the parameter are made to the copy of the argument, not the argument itself:
#include <iostream>
void addOne(int y) // y is a copy of x
{
++y; // this modifies the copy of x, not the actual object x
}
int main()
{
int x { 5 };
std::cout << "value = " << x << '\n';
addOne(x);
std::cout << "value = " << x << '\n'; // x has not been modified
return 0;
}
In the above program, because value parameter y
is a copy of x
, when we increment y
, this only affects y
. This program outputs:
value = 5
value = 5
However, since a reference acts identically to the object being referenced, when using pass by reference, any changes made to the reference parameter will affect the argument:
#include <iostream>
void addOne(int& y) // y is bound to the actual object x
{
++y; // this modifies the actual object x
}
int main()
{
int x { 5 };
std::cout << "value = " << x << '\n';
addOne(x);
std::cout << "value = " << x << '\n'; // x has been modified
return 0;
}
// Output
value = 5
value = 6
In the above example, x
initially has value 5
. When addOne(x)
is called, reference parameter y
is bound to argument x
. When the addOne()
function increments reference y
, it's actually incrementing argument x
from 5
to 6
(not a copy of x
). This changed value persists even after addOne()
has finished executing.
Pass by reference can only accept modifiable value arguments
Because a reference to a non-const value can only bind to a modifiable value, this means that pass by reference only works with arguments that are modifiable values.
#include <iostream>
void printValue(int& y) // y only accepts modifiable lvalues
{
std::cout << y << '\n';
}
int main()
{
int x { 5 };
printValue(x); // ok: x is a modifiable lvalue
const int z { 5 };
printValue(z); // error: z is a non-modifiable lvalue
printValue(5); // error: 5 is an rvalue
return 0;
}
Unlike a reference to non-const (which can only bind to modifiable values), a reference to const can bind to modifiable values, non-modifiable values, and values (rvalues). Therefore, if we make a reference parameter const, then it will be able to bind to any type of argument:
#include <iostream>
void printValue(const int& y) // y is now a const reference
{
std::cout << y << '\n';
}
int main()
{
int x { 5 };
printValue(x); // ok: x is a modifiable lvalue
const int z { 5 };
printValue(z); // ok: z is a non-modifiable lvalue
printValue(5); // ok: 5 is a literal rvalue
return 0;
}
Passing by const reference offers the same primary benefit as pass by reference (avoiding making a copy of the argument), while also guaranteeing that the function can not change being referenced.
For example, the following is disallowed, because ref
is const:
void addOne(const int& ref)
{
++ref; // not allowed: ref is const
}
In most cases, we don't want our functions modifying the value of arguments.
Favor passing by const reference over passing by non-const reference unless you have a specific reason to do otherwise (e.g., the function needs to change the value of an argument).
Now we can understand the motivation for allowing const value references to bind to rvalues: without that capability, there would be no way to pass literals to functions that used by pass by reference.
References vs Pointers
write a different chapter for it