Primitive data structures

Overview

When it comes to programming, data structures are the foundation upon which complex algorithms and applications are built. In C+

What are Primitive Data Structures?

Primitive data structures, also known as basic data types, are the fundamental building blocks of any programming language. They are predefined by the language and used to represent simple values like integers, floating-point numbers, characters, and boolean values. In C++, these data types provides a way to store and manipulate basic data effectively.

TypeKeyword
Booleanbool
Characterchar
Integerint
Floating Pointfloat
Double Floating Pointdouble
Valuelessvoid

Types of Primitive Data Structures in C++

1️⃣ Integer:

The keyword int can represent integer data types. The size of the int data type in C++ can vary depending on the architecture and compiler being used. The C++ standard does not specify the exact size of int, but it mandates that int must be at least 16 bits in size.

Here are common size variations for the int data type on different architectures:

  1. 16-bit Systems = On some older or embedded systems, int may be 16 bits (2 bytes) in size meaning it can represent values from -32768 to 32767.
  2. 32-bit Systems = On many 32-bit architectures, including x86, int is typically 32 bits (4 bytes) in size, allowing it to represent values from -2147483648 to 2147483647.
  3. 64-bit Systems  = On most modern 64-bit architectures, such as x86_64 and ARM64, int remains 32 bits (4 bytes) in size, the same as on 32-bit systems. This is done for compatibility reasons, as changing the size of int could break existing code.

You can check the size in your system using sizeof operator by following code:

#include <iostream>

int main() {
    std::cout << "Size of int: " << sizeof(int) << " bytes" << std::endl;
    return 0;
}

//Output
Size of int: 4 bytes

So for our case, The range of integers is -2147483648 to 2147483647, and they take up 4 bytes of memory which is 4*8 = 32 bits.

Some properties of the int data type are:

  • Being a signed data type, it represents integer values, both positive and negative i.e whole numbers.
  • Take a size of 32 bits where 1 bit is used to store the sign of the integer (the leftmost bit is used to store the sign).
  • A maximum integer value that can be stored in an int data type is typically 2147483647 around (2 raised to power 31) - 1. While the size of int is 32 bits but the left most bit is used to store sign 0 means positive 1 means negative. So we left 31 bits for the storage. So the maximum value that can be represented with (N-1) bits is 2^(N-1)-1, 2^(31)-1 = 2147483647. This subtraction of 1 accounts for the fact that we start counting from 0.
  • The maximum value that can be stored in int is stored as a constant in <climits> header file whose value can be used as INT_MAX.
  • A minimum integer value that can be stored in an int data type is typically -2147483648 around (-2 raised to power 31). So the minimum value that can be represented with (N-1) bits is 2^(N-1), 2^31 = -2147483648.

Overflow and underflow = These issues arise when the result of an operation exceeds the capacity (range) of the data type. Here`s what happens in each case:

Overflow = Overflow occurs when the result of an operation is larger than the maximum value that can be represented by the data type. In many, C++ implementations, when overflow occurs, the value wraps around to the minimum value that can be represented by the data type. So, 2147483648 wraps around to -2147483648.

For example : 

#include <iostream>
using namespace std;
int main(){
    int data = 2147483647;
    cout << "Initial Data = " << data << endl;
    data++;// Overflow occurs here.

    cout << "After Increment Data = " << data << endl;
}

Output: 
Initial Data = 2147483647
After Increment Data = -2147483648

Underflow = It occurs when you subtract a number from the minimum value an int can represent. In the case of a 32 bit int, the minimum value is -2147483648.

If you subtract 1 from this minimum value, the result would mathematically be -2147483649, which is outside the representable range. So similar to overflow, underflow often results in the value wrapping around to the maximum representable value. So, -2147483649 wraps around to 2147483647:

#include <iostream>
using namespace std;
int main(){
    int data = -2147483648;
    cout << "Initial Data = " << data << endl;
    data--;// Underflow occurs here.

    cout << "After Decrement Data = " << data << endl;
}

Output:
Initial Data = -2147483648
After Decrement Data = 2147483647

There are four type modifiers in C++:

  • short
  • long
  • signed
  • unsigned

short:

  • The short modifier is used to create integer variables with a smaller range of values compared to regular int variables.
  • It typically takes up less memory compared to int.
  • The exact size of a short variable is platform-dependent but is often 16 bits (2 bytes).
  • The range of values a short can represent is approximately -32768 to 32767 for 16-bit short.
// small integer
short a = 123;
// Here a is a short integer variable.

Note: short is equivalent to short int.

#include <iostream>
using namespace std;
int main(){
    // short
    short shorty = 1234;
    cout<<"Size = "<<sizeof(shorty)<< " Bytes"<<endl;
}

// Output
Size = 2 Bytes

long:

  • The long modifier is used to create integer variables with a larger range of values compared to regular int variables.
  • It typically takes up more memory compared to int.
  • The exact size of a long variable is platform-dependent, In 32 bit system it is of 4 Bytes and in 64 bit system it is of 8 Bytes.
long bigNumber = 1000000;

Note: long is equivalent to long int.

 

#include <iostream>
using namespace std;
int main(){
    // long
    long longie = 123456789;
    cout<<"Size = "<<sizeof(longie)<< " Bytes"<<endl;
}

// Output
Size = 8 Bytes

signed:

  • The signed modifier is used to explicitly specify that a variable can represent both positive and negative values.
  • In C++, most integer data types, including int, short and long are signed by default, meaning they can represent both positive and negative numbers.
  • You can use signed to indicate signedness explicitly, although it`s unnecessary.
signed int temperature = -10; // 'signed' is often implicit

Note:

By default, integers are signed, Hence instead of signed int, we can directly use int.

signed and unsigned can only be used with int and char types.

unsigned:

  • The unsigned modifier is used to create variables that can only represent non-negative values (zero and positive numbers).
  • When you declare a variable as unsigned, it effectively doubles the positive range but eliminates the ability to represent negative values.
  • unsigned is often used when you want to ensure that a variable stores only non-negative quantities, such as indices, sizes, or flags.

Example:

unsigned int count = 42; // Only non-negative values are allowed

Below is the summary of size of various integer data modifiers in 64 bit system.

Data-TypeSize (in Bytes)Range
int or signed int4 Bytes-2147483648 to 2147483647
unsigned int4 Bytes0 to 4294967295
short int 2 Bytes-32768 to 32767
long int8 Bytes 
unsigned short int2 Bytes0 to 65535
unsigned long int8 Bytes 

 

2️⃣ Floating Point:

Floating point data type is used for storing single-precision floating-point values or decimal values. The keyword used for the floating-point data type is float. The float variable has a size of 4 bytes.

  • Range: The range of a float variable is approximately form -3.4e38 to 3.4e38. These values are often represented as -FLT_MAX and FLT_MAX in C++ to denote to minimum and maximum finite values, respectively.
  • Precision: float provides about 7 decimal digits of precision. This means that it can accurately represent numbers with up to 7 significant digits. Beyond that, rounding errors may occur.

Its important to note that while float can represent a wide range of values, it sacrifices precision for that range. Therefore, it may not be suitable for applications that require high precision, such as financial calculations or scientific simulations. In such cases, the double or long double data types, which offer greater precision at the cost of slightly increased memory usage, are often preferred.

#include <iostream>
#include <limits>

int main() {
    float myFloat = 3.14159f; // Note the 'f' suffix to indicate a float literal

    std::cout << "Minimum float value: " << -std::numeric_limits<float>::max() << std::endl;
    std::cout << "Maximum float value: " << std::numeric_limits<float>::max() << std::endl;
    std::cout << "Value of myFloat: " << myFloat << std::endl;

    return 0;
}

// Output
Minimum float value: -3.40282e+38
Maximum float value: 3.40282e+38
Value of myFloat: 3.1415

Note: There is no such thing as an unsigned float.

The concept of unsigned is typically associated with integer data types like unsigned int or unsigned char, and it signifies that the variable can only store non-negative values (zero or positive integers) and does not have a sign bit for negative numbers.

3️⃣ Double Floating Point:

double is the keyword used to hold floating-point numbers (decimals and exponentials) with double precision. The double variable has a size of 8 bytes. It offers higher precision compared to float data type.

  • Precision = double provides approximately 15-16 decimal digits of precision. This means it can accurately represent real numbers with up to 15-16 significant digits.
  • Range = The range of a double variable is vast, typically from approximately -1.7e308 to 1.7e308 (negative and positive). These values are often represented as -DBL_MAX and DBL_MAX to denote the minimum and maximum finite values, respectively.
  • Size = A double variable typically occupies 8 bytes (64 bits) of memory in most C++ implementations. The larger size allows it to store more significant digits and a wider range compared to float.
  • Usage = It commonly used in application where high precision is required, such as scientific calculations, engineering simulations, financial modeling, and any scenario where the accuracy of real numbers is crucial.
  • Declaration  = To declare a double variable in C++, you can use double keyword. 

Ex:

#include <iostream>
#include <limits>

int main() {
    double myDouble = 1234.56789;

    std::cout << "Minimum double value: " << -std::numeric_limits<double>::max() << std::endl;
    std::cout << "Maximum double value: " << std::numeric_limits<double>::max() << std::endl;
    std::cout << "Value of myDouble: " << myDouble << std::endl;

    return 0;
}
//Output
Minimum double value: -1.79769e+308
Maximum double value: 1.79769e+308
Value of myDouble: 1234.57

long double:

  • The long double data type is used to represent extended precision floating-point numbers, providing even higher precision compared to double.
  • It typically occupies more than 8 bytes of memory (16 bytes = 128 bits) which may vary depending on the compiler and platform.
  • long double provides significantly more precision than double often with 18-19 or more decimal digits of precision.
  • The range of a long double variable is also quite large, though the exact range and representation can vary across different systems.
  • long double is used in applications where extremely high precision is required.

4️⃣ Character Types:

character:

The keyword char represent characters. It is 1 Byte in size. Single quotes are used to enclose characters in C++. It is often used to store the single letter, digits, and other characters from the character set.

  • Size = The char data type typically occupies 1 Byte of memory, which is equivalent to 8 bits. This is because it is designed to store a single character from the basic character set, which included letters, digits, punctuation, and control characters.
  • Character Set = char is designed to hold characters from the ASCII (American Standard Code for Information Interchange) character set by default. ASCII characters include the English alphabet (both uppercase and lowercase), digits (0-9), punctuation marks, and various control characters like newline and tab.

For example = 

char myChar = 'A'; // Assigns the character 'A' to myChar
  • Character Literals = Characters in C++ are enclosed in single quotes, as shown in the example above. You can also use escape sequences to represent special characters, such as ‘\n’ for a newline or ‘\t’ for a tab.
  • Unsigned or Signed = The char data type can either signed or unsigned, depending on the compiler and system. By default, it is signed on most systems, which means it can represent both positive and negative values. If you need unsigned character, you can use unsigned char explicitly.
#include <iostream>
#include <limits>

int main() {
    char charie = 'M';
    std::cout<<"size of char = "<< sizeof(charie) <<" byte"<< std::endl;
    return 0;
}
//Output
size of char = 1 byte

Character are basically only 2 types:

char:

  • char is an 8-bit (1 byte) character data type.
  • It can store a single character from the ASCII character set.
  • By default char is signed

unsigned char:

  • unsigned char is also a one-byte (8-bit) integer type, but it is alwasy treated as unsigned , meaning it can only hold non-negative values (0 and positive integers).
  • Unlike a regular char, which can have a range that includes negative and positive values (typically -128 to 127 for a signed char), and unsigned char has a range of 0 to 255.

wchar_t:

  • This is a wide character type used to represent character from a larger character set, often Unicode.
  • It is commonly used when working with multilingual or internationalized text.
  • In 64 bit system it is of 4 Byte
#include <iostream>
#include <limits>

int main() {
    char charie = 'M';
    wchar_t myWideChar = L'Ω';
    std::cout<<"size of char = "<< sizeof(charie) <<" byte"<< std::endl;
    std::cout<<"size of wchar = "<< sizeof(myWideChar) <<" byte"<< std::endl;
    std::cout<<"myWideChar = "<< myWideChar<<std::endl;
    std::cout<<"charie = " << charie<<std::endl;
    return 0;
}
// Output
size of char = 1 byte
size of wchar = 4 byte
myWideChar = 937
charie = M

5️⃣ Boolean:

Boolean data type is represented by bool keyword. bool is a built-in data type in C++. It is one of the fundamental data types provided by the language.

A bool typically occupies one byte (8 bits) in memory, although it only needs one bit to present true or false values. However, C++ standardized it to at least one byte to ensure memory alignment and compatibility with other data types.

Example = 

#include <iostream>
#include <limits>

int main() {
    bool boolie = true;
    std::cout<<"value of boolie = "<< boolie<<std::endl;
    std::cout<<"size of bool = "<<sizeof(bool)<<" byte"<<std::endl;

    std::cout<<"false is = "<<false<<std::endl;
    std::cout<<"true is = "<<true<<std::endl;
    
    return 0;
}
// Output
value of boolie = 1
size of bool = 1 byte

false is = 0
true is = 1

Important Points:

  • The default numeric value of true is 1 and false is 0.
  • We can use bool-type variables or values true and false in mathematical expression also, For instance
int x = false + true + 6;
// Output = 0 + 1 + 6 = 7 
  • It is also possible to convert implicitly the data type integers or floating point values to bool type. For instance,
bool x = 0; //false
bool y = 7; //true
bool z = 1.23 // true.

6️⃣ Void or Valueless:

The term void refers to something that has no worth. The void data type represents a valueless entity. Variables of the void type cannot be declared. It is only used for functions, not returning any data.