Data Representation

As we all know, that the smallest piece of information that a CPU processes is a “bit” A bit is a small integer or Boolean type value, either 0 (off/false) or 1 (on/true).

Bits:

The smallest and the foundational unit of data in computing, representing a binary digit (0 or 1). Bits are the building blocks of all digital information and serve as the foundation for data representation.

Byte:

Consists of 8 bits. Bytes are the basic unit of storage in computer systems and are commonly used to represent characters, numbers, and other data types.

Nibble:

Half of a byte, comprising 4 bits. Nibbles are often used in hexadecimal notation

Data Sizes

  • 16-bit: Represents data that is 16 bits wide. Commonly used in older computer systems and microcontrollers.
  • 32-bit: Represents data that is 32 bits wide. Widely used in modern computing systems.
  • 64-bit: Represents data that is 64 bits wide.

Notation

  • Binary: Base-2 numeral system, consisting of only two digits: 0 and 1. Each digit represents a power of 2, with the rightmost digit corresponding to 2^0, the next digit to 
  • Octal: 
  • Decimal: Base-10 numeral system, commonly used in everyday arithmetic. Consists of digits ranging from 0 to 9. Each digits represents a power of 10.
  • Hexadecimal: Base-16 numeral system, Consists of digits ranging from 0 to 9 and letters A to F, representing values from 0 to 15. Each hexadecimal digit corresponds to a nibble (4 bits) of binary data.

Endianness

It refers to the order in which bytes are stored in memory or transmitted over a network. It dictates whether the most significant bytes (MSB) or the least significant bytes (LSB) of a multi-byte data type comes first.

There are two primary types of endianness:

Big-Endian: In a big-endian system, the most significant byte (MSB) occupies the lowest memory address, while subsequent bytes are stored in increasing order of significance. In other words, the MSB comes first in memory. ARM processors can support both little-endian and big-endian modes, but the majority of ARM-based devices uses little-endian byte ordering. Many network protocols, such as Ethernet and IP (Internet Protocol) use big-endian byte ordering for consistency across different systems.

For example, consider the hexadecimal representation Ox12345678.

In a big-endian system, the bytes would be stored in memory as follows:

Address  Content
-------  -------
0x00     12
0x01     34
0x02     56
0x03     78

Little-Endian: In contrast, little-endian systems store the least significant bytes (LSB) at the lowest memory address, with subsequent bytes following in increasing order of significance. In other words, the LSB comes first in memory. Intel use the little-endian byte ordering.

Using the same example Ox12345678, in a little-endian system, the bytes would be stored as:

Address  Content
-------  -------
0x00     78
0x01     56
0x02     34
0x03     12