CLOSE

In the last chapter we enabled the A20 gate, now before entering to the protected mode one last thing which we need to set up is the GDT (Global Descriptor Table). We will delve into the GDT things now and set up it as well.

But before that let's dig something:

Segmentation in protected mode differs significantly from that in real mode. In real mode, segments can overlap, allowing the same memory address to be accessed through different segment-offset combinations. This creates flexibility, but it lacks protection, meaning segments do not restrict access to specific memory areas.

In protected mode, however, segmentation provides memory protection, ensuring that each segment has defined boundaries and access permissions. One segment cannot access the memory of another segment without explicit permission, and any attempt to do so results in a segmentation fault or general protection fault. This segmentation mechanism ensures better isolation between different areas of memory, such as separating user applications from system memory, which improves security and stability in multitasking environments.

GDT helps us achieving that protection between segments.

Recap:

In x86 architecture, segmentation is a memory management technique used to divide the memory space into different segments. These segments have a base address and a limit (or size). This approach helps in organizing memory logically, especially when dealing with the limitations of earlier processors like the Intel 8086, which could only directly address 1 MB of memory.

In real mode, the CPU can access up to 1 MB of memory, divided into segments. The memory address is calculated by combining a segment and an offset. The segment is a 16-bit value stored in one of the segment registers (CS, DS, SS, ES), and the offset is a 16-bit value representing the position within the segment.

Address Calculation:

In real mode, the CPU computes the physical address by combining the value in the segment register and an offset. The formula for calculating the physical address is:

Physical Address = (Segment * 16) + Offset

Here, the segment value is shifted left by 4 bits (multiplied by 16) and added to the offset. This method creates an overlapping memory model, meaning different segment-offset pairs can point to the same physical address. Each segment is 64 KB long, starting at a multiple of 16 bytes. The offset is a 16-bit value (0-65535) added to the segment's base address to calculate the exact memory location.

Example of Address Calculation:

Consider the following segment and offset values:

  • CS (Code Segment) = 0x1000
  • IP (Instruction Pointer) = 0x0020

To calculate the physical address:

Physical Address = (0x1000 * 16) + 0x0020
                 = 0x10000 + 0x0020
                 = 0x10020

So, the CPU will fetch the instruction at physical memory address 0x10020.

Overlapping Segments

Since segments are defined in 16-byte increments, multiple segment-offset pairs can point to the same physical address. For instance:

  • Segment = 0x1000:0x0020 translates to physical address 0x10020.
  • Segment = 0x0FFF:0x1020 also translates to physical address 0x10020.

This overlapping allows flexibility but also introduces complexity when managing memory, as different segment values may refer to the same memory location.

Default Segments for Instructions

In real mode, certain instructions assume specific segment registers by default:

  • CS (Code Segment Pointer): Used to fetch instructions.
    • Holds the segment address of the program's code.
  • SS (Stack Segment Pointer): Used for stack operations.
    • Points to the segment used for the stack.
  • DS (Data Segment Index): Used for most data accesses, especially in string operations.
    • Holds the segments address for accessing data.
  • ES (Extra Segment Index): Used for destination memory addresses in string operations.
    • Used as an additional segment for data, often used in string operations.

Each segment register holds a 16-bit value, which represents the starting address of a 64 KB memory block (segment).

 

IMG_20171007_080255.png
What are segment registers? Why is memory segmented? How is memory ...

Since segment registers are 16-bit and the maximum offset is also 16-bit, each segment can span from Segment Base to Segment Base + 64 KB - 1. In real mode:

  • A maximum of 64 KB can be accessed within each segment.
  • Total memory accessible is 1 MB (from address 0x00000 to 0xFFFFF).

Rules of memory Segmentation:

  • The four can overlap for small programs. In a minimum system all four segments can start at the address 00000H.
  • The segment can begin/start at any memory address which is divisible by 16.

1️⃣ What is the Global Descriptor Table (GDT)❔

The Global Descriptor Table (GDT) is a data structure used by the x86 architecture to define memory segments. It contains segment descriptors that specify the base address, limit, access rights, and other attributes for each segment of memory accessible by the CPU. The GDT is primarily used for managing memory segmentation, which is essential for providing memory protection, multitasking capabilities, and efficient memory access in operating systems.

Purpose of the GDT:

The GDT serves several key purposes in the x86 architecture:

  1. Memory Segmentation: It defines the memory segments that the CPU can address and manage. Each segment descriptor in the GDT specifies the base address and size limit of a memory segment, allowing the CPU to access different parts of memory efficiently.
  2. Memory Protection: Segment descriptors in the GDT include access rights and permissions that control how segments can be accessed. This includes read, write, execute permissions, as well as attributes like privilege level and segment type (code, data, system, etc.). These protections ensure that processes cannot interfere with or access memory outside their designated segments.
  3. Operating System Management: The GDT is managed by the operating system kernel to set up and manage memory segments for user processes and system components. It allows the OS to create separate address spaces for different processes, enforce memory protection policies, and handle context switching between tasks.

Segmentation Faults and Exceptions

The GDT enforces memory protection by triggering segmentation faults or general protection faults when violations occur. For example:

  • If a process tries to access memory outside its segment limit.
  • If a lower-privilege process attempts to access a higher-privilege segment.

This mechanism ensures that:

  • User-level processes cannot corrupt kernel-level memory.
  • Processes cannot access each other’s memory unless explicitly allowed.

Structure of the GDT:

The GDT in protected mode is an array of 8-byte(64 bits) segment descriptors (entries). Fields of the GDT define the segment attributes such as the base address, limit and access rights. Each descriptor (entry) has the following structure:

BitsFieldDescription
0–15Segment Limit (15:0)Defines the lower 16 bits of the segment limit (the size of the segment).
16–31Base Address (15:0)Defines the lower 16 bits of the base address (the starting address).
32–39Base Address (23:16)Defines the next 8 bits of the base address.
40–43TypeDescribes the type of segment (e.g., code, data, system segment).
44S (Descriptor Type)0 = system segment, 1 = code/data segment.
45–46Descriptor Privilege Level (DPL)Defines the privilege level (0 = highest, 3 = lowest).
47P (Present)Segment present flag (1 = segment present, 0 = segment not present).
48–51Segment Limit (19:16)Defines the upper 4 bits of the segment limit.
52AVL (Available)Available for use by the system (software).
53L (64-bit)1 for 64-bit code segment (used in x86-64 long mode).
54D/B (Default Operation Size)0 = 16-bit, 1 = 32-bit (affects code/data size).
55G (Granularity)0 = limit in bytes, 1 = limit in 4 KB units (used to expand segment size).
56–63Base Address (31:24)Defines the upper 8 bits of the base address.
  • Segment Base Address (4 bytes) 32 bits total:
    • Base Address: Specifies the starting address of the segment in physical memory. This forms the base linear address for the segment.
    • It is split across three fields: the first 16 bits (Base (15:0)), the next 8 bits (Base (23:16)), and the last 8 bits (Base (31:24)).
    • Combining these three fields gives a 32-bit base address, allowing segments to be placed anywhere in the 4 GB address space.
  • Segment Limit (20 bits):
    • Limit: Specifies the size limit of the segment. Depending on the Granularity bit, this limit can be interpreted as bytes or pages.
    • Segment Limit (19:16): Upper 4 bits of the segment limit.
    • Segment Limit (15:0): Lower 16 bits of the segment limit.
    • The Segment Limit defines the size of the segment. A segment can be up to 1 MB in size by default, but with granularity (G) enabled, it can be extended to up to 4 GB.
  • Access Rights (1 byte) 8 bits:
    • Defines the type of segment, its privilege level, and whether it is present in memory. It is broken down into:
      • Present (P): 1 bit. If set to 1, the segment is present in memory. If 0, the segment is not present, and access to it will cause a fault.
      • Descriptor Privilege Level (DPL): bits. This specifies the privilege level of the segment (0 = kernel, 3 = user).
      • Segment Type (Type): 4 bits. Describes the type of segment. This field distinguishes between code, data, and system segments and also specifies permissions (read, write, execute). Common values for the Type field:
        • Code Segment: Executable, readable, but not writable.
        • Data Segment: Readable and writable but not executable.
      • Descriptor Type (S): 1 bit. If set to 1, it indicates a code or data segment. If 0, it indicates a system segment (e.g., TSS, LDT).
  • Flags (4 bits):
    • The flags byte contains several bits that define extra attributes:
      • G (Granularity): 1 bit. If set to 1, the segment limit is multiplied by 4 KB, allowing segment sizes up to 4 GB. If 0, the segment limit is interpreted directly, with a maximum size of 1 MB.
      • D/B (Default Operation Size/Big): 1 bit. If set to 1, it indicates 32-bit segment operations. If 0, it indicates 16-bit segment operations.
      • L (Long Mode): 1 bit. If set to 1, it indicates the segment is a 64-bit code segment (used in 64-bit mode).
      • AVL (Available for Use by System Software): 1 bit. This is typically reserved for use by the operating system and is usually set to 0.
image-229.png

Example of a GDT Entry:

To illustrate, let's consider an example of a GDT entry for a code segment:

; Example GDT Entry for a Code Segment

Limit Low  : 0xFFFF       ; Segment limit (16-bit, maximum)
Base Low   : 0x0000       ; Base address (low 16 bits)
Base Middle: 0x0000       ; Base address (middle 8 bits)
Access Byte: 0x9A         ; P=1, DPL=0, S=1, Type=0x2 (Code Segment, Read/Execute)
Flags      : 0xCF         ; G=1 (4 KB granularity), Size=1 (32-bit)
Base High  : 0x00         ; Base address (high 8 bits)

In this example:

  • The segment limit is set to 0xFFFF, indicating a segment size of 64 KB.
  • The base address is set to 0x00000000.
  • The access byte (0x9A) indicates that the segment is present (P=1), accessible from privilege level 0 (DPL=0), and is a code segment (S=1, Type=0x2).
  • The flags (0xCF) specify 4 KB granularity (G=1) and a 32-bit segment (Size=1).
  • The base address high is 0x00, completing the 32-bit base address.

Using the GDT:

Once the GDT is populated with entries, it is loaded into the processor using the LGDT instruction, which loads the base address and size of the GDT from a special descriptor (GDTR).

Loading the GDT

lgdt [gdtr]     ; Load GDTR with the base address and limit of the GDT

Accessing Segments:

  • The CPU uses segment selectors (stored in segment registers such as CS, DS, SS) to index into the GDT.
  • The selected segment descriptor provides information for address translation, memory protection checks, and segment access rights enforcement.

2️⃣ Implementation in TheTaaJ:

gdt.inc:

%ifndef _GDT_INC_
%define _GDT_INC_

BITS 16

; Definitions
%define 		NULL_DESC 		0
%define 		CODE_DESC 		0x8
%define 		DATA_DESC 		0x10
%define			CODE16_DESC		0x18
%define 		DATA16_DESC 	0x20


; ********************************
; InstallGdt
; ********************************
InstallGdt32:
	; Save state
	pushad		; pushes all general purpose 32-bit registers onto the stack
			; This saves their current values so they can be restored later.

	mov	si, sGDT32InstallingSentence
	call	PrintString16BIOS
	call	PrintNewline		; \n

	; Clear interrupts
	cli 		; Disable interrupts to prevent any interruptions.

	; Load Gdt
	lgdt 	[GDT32]

	; Restore & Return
	sti		; Restore interrupts.
	
	mov	si, sGDT32InstalledSentence
	call	PrintString16BIOS
	call	PrintNewline		; \n
	
	popad		; Pops the previously saved values of general purpose 32-bit
				; registers.
	ret

;*******************************************
; Global Descriptor Table
;*******************************************
StartOfGdt:		; Beginning of the GDT section
	dd 0            ; null descriptor
	dd 0 

; gdt code:	        ; code descriptor
	dw 0FFFFh       ; limit low
	dw 0		; base low
	db 0            ; base middle
	db 10011010b    ; access
	db 11001111b    ; granularity
	db 0            ; base high

; gdt data:	        ; data descriptor
	dw 0FFFFh       ; limit low (Same as code)10:56 AM 7/8/2007
	dw 0            ; base low
	db 0            ; base middle
	db 10010010b    ; access
	db 11001111b    ; granularity
	db 0            ; base high

; gdt code 16bit:	; code descriptor
	dw 0FFFFh       ; limit low
	dw 0            ; base low
	db 0            ; base middle
	db 10011010b    ; access
	db 00001111b    ; granularity
	db 0            ; base high

; gdt data 16bit:       ; data descriptor
	dw 0FFFFh       ; limit low (Same as code)10:56 AM 7/8/2007
	dw 0            ; base low
	db 0            ; base middle
	db 10010010b    ; access
	db 00001111b    ; granularity
	db 0            ; base high
EndOfGdt:

; The actual Gdt Header
GDT32:
	dw EndOfGdt - StartOfGdt - 1	; Size
	dd StartOfGdt			; Starting address


sGDT32InstallingSentence db 'Installing the GDT 32...', 0
sGDT32InstalledSentence db 'Installed the GDT 32...', 0

%endif

stage2.asm:

	;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
	call InstallGdt32
	;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

Output:

image-157.png

Complete Source Code:

Get the complete source code here:

https://github.com/The-Jat/TheTaaJ/tree/b0bc6fe85360164c6752af17bd6b2b116502cde4