The A20 Line

1 A Little Background in Quick

1.1 8086 Processor 16-bit Processor

When the Intel 8086 microprocessor was released in 1978, it was 16-bit processor and had a 20-bit address bus. This allowed it to address up to 1 MB of memory, from address 0x00000 to 0xFFFFF. Memory addressing in the 8086 was managed through segment pairs, where the segment was a 16-bit value shifted left by 4 bits, and the offset was a 16-bit value added to the shifted segment. This provided the addressable range but also caused an overlap: addresses like 0xFFFF:0x0010 would wrap around to address 0x00000.

Characteristics and Capabilities of 16-bit processor:

  • Data Bus:
    • The Intel 8086 has a 16-bit data bus. This means it can transfer 16 bits of data between the CPU and memory or peripherals in a single clock cycle.
    • A wider data bus allows for faster data transfer compared to processors with narrower buses (e.g., 8-bit data bus).
  • Address Bus:
    • The Intel 8086 has a 20-bit address bus. This allows it to address up to 1 MB (2^20 bytes) of memory.
      • 2^20 = 1048576 Unique locations and each location consists of 1 Byte, means 1048576 bytes which is 1024 KB (1 MB).
    • The 20-bit address bus determines the maximum addressable memory space. Each address represents a unique location in memory.
  • Registers:
    • The 8086 has 16-bit general-purpose registers (AX, BX, CX, DX, SI, DI, BP, SP) and segment registers (CS, DS, ES, SS).
    • The use of 16-bit registers allows the processor to manipulate data in 16-bit chunks efficiently.

As 8086 CPU has 16-bit registers, which means each register can hold a value between 0x0000 and 0xFFFF or 0 to 65535 in decimal (2^(16-1)). If we were to use a single 16-bit register to address memory directly, it would only be able to access 64 KB (2^16 bytes) of memory. This was the limitation of having 20-bit address space and 16-bit registers. This problem was over come with the use of segmentation model.

I guess you are familiar with the segmented model, which solved the problem of addressing of 1 MB of memory through 16 bit register in 8086 processor.

Segmentation:

It is a kind of memory management technique of dividing the memory into distinct manageable segments. Like a chunk of memory. It basically divides the total addressable memory space into smaller, logical segments. Segmentation allows the CPU to handle more memory than what can be addressed directly by its 16-bit registers alone, providing a flexible and efficient way to manage memory in systems with larger address spaces.

Key Points of Segmentation:

  1. Logical Division: Memory is divided into segments, such as code segment, data segment, stack segment, and extra segment. Each segment can be up to 64 KB in size.
  2. Segment Registers: The processor uses segment registers (CS, DS, SS, ES) to hold the starting addresses of these segments. Each segment register is 16 bits wide.
  3. Offset: Within each segment, an offset (also 16 bits) specifies the exact location of data or instructions. The combination of a segment base address and an offset forms a complete address.
  4. Address Calculation: The physical address is calculated using the formula:
    Physical Address = (Segment * 16)  + Offset
    This formula shifts the segment value left by 4 bits (or multiplies by 16) and adds the offset to generate a 20-bit physical address.
  5. Address Space Utilization: Segmentation enables the processor to utilize its full 1 MB address space, overcoming the limitations of 16-bit registers that can only address up to 64 KB directly.

Segment and Offset: The Intel 8086 processor used a segmented memory model where a physical address was computed using a combination of a segment register and an offset. The physical address was calculated as:

Physical Address = (Segment x 16)  + Offset

Each Segment was 64 KB in size (2^16 bytes), and offsets were 16-bit values ranging from 0 to 65535 (2^16 - 1).

When a program referenced memory, the processor would use the segment specified in a segment register (such as CS for code segment, DS for data segment, etc.) and the offset to calculate the physical address.

  • For example, if CS (code segment register) contained 0x1000 and IP (instruction pointer, part of offset) contained 0x0010, the physical address would be:
    • Physical Address = (0x1000 * 16) + 0x0010
    • 0x100000 + 0x0010
    • 0x10010, This is the resulted physical address.

Address Wrapping in 8086:

Due to the segmented architecture, addresses could theoretically go beyond the 1 MB limit of the 8086's 20-bit address bus.

When an address exceeded 0xFFFFF (1 MB in hexadecimal), the 8086 did not generate an exception but wrapped around due to the limited 20-bit address space.

For example:

  • Suppose the CS register contained 0xFFFF and IP contained 0xFFF0.
  • The physical address would be:
    • Physical Address = (0xFFFF * 16) + 0xFFF0
    • 0xFFFF0 + 0xFFF0
    • 0x10FFE0
    • However, because of the 20-bit limit, the actual physical address would wrap around:
      • To find the wrapped around address, simply take the initial physical address modulo 0x100000 (1 MB in hexadecimal).
      • Wrapped Physical Address  = Initial Physical Address mod 0x100000
      • Wrapped Physical Address = 0x10FFE0 mod 0x100000
      • Wrapped Physical Address = 0x00FFE0
    • Thus, an attempt to access address 0xFFFF0 would actually access address 0x00FFE0

1.2 8088 Processor

Data Bus Width:

  • The 8088 has an 8-bit external data bus, compared to the 16-bit external data bus of the 8086.
  • Internally, it still processes 16-bit data and uses 16-bit registers.

Address Bus:

  • Both the 8086 and 8088 have a 20-bit address bus, allowing them to address up to 1 MB of memory.
  • Like the 8086, it had a 20-bit address bus and could directly address 2^20 memory locations. These 20 bits could hold a number between 0 and 2^20 = 1,048,576, which is a number of different memory cells we could address. But we never address individual cells - rather we could address blocks of 8 cells (one byte).
  • Physical address calculation: Physical Address = (Segment * 16 ) + Offset

Wraparound Behavior:

  • In the Intel 8086 processors, which have a 20-bit address bus, memory addresses wrap around when they exceed 1 MB (0xFFFFF in hexadecimal).
  • Example: If the address calculation results in 0x100000, the processor wraps it around to 0x00000, maintaining compatibility with software that relies on this behavior.

1.3 The 80286 Processor

It was introduces in 1982, the Intel 80286 processor had a 24-bit address bus, capable of addressing up to 16 MB of memory.

It had two modes of operating:

  • Read Mode = In this mode, 80286 maintained backward compatibility with the 8086/8088 by using the same 20-bit address calculation.
    • When in real mode the 80286 supposed to behave exactly the same as an 8088 or 8086, for compatibility.
  • Protected Mode = In Protected Mode, it could address the full 24-bit address space.

Now there was a need to maintain backward compatibility with software designed for the 8086/8088 which might have relied on wrap around functionality.

So the concept of A20 Line was introduced.

2 A20 Line:

The A20 address line is the physical representation of the 21st bit (number 20, counting from 0) of any memory access. It was introduced with 80286 processor to maintain backward compatibility with 8086/8088.

  • Wrapping Around is the reason for the introduction of A20 Line.

Well the 8086/8088 processor wrap arounds (to beginning) the memory access of more than 1 MB address range. Means when we try to access the address after reaching the 1 MB limit it wraps arounds back from starting. But in 80286 we got 24-address lines which means 16 MB address space, in that case many programs written for the 8086/8088 relied on the wraparound behavior for addressing. If the 80286 and later processors didn't emulate this behavior, those programs could malfunction.

Thus, to maintain this behavior for the compatibility to 8086/8088 processor, the A20 line is used.

  • When Gate A20 is Disabled:
    • The A20 line is forced to 0, causing the address to wrap around at the 1 MB boundary.
    • This mimics the 8086/8088 behavior, ensuring compatibility with legacy software.
  • When Gate A20 is Enabled:
    • The A20 line functions normally, allowing access to the full addressable memory space beyond 1 MB.
    • This is useful in Protected Mode and for applications needing access to more memory.

Every processors after the 80286 has the A20 gate.

Intel 80386 (386):

  • It was the first 32 bit microprocessor. It has 32 bits address bus allowing it to address up to 4 GB of physical memory.
  • It had 32-bit data bus.
  • It has three operating modes, real mode, protected mode, and virtual 8086 mode, catering to both backward compatibility.
    • Real Mode: Compatible with earlier x86 processors (8086/8088), supporting 16-bit addressing and compatibility with DOS applications.
      • In Real mode, the 80386 emulated the 8086/8088 behavior for memory addressing including the wraparound at 1 MB.
    • Protected Mode: Enabled 32-bit addressing, memory protection, and multitasking capabilities, crucial for modern operating systems like OS/2 and early versions of Windows.
      • In Protected Mode, the A20 gate is typically enabled to access the full 32-bit address space.
    • Virtual 8086 Mode: Allowed multiple virtual 8086 environments within Protected Mode, supporting DOS applications seamlessly.
  • A20 Gate Control: The A20 gate can be controlled to mimic the behavior of the 8086/8088:
    • A20 Line Enabled: Addresses beyond 1 MB are accessible, and the A20 line functions normally.
    • A20 Line Disabled: Forces the A20 line to 0, causing addresses to wrap around at 1 MB, replicating the behavior of the older processors.

A20 Gate State:

During the Power-On Self Test (POST) process, the BIOS (Basic Input/Output System) sets up the A20 gate based on configuration. But mostly it is disabled to let the processor boot in real mode, and explicitly enabled by the bootloader while switching to protected mode to be able to access the complete memory.

Different Methods of Enabling A20

In order to get access to whole of the memory we must have to enable the A20 gate. There are different methods historically used to enable the A20 line:

  1. BIOS Method
    1. Calling a BIOS function to enable A20
  2. Keyboard Controller Method
    1. Sending a command to the keyboard controller to enable A20..
  3. System Port
    1. Toggling the A20 gate directly through hardware.

But first we should check its state:

Checking A20 Gate State:

The A20 gate's significance stems from the early days of PC architecture, particularly with Intel's 8086 processor and subsequent models. These processors were designed with a 20-bit address bus, allowing them to address up to 1 MB of memory directly. However, due to certain design choices and compatibility considerations, the A20 gate was introduced to control how memory addresses wrap around beyond the 1 MB limit.

Sometimes we need to know if the A20 is already enabled or not. There are various methods of doing so, but the most used one is of writing to memory beyond the 1 MB range:

We will check if wrapping around is happening to check if A20 is enabled or not.

  • Step 1: Store Original Value of Test Memory Location.
    • We will use two memory value one is 0x0500 and the other is 0xFFFF0  + 0x0510 = 0x100500.
    • The first thing is to save the original value from these address in stack.
  • Step 2: Write test values.
    • Now write test value at these address for example,
      • Write 0x00 to 0x500 and 0xFF to 0x100500.
      • But why,
        • Suppose we write the value 0 at memory 0x0500 and value 0xFF at memory 0xFFFF0 + 0x0510 = 0x100500 which would point to same 0x0500 if there is wrapping around as explained below:
        • Address range When A20 Disabled = 0x00000 - 0xFFFFF (1 MB) 
        • Accessing 0x100000 would wrap around to 0x00000.
        • So to find the wrapped around address we find out the modulo of that address with 0x100000.
        • So in our case, 0x100500 % 0x100000 = 0x0510. which is same as last address on which we write 0x00.
      • Thus writing at location 0x100500 would effectively over-write at 0x0510.
  • Step 3: Compare Values
    • Now compare the value at 0x0510 with the value which we write at 0x100500 which is 0xFF.
    • If it is same means A20 is disabled and as the wrap around happened.
  • Step 4: Restore Original Values From Stack to Test Memory Addresses.
    • Restore the original value from the stack to the test memory addresses which we saved earlier.
  • Step 5: Return the status.
    • Return the status indicating whether the A20 line is enabled or disabled.

Here is the complete code of doing so.

; ********************************
; CheckA20
; OUT: 
;   - AX: 0 If A20 is disabled, 1 If enabled
; ******************************** 
CheckA20:
    pushf               ; Push flags register onto stack
    push    ds          ; Push ds register onto stack
    push    es          ; Push es register onto stack
    push    di          ; Push di register onto stack
    push    si          ; Push si register onto stack

    cli                 ; Disable interrupts

    xor     ax, ax      ; Clear ax register (ax = 0)
    mov     es, ax      ; Set es segment register to 0

    not     ax          ; Invert ax (ax = 0xFFFF)
    mov     ds, ax      ; Set ds segment register to 0xFFFF

    mov     di, 0x0500  ; Set di to offset 0x0500
    mov     si, 0x0510  ; Set si to offset 0x0510

    mov     al, byte [es:di]  ; Read byte at es:di into al
    push    ax          ; Push ax onto stack

    mov     al, byte [ds:si]  ; Read byte at ds:si into al
    push    ax          ; Push ax onto stack

    mov     byte [es:di], 0x00  ; Write 0x00 to byte at es:di which is 0x500
    mov     byte [ds:si], 0xFF  ; Write 0xFF to byte at ds:si 0xFFFF0+0x0510 = 0x100500, because it is equal to 0x100500 % 0x510 = 0x500, in case of wrap around.

    cmp     byte [es:di], 0xFF  ; Compare byte at es:di with 0xFF
    							; If disabled values would be same
    							; If not disabled different values.

    pop     ax          ; Pop ax from stack into al
    mov     byte [ds:si], al    ; Restore original value at ds:si

    pop     ax          ; Pop ax from stack into al
    mov     byte [es:di], al    ; Restore original value at es:di

    mov     ax, 0       ; Set ax to 0 (default value for disabled)

    je      CheckA20__Exit   ; Jump if equal to exit

    mov     ax, 1       ; Set ax to 1 (enabled)

CheckA20__Exit:
    pop     si          ; Pop si register from stack
    pop     di          ; Pop di register from stack
    pop     es          ; Pop es register from stack
    pop     ds          ; Pop ds register from stack
    popf                ; Pop flags register from stack

ret                 ; Return from subroutine

1 BIOS Method:

This is a straightforward way to enable the A20 line without directly manipulating hardware ports. But it is not guaranteed to work Because not all BIOS support this method. To be want to support wide variety of system, you might want to include additional methods to enable the A20 line, which we will discuss after it.

Modern BIOS implementations provide a function to enable the A20 line which is INT 0x15 with AX = 0x2401.

Here is the code of doing so:

; ********************************
; A20MethodBios
; ******************************** 
A20MethodBios:
	; Bios functon to enable A20
	; AX = 2401h specifies the function to enable A20
	mov 	ax, 0x2401
	int 	0x15		; Call BIOS interrupt 15h to enable A20 line
	ret					; Return from the subroutine

2 Keyboard Controller:

The 8088 in the original PC had only 20 address lines, good for 1 MB. The maximum address FFFF:FFFF addresses 0x10ffef, and this would silently wrap to 0x0ffef. When the 286 (with 24 address lines) was introduced, it had a real mode that was intended to be 100% compatible with the 8088. However, it failed to do this address truncation (a bug), and people found that there existed programs that actually depended on this truncation. Trying to achieve perfect compatibility, IBM invented a switch to enable/disable the 0x100000 address bit. Since the 8042 keyboard controller happened to have a spare pin, that was used to control the AND gate that disables this address bit. The signal is called A20, and if it is zero, bit 20 of all addresses is cleared.

In the context of enabling the A20 line, the "keyboard ports" refer to specific I/O ports used to communicate with the keyboard controller (8042). These ports are:

  • Command/Status Port (0x64): Used to send commands to the keyboard controller and read its status.
  • Data Port (0x60): Used to read data from or write data to the keyboard controller.

The keyboard controller (8042) is used to manage the keyboard interface and other system functions, including enabling the A20 line.

These are following steps to enable the A20 line through the Keyboard Controller:

  1. Clear Interrupts
  2. Check if Input Buffer is Full
  3. Send Command to Keyboard Controller
  4. Wait for Input Buffer to be Empty
  5. Send Data to Enable A20 Line
  6. Re-enable Interrupts

1 Clear Interrupts

Before starting the process, disable interrupts to prevent any interference.

cli ; Clear interrupts

2 Check if Input Buffer is Full

Read the status port (0x64) of the keyboard controller. If bit 1 of the status register is set, the input buffer is full, and you need to wait until it becomes empty.

in al, 0x64 ; Read status port
test al, 2  ; Check if input buffer is full (bit 1)
jnz $-2     ; If full, loop until it is empty (If not zero keep looping)

3 Send Command to Keyboard Controller

Send the command 0xD1 to the command port (0x64). This command tells the keyboard controller that we want to write to its output port.

mov al, 0xD1 ; Command to write to output port
out 0x64, al ; Send command to keyboard controller

4 Wait for Input Buffer to be Empty

Again, check if the input buffer is full and wait until it becomes empty before proceeding.

in al, 0x64 ; Read status port again
test al, 2  ; Check if input buffer is full (bit 1)
jnz $-2     ; If full, loop until it is empty

5 Send Data to Enable A20 Line

Send the data 0xDF to the data port (0x60). This value sets bit 1 in the output port, which enables the A20 line.

mov al, 0xDF ; Data to enable A20 line (set bit 1)
out 0x60, al ; Send data to keyboard controller

6 Re-enable Interrupts

Finally, re-enable the interrupts to allow the system to resume normal operation.

sti ; Re-enable interrupts

Complete Code:

; ********************************
; A20MethodKeyboardController
; ******************************** 
A20MethodKeyboardController:
        cli                     ; Clear interrupts to prevent interference

        call    A20Wait         ; Wait until the input buffer is empty
        mov     al,0xAD         ; Command to disable the keyboard interface
        out     0x64,al         ; Send the command to the keyboard controller

        call    A20Wait         ; Wait until the input buffer is empty
        mov     al,0xD0         ; Command to read the output port
        out     0x64,al         ; Send the command to the keyboard controller

        call    A20Wait2        ; Wait until the output buffer is full
        in      al,0x60         ; Read the output port from the data port
        push    eax             ; Save the value read from the output port on the stack

        call    A20Wait         ; Wait until the input buffer is empty
        mov     al,0xD1         ; Command to write to the output port
        out     0x64,al         ; Send the command to the keyboard controller

        call    A20Wait         ; Wait until the input buffer is empty
        pop     eax             ; Restore the value read from the output port
        or      al,2            ; Set the A20 gate enable bit (bit 1)
        out     0x60,al         ; Write the modified value back to the output port

        call    A20Wait         ; Wait until the input buffer is empty
        mov     al,0xAE         ; Command to enable the keyboard interface
        out     0x64,al         ; Send the command to the keyboard controller

        call    A20Wait         ; Wait until the input buffer is empty
        sti                     ; Re-enable interrupts
        ret                     ; Return from subroutine

; Waits until the input buffer is empty
A20Wait:
        in      al,0x64         ; Read status port of the keyboard controller
        test    al,2            ; Check if the input buffer is full (bit 1)
        jnz     A20Wait         ; If full, wait until it is empty
        ret                     ; Return when the input buffer is empty

; Waits until the output buffer is full
A20Wait2:
        in      al,0x64         ; Read status port of the keyboard controller
        test    al,1            ; Check if the output buffer is full (bit 0)
        jz      A20Wait2        ; If not full, wait until it is full
        ret                     ; Return when the output buffer is full

3 System Control Port (Port 92h)

This method is faster and simpler than the keyboard controller method. It directly accesses the system control port (92h) to enable the A20 gate.

Steps:

  1. Read the current value from port 92h.
  2. Set the A20 gate enable bit.
  3. Write the modified value back to port 92h.
in al, 0x92         ; Read system control port
or al, 2            ; Set A20 enable bit (bit 1)
out 0x92, al        ; Write back to system control port