A stack is a data structure that follows the Last In, First Out (LIFO) principle, where the last element added to the stack is the first one to be removed.
In assembly language, a stack is a fundamental data structure used to store temporary data, such as function parameters, return addresses, and local variables. It operates in a LIFO (Last In, First Out) manner, meaning that the last item pushed onto the stack is the first one to be popped off.
Fundamentals of Stack
In assembly language, a stack is typically implemented using a region of memory known as the stack segment
. This segment grows and shrinks dynamically as elements are pushed onto or popped off the stack.
Assembly language often relies on the stack for:
- Function calls (storing return addresses).
- Parameter passing (for passing arguments to functions).
- Saving register states before using them in a function (so they can be restored later).
The stack is manipulated primarily through two key operations:
- PUSH: Adds an item to the top of the stack.
- POP: Removes an item from the top of the stack.
Stack Operations
- Push Operation: To add data onto the stack, the
push
instruction is used. This instruction decrements the stack pointer and stored the value at the memory location pointed to by the stack pointer. - Pop Operation: Conversely, the
pop
instruction retrieves data from the stack. It retrieves the value from the memory location pointed to by the stack pointer and then increments the stack pointer.
Stack Frame
When a function is called a new stack frame is typically created, containing space for local variables, return addresses, and other function-specific data. The stack frame is then destroyed upon function completion.
The portion of stack allocated for the single function call is called a stack frame.
In other words, for each function call, new space (i.e., stack frame) is created on the stack. Stack grows from high value addresses to lower value addresses.
Stack Pointer
The stack pointer (SP
) is a register that keeps track of the top of the stack. In most architectures, the stack grows downward in memory, meaning that as elements are pushed onto the stack, the stack pointer decrements, and as elements are popped off, the stack pointer increments.
Stack pointer always points to the top of the stack.
- In x86 assembly, this register is usually called
ESP
(Extended Stack Pointer) in 32-bit mode orRSP
in 64-bit mode or simpleSP
in 16-bit mode.
Example:
PUSH EAX ; Decrease ESP and store the value of EAX on the stack
POP EBX ; Increase ESP and move the top value from the stack into EBX
In the example:
PUSH EAX
: The value in theEAX
register is pushed onto the stack, andESP
decreases.POP EBX
: The top value of the stack is popped intoEBX
, andESP
increases.
Usage of the Stack Pointer:
- Tracks the top of the stack, crucial for adding/removing data to/from the stack.
- Used implicitly by instructions like
PUSH
,POP
,CALL
, andRET
.
Base Pointer (BP) or Frame Pointer (FP)
The base pointer (also known as the frame pointer) is a register that helps keep track of the start of the current function’s stack frame. While the stack pointer constantly changes as values are pushed and popped from the stack, the base pointer remains fixed throughout the function call, providing a stable reference point for accessing function parameters and local variables.
- In 32-bit mode, the base pointer is called
EBP
(Extended Base Pointer). - In 64-bit mode, it's called
RBP
(Register Base Pointer).
Function Call and Stack Frame:
When a function is called, the stack frame is set up as follows:
- The previous base pointer (from the calling function) is pushed onto the stack.
- The stack pointer is moved into the base pointer (
EBP
orRBP
) to mark the start of the new stack frame. - Local variables and function arguments are referenced relative to the base pointer, which doesn't change during the function's execution.
This allows for easy access to:
- Parameters (which are stored at higher addresses relative to the base pointer).
- Local variables (which are stored at lower addresses relative to the base pointer).
Example:
Here’s how a typical stack frame looks during a function call:
push ebp ; Save the old base pointer (caller's stack frame)
mov ebp, esp ; Set the current base pointer (start of the new stack frame)
sub esp, 16 ; Allocate 16 bytes of space for local variables
; Function body: use EBP to access parameters and local variables
mov esp, ebp ; Restore the stack pointer to the value of the base pointer
pop ebp ; Restore the old base pointer (caller’s stack frame)
ret ; Return to the caller (pop return address from the stack)
push ebp
: Saves the caller’s base pointer.mov ebp, esp
: Sets the current stack pointer as the new base pointer for this function, marking the start of the current stack frame.sub esp, 16
: Allocates space on the stack for local variables (16 bytes in this case).mov esp, ebp
: Before returning, restores the stack pointer to where it was at the start of the function call (cleans up the local variables).pop ebp
: Restores the caller’s base pointer, so the previous stack frame is correctly set up before returning.
Instructions
push:
This command puts a 64-bit piece of data onto the stack. It adjusts the stack pointer (rsp
) to make space for the new data, and then copies the data to that location in memory. We can't directly push immediate values, only registers or memory locations.
Examples:
push rax
: Pushes the value stored in the rax register onto the stack.push qword [qVal]
: Pushes the 64-bit value stored at the memory address qVal onto the stack.push qVal
Pushes the memory address qVal onto the stack.
pop:
This command takes a 64-bit piece of data off the stack. It adjusts the stack pointer to remove the data that was pushed last, and then stores that data in the specified operand (register or memory location).
Examples:
pop rax
: Pops the top value from the stack and stores it in the rax register.pop qword [qVal]
: Pops the top value from the stack and stores it at the memory address qVal.pop rsi
: Pops the top value from the stack and stores it in the rsi register.
Call:
- The Call instruction automatically pushes the return address (next instruction address) onto the stack and jumps to the target function.
CALL function_label ; Pushes return address and jumps to the function
Ret:
- The
ret
instruction pops the return address from the stack and jumps back to it, thus returning control to the calling function.
RET ; Pops return address and returns to the calling function
Stack Implementation
The rsp register is used to point to the current top of stack in memory. In this architecture as with most, the stack is implemented growing downward in memory:
Stack Layout:
The general layout for a program is as follows:
- The heap is where dynamically allocated data will be placed (if requested). For example, items allocated with the C++ new operator or the C malloc() system call. As dynamically allocated data is created (at run-time), the heap typically grows upward. However, the stack starts in high memory and grows downward. The stack is used to temporarily store information such as call frames for function calls. A large program or a recursive function may use a significant amount of stack space.
- As the heap and stack expand, they grow toward each other. This is done to ensure the most effective overall use of memory.
A program (Process A) that uses a significant amount of stack space and a minimal amount of heap space will function. A program (Process B) that uses a minimal amount of stack space and a very large amount of heap space will also function.
Of course, if the stack and heap meet, the program will crash. If that occurs, there is no memory available.
Stack Operations
The basic stack operations of push and pop adjust the stack pointer register, rsp, during their operation.
For a push operation:
- The rsp register is decreased by 8 (1 quadword).
- The operand is copied to the stack at [rsp].
The operand is not altered. The order of these operation is important.
For a pop operation:
- The current top of the stack at [rsp], is copied into the operand.
- The rsp register is increased by 8 (1 quadword).
The order of these operations is the exact reverse of the push. The item popped is not actually deleted.
Example:
mov rax, 6700 ; 6700 = 00001A2C in hex
push rax
mov rax, 32 ; 31 = 0000001F in hex
push rax
Would produce the following stack operations (where each box is a byte):
The layout shows the architecture is little-endian in that the least significant byte placed into the lowest memory location.