Calling Convention in X86-64

What is Calling Convention?

A calling convention is like a set of rules dictates how functions interact with each other during program execution. It defines the protocol for parameter passing, return values, stack management, and register usage, ensuring seamless interoperability between different parts of the program.

Below are some of the x86 (32 bit) calling conventions:

x86 (32 bit) calling conventions:

1 Cdecl (C declaration)

This convention is commonly used in C and C++ languages. In cdecl, the caller pushes arguments onto the stack in reverse order (right to left), and it is the caller's responsibility to clean up the stack after the call. Return values are typically stored in a register, like eax

Parameter Passing:

  • Parameters are pushed onto the stack in right-to-left order.
  • The caller cleans up the stack after the function call.
  • The return value is placed in the EAX register.

Example usage:

push    dword 3   ; Push the second argument
push    dword 2   ; Push the first argument
call    myFunction

add esp, 8	; clean up the stack by adding 2 arguments of 4 bytes
			; 2 * 4 = 8

When calling C function from assembly or calling assembly function from C, both times cdecl calling convention is used.

  • Calling C function from assembly, we explicitly push arguments into the stack and clean the stack after the returning from the stack.
  • Calling assembly function from C, C compiler does it automatically, we don't need to explicitly push parameters into the stack and clean the stack.

2 Stdcall (Standard call)

Commonly used in Windows API functions. In stdcall, the callee is responsible for cleaning up the stack after the call. Arguments are pushed onto the stack in right-to-left order, similar to cdecl. Return values are often stored in the eax register.

Parameter Passing:

  • Parameters are pushed onto the stack in right-to-left order.
  • The callee cleans up the stack.
  • The return value is placed in the EAX register.

Example usage:

push    dword 2   ; Push the first argument
push    dword 3   ; Push the second argument
call    myFunction

3 Fastcall

This convention attempts to optimize register usage by passing some arguments in register rather than exclusively on the stack. The first few arguments are passed in registers, while additional arguments are passed on the stack. Return values are often stored in eax.

Parameter Passing (Microsoft version):

  • The first two parameters are passed in ECX and EDX.
  • Additional parameters are passed on the stack.
  • The callee is responsible for cleaning up the stack.
  • The return value is placed in the EAX register.

Example usage:

mov     ecx, dword 2  ; Load the first argument into ECX
mov     edx, dword 3  ; Load the second argument into EDX
call    myFunction

4 Thiscall (C++ Member Function Call)

Commonly used in object-oriented programming for methods. In thiscall, the first argument (usually the this pointer) is passed via a register (often ecx on x86), while the remaining arguments are passed on the stack. Return values are typically stored in eax.

Parameter Passing:

  • The this pointer is passed in ECX.
  • Parameters are passed on the stack.
  • The callee is responsible for cleaning up the stack.
  • The return value is placed in the EAX register.

Example usage:

mov     ecx, dword ptr [object]  ; Load the 'this' pointer
push    dword 2                   ; Push the second argument
call    [ecx].myMethod           ; Call the method

x86-64 64 bit calling convention

When a function named caller calls another function named caller, the caller will store the parameters in certain registers such as rdi, and rsi, and the callee can obtain the parameters through these registers. Therefore, the caller and callee must know which registers are used to pass parameters, and their order. These specificatons are calling conventions.

Register Utilization:

Parameter Passing: The first six integer or pointer arguments are transmitted through specific registers (rdi, rsi, rdx, rcx, r8, and r9). If there are more than six parameters to the function, then push the rest onto the stack in reverse order (i.e. last parameter first) – since the stack grows down.

Return Values: Integer or pointer return values find their place in the rax register, while larger return values, like structs, are often returned in memory.

Stack Frame

When the number of parameters passed to a function exceeds the capacity of registers designated for parameter passing in the AMD64 calling convention (which is the first six integer or pointer arguments passed in rdi, rsi, rdx, rcx, r8, and r9 registers), additional parameters are passed on the stack

  1. Initial Setup:
    1. Before the function is called, the caller prepared the parameters, placing the first six in the designated registers and any additional parameters onto the stack.
  2. Function Prologue:
    1.  

 

Assume that the callee function has 8 parameters, 3 local variables, and a return value as follows:

long callee(long a, long b, long c, long d, long e, long f, long g, long h) {
    long x;
    long y;
    long z;
    return 10;
}

void caller() {
    ...
    long x = calc(1, 2, 3, 4, 5, 6, 7, 8);
    ...
}

The assembly code of the caller may be as follows. Because the stack grows from high address to low address, the caller first subtract 16 from rsp to allocate two spaces, put the 7th and 8th parameters onto it, and the place the first 6 parameters into registers. Then, the callee is called. At this time, the call instruction will push the address of the next instruction onto the stack and subtract 8 from rsp.

When the callee ends and returns to the caller, the caller must clear the two spaces just allocated in the stack, so 16 is added to rsp. Then, get the return value of the callee from rax and store it in a local variable.

caller:
    ...
    subq   $16, %rsp      # Make stack space for the 7th and 8th parameters
    movq   $8, 8(%rsp)
    movq   $7, (%rsp)
    movq   $6, %r9
    movq   $5, %r8
    movq   $4, %rcx
    movq   $3, %rdx
    movq   $2, %rsi
    movq   $1, %rdi
    call   callee          # Call callee and push the return address onto the stack
    addq   $16, %rsp       # Clean up the stack
    movq   %rax, -8(%rbp)  # Save the return value to a local variable
    ...

The assembly code of the caller may be as follows. The callee first stores the caller's rbp onto the stack and set its own rbp. Then, it subtracts 24 from rsp to allocate three spaces to three local variables.

Before the callee ends, place the value to be returned to the caller into rax. Then, the leave instruction will copy rbp to rsp and pop the previous rbp value from the stack. Finally the ret instruction pops the return address from the stack.

callee:
    pushq   %rbp           # Save previous %rbp to the stack
    movq    %rsp, %rbp     # Move %rsp to %rbp
    subq    $24, %rsp      # Allocate space for the local variables
    ...
    movq    $10, %rax      # Move return value to %rax
    leave                  # Copy %rbp to %rsp, restore previous %rbp from the stack
    ret                    # Return by pop the return address from the stack

When the caller calls the callee, the stack will be as shown below. Please go through the process of the caller the caller above again with the figure below. You will have a better understanding of the changes in the stack.

calling-convention.webp

Red zone refers to the 128 bytes after rsp. Functions can use red zones to store temporary data. Especially when a function is a leaf function, it can use this area directly without adjusting rsp to allocate space.

Pushing Data onto the Stack:

The push instruction decrements the stack pointer (rsp) by 8bytes (to accommodate a 64-bit value), and then stores the operand at the memory location pointed to by rsp. For example:

push rax

This instruction is equivalent to:

sub rsp, 8
mov [rsp], rax

Popping Data from the Stack:

To pop data off the stack in x86_64, the pop instruction is used. It loads the value from the memory location pointed to by rsp into the operand and then increments rsp by 8 bytes. For example:

pop rax

This instruction is equivalent to:

mov rax, [rsp]
add rsp, 8

Example:

section .text
global add_numbers

add_numbers:
    push rbp            ; Save the base pointer
    mov rbp, rsp        ; Set up the stack frame

    ; Function logic...

    mov rsp, rbp        ; Restore the stack pointer
    pop rbp             ; Restore the base pointer
    ret                 ; Return from the function
  1. Function Call Preparation (Caller):
    1. Before calling the add_numbers function, the caller prepares the parameters to the AMD64 calling convention.
    2. Parameters are passed in registers (rdi, rsi, rdx, rcx, r8, r9) or pushed onto the stack if there are more than six parameters.
  2. Call Instruction (Caller):
    1. The caller invokes the function using the call instruction specifying the address of the add_numbers function.
    2. The call instruction pushes the return address (the address of the instruction following the call) onto the stack and jumps to the start of the called function.
  3. Function Prologue (Callee):
    1. Upon entry into the add_numbers function, the prologue is executed.
    2. The current value of the base pointer (rbp) is pushed onto the stack to save its value.
    3. The base pointer (rbp) is then set to the current value of the stack pointer (rsp), establishing the stack frame for the function.
  4. Parameter Access (Callee):
    1. Within the function, parameters are accessed based on the AMD64 calling convention.
    2. Parameters passed in registers (rdi, rsi, rdx, rcx, r8 and r9) are accessed directly from those registers.
    3. Additional parameters passed on the stack are accessed by dereferencing memory address relative to the base pointer (rbp).
  5. Function Execution (Callee):
    1. The function executes its logic, performing the intended operations using the provided parameters and any local variables.
    2. The stack frame is utilized for storing local variables and maintaining the function execution context.
  6. Function Epilogue (Callee):
    1. As the function nears its completion, the epilogue is executed.
    2. The stack frame is clearned up to restore the previous state before returning control to the caller.
    3. The base pointer (rbp) is restored by popping its previous value from the stack.
    4. The stack pointer (rsp) is then set to the value of the base pointer, effectively deallocating the space allocated for local variables.
    5. The ret instruction is executed, which pops the return address from the stack and jumps to it, returning control to the caller.
  7. Return from Function (Caller):
    1. After the add_numbers function completes its execution, control returns to the instruction following the call instruction in the caller's code.
    2. The caller continues its execution from where it left off, using the return value (if any) provided by the add_numbers function.