CMPSX Instruction Family in x86

The cmpsX instruction is a powerful string comparison operation available in the x86 assembly language, designed to facilitate efficient and direct comparisons of sequences of data. It is particularly useful for tasks such as comparing strings or memory blocks. In this article, we will delve into the details of the cmpsX instruction family, explore its variations, and understand its practical applications.

Overview of cmpsX

The cmpsX instruction family consists of:

  • cmpsb: Compare String Byte
  • cmpsw: Compare String Word
  • cmpsd: Compare String Doubleword
  • cmpsq: Compare String Quadword (available in 64-bit mode)

Each instruction compares elements (bytes, words, doublewords, or quadwords) at addresses specified by the SI (Source Index) and DI (Destination Index) registers, then updates the CPU flags based on the result.

Instruction Syntax

cmpsb   ; Compare byte at [SI] with byte at [DI]
cmpsw   ; Compare word at [SI] with word at [DI]
cmpsd   ; Compare doubleword at [SI] with doubleword at [DI]
cmpsq   ; Compare quadword at [RSI] with quadword at [RDI] (64-bit mode)

How It Works

  1. Comparison: The instruction performs a subtraction of the value at the source index (SI or RSI) from the value at the destination index (DI or RDI).
  2. Flag Updates: It updates the status flags (ZF, SF, CF, OF, AF, PF) based on the result of the subtraction, but does not store the result.
  3. Pointer Adjustment: The SI and DI registers are incremented or decremented based on the direction flag (DF):
    • If DF is clear (0), SI and DI are incremented.
    • If DF is set (1), SI and DI are decremented.

Direction Flag (DF)

The direction flag (DF) determines whether the index registers are incremented or decremented. It is controlled using the cld (Clear Direction Flag) and std (Set Direction Flag) instructions:

  • cld: Clears DF (increments SI and DI).
  • std: Sets DF (decrements SI and DI).

Using cmpsX with Repeat Prefixes

The cmpsX instructions can be combined with repeat prefixes to compare multiple elements until a condition is met:

  • rep: Repeat while CX (or ECX, RCX) is not zero.
  • repe / repz: Repeat while equal / zero flag is set and CX is not zero.
  • repne / repnz: Repeat while not equal / zero flag is clear and CX is not zero.

Example: Comparing Strings

Here's an example of using cmpsb to compare two strings:

section .data
    string1 db 'Hello, World!', 0
    string2 db 'Hello, World!', 0

section .text
    org 0x100
start:
    ; Set up the data segment
    mov ax, ds
    mov es, ax

    ; Load addresses of the strings
    mov si, string1
    mov di, string2

    ; Load the length of the strings
    mov cx, 13 ; Length of 'Hello, World!'

    ; Clear the Direction Flag to ensure forward movement
    cld

    ; Compare strings
    repe cmpsb

    ; Check result
    jz strings_equal
    jmp strings_not_equal

strings_equal:
    ; Code to handle equal strings
    mov ah, 0x09
    lea dx, [msg_equal]
    int 0x21
    jmp done

strings_not_equal:
    ; Code to handle unequal strings
    mov ah, 0x09
    lea dx, [msg_not_equal]
    int 0x21

done:
    ; Terminate program
    mov ah, 0x4C
    int 0x21

section .data
    msg_equal db 'Strings are equal.', '$'
    msg_not_equal db 'Strings are not equal.', '$'

Explanation

  1. Data Segment Setup: mov ax, ds and mov es, ax set up the data segment for string comparison.
  2. Loading Addresses: mov si, string1 and mov di, string2 load the addresses of the strings into SI and DI.
  3. Setting Up Comparison: mov cx, 13 sets the length of the strings to compare, and cld clears the direction flag.
  4. Comparing Strings: repe cmpsb compares each byte of the strings while they are equal and CX is not zero.
  5. Handling Results: If ZF (Zero Flag) is set after the comparison loop, the strings are equal; otherwise, they are not.

Practical Applications

  1. String Comparisons: Used in functions to check if two strings are identical.
  2. Memory Comparisons: Useful in scenarios where blocks of memory need to be compared, such as checksum validation.
  3. Search Algorithms: Helps in implementing search algorithms that need to compare elements within arrays or buffers.