FAT File System

Well a File System is nothing more than a standard or specification that describe how data is stored, organized, and managed on a storage medium. These standards ensure compatibility, consistency, and interoperability across different systems and devices.

The File Allocation Table (FAT) file system is a legacy file system originally developed by Microsoft for use on small disks and simple directory structures. Despite being considered obsolete for modern computing needs, FAT remains prevalent in various applications due to its simplicity and wide compatibility with multiple operating systems and devices. This article delves into the intricacies of the FAT file system, exploring its architecture, functionality, strengths, and weaknesses.

History and Evolution

The FAT file system has undergone several iterations since its inception in the late 1970s:

  • FAT12 (1977): The original version, FAT12, was designed for use on floppy disks. It uses 12-bit entries in the FAT, supporting up to 4,096 clusters. Each cluster can contain one or more sectors, depending on the disk size.
  • FAT16 (1984): As hard drives became more common, FAT16 was introduced, using 16-bit entries to support up to 65,536 clusters. FAT16 allowed for larger volumes and files, making it suitable for hard drives.
  • FAT32 (1996): With the advent of larger storage devices, FAT32 was developed. It uses 32-bit entries, theoretically supporting up to 4,294,967,296 clusters, allowing for much larger volumes and files up to 4 GB in size.

Key Concepts in File System Theory

1 Storage Devices:

  • File systems reside on various types of storage devices, such as hard drives, solid-state drives (SSDs), optical discs, and removable media (e.g., USB flash drives).

2 Blocks(Sector) and Clusters:

  • Block: The smallest unit of data storage on a disk. Typically 512 bytes or 4 KB.
  • Cluster: A group of one or more contiguous blocks. File systems often allocate space in clusters to reduce fragmentation and manage disk space more efficiently.

3 Logical Structure:

  • Volumes: Logical partitions of a physical storage device. Each volume can have its own file system.
  • Directories: Hierarchical structures that organize files. Directories (or folders) can contain files and subdirectories.
  • Files: Basic units of data storage. A file is a sequence of bytes, and file systems manage metadata associated with each file, such as name, size, timestamps, and permissions.

4 Metadata:

  • Metadata includes information about files and directories, such as:
    • File name
    • File size
    • File type
    • Permissions
    • Timestamps (creation, modification, access)
    • Location on disk (pointers to blocks or clusters)

5 File Allocation Methods:

  • Contiguous Allocation: Files are stored in contiguous blocks. This method is simple and provides fast access but can lead to fragmentation and difficulty in finding contiguous free space.
  • Linked Allocation: Files are stored in linked lists of blocks. Each block points to the next block. This method handles fragmentation better but can be slower due to the need to follow pointers.
  • Indexed Allocation: Uses an index block to keep track of the addresses of all the blocks occupied by a file. This method provides direct access and handles fragmentation well but requires additional space for the index block.

6 Directory Structure:

  • Single-Level Directory: All files are stored in a single directory. Simple but not suitable for large systems.
  • Two-Level Directory: Separate directory for each user. Provides better organization but still limited.
  • Tree-Structured Directory: Hierarchical directory structure. Allows nested directories and provides good organization and flexibility.
  • Acyclic-Graph Directory: Allows directories to share subdirectories and files. Offers more flexibility but introduces complexity.
  • General Graph Directory: Most flexible, allowing arbitrary links among directories and files, but requires mechanisms to avoid cycles and manage complexity.

7 File System Operations:

  • Create: Allocate space for a new file and add an entry to the directory.
  • Delete: Remove a file's directory entry and release its allocated space.
  • Open: Load file metadata into memory for access.
  • Close: Release file metadata from memory.
  • Read: Retrieve data from a file.
  • Write: Store data to a file.
  • Rename: Change the name of a file.
  • Copy: Create a duplicate of a file.

8 Mounting and Unmounting:

  • Mounting: Integrating a file system into the system's directory structure. The mount operation makes a file system accessible at a specific directory (mount point).
  • Unmounting: Detaching a file system from the directory structure, ensuring all pending operations are completed.

9 File System Integrity and Recovery:

  • Journaling: Keeping a log (journal) of changes that will be made to the file system. Helps in recovering from crashes by replaying or rolling back incomplete transactions.
  • Consistency Checking: Scanning the file system to detect and repair inconsistencies. Tools like fsck (file system check) are used for this purpose.

10 Security and Access Control:

  • Permissions: File systems enforce access control through permissions, determining who can read, write, or execute a file.
  • Ownership: Each file has an owner, typically the user who created it. Ownership determines default access permissions.

11 Performance Considerations:

  • Caching: Storing frequently accessed data in memory to speed up access.
  • Buffering: Temporarily storing data in a buffer to reduce the number of read/write operations.
  • Defragmentation: Reorganizing the storage to reduce fragmentation and improve performance.

12 Types of File Systems

  • FAT (File Allocation Table):
    • Simple, widely supported, used in removable media.
    • Variants: FAT12, FAT16, FAT32.
  • NTFS (New Technology File System):
    • Advanced file system used in Windows.
    • Features: Journaling, security descriptors, encryption, compression, large file support.
  • ext (Extended File System):
    • Used in Linux.
    • Variants: ext2, ext3, ext4.
    • Features: Journaling (ext3, ext4), large file support, extended attributes.
  • HFS+ (Hierarchical File System Plus):
    • Used in macOS.
    • Features: Journaling, compression, large file support.
  • APFS (Apple File System):
    • Modern file system used in macOS and iOS.
    • Features: Snapshots, cloning, encryption, space sharing, fast directory sizing.
  • ReFS (Resilient File System):
    • Advanced file system used in Windows Server.
    • Features: Integrity checking, automatic repair, large volume and file support.
  • Btrfs (B-tree File System):
    • Advanced file system used in Linux.
    • Features: Copy-on-write, snapshots, compression, integrated RAID support.

FAT System Structure

image-197.png

1. Boot Sector:

  • Definition: The first sector of a storage device formatted with the FAT file system.
    • It is located at the beginning of the volume.
    • It is also known as the Volume Boot Record.
  • Explanation: The boot sector contains essential information about the volume, such as its type (e.g., FAT32), the size of the clusters, the number of reserved sectors, and other parameters necessary for the operating system to mount and manage the file system.
  • Below is the Information that a VBR contains:
    • Jump Instruction: The first 3 bytes are a jump instruction to the bootstrap code.
    • OEM Name: An 8-byte string identifying the system that formatted the volume.
    • BIOS Parameter Block (BPB): This contains key data about the disk layout, such as:
      • Bytes per sector (usually 512)
      • Sectors per cluster (power of 2, typically 1, 2, 4, 8, etc.)
      • Number of reserved sectors (including the boot sector)
      • Number of FAT copies (typically 2)
      • Maximum number of root directory entries (always 0 for FAT32)
      • Total sectors in the volume
      • Media descriptor (identifies the media type)
      • Sectors per FAT
      • Sectors per track and number of heads (for CHS addressing)
      • Hidden sectors (if the volume is part of a larger disk)
    • Extended BPB (FAT32-specific):
      • Sectors per FAT (32-bit value)
      • Flags
      • FAT version number
      • Cluster number of the root directory's starting cluster
      • Sector number of the FS Information Sector
      • Sector number of the Backup Boot Sector
      • Reserved
    • Bootstrap Code: Executable code for booting the system.
    • End of Sector Marker: 0x55AA.

VBR:

The Volume Boot Record (VBR) is a crucial part of the FAT32 file system, located at the beginning of the volume. It contains essential information for bootstrapping the operating system and organizing the file system on the disk. Below is a visual representation and explanation of the VBR structure:

Offset (bytes)Size (bytes)Description
03Jump Instruction
38OEM Name
1125This is the BIOS Parameter Block (BPB). Below is its structure.
112Bytes per Sector (e.g., 512)
131Sectors per Cluster (power of 2, e.g., 8)
142Reserved Sectors (e.g., 32)
161Number of FATs (typically 2)
172Max Root Dir Entries (always 0 for FAT32)
192Total Sectors (16-bit, if zero, use 32-bit)
211Media Descriptor (e.g., 0xF8)
222Sectors per FAT (16-bit, always 0 for FAT32)
242Sectors per Track (CHS value)
262Number of Heads (CHS value)
284Hidden Sectors (LBA value)
324Total Sectors (32-bit)
3626Extended BIOS Parameter Block (BPB). Below is the structure of it.
364Sectors per FAT (32-bit)
402Flags
422FAT Version
444Root Dir Start Cluster
482FS Info Sector
502Backup Boot Sector
5212Reserved
64448Bootstrap Code
5102Signature (0x55AA)

Jump Instruction (Offset 0, 3 bytes):

  • This field contains a jump instruction to the bootstrap code. It allows the BIOS to skip past the header and execute the bootstrap code.

OEM Name (Offset 3, 8 bytes):

  • An 8-byte string typically containing the name of the system that formatted the volume (e.g., "MSDOS5.0").

BIOS Parameter Block (BPB) (Offset 11, 25 bytes):

  • Bytes per Sector (2 bytes): Defines the size of a sector, usually 512 bytes.
  • Sectors per Cluster (1 byte): Number of sectors in a cluster, which must be a power of 2.
  • Reserved Sectors (2 bytes): Number of reserved sectors, starting at the beginning of the volume. This includes the boot sector.
  • Number of FATs (1 byte): Typically set to 2, indicating two copies of the FAT.
  • Max Root Dir Entries (2 bytes): Always 0 for FAT32, as the root directory is not fixed.
  • Total Sectors (16-bit) (2 bytes): Total number of sectors on the volume. If zero, use the 32-bit value.
  • Media Descriptor (1 byte): Specifies the media type (e.g., 0xF8 for hard disks).
  • Sectors per FAT (16-bit) (2 bytes): Always 0 for FAT32.
  • Sectors per Track (2 bytes): Used for CHS addressing.
  • Number of Heads (2 bytes): Used for CHS addressing.
  • Hidden Sectors (4 bytes): Number of hidden sectors preceding the partition.
  • Total Sectors (32-bit) (4 bytes): Total number of sectors on the volume if the 16-bit value is zero.

Extended BPB (Offset 36, 26 bytes):

  • Sectors per FAT (32-bit) (4 bytes): Number of sectors per FAT.
  • Flags (2 bytes): File system flags.
  • FAT Version (2 bytes): Version number (typically 0).
  • Root Dir Start Cluster (4 bytes): Cluster number of the start of the root directory.
  • FS Info Sector (2 bytes): Sector number of the FS Information Sector.
  • Backup Boot Sector (2 bytes): Sector number of the Backup Boot Sector.
  • Reserved (12 bytes): Reserved for future use.

Bootstrap Code (Offset 62, 448 bytes):

  • This field contains the bootstrap code that is executed if the volume is bootable. It initializes the operating system.

Signature (Offset 510, 2 bytes):

  • A signature marker (0x55AA) indicating the end of the boot sector.

2. Reserved Sectors:

  • Definition: Sectors reserved for system use, immediately following the boot sector.
  • Explanation: These sectors are set aside for critical data structures and future expansion. They ensure that important system data is not overwritten.

3. File Allocation Table (FAT) Region:

The File Allocation Table (FAT) is a list of entries (linked list). Each entry in the FAT corresponds to a cluster and can indicate whether the cluster is free, reserved, bad, or part of a file.

  • Definition: A table (array of 32-bit entries) that maps the clusters of the data area, keeping track of which clusters are occupied and how they are linked together.

Visual Representation of a FAT32 Entry:

+--------+----------------------------+
| Offset | Description                |
+--------+----------------------------+
| 0      | Cluster 0 Entry (Reserved) |
+--------+----------------------------+
| 4      | Cluster 1 Entry (Reserved) |
+--------+----------------------------+
| 8      | Cluster 2 Entry            |
+--------+----------------------------+
| ...    | ...                        |
+--------+----------------------------+
| 4n     | Cluster n Entry            |
+--------+----------------------------+

Detailed Explanation of Each Entry

  • Cluster 0 Entry:
    • Reserved and typically contains a special value that identifies the file system type (e.g., 0x0FFFFFF8 for FAT32).
  • Cluster 1 Entry:
    • Reserved and typically contains the same special value as Cluster 0 or a different identifier.
  • Cluster 2 Entry and Beyond:
    • Each entry from Cluster 2 onwards can represent one of the following:
      • Free Cluster: Indicates that the cluster is available for allocation (e.g., 0x00000000).
      • Reserved Cluster: Indicates that the cluster is reserved (e.g., 0x0FFFFFFF).
      • Bad Cluster: Indicates that the cluster is bad and should not be used (e.g., 0x0FFFFFF7).
      • Cluster in Use: Indicates the next cluster in the file's chain. For example, if Cluster 2 is part of a file and points to Cluster 3, the value would be 0x00000003.
      • End-of-File (EOF): Indicates the end of a file's cluster chain (e.g., 0x0FFFFFFF).

Example Layout of FAT Entries:

Below is an example of how FAT entries might look for a simple file allocation scenario:

+-----------+----------------+
| Cluster # | FAT Entry Value|
+-----------+----------------+
| 0         | 0x0FFFFFF8     | // Reserved
+-----------+----------------+
| 1         | 0x0FFFFFF8     | // Reserved
+-----------+----------------+
| 2         | 0x00000003     | // Points to Cluster 3
+-----------+----------------+
| 3         | 0x00000004     | // Points to Cluster 4
+-----------+----------------+
| 4         | 0x0FFFFFFF     | // End of File (EOF)
+-----------+----------------+
| 5         | 0x00000000     | // Free Cluster
+-----------+----------------+
| ...       | ...            |
+-----------+----------------+

In this example:

  • Clusters 0 and 1 are reserved.
  • Clusters 2, 3, and 4 are part of a file, with Cluster 2 pointing to Cluster 3, Cluster 3 pointing to Cluster 4, and Cluster 4 indicating the end of the file.
  • Cluster 5 is free and available for allocation.

4. Cluster:

A cluster in the context of file systems refers to a contiguous group of sectors on a disk.

  • Definition: The smallest unit of disk space allocation in the FAT file system, consisting of one or more blocks (sectors).
  • Explanation: Clusters are the basic units of storage allocation. A file is stored in one or more clusters, and the FAT keeps track of which clusters belong to which file.

Characteristics of Clusters:

  • Size:
    • The size of a cluster varies depending on the file system and the size of the disk. Common cluster sizes range from 512 bytes to several kilobytes.
    • For example, in FAT file systems (such as FAT16 and FAT32), cluster sizes can range from 512 bytes up to 32 KB, depending on the volume size and the file system's configuration.
  • Allocation:
    • When a file is created, the file system allocates one or more clusters to store the file's data.
    • The number of clusters allocated depends on the file size and the cluster size. Larger files require more clusters.
  • Contiguity:
    • Ideally, clusters allocated to a file are contiguous (adjacent) on the disk. Contiguous allocation can improve read/write performance because the disk head does not need to seek between non-contiguous locations.
    • Fragmentation occurs when contiguous clusters are not available, leading to non-contiguous allocation of clusters and potentially reducing disk performance.
  • Cluster Chain:
    • Files larger than one cluster are stored as a chain of clusters. Each cluster in the chain contains a pointer to the next cluster in the sequence, forming a linked list structure known as a file's cluster chain.
    • The file system uses this cluster chain to retrieve and store file data efficiently.

5. Root Directory Region:

The root directory is the top-most directory on a disk or volume, serving as the starting point for navigating the file system hierarchy. In FAT32, unlike more modern file systems like NTFS or ext4, the root directory is not stored as a cluster chain but rather as a fixed-size area immediately following the File Allocation Table (FAT) entries.

  • Definition: A special directory at the root level of the file system hierarchy.
  • Explanation: The root directory contains entries for files and subdirectories. In FAT32, the root directory can expand as needed, unlike in FAT12 and FAT16 where it has a fixed size.
  • Location: The root directory in FAT32 is located in a fixed position immediately following the FAT(s) and FS Information sector.
  • Entries: Each entry in the root directory is 32 bytes long and can represent either a file or subdirectory.
    • Directory Entries: Each entry in the root directory contains metadata about a file or subdirectory, including:
      • File Name: An 8.3 format (8 characters for the file name and 3 for the extension).
      • Attributes: Flags indicating file properties (e.g., read-only, hidden, system, volume label, directory, archive).
      • Creation Date/Time: Timestamp indicating when the file was created.
      • Last Access Date: Timestamp of the last access to the file.
      • Last Modified Date/Time: Timestamp of the last modification to the file.
      • Starting Cluster Number: For files, this points to the first cluster in the cluster chain storing the file's data. For directories, it points to the first cluster of the directory entries.

Structure of a Root Directory Entry in FAT32

Each root directory entry consists of the following fields, organized in a fixed-size 32-byte structure:

Offset (bytes)Size (bytes)Description
011File Name
111Attributes
121Reserved
131Creation Time (Tenths of a second)
142Creation Time (Hours, Minutes, Seconds)
162Creation Date
182Last Access Date
202High Word of First Cluster
222Last Modified Time
242Last Modified Date
262Low Word of First Cluster
284File Size

Detailed Explanation of Each Field:

  • File Name (Offset 0, 11 bytes):
    • The file name stored in ASCII characters. For short file names (8.3 format), it consists of eight characters for the file name followed by three characters for the file extension. Unused characters are padded with spaces (0x20).
  • Attributes (Offset 11, 1 byte):
    • Flags that describe various attributes of the file or directory, including:
      • Read-only (0x01)
      • Hidden (0x02)
      • System (0x04)
      • Volume Label (0x08)
      • Directory (0x10)
      • Archive (0x20)
    • Multiple attributes can be combined using bitwise OR.
  • Reserved (Offset 12, 1 byte):
    • Reserved for future use. Typically set to 0x00.
  • Creation Time (Offset 13-15, 3 bytes):
    • Creation time stored in a compacted format:
      • Bits 0-4: Tenths of a second (0-199)
      • Bits 5-10: Seconds (0-59)
      • Bits 11-15: Minutes (0-59)
      • Bits 16-20: Hours (0-23)
  • Creation Date (Offset 16-17, 2 bytes):
    • Creation date encoded with:
      • Bits 0-4: Day (1-31)
      • Bits 5-8: Month (1-12)
      • Bits 9-15: Year (relative to 1980, so add 1980 to get the actual year)
  • Last Access Date (Offset 18-19, 2 bytes):
    • Last access date stored similarly to the creation date.
  • High Word of First Cluster (Offset 20-21, 2 bytes):
    • High 16 bits of the first cluster number for files. For directories, this field is typically set to 0x0000.
  • Last Modified Time (Offset 22-23, 2 bytes):
    • Last modified time stored in the same compacted format as creation time.
  • Last Modified Date (Offset 24-25, 2 bytes):
    • Last modified date encoded in the same format as the creation date.
  • Low Word of First Cluster (Offset 26-27, 2 bytes):
    • Low 16 bits of the first cluster number for files. For directories, this field is typically set to 0x0000.
  • File Size (Offset 28-31, 4 bytes):
    • Size of the file in bytes.

Example Usage:

  • Suppose a root directory entry has the following values:
    • File Name: README.TXT
    • Attributes: 0x20 (Archive, indicating a regular file)
    • Creation Date/Time: 2023-01-15 14:30:45
    • Last Access Date: 2023-06-10
    • Last Modified Date/Time: 2023-06-09 09:15:30
    • First Cluster: 0x000A (low) 0x0000 (high)
    • File Size: 1024 bytes

This entry indicates that README.TXT is a regular file with archive attribute set, created on January 15, 2023, at 14:30:45, last accessed on June 10, 2023, and last modified on June 9, 2023, at 09:15:30. It starts at cluster 0x000A and has a size of 1024 bytes.

6. Data Region

  • Definition: The part of the volume where actual file and directory data is stored.
  • Explanation: The data region is composed of clusters. The FAT entries point to these clusters to form a chain, representing the storage layout of files and directories.

7. Directory Entry:

  • Definition: An entry in a directory that holds metadata about a file or subdirectory.
  • Explanation: Directory entries contain the file name, attributes, timestamps, the starting cluster number, and the file size. These entries are essential for locating and managing files.

8. Cluster Chain:

  • Definition: A sequence of clusters used to store a file, linked together via entries in the FAT.
  • Explanation: When a file is larger than one cluster, it spans multiple clusters. The FAT links these clusters in a chain, allowing the file system to piece together the entire file.

9. Fragmentation:

  • Definition: A condition where a file’s data is not stored in contiguous clusters but is spread across different areas of the disk.
  • Explanation: Fragmentation occurs over time as files are created, deleted, and resized. It can lead to slower access times because the system must read from multiple locations to access a single file.

10. End of File (EOF) Marker:

  • Definition: A special value in a FAT entry indicating the end of a file.
  • Explanation: The EOF marker signals that there are no more clusters to read for a given file. This helps the file system know where a file’s data terminates.

Working of FAT FS

Structure of a FAT32 Volume

Volume Layout:

Boot Sector and Reserved Sectors:
+-------------------+  <-- Sector 0 (Boot Sector)
| Boot Sector       |  (Contains volume information)
+-------------------+
| Reserved Sectors  |  (System reserved space)
+-------------------+

File Allocation Table (FAT):
+-------------------+  <-- FAT1
| FAT1              |  (First copy of the cluster map)
+-------------------+
| FAT2              |  (Second copy for redundancy)
+-------------------+

Root Directory and Data Region:
+-------------------+  <-- Root Directory
| Root Directory    |  (Contains file and directory
|					|		entries)
+-------------------+
| Data Region       |  (Clusters where data is stored)
| (Clusters)        |
+-------------------+


Imagine the storage device as a book. Here's how the book is organized:

+-------------------+  <-- Sector 0 (Boot Sector)
| Boot Sector       |  (Table of Contents)
+-------------------+
| Reserved Sectors  |  (Reserved Pages)
+-------------------+
| File Allocation   |  (First copy of the index of the 
| Table (FAT1)      |		book's pages)
+-------------------+
| File Allocation   |  (Second copy of the index for 
| Table (FAT2)      |		redundancy)
+-------------------+
| Root Directory    |  (The main directory where file 
|					|		information starts)
+-------------------+
| Data Region       |  (The actual content of the 
| (Clusters)        |	book, spread across many 
|					|	pages)
+-------------------+

File Allocation Table (FAT)

FAT Entries:

The FAT works like an index in the book, indicating which pages (clusters) belong to which file.

+---------+---------+---------+---------+---------+---------+
| Cluster |  Entry  |  Entry  |  Entry  |  Entry  |  Entry  |
|  Index  |    0    |    1    |    2    |    3    |    4    |
+---------+---------+---------+---------+---------+---------+
|  Value  |  0xFFF8 |  0x0003 |  0x0004 |  0xFFFF |  0x0005 |
+---------+---------+---------+---------+---------+---------+

In this example:

  • Clusters 0 and 1 are reserved.
  • Cluster 2 points to Cluster 3 (0x0003).
  • Cluster 3 points to Cluster 4 (0x0004).
  • Cluster 4 marks the end of the file (0xFFFF).
  • Cluster 5 points to the next cluster (not shown).

Storing a File

Example: Storing "File.txt" in Clusters

Imagine "File.txt" is 3 clusters long and stored across clusters 2, 3, and 4. The FAT entries and clusters would be visualized as follows:

+-------------+      +----------+      +----------+      +----------+
| Directory   | ---> | Cluster 2 | --->| Cluster 3 | --->| Cluster 4 |
| Entry       |      +----------+      +----------+      +----------+
| (File.txt)  |        (Data)            (Data)            (Data)
+-------------+

The directory entry for "File.txt" might look like this:

+-------------+----------------+----------------+----------------+--------------+
| File Name   | Attributes     | Start Cluster  | File Size      | ...          |
+-------------+----------------+----------------+----------------+--------------+
| File.txt    | -              | 2              | 3 Clusters     | ...          |
+-------------+----------------+----------------+----------------+--------------+

Accessing a File

Steps to Access "File.txt"

  1. Directory Lookup: Locate the directory entry for "File.txt".
  2. Read Start Cluster: Note that the entry indicates the start cluster (Cluster 2).
  3. Follow FAT Chain:
    • Read data from Cluster 2.
    • Check FAT entry for Cluster 2: it points to Cluster 3.
    • Read data from Cluster 3.
    • Check FAT entry for Cluster 3: it points to Cluster 4.
    • Read data from Cluster 4.
    • Check FAT entry for Cluster 4: it indicates the end of the file (0xFFFF).

Visualization of this process:

+-------------+      +----------+      +----------+      +----------+
| Directory   | ---> | Cluster 2 | --->| Cluster 3 | --->| Cluster 4 |
| Entry       |      +----------+      +----------+      +----------+
| (File.txt)  |        (Data)            (Data)            (Data)
+-------------+

Fragmentation Example

Files can become fragmented over time, meaning their data clusters are not stored contiguously. Here’s an example of how "File.txt" might be fragmented:

+-------------+      +----------+      +----------+      +----------+      +----------+
| Directory   | ---> | Cluster 2 | --->| Cluster 5 | --->| Cluster 8 | --->| Cluster 12|
| Entry       |      +----------+      +----------+      +----------+      +----------+
| (File.txt)  |        (Data)            (Data)            (Data)            (Data)
+-------------+

FAT entries for this fragmented file might look like:

+---------+---------+---------+---------+---------+---------+---------+---------+
| Cluster |  Entry  |  Entry  |  Entry  |  Entry  |  Entry  |  Entry  |  Entry  |
|  Index  |    0    |    1    |    2    |    3    |    4    |    5    |    6    |
+---------+---------+---------+---------+---------+---------+---------+---------+
|  Value  |  0xFFF8 |  0x0003 |  0x0005 |  0x0006 |  0x0007 |  0x0008 |  0x000C |
+---------+---------+---------+---------+---------+---------+---------+---------+
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <string.h>
#include <dirent.h>
#include <sys/stat.h>
#include <unistd.h>

#define DISK_SIZE (10 * 1024 * 1024)  // 10 MB
#define SECTOR_SIZE 512
#define CLUSTER_SIZE (8 * SECTOR_SIZE)  // 8 sectors per cluster
#define RESERVED_SECTORS 32
#define NUM_FATS 2
#define ROOT_DIR_CLUSTER 2
#define FAT_SIZE_SECTORS 64

#pragma pack(push, 1)
typedef struct {
    uint8_t  jmpBoot[3];
    uint8_t  oemName[8];
    uint16_t bytesPerSector;
    uint8_t  sectorsPerCluster;
    uint16_t reservedSectors;
    uint8_t  numFATs;
    uint16_t rootEntryCount;
    uint16_t totalSectors16;
    uint8_t  media;
    uint16_t fatSize16;
    uint16_t sectorsPerTrack;
    uint16_t numHeads;
    uint32_t hiddenSectors;
    uint32_t totalSectors32;

    // FAT32 specific
    uint32_t fatSize32;
    uint16_t extFlags;
    uint16_t fsVersion;
    uint32_t rootCluster;
    uint16_t fsInfo;
    uint16_t backupBootSector;
    uint8_t  reserved[12];
    uint8_t  driveNumber;
    uint8_t  reserved1;
    uint8_t  bootSignature;
    uint32_t volumeID;
    uint8_t  volumeLabel[11];
    uint8_t  fsType[8];
    uint8_t  bootCode[420];
    uint16_t bootSectorSignature;
} FAT32BootSector;

typedef struct {
    uint32_t leadSignature;
    uint8_t  reserved1[480];
    uint32_t structSignature;
    uint32_t freeCount;
    uint32_t nextFree;
    uint8_t  reserved2[12];
    uint32_t trailSignature;
} FAT32FSInfo;
#pragma pack(pop)

void create_disk_image(const char *filename) {
    FILE *f = fopen(filename, "wb");
    if (!f) {
        perror("Failed to create disk image");
        exit(1);
    }
    
    // Set the size to 10 MB
    fseek(f, DISK_SIZE - 1, SEEK_SET);
    fputc('\0', f);
    fclose(f);
}

void write_boot_sector(FILE *f) {
    FAT32BootSector bs = {0};

    // Jump code and OEM name
    bs.jmpBoot[0] = 0xEB;
    bs.jmpBoot[1] = 0x58;
    bs.jmpBoot[2] = 0x90;
    memcpy(bs.oemName, "ValiOS  ", 8);

    // BIOS Parameter Block (BPB)
    bs.bytesPerSector = 512;
    bs.sectorsPerCluster = 8;
    bs.reservedSectors = RESERVED_SECTORS;
    bs.numFATs = NUM_FATS;
    bs.media = 0xF8;
    bs.sectorsPerTrack = 63;
    bs.numHeads = 255;
    bs.totalSectors32 = DISK_SIZE / SECTOR_SIZE;
    bs.fatSize32 = FAT_SIZE_SECTORS;
    bs.rootCluster = ROOT_DIR_CLUSTER;
    bs.fsInfo = 1;
    bs.backupBootSector = 6;

    // Volume information
    bs.driveNumber = 0x80;
    bs.bootSignature = 0x29;
    bs.volumeID = 0x12345678;
    memcpy(bs.volumeLabel, "NO NAME    ", 11);
    memcpy(bs.fsType, "FAT32   ", 8);

    // Boot signature
    bs.bootSectorSignature = 0xAA55;

    // Write to disk image
    fwrite(&bs, sizeof(bs), 1, f);
}

void write_fsinfo_sector(FILE *f) {
    FAT32FSInfo fsinfo = {0};

    fsinfo.leadSignature = 0x41615252;
    fsinfo.structSignature = 0x61417272;
    fsinfo.freeCount = 0xFFFFFFFF;
    fsinfo.nextFree = 0xFFFFFFFF;
    fsinfo.trailSignature = 0xAA550000;

    // Move to FSInfo sector
    fseek(f, SECTOR_SIZE, SEEK_SET);

    // Write FSInfo sector
    fwrite(&fsinfo, sizeof(fsinfo), 1, f);
}

void write_fat_tables(FILE *f) {
    // Create FAT table
    uint32_t fat[128] = {0};
    fat[0] = 0x0FFFFFF8;  // Media descriptor and end of cluster chain
    fat[1] = 0xFFFFFFFF;  // End of cluster chain for root directory

    // Move to the start of the first FAT
    fseek(f, RESERVED_SECTORS * SECTOR_SIZE, SEEK_SET);  // Reserved sectors

    // Write FAT1
    fwrite(fat, sizeof(fat), 1, f);

    // Move to the start of the second FAT
    fseek(f, (RESERVED_SECTORS + FAT_SIZE_SECTORS) * SECTOR_SIZE, SEEK_SET);  // Reserved sectors + FAT1

    // Write FAT2
    fwrite(fat, sizeof(fat), 1, f);
}

void write_root_directory(FILE *f) {
    uint8_t root_dir[CLUSTER_SIZE] = {0};  // 1 cluster for root directory

    // Move to the start of the root directory
    fseek(f, (RESERVED_SECTORS + FAT_SIZE_SECTORS * NUM_FATS) * SECTOR_SIZE, SEEK_SET);  // Reserved sectors + 2 FATs

    // Write root directory
    fwrite(root_dir, sizeof(root_dir), 1, f);
}

void write_stage1_bootloader(FILE *f, const char *bootloader_path) {
    FILE *bootloader = fopen(bootloader_path, "rb");
    if (!bootloader) {
        perror("Failed to open bootloader file");
        exit(1);
    }

    uint8_t bootloader_code[SECTOR_SIZE] = {0};
    fread(bootloader_code, 1, SECTOR_SIZE, bootloader);
    fclose(bootloader);

    fseek(f, 0, SEEK_SET);
    fwrite(bootloader_code, 1, SECTOR_SIZE, f);
}

void copy_directory_to_image(FILE *f, const char *src_dir, uint32_t start_cluster) {
    struct dirent *entry;
    DIR *dp = opendir(src_dir);
    if (!dp) {
        perror("Failed to open source directory");
        exit(1);
    }

    while ((entry = readdir(dp))) {
        if (strcmp(entry->d_name, ".") == 0 || strcmp(entry->d_name, "..") == 0) {
            continue;
        }

        char path[1024];
        snprintf(path, sizeof(path), "%s/%s", src_dir, entry->d_name);

        struct stat st;
        if (stat(path, &st) == -1) {
            perror("Failed to stat file");
            continue;
        }

        if (S_ISDIR(st.st_mode)) {
            // Handle directory (recursively)
            copy_directory_to_image(f, path, start_cluster + 1);
        } else if (S_ISREG(st.st_mode)) {
            // Handle regular file
            FILE *src_file = fopen(path, "rb");
            if (!src_file) {
                perror("Failed to open source file");
                continue;
            }

            uint8_t buffer[CLUSTER_SIZE] = {0};
            size_t bytes_read = fread(buffer, 1, CLUSTER_SIZE, src_file);
            fclose(src_file);

            // Write file content to image
            fseek(f, (RESERVED_SECTORS + FAT_SIZE_SECTORS * NUM_FATS + (start_cluster - 2) * CLUSTER_SIZE), SEEK_SET);
            fwrite(buffer, 1, bytes_read, f);
        }
    }

    closedir(dp);
}

int main() {
    const char *disk_image = "disk.img";
    const char *bootloader_path = "stage1.bin";
    const char *src_dir = "src";

    create_disk_image(disk_image);

    FILE *f = fopen(disk_image, "r+b");
    if (!f) {
        perror("Failed to open disk image");
        exit(1);
    }

    // Write the boot sector
    write_boot_sector(f);

    // Write the FSInfo sector
    write_fsinfo_sector(f);

    // Write the FAT tables
    write_fat_tables(f);

    // Write the root directory
    write_root_directory(f);

    // Write the stage1 bootloader
    write_stage1_bootloader(f, bootloader_path);

    // Copy the directory to the image
    copy_directory_to_image(f, src_dir, ROOT_DIR_CLUSTER);

    fclose(f);
    return 0;
}