A file system is a fundamental component that ensures this data is organized, stored, and accessed efficiently and securely. Without a file system, the data on storage devices would be an unmanageable collection of bits and bytes, making it impossible to isolate or retrieve meaningful information.
Understanding File System
A file system is a method and data structure that an operating system uses to control how data is stored and retrieved on storage devices such as hard drives, SSDs, and USB devices. Without a file system, data placed in a storage medium would be one large block of data with no way to tell where one piece of data stops and the next begins. By separating the data into pieces and giving each piece a name, the data is easily isolated and identified. Taking its name from the way paper-based data management systems are named, each piece of data is called a "file." The structure and logic rules used to manage the groups of data and their names is called a "file system."
A file system is a critical component of an operating system that defines how files are named, stored, and retrieved from a storage device. This includes various mechanisms and data structures to manage the organization, access, and manipulation of data on storage media such as hard drives, SSDs, USB drives, and optical discs.
- It defines the rules and conventions for naming files and directories, organizing them in a hierarchical structure, and managing their storage locations.
- Essentially, a file system provides a way to separate data into distinct files, each with a unique name and a set of attributes, facilitating easy data retrieval and management.
Why do we need File System
The Chaos Without a File System:
Without a file system, a storage device would be nothing more than a large, undifferentiated mass of data. The operating system wouldn't be able to distinguish one piece of information from another, rendering the storage medium essentially useless.
Example:
Imagine walking into a room filled with piles of papers scattered all over the floor. There's no order, no structure, just a chaotic mess of documents that makes finding any specific piece of information nearly impossible. This is what a storage device would be like without a file system: a massive, disorganized chunk of data where the operating system wouldn't be able to differentiate between individual pieces of information.
Origin of the Term File System
The term "file system" is derived from old paper-based data management systems. In those systems, documents were kept as files and organized into directories, much like folders in a filing cabinet. This organized structure made it easy to locate, manage, and retrieve documents.
The Order with a File System
A file system transforms a storage device from chaos to a well-organized structure.
However, a file system changes everything. It introduces a methodical way to organize, store, and manage data, transforming the chaotic pile into a neatly organized filing cabinet. Here’s why a file system is not just a bookkeeping feature, but an essential component of any operating system:
Key Responsibilities of a File System:
1 Space Management
- Efficient Storage: Divides the storage medium into manageable blocks and keeps track of which blocks are used and which are free, ensuring efficient allocation and retrieval of data.
- Fragmentation Handling: Manages fragmentation to optimize storage space and maintain performance.
2 Metadata Management
- File Information: Stores metadata such as file size, creation date, permissions, and location. This metadata helps the operating system quickly access and manage files.
3 Data Encryption
- Security: Some file systems support encryption, protecting data from unauthorized access and ensuring privacy.
4 File Access Control
- Permissions: Manages permissions to control who can read, write, or execute a file, maintaining security and preventing unauthorized access.
- User Authentication: Integrates with user authentication mechanisms to ensure that only authorized users can access certain files or directories.
5 Data Integrity
- Error Detection and Correction: Includes mechanisms to detect and correct errors, ensuring data integrity. Techniques like journaling help recover from crashes and data corruption.
- Backup and Recovery: Supports backup and recovery processes, allowing users to restore data in case of hardware failure, accidental deletion, or corruption.
Partitioning in File Systems
Partitioning is splitting a single physical storage device, such as a hard drive, or SSD, into multiple, smaller, logical units called partitions. Each partition can function as an independent unit and can be formatted with its own file system.

Why Partition a Storage Device?
1 Organization and Management
- Data Separation: Different types of data (e.g., system files, applications, user data) can be kept on separate partitions, improving organization and making it easier to manage and back up data.
- Multiple Operating Systems: Partitions allow multiple operating systems to be installed on a single physical drive, enabling dual-boot or multi-boot configurations.
2 Performance Optimization
- Swap Space: On Unix-like systems, a dedicated swap partition can be created to manage memory more efficiently.
- File System Performance: Different file systems can be used for different partitions to optimize performance based on the type of data stored (e.g., using ext4 for general storage and XFS for handling large files).
3 Data Security and Recovery
- Isolation: Separating system and user data reduces the risk of data corruption. If the operating system needs to be reinstalled, user data on a separate partition remains intact.
- Easier Backup and Recovery: With data organized into partitions, creating backups and restoring specific partitions is simpler and more efficient.
4 Flexibility and Scalability
- Resizing Partitions: Modern file systems and tools allow partitions to be resized to accommodate growing data needs without affecting other partitions.
- Different File Systems: Different partitions can use different file systems tailored to specific requirements (e.g., NTFS for Windows compatibility, ext4 for Linux).
5 Example:
Some operating systems, like Windows, assign a drive letter (A, B, C, or D
) to the partitions. For instance, the primary partition on Windows (on which Windows is installed) is known as C:, or drive C
.
In Unix-like operating systems, however, partitions appear as ordinary directories under the root directory.
Partitioning Schemes
Partitioning schemes are methods used to divide a physical storage device into separate, manageable sections called partitions. Each partition can function independently, having its own file system and operating system, if needed. The two most common partitioning schemes are the Master Boot Record (MBR) and the GUID Partition Table (GPT). Each scheme has its own structure, limitations, and use cases.
Regardless of what partitioning scheme you choose, the first few blocks on the storage device will always contain critical data about your partitions. The system's firmware uses these data structures to boot up the operating system on a partition.
What is System Firmware?
Firmware is low-level software embedded into electronic devices to operate the device or bootstrap another program to do so. It acts as a bridge between the hardware and higher-level software, providing essential instructions for how the device communicates with other hardware.
Examples of Firmware:
- Computers: The firmware provides a standard interface for the operating system to boot up and work with hardware components.
- Peripherals: Devices like keyboards, mice, and printers also contain firmware to manage their basic functions.
- Electronic Appliances: Even home appliances such as microwaves and washing machines have firmware to control their operations.
Role in Computers:
In computers, the firmware is crucial for the initial hardware checks and boot process. It prepares the system to load the operating system by initializing hardware components and providing a consistent interface for booting.
Types of System Firmware in Computers
Hardware manufacturers make firmware based on two specifications:
- Basic Input/Output (BIOS)
- Unified Extensible Firmware Interface (UEFI)
Basic Input / Output System (BIOS):
- History: The traditional firmware used in early personal computers.
- Function: Conducts POST (Power-On Self-Test) and initializes hardware before loading the bootloader from the MBR of the bootable partition.
- Interface: Provides a simple text-based interface for configuration.
- Limitations: Limited to MBR partitioning and cannot handle disks larger than 2 TB. It also supports only 16-bit processor modes, which restricts performance and capabilities.
Unified Extensible Firmware Interface (UEFI):
- History: A modern replacement for BIOS, developed to overcome its limitations.
- Function: Initializes hardware, performs POST, and can load the operating system directly from the GPT partition table.
- Interface: Offers a more advanced graphical interface and supports mouse navigation.
- Advantages:
- Support for Large Disks: Can handle disks larger than 2 TB.
- More Partitions: Supports GPT, allowing for more partitions.
- Enhanced Security: Includes features like Secure Boot to prevent unauthorized code from running before the OS loads.
- Faster Boot Times: UEFI can initialize hardware and start the OS more quickly than BIOS.
Role of Firmware in the Boot Process
Initialization:
- BIOS: The BIOS firmware runs POST, initializes the hardware, and then reads the MBR to locate the bootable partition. It loads the bootloader code from this partition to start the operating system.
- UEFI: The UEFI firmware performs similar hardware initialization but reads the GPT headers and partition entries directly. It locates and loads the bootloader from the EFI System Partition (ESP).
Bootloader Execution:
- BIOS: Transfers control to the bootloader found in the MBR, which then loads the operating system.
- UEFI: Directly loads the bootloader or OS loader from the ESP, allowing for more complex and secure booting mechanisms.
Importance of the First Few Blocks
Regardless of the partitioning scheme (MBR or GPT), the first few blocks of the storage device hold essential data that the system firmware reads to start the boot process:
- MBR: Contains the bootloader and the partition table.
- GPT: Contains the protective MBR, primary GPT header, and partition entry array, along with a backup header at the end of the disk.
1 Master Boot Record (MBR):

Structure:
1 Master Boot Routine
- Size = 446 bytes
- Description = Contains the bootloader code, a small program that the BIOS loads and executes to begin the boot process. This code is responsible for locating the active partition and transferring control to its Volume Boot Record (VBR).
2 Partition Table
- Size: 64 bytes
- Description: Contains entries for up to four partitions, each 16 bytes long. These entries describe the partitions on the storage device, including their starting and ending locations, type, and other details.
Detailed Partition Table Entry
Each partition table entry within the MBR is 16 bytes long and includes the following fields:
Boot Indicator (1 byte)
- Description: Specifies whether the partition is bootable. 0x80 indicates a bootable partition, while 0x00 indicates a non-bootable partition.
Start Head (1 byte)
- Description: The head number where the partition starts.
Start Sector (1 byte)
- Description: The sector number where the partition starts (6 bits for the sector number, 2 bits for the high part of the cylinder number).
Start Cylinder (1 byte)
- Description: The cylinder number where the partition starts (8 bits for the low part of the cylinder number).
Partition Type (1 byte)
- Description: Indicates the file system type or the intended use of the partition (e.g., NTFS, FAT32).
End Head (1 byte)
- Description: The head number where the partition ends.
End Sector (1 byte)
- Description: The sector number where the partition ends (6 bits for the sector number, 2 bits for the high part of the cylinder number).
End Cylinder (1 byte)
- Description: The cylinder number where the partition ends (8 bits for the low part of the cylinder number).
Starting LBA (4 bytes)
- Description: The starting sector (Logical Block Addressing) of the partition.
Number of Sectors (4 bytes)
- Description: The total number of sectors in the partition.
3 Identification Code (Boot Signature)
- Size: 2 bytes
- Description: The boot signature (0x55AA) marks the end of the MBR and indicates that it is valid and executable.
Features:
- Partition Limit: Supports up to four primary partitions or three primary partitions and one extended partition (which can hold multiple logical partitions).
- Size Limit: Each partition is limited to 2 TB due to the 32-bit partition table.
- Compatibility: Widely supported across various operating systems, making it a good choice for legacy systems.
Use Cases:
- Legacy Systems: Ideal for older systems and BIOS-based boot systems.
- Small Disks: Suitable for disks smaller than 2 TB.
Limitations:
- Partition Number Limit: Restricted to four primary partitions or three primary and one extended partition.
- Size Restriction: Cannot handle partitions larger than 2 TB.
MBR Booting Process:
Let's recall what we have learnt in the booting process:
1 Power-On Self-Test (POST)
When the computer is powered on or restarted, the BIOS (Basic Input/Output System) firmware performs a Power-On Self-Test (POST) to check the system's hardware components. This includes:
- Checking the CPU, memory, and other critical hardware components.
- Initializing the system’s hardware and setting up the necessary interfaces.
2. BIOS Initialization
After completing the POST successfully, the BIOS initializes system hardware, such as:
- Initializing hardware interfaces like keyboard, display, and storage controllers.
- Configuring system settings stored in the CMOS (Complementary Metal-Oxide-Semiconductor) memory.
3. BIOS Boot Sequence
The BIOS follows a pre-configured boot sequence to determine the bootable device. The sequence typically includes:
- Checking for bootable devices in a predefined order (e.g., hard drive, USB drive, CD/DVD, network).
- Checks each device in the order specified until it finds a bootable one.
4. Reading the Master Boot Record (MBR)
The BIOS reads the first 512-byte (sector 0) sector (MBR) of the first drive into memory at address 0x7c00
.
The MBR consists of:
- Bootloader Code (446 bytes): Contains the machine code that the BIOS executes to begin the boot process.
- Partition Table (64 bytes): Contains four partition entries, each 16 bytes long, describing the partitions on the storage device.
- Boot Signature (2 bytes): Contains the boot signature (0x55AA) indicating a valid MBR.
5. Checking the Boot Signature
Before executing the bootloader code, the BIOS verifies that the sector contains a valid MBR by checking for the boot signature.
- Boot Signature (2 bytes): The last two bytes of the 512-byte MBR sector should contain the value 0x55AA. This signature indicates that the sector is a valid MBR.
If the Boot Signature is Valid:
- It executes the loaded bootloader code.
If the Boot Signature is Invalid or the Drive is Not Bootable:
- Error Message: If the drive is non-bootable, the BIOS may display an error message, such as "Non-System Disk" or "Disk Error."
- Proceeds to the Next Device: The BIOS continues to the next device in the boot sequence.
6. Executing the Bootloader Code
At this point we have a valid bootable MBR code (at 0x7c00
), the BIOS transfers control to it. This small program (also known as the initial bootloader) is responsible for:
- Locating the active (bootable) partition as indicated by the partition table.
- Loading the Volume Boot Record (VBR) of the active partition into memory.
7. Loading the Volume Boot Record (VBR)
The VBR, also known as the partition boot sector, is the first sector of the active partition. It contains:
- Additional bootloader code that is specific to the operating system installed on that partition.
- Code that will further load the operating system’s kernel or another secondary bootloader.
8. Transferring Control to the OS Bootloader
The bootloader code in the VBR is executed, which typically performs the following tasks:
- Initializing the file system of the partition to locate the operating system kernel or secondary bootloader.
- Loading the kernel or secondary bootloader into memory.
- Transferring control to the operating system’s bootloader.
9. Loading the Operating System
The operating system’s bootloader then takes over the boot process, which involves:
- Initializing core operating system components.
- Loading device drivers and system services.
- Transitioning from real mode (used by BIOS) to protected mode (used by modern operating systems) or long mode for 64-bit OS.
10. Kernel Initialization and OS Start
Finally, the operating system kernel initializes its subsystems and services, mounts the root file system, and starts the init process (or its equivalent):
- The init process completes the operating system boot process.
- The operating system becomes fully operational, ready for user interaction.
2 GUID Partition Table (GPT)
The GUID Partition Table (GPT) is a standard for the layout of the partition table on a physical storage device such as a hard disk drive (HDD) or solid-state drive (SSD). GPT is part of the Unified Extensible Firmware Interface (UEFI) specification, which replaces the older Basic Input/Output System (BIOS) firmware interface. GPT provides several advantages over the older Master Boot Record (MBR) partitioning scheme, including support for larger disks and more partitions.
Structure:
- Protective MBR: The first sector, which prevents disk utilities from misrecognizing the GPT disk as unpartitioned.
- Primary GPT Header: Contains information about the disk and its partitions.
- Partition Entry Array: Stores information for each partition.
- Secondary GPT Header: Located at the end of the disk for redundancy.
Features:
- Partition Limit: Can support up to 128 partitions by default, though it can be extended further.
- Size Limit: Each partition can be up to 9.4 zettabytes (theoretical limit).
- Redundancy: GPT stores multiple copies of the partitioning data across the disk for increased reliability and recovery.
Use Cases:
- Modern Systems: Preferred for UEFI-based systems and new installations.
- Large Disks: Suitable for disks larger than 2 TB.
- Advanced Features: Supports advanced features like CRC protection for integrity checking.
Advantages:
- More Partitions: Supports a much higher number of partitions compared to MBR.
- Larger Partitions: Handles very large disks and partitions beyond 2 TB.
- Resilience: Offers better data integrity and redundancy.
Comparison: MBR vs GPT
Feature | MBR | GPT |
---|---|---|
Maximum Partitions | 4 primary or 3 primary + 1 extended | 128 (default, can be extended) |
Maximum Partition Size | 2 TB | 9.4 zettabytes |
Redundancy | No | Yes (Primary and Secondary Headers) |
Compatibility | Widely supported, especially in older systems | Supported by most modern systems |
Integrity Checking | No | Yes (CRC protection) |
Suitable for Booting | BIOS-based systems | UEFI-based systems |
Structure of File System
The structure of a file system is designed to manage how data is stored and retrieved on a storage device, such as a hard disk drive (HDD) or solid-state drive (SSD). This structure typically includes several key components, each with specific roles and functions.
1. Boot Block (or Boot Sector)
- Purpose: Contains the necessary code to boot the operating system. If the file system is on a bootable disk, the boot block will include the bootstrap loader, which is executed to load the operating system into memory.
- Location: The very first block of the disk.
2. Superblock
- Purpose: Stores metadata about the file system, including its size, the number of data blocks, the location of key structures, and file system version.
- Contents:
- Total number of blocks and inodes
- Block size
- Information about free and allocated blocks
- Details about the file system layout and version
- Location: Typically found right after the boot block.
3. Inode Table
- Purpose: A data structure used to represent files and directories. Each inode stores metadata about a file or directory but does not contain the actual data.
- Contents:
- File size
- File type and permissions
- Timestamps (creation, modification, access)
- Pointers to data blocks
- Owner and group ID
- Location: Usually located right after the superblock.
4. Data Blocks
- Purpose: Where the actual file data is stored. Data blocks can be contiguous or scattered across the disk.
- Contents: The actual content of files and directories.
- Location: The main area of the disk, managed by the inode table which points to the specific data blocks.
5. Free Space Management
- Purpose: Keeps track of unused blocks and inodes to allocate space for new files.
- Methods:
- Bitmap: A map of bits where each bit represents a block (1 means used, 0 means free).
- Free List: A linked list of free blocks.
- Location: Typically located in the superblock or in a dedicated area of the disk.
6. Directory Structure
- Purpose: Organizes files into a hierarchy, allowing directories to contain other files and directories.
- Contents: Maps filenames to inodes.
- Location: Inodes and data blocks for directories are scattered throughout the disk.
7. Journal (for Journaling File Systems)
- Purpose: Improves reliability by recording changes to the file system in a journal before they are actually written to the main file system. This helps recover from crashes and power failures.
- Contents: Records of pending changes, ensuring atomicity.
- Location: A dedicated area on the disk, often near the superblock or inode table.
Example of a Simple File System Layout
Here's a simplified layout of a typical file system:
- Boot Block (Block 0): Contains bootstrapping code.
- Superblock (Block 1): Contains metadata about the file system.
- Inode Table (Blocks 2-9): Array of inodes, each representing a file or directory.
- Free Space Management (Blocks 10-13): Bitmap or free list tracking free blocks.
- Data Blocks (Blocks 14 and beyond): Storage for actual file data and directory contents.