Harman Patil (Editor)

Design of the FAT file system

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit
Directory contents
  
Table

File allocation
  
Design of the FAT file system

Developer(s)
  
Full name
  
File Allocation Table:FAT12 (12-bit version),FAT16 (16-bit versions),FAT32 (32-bit version with 28 bits used),exFAT (64-bit versions)

Introduced
  
1977 (Standalone Disk BASIC-80)FAT12: August 1980 (SCP QDOS)FAT16: August 1984 (IBM PC DOS 3.0)FAT16B: November 1987 (Compaq MS-DOS 3.31)FAT32: August 1996 (Windows 95 OSR2)

Partition identifier
  
MBR/EBR:FAT12: 0x01 e.a.FAT16: 0x04 0x06 0x0E e.a.FAT32: 0x0B 0x0C e.a.BDP:EBD0A0A2-B9E5-4433-87C0-68B6B72699C7

A FAT file system is a specific type of computer file system architecture and a family of industry-standard file systems utilizing it.

Contents

The FAT file system is a legacy file system which is simple and robust. It offers good performance even in very light-weight implementations, but cannot deliver the same performance, reliability and scalability as some modern file systems. It is, however, supported for compatibility reasons by nearly all currently developed operating systems for personal computers and many home computers, mobile devices and embedded systems, and thus is a well-suited format for data exchange between computers and devices of almost any type and age from 1981 up to the present.

Originally designed in 1977 for use on floppy disks, FAT was soon adapted and used almost universally on hard disks throughout the DOS and Windows 9x eras for two decades. Today, FAT file systems are still commonly found on floppy disks, USB sticks, flash and other solid-state memory cards and modules, and many portable and embedded devices. DCF implements FAT as the standard file system for digital cameras since 1998. FAT is also utilized for the EFI system partition (partition type 0xEF) in the boot stage of EFI-compliant computers.

For floppy disks, FAT has been standardized as ECMA-107 and ISO/IEC 9293:1994 (superseding ISO 9293:1987). These standards cover FAT12 and FAT16 with only short 8.3 filename support; long filenames with VFAT are partially patented.

Technical overview

The name of the file system originates from the file system's prominent usage of an index table, the File Allocation Table, statically allocated at the time of formatting. The table contains entries for each cluster, a contiguous area of disk storage. Each entry contains either the number of the next cluster in the file, or else a marker indicating end of file, unused disk space, or special reserved areas of the disk. The root directory of the disk contains the number of the first cluster of each file in that directory; the operating system can then traverse the FAT table, looking up the cluster number of each successive part of the disk file as a cluster chain until the end of the file is reached. In much the same way, sub-directories are implemented as special files containing the directory entries of their respective files.

Originally designed as an 8-bit file system, the maximum number of clusters has been significantly increased as disk drives have evolved, and so the number of bits used to identify each cluster has grown. The successive major versions of the FAT format are named after the number of table element bits: 12 (FAT12), 16 (FAT16), and 32 (FAT32). Except for the original 8-bit FAT precursor, each of these variants is still in use. The FAT standard has also been expanded in other ways while generally preserving backward compatibility with existing software.

Layout

A FAT file system is composed of four different sections:

  • The Reserved sectors, located at the very beginning.
  • The first reserved sector (logical sector 0) is the Boot Sector (also called Volume Boot Record or simply VBR). It includes an area called the BIOS Parameter Block (BPB) which contains some basic file system information, in particular its type and pointers to the location of the other sections, and usually contains the operating system's boot loader code.Important information from the Boot Sector is accessible through an operating system structure called the Drive Parameter Block (DPB) in DOS and OS/2.The total count of reserved sectors is indicated by a field inside the Boot Sector, and is usually 32 on FAT32 file systems.For FAT32 file systems, the reserved sectors include a File System Information Sector at logical sector 1 and a Backup Boot Sector at logical sector 6.While many other vendors have continued to utilize a single-sector setup (logical sector 0 only) for the bootstrap loader, Microsoft's boot sector code has grown to span over logical sectors 0 and 2 since the introduction of FAT32, with logical sector 0 depending on sub-routines in logical sector 2. The Backup Boot Sector area consists of three logical sectors 6, 7, and 8 as well. In some cases, Microsoft also uses sector 12 of the reserved sectors area for an extended boot loader.
  • The FAT Region.
  • This typically contains two copies (may vary) of the File Allocation Table for the sake of redundancy checking, although rarely used, even by disk repair utilities.These are maps of the Data Region, indicating which clusters are used by files and directories. In FAT12 and FAT16 they immediately follow the reserved sectors.Typically the extra copies are kept in tight synchronization on writes, and on reads they are only used when errors occur in the first FAT. In FAT32, it is possible to switch from the default behaviour and select a single FAT out of the available ones to be used for diagnosis purposes.The first two clusters (cluster 0 and 1) in the map contain special values.
  • The Root Directory Region.
  • This is a Directory Table that stores information about the files and directories located in the root directory. It is only used with FAT12 and FAT16, and imposes on the root directory a fixed maximum size which is pre-allocated at creation of this volume. FAT32 stores the root directory in the Data Region, along with files and other directories, allowing it to grow without such a constraint. Thus, for FAT32, the Data Region starts here.
  • The Data Region.
  • This is where the actual file and directory data is stored and takes up most of the partition. Traditionally, the unused parts of the data region are initialized with a filler value of 0xF6 as per the INT 1Eh's Disk Parameter Table (DPT) during format on IBM compatible machines, but also used on the Atari Portfolio. 8-inch CP/M floppies typically came pre-formatted with a value of 0xE5; by way of Digital Research this value was also used on Atari ST formatted floppies. Amstrad used 0xF4 instead. Some modern formatters wipe hard disks with a value of 0x00, whereas a value of 0xFF, the default value of a non-programmed flash block, is used on flash disks to reduce wear. The latter value is typically also used on ROM disks. (Some advanced formatting tools allow to configure the format filler byte.)The size of files and subdirectories can be increased arbitrarily (as long as there are free clusters) by simply adding more links to the file's chain in the FAT. Note however, that files are allocated in units of clusters, so if a 1 KB file resides in a 32 KB cluster, 31 KB are wasted.FAT32 typically commences the Root Directory Table in cluster number 2: the first cluster of the Data Region.

    FAT uses little-endian format for all entries in the header (except for, where explicitly mentioned, for some entries on Atari ST boot sectors) and the FAT(s). It is possible to allocate more FAT sectors than necessary for the number of clusters. The end of the last sector of each FAT copy can be unused if there are no corresponding clusters. The total number of sectors (as noted in the boot record) can be larger than the number of sectors used by data (clusters × sectors per cluster), FATs (number of FATs × sectors per FAT), the root directory (n/a for FAT32), and hidden sectors including the boot sector: this would result in unused sectors at the end of the volume. If a partition contains more sectors than the total number of sectors occupied by the file system it would also result in unused sectors, at the end of the partition, after the volume.

    Boot Sector

    On non-partitioned devices, such as floppy disks, the Boot Sector (VBR) is the first sector (logical sector 0 with physical CHS address 0/0/1 or LBA address 0). For partitioned devices such as hard drives, the first sector is the Master Boot Record defining partitions, while the first sector of partitions formatted with a FAT file system is again the Boot Sector.

    Common structure of the first 11 bytes used by most FAT versions for IBM compatible x86-machines since DOS 2.0 are:

    FAT-formatted Atari ST floppies have a very similar boot sector layout:

    FAT12-formatted MSX-DOS volumes have a very similar boot sector layout:

    BIOS Parameter Block

    Common structure of the first 25 bytes of the BIOS Parameter Block (BPB) used by FAT versions since DOS 2.0 (bytes at sector offset 0x00B to 0x017 are stored since DOS 2.0, but not always used before DOS 3.2, values at 0x018 to 0x01B are used since DOS 3.0):

    DOS 3.0 BPB:

    The following extensions were documented since DOS 3.0, however, they were already supported by some issues of DOS 2.11. MS-DOS 3.10 still supported the DOS 2.0 format, but could use the DOS 3.0 format as well.

    DOS 3.2 BPB:

    Officially, MS-DOS 3.20 still used the DOS 3.0 format, but SYS and FORMAT were adapted to support a 6 bytes longer format already (of which not all entries were used).

    DOS 3.31 BPB:

    Officially introduced with DOS 3.31 and not used by DOS 3.2, some DOS 3.2 utilities were designed to be aware of this new format already. Official documentation recommends to trust these values only if the logical sectors entry at offset 0x013 is zero.

    A simple formula translates a volume's given cluster number CN to a logical sector number LSN:

    1. Determine (once) SSA=RSC+FN×SF+ceil((32×RDE)/SS), where the reserved sector count RSC is stored at offset 0x00E, the FAT number FN at offset 0x010, the sectors per FAT SF at offset 0x016 (FAT12/FAT16) or 0x024 (FAT32), the root directory entries RDE at offset 0x011, the sector size SS at offset 0x00B, and ceil(x) rounds up to a whole number.
    2. Determine LSN=SSA+(CN−2)×SC, where the sectors per cluster SC are stored at offset 0x00D.

    On unpartitioned media the volume's number of hidden sectors is zero and therefore LSN and LBA addresses become the same for as long as a volume's logical sector size is identical to the underlying medium's physical sector size. Under these conditions, it is also simple to translate between CHS addresses and LSNs as well:

    LSN=SPT×(HN+(NOS×TN))+SN−1, where the sectors per track SPT are stored at offset 0x018, and the number of sides NOS at offset 0x01A. Track number TN, head number HN, and sector number SN correspond to Cylinder-head-sector: the formula gives the known CHS to LBA translation.

    Extended BIOS Parameter Block

    Further structure used by FAT12 and FAT16 since OS/2 1.0 and DOS 4.0, also known as Extended BIOS Parameter Block (EBPB) (bytes below sector offset 0x024 are the same as for the DOS 3.31 BPB):

    FAT32 Extended BIOS Parameter Block

    In essence FAT32 inserts 28 bytes into the EBPB, followed by the remaining 26 (or sometimes only 7) EBPB bytes as shown above for FAT12 and FAT16. Microsoft and IBM operating systems determine the type of FAT file system used on a volume solely by the number of clusters, not by the used BPB format or the indicated file system type, that is, it is technically possibly to use a "FAT32 EBPB" also for FAT12 and FAT16 volumes as well as a DOS 4.0 EBPB for small FAT32 volumes. Since such volumes were found to be created by Windows operating systems under some odd conditions, operating systems should be prepared to cope with these hybrid forms.

    Exceptions

    Versions of DOS before 3.2 totally or partially relied on the media descriptor byte in the BPB or the FAT ID byte in cluster 0 of the first FAT in order to determine FAT12 diskette formats even if a BPB is present. Depending on the FAT ID found and the drive type detected they default to use one of the following BPB prototypes instead of using the values actually stored in the BPB.

    Originally, the FAT ID was meant to be a bit flag with all bits set except for bit 2 cleared to indicate an 80 track (vs. 40 track) format, bit 1 cleared to indicate a 9 sector (vs. 8 sector) format, and bit 0 cleared to indicate a single-sided (vs. double-sided) format, but this scheme was not followed by all OEMs and became obsolete with the introduction of hard disks and high-density formats. Also, the various 8-inch formats supported by 86-DOS and MS-DOS do not fit this scheme.

    Microsoft recommends to distinguish between the two 8-inch formats for FAT ID 0xFE by trying to read of a single-density address mark. If this results in an error, the medium must be double-density.

    The table does not list a number of incompatible 8-inch and 5.25-inch FAT12 floppy formats supported by 86-DOS, which differ either in the size of the directory entries (16 bytes vs. 32 bytes) or in the extent of the reserved sectors area (several whole tracks vs. one logical sector only).

    The implementation of a single-sided 315 KB FAT12 format used in MS-DOS for the Apricot PC and F1e had a different boot sector layout, to accommodate that computer's non-IBM compatible BIOS. The jump instruction and OEM name were omitted, and the MS-DOS BPB parameters (offsets 0x00B-0x017 in the standard boot sector) were located at offset 0x050. The Portable, F1, PC duo and Xi FD supported a non-standard double-sided 720 KB FAT12 format instead. The differences in the boot sector layout and media IDs made these formats incompatible with many other operating systems. The geometry parameters for these formats are:

  • 315 KB: Bytes per logical sector: 512 bytes, logical sectors per cluster: 1, reserved logical sectors: 1, number of FATs: 2, root directory entries: 128, total logical sectors: 630, FAT ID: 0xFC, logical sectors per FAT: 2, physical sectors per track: 9, number of heads: 1.
  • 720 KB: Bytes per logical sector: 512 bytes, logical sectors per cluster: 2, reserved logical sectors: 1, number of FATs: 2, root directory entries: 176, total logical sectors: 1440, FAT ID: 0xFE, logical sectors per FAT: 3, physical sectors per track: 9, number of heads: 2.
  • Later versions of Apricot MS-DOS gained the ability to read and write disks with the standard boot sector in addition to those with the Apricot one. These formats were also supported by DOS Plus 2.1e/g for the Apricot ACT series.

    The DOS Plus adaptation for the BBC Master 512 supported two FAT12 formats on 80-track, double-sided, double-density 5.25" drives, which did not use conventional boot sectors at all. 800 KB data disks omitted a boot sector and began with a single copy of the FAT. The first byte of the relocated FAT in logical sector 0 was used to determine the disk's capacity. 640 KB boot disks began with a miniature ADFS file system containing the boot loader, followed by a single FAT. Also, the 640 KB format differed by using physical CHS sector numbers starting with 0 (not 1, as common) and incrementing sectors in the order sector-track-head (not sector-head-track, as common). The FAT started at the beginning of the next track. These differences make these formats unrecognizable by other operating systems. The geometry parameters for these formats are:

  • 800 KB: Bytes per logical sector: 1024 bytes, logical sectors per cluster: 1, reserved logical sectors: 0, number of FATs: 1, root directory entries: 192, total logical sectors: 800, FAT ID: 0xFD, logical sectors per FAT: 2, physical sectors per track: 5, number of heads: 2.
  • 640 KB: Bytes per logical sector: 256 bytes, logical sectors per cluster: 8, reserved logical sectors: 16, number of FATs: 1, root directory entries: 112, total logical sectors: 2560, FAT ID: 0xFF, logical sectors per FAT: 2, physical sectors per track: 16, number of heads: 2.
  • DOS Plus for the Master 512 could also access standard PC disks formatted to 180 KB or 360 KB, using the first byte of the FAT in logical sector 1 to determine the capacity.

    FS Information Sector

    The "FS Information Sector" was introduced in FAT32 for speeding up access times of certain operations (in particular, getting the amount of free space). It is located at a logical sector number specified in the FAT32 EBPB boot record at position 0x030 (usually logical sector 1, immediately after the boot record itself).

    The sector's data may be outdated and not reflect the current media contents, because not all operating systems update or use this sector, and even if they do, the contents is not valid when the medium has been ejected without properly unmounting the volume or after a power-failure. Therefore, operating systems should first inspect a volume's optional shutdown status bitflags residing in the FAT entry of cluster 1 or the FAT32 EBPB at offset 0x041 and ignore the data stored in the FS information sector, if these bitflags indicate that the volume was not properly unmounted before. This does not cause any problems other than a possible speed penalty for the first free space query or data cluster allocation; see fragmentation.

    If this sector is present on a FAT32 volume, the minimum allowed logical sector size is 512 bytes, whereas otherwise it would be 128 bytes. Some FAT32 implementations support a slight variation of Microsoft's specification by making the FS information sector optional by specifying a value of 0xFFFF (or 0x0000) in the entry at offset 0x030.

    Cluster map

    A volume's data area is divided up into identically sized clusters, small blocks of contiguous space. Cluster sizes vary depending on the type of FAT file system being used and the size of the partition; typically cluster sizes lie somewhere between 2 KB and 32 KB.

    Each file may occupy one or more of these clusters depending on its size; thus, a file is represented by a chain of these clusters (referred to as a singly linked list). However these clusters are not necessarily stored adjacent to one another on the disk's surface but are often instead fragmented throughout the Data Region.

    Each version of the FAT file system uses a different size for FAT entries. Smaller numbers result in a smaller FAT, but waste space in large partitions by needing to allocate in large clusters.

    The FAT12 file system uses 12 bits per FAT entry, thus two entries span 3 bytes. It is consistently little-endian: if those three bytes are considered as one little-endian 24-bit number, the 12 least significant bits represent the first entry (f.e. cluster 0) and the 12 most significant bits the second (f.e cluster 1). In other words, while the low eight bits of the first cluster in the row are stored in the first byte, the top four bits are stored in the low nibble of the second byte, whereas the low four bits of the subsequent cluster in the row are stored in the high nibble of the second byte and its higher eight bits in the third byte.

  • FAT ID / endianess marker (in reserved cluster #0), with 0xF0 indicating a volume on a non-partitioned superfloppy drive (must be 0xF8 for partitioned disks)
  • End of chain indicator / maintenance flags (in reserved cluster #1)
  • Second chain (7 clusters) for a non-fragmented file (here: #2, #3, #4, #5, #6, #7, #8)
  • Third chain (7 clusters) for a fragmented, possibly grown file (here: #9, #A, #14, #15, #16, #19, #1A)
  • Fourth chain (7 clusters) for a non-fragmented, possibly truncated file (here: #B, #C, #D, #E, #F, #10, #11)
  • Empty clusters
  • Fifth chain (1 cluster) for a sub-directory (here: #23)
  • Bad clusters (3 clusters) (here: #27, #28, #2D)
  • The FAT16 file system uses 16 bits per FAT entry, thus one entry spans two bytes in little-endian byte order:

    The FAT32 file system uses 32 bits per FAT entry, thus one entry spans four bytes in little-endian byte order. The four top bits of each entry are reserved for other purposes, cleared during format and should not be changed otherwise. They must be masked off before interpreting the entry as 28-bit cluster address.

  • First chain (1 cluster) for the root directory, pointed to by an entry in the FAT32 BPB (here: #2)
  • Second chain (6 clusters) for a non-fragmented file (here: #3, #4, #5, #6, #7, #8)
  • The File Allocation Table (FAT) is a contiguous number of sectors immediately following the area of reserved sectors. It represents a list of entries that map to each cluster on the volume. Each entry records one of five things:

  • the cluster number of the next cluster in a chain
  • a special end of cluster-chain (EOC) entry that indicates the end of a chain
  • a special entry to mark a bad cluster
  • a zero to note that the cluster is unused
  • For very early versions of DOS to recognize the file system, the system must have been booted from the volume or the volume's FAT must start with the volume's second sector (logical sector 1 with physical CHS address 0/0/2 or LBA address 1), that is, immediately following the boot sector. Operating systems assume this hard-wired location of the FAT in order to find the FAT ID in the FAT's cluster 0 entry on DOS 1.0-1.1 FAT diskettes, where no valid BPB is found.

    Special entries

    The first two entries in a FAT store special values:

    The first entry (cluster 0 in the FAT) holds the FAT ID since MS-DOS 1.20 and PC DOS 1.1 (allowed values 0xF0-0xFF with 0xF1-0xF7 reserved) in bits 7-0, which is also copied into the BPB of the boot sector, offset 0x015 since DOS 2.0. The remaining 4 bits (if FAT12), 8 bits (if FAT16) or 20 bits (if FAT32) of this entry are always 1. These values were arranged so that the entry would also function as an "trap-all" end-of-chain marker for all data clusters holding a value of zero. Additionally, for FAT IDs other than 0xFF (and 0x00) it is possible to determine the correct nibble and byte order (to be) used by the file system driver, however, the FAT file system officially uses a little-endian representation only and there are no known implementations of variants using big-endian values instead. 86-DOS 0.42 up to MS-DOS 1.14 used hard-wired drive profiles instead of a FAT ID, but used this byte to distinguish between media formatted with 32-byte or 16-byte directory entries, as they were used prior to 86-DOS 0.42.

    The second entry (cluster 1 in the FAT) nominally stores the end-of-cluster-chain marker as used by the formater, but typically always holds 0xFFF / 0xFFFF / 0x0FFFFFFF, that is, with the exception of bits 31-28 on FAT32 volumes these bits are normally always set. Some Microsoft operating systems, however, set these bits if the volume is not the volume holding the running operating system (that is, use 0xFFFFFFFF instead of 0x0FFFFFFF here). (In conjunction with alternative end-of-chain markers the lowest bits 2-0 can become zero for the lowest allowed end-of-chain marker 0xFF8 / 0xFFF8 / 0x?FFFFFF8; bit 3 should be reserved as well given that clusters 0xFF0 / 0xFFF0 / 0x?FFFFFF0 and higher are officially reserved. Some operating systems may not be able to mount some volumes if any of these bits are not set, therefore the default end-of-chain marker should not be changed.) For DOS 1 and 2, the entry was documented as reserved for future use.

    Since DOS 7.1 the two most-significant bits of this cluster entry may hold two optional bitflags representing the current volume status on FAT16 and FAT32, but not on FAT12 volumes. These bitflags are not supported by all operating systems, but operating systems supporting this feature would set these bits on shutdown and clear the most significant bit on startup:
    If bit 15 (on FAT16) or bit 27 (on FAT32) is not set when mounting the volume, the volume was not properly unmounted before shutdown or ejection and thus is in an unknown and possibly "dirty" state. On FAT32 volumes, the FS Information Sector may hold outdated data and thus should not be used. The operating system would then typically run SCANDISK or CHKDSK on the next startup (but not on insertion of removable media) to ensure and possibly reestablish the volume's integrity.
    If bit 14 (on FAT16) or bit 26 (on FAT32) is cleared, the operating system has encountered disk I/O errors on startup, a possible indication for bad sectors. Operating systems aware of this extension will interpret this as a recommendation to carry out a surface scan (SCANDISK) on the next boot. (A similar set of bitflags exists in the FAT12/FAT16 EBPB at offset 0x1A or the FAT32 EBPB at offset 0x36. While the cluster 1 entry can be accessed by file system drivers once they have mounted the volume, the EBPB entry is available even when the volume is not mounted and thus easier to use by disk block device drivers or partitioning tools.)

    If the number of FATs in the BPB is not set to 2, the second cluster entry in the first FAT (cluster 1) may also reflect the status of a TFAT volume for TFAT-aware operating systems. If the cluster 1 entry in that FAT holds the value 0, this may indicate that the second FAT represents the last known valid transaction state and should be copied over the first FAT, whereas the first FAT should be copied over the second FAT if all bits are set.

    Some non-standard FAT12/FAT16 implementations utilize the cluster 1 entry to store the starting cluster of a variable-sized root directory (typically 2). This may occur when the number of root directory entries in the BPB holds a value of 0 and no FAT32 EBPB is found (no signature 0x29 or 0x28 at offset 0x042). This extension, however, is not supported by mainstream operating systems, as it is conflictive with other possible uses of the cluster 1 entry. Most conflicts can be ruled out if this extension is only allowed for FAT12 with less than 0xFEF and FAT16 volumes with less than 0x3FEF clusters and 2 FATs.

    Because these first two FAT entries store special values, there are no data clusters 0 or 1. The first data cluster (after the root directory if FAT12/FAT16) is cluster 2, and cluster 2 is by definition the first cluster in the data area.

    Cluster values

    FAT entry values:

    Despite its name FAT32 uses only 28 bits of the 32 possible bits. The upper 4 bits are usually zero, but are reserved and should be left untouched. A standard conformant FAT32 file system driver or maintenance tool must not rely on the upper 4 bits to be zero and it must strip them off before evaluating the cluster number in order to cope with possible future expansions where these bits may be used for other purposes. They must not be cleared by the file system driver when allocating new clusters, but should be cleared during a reformat.

    Size limits

    The FAT12, FAT16, FAT16B, and FAT32 variants of the FAT file systems have clear limits based on the number of clusters and the number of sectors per cluster (1, 2, 4, ..., 128). For the typical value of 512 bytes per sector:

    FAT12 requirements : 3 sectors on each copy of FAT for every 1,024 clustersFAT16 requirements : 1 sector on each copy of FAT for every 256 clustersFAT32 requirements : 1 sector on each copy of FAT for every 128 clustersFAT12 range : 1 to 4,084 clusters : 1 to 12 sectors per copy of FATFAT16 range : 4,085 to 65,524 clusters : 16 to 256 sectors per copy of FATFAT32 range : 65,525 to 268,435,444 clusters : 512 to 2,097,152 sectors per copy of FATFAT12 minimum : 1 sector per cluster × 1 clusters = 512 bytes (0.5 KB)FAT16 minimum : 1 sector per cluster × 4,085 clusters = 2,091,520 bytes (2,042.5 KB)FAT32 minimum : 1 sector per cluster × 65,525 clusters = 33,548,800 bytes (32,762.5 KB)FAT12 maximum : 64 sectors per cluster × 4,084 clusters = 133,824,512 bytes (≈ 127 MB)[FAT12 maximum : 128 sectors per cluster × 4,084 clusters = 267,694,024 bytes (≈ 255 MB)]FAT16 maximum : 64 sectors per cluster × 65,524 clusters = 2,147,090,432 bytes (≈2,047 MB)[FAT16 maximum : 128 sectors per cluster × 65,524 clusters = 4,294,180,864 bytes (≈4,095 MB)]FAT32 maximum : 8 sectors per cluster × 268,435,444 clusters = 1,099,511,578,624 bytes (≈1,024 GB)FAT32 maximum : 16 sectors per cluster × 268,173,557 clusters = 2,196,877,778,944 bytes (≈2,046 GB)[FAT32 maximum : 32 sectors per cluster × 134,152,181 clusters = 2,197,949,333,504 bytes (≈2,047 GB)][FAT32 maximum : 64 sectors per cluster × 67,092,469 clusters = 2,198,486,024,192 bytes (≈2,047 GB)][FAT32 maximum : 128 sectors per cluster × 33,550,325 clusters = 2,198,754,099,200 bytes (≈2,047 GB)]Legend: 268435444+3 is 0x0FFFFFF7, because FAT32 version 0 uses only 28 bits in the 32-bit cluster numbers, cluster numbers 0x0FFFFFF7 up to 0x0FFFFFFF flag bad clusters or the end of a file, cluster number 0 flags a free cluster, and cluster number 1 is not used. Likewise 65524+3 is 0xFFF7 for FAT16, and 4084+3 is 0xFF7 for FAT12. The number of sectors per cluster is a power of 2 fitting in a single byte, the smallest value is 1 (0x01), the biggest value is 128 (0x80). Lines in square brackets indicate the unusual cluster size 128, and for FAT32 the bigger than necessary cluster sizes 32 or 64.

    Because each FAT32 entry occupies 32 bits (4 bytes) the maximal number of clusters (268435444) requires 2097152 FAT sectors for a sector size of 512 bytes. 2097152 is 0x200000, and storing this value needs more than two bytes. Therefore, FAT32 introduced a new 32-bit value in the FAT32 boot sector immediately following the 32-bit value for the total number of sectors introduced in the FAT16B variant.

    The boot record extensions introduced with DOS 4.0 start with a magic 40 (0x28) or 41 (0x29). Typically FAT drivers look only at the number of clusters to distinguish FAT12, FAT16, and FAT32: the human readable strings identifying the FAT variant in the boot record are ignored, because they exist only for media formatted with DOS 4.0 or later.

    Determining the number of directory entries per cluster is straight forward, each entry occupies 32 bytes, this results in 16 entries per sector for a sector size of 512 bytes. The DOS 5 RMDIR/RD command removes the initial "." (this directory) and ".." (parent directory) entries in subdirectories directly, therefore sector size 32 on a RAM disk is possible for FAT12, but requires 2 or more sectors per cluster. A FAT12 boot sector without the DOS 4 extensions needs 29 bytes before the first unnecessary FAT16B 32-bit number of hidden sectors, this leaves three bytes for the (on a RAM disk unused) boot code and the magic 0x55 0xAA at the end of all boot sectors. On Windows NT the smallest supported sector size is 128.

    On Windows NT operating systems the FORMAT command options /A:128K and /A:256K correspond to the maximal cluster size 0x80 (128) with a sector size 1024 and 2048, respectively. For the common sector size 512 /A:64K yields 128 sectors per cluster.

    Both editions of each ECMA-107 and ISO/IEC 9293 specify a Max Cluster Number MAX determined by the formula MAX=1+trunc((TS-SSA)/SC), and reserve cluster numbers MAX+1 up to 4086 (0xFF6, FAT12) and later 65526 (0xFFF6, FAT16) for future standardization.

    Microsoft's EFI FAT32 specification states that any FAT file system with less than 4085 clusters is FAT12, else any FAT file system with less than 65525 clusters is FAT16, and otherwise it is FAT32. The entry for cluster 0 at the beginning of the FAT must be identical to the media descriptor byte found in the BPB, whereas the entry for cluster 1 reflects the end-of-chain value used by the formatter for cluster chains (0xFFF, 0xFFFF or 0x0FFFFFFF). The entries for cluster numbers 0 and 1 end at a byte boundary even for FAT12, e.g., 0xF9FFFF for media descriptor 0xF9.

    The first data cluster is 2, and consequently the last cluster MAX gets number MAX+1. This results in data cluster numbers 2...4085 (0xFF5) for FAT12, 2...65525 (0xFFF5) for FAT16, and 2...268435445 (0x0FFFFFF5) for FAT32.

    The only available values reserved for future standardization are therefore 0xFF6 (FAT12) and 0xFFF6 (FAT16). As noted below "less than 4085" is also used for Linux implementations, or as Microsoft's FAT specification puts it:

    when it says <, it does not mean <=. Note also that the numbers are correct. The first number for FAT12 is 4085; the second number for FAT16 is 65525. These numbers and the ‘<’ signs are not wrong.

    Fragmentation

    The FAT file system does not contain built-in mechanisms which prevent newly written files from becoming scattered across the partition. On volumes where files are created and deleted frequently or their lengths often changed, the medium will become increasingly fragmented over time.

    While the design of the FAT file system does not cause any organizational overhead in disk structures or reduce the amount of free storage space with increased amounts of fragmentation, as it occurs with external fragmentation, the time required to read and write fragmented files will increase as the operating system will have to follow the cluster chains in the FAT (with parts having to be loaded into memory first in particular on large volumes) and read the corresponding data physically scattered over the whole medium reducing chances for the low-level block device driver to perform multi-sector disk I/O or initiate larger DMA transfers, thereby effectively increasing I/O protocol overhead as well as arm movement and head settle times inside the disk drive. Also, file operations will become slower with growing fragmentation as it takes increasingly longer for the operating system to find files or free clusters.

    Other file systems, e.g., HPFS or exFAT, use free space bitmaps that indicate used and available clusters, which could then be quickly looked up in order to find free contiguous areas. Another solution is the linkage of all free clusters into one or more lists (as is done in Unix file systems). Instead, the FAT has to be scanned as an array to find free clusters, which can lead to performance penalties with large disks.

    In fact, seeking for files in large subdirectories or computing the free disk space on FAT volumes is one of the most resource intensive operations, as it requires reading the directory tables or even the entire FAT linearly. Since the total amount of clusters and the size of their entries in the FAT was still small on FAT12 and FAT16 volumes, this could still be tolerated on FAT12 and FAT16 volumes most of the time, considering that the introduction of more sophisticated disk structures would have also increased the complexity and memory footprint of real-mode operating systems with their minimum total memory requirements of 128 KB or less (such as with DOS) for which FAT has been designed and optimized originally.

    With the introduction of FAT32, long seek and scan times became more apparent, particularly on very large volumes. A possible justification suggested by Microsoft's Raymond Chen for limiting the maximum size of FAT32 partitions created on Windows was the time required to perform a "DIR" operation, which always displays the free disk space as the last line. Displaying this line took longer and longer as the number of clusters increased. FAT32 therefore introduced a special file system information sector where the previously computed amount of free space is preserved over power cycles, so that the free space counter needs to be recalculated only when a removable FAT32 formatted medium gets ejected without first unmounting it or if the system is switched off without properly shutting down the operating system, a problem mostly visible with pre-ATX-style PCs, on plain DOS systems and some battery-powered consumer products.

    With the huge cluster sizes (16 KB, 32 KB, 64 KB) forced by larger FAT partitions, internal fragmentation in form of disk space waste by file slack due to cluster overhang (as files are rarely exact multiples of cluster size) starts to be a problem as well, especially when there are a great many small files.

    Various optimizations and tweaks to the implementation of FAT file system drivers, block device drivers and disk tools have been devised to overcome most of the performance bottlenecks in the file system's inherit design without having to change the layout of the on-disk structures. They can be divided into on-line and off-line methods and work by trying to avoid fragmentation in the file system in the first place, deploying methods to better cope with existing fragmentation, and by reordering and optimizing the on-disk structures. With optimizations in place, the performance on FAT volumes can often reach that of more sophisticated file systems in practical scenarios, while at the same time retaining the advantage of being accessible even on very small or old systems.

    DOS 3.0 and higher will not immediately reuse disk space of deleted files for new allocations but instead seek for previously unused space before starting to use disk space of previously deleted files as well. This not only helps to maintain the integrity of deleted files for as long as possible but also speeds up file allocations and avoids fragmentation, since never before allocated disk space is always unfragmented. DOS accomplishes this by keeping a pointer to the last allocated cluster on each mounted volume in memory and starts searching for free space from this location upwards instead of at the beginning of the FAT, as it was still done by DOS 2.x. If the end of the FAT is reached, it would wrap around to continue the search at the beginning of the FAT until either free space has been found or the original position has been reached again without having found free space. These pointers are initialized to point to the start of the FATs after bootup, but on FAT32 volumes, DOS 7.1 and higher will attempt to retrieve the last position from the FS Information Sector. This mechanism is defeated, however, if an application often deletes and recreates temporary files as the operating system would then try to maintain the integrity of void data effectively causing more fragmentation in the end. In some DOS versions, the usage of a special API function to create temporary files can be used to avoid this problem.

    Additionally, directory entries of deleted files will be marked 0xE5 since DOS 3.0. DOS 5.0 and higher will start to reuse these entries only when previously unused directory entries have been used up in the table and the system would otherwise have to expand the table itself.

    Since DOS 3.3 the operating system provides means to improve the performance of file operations with FASTOPEN by keeping track of the position of recently opened files or directories in various forms of lists (MS-DOS/PC DOS) or hash tables (DR-DOS), which can reduce file seek and open times significantly. Before DOS 5.0 special care must be taken when using such mechanisms in conjunction with disk defragmentation software bypassing the file system or disk drivers.

    Windows NT will allocate disk space to files on FAT in advance, selecting large contiguous areas, but in case of a failure, files which were being appended will appear larger than they were ever written into, with a lot of random data at the end.

    Other high-level mechanisms may read in and process larger parts or the complete FAT on startup or on demand when needed and dynamically build up in-memory tree representations of the volume's file structures different from the on-disk structures. This may, on volumes with many free clusters, occupy even less memory than an image of the FAT itself. In particular on highly fragmented or filled volumes, seeks become much faster than with linear scans over the actual FAT, even if an image of the FAT would be stored in memory. Also, operating on the logically high level of files and cluster-chains instead of on sector or track level, it becomes possible to avoid some degree of file fragmentation in the first place or to carry out local file defragmentation and reordering of directory entries based on their names or access patterns in the background.

    Some of the perceived problems with fragmentation of FAT file systems also result from performance limitations of the underlying block device drivers, which becomes more visible the lesser memory is available for sector buffering and track blocking/deblocking:

    While the single-tasking DOS had provisions for multi-sector reads and track blocking/deblocking, the operating system and the traditional PC hard disk architecture (only one outstanding input/output request at a time and no DMA transfers) originally did not contain mechanisms which could alleviate fragmentation by asynchronously prefetching next data while the application was processing the previous chunks. Such features became available later. Later DOS versions also provided built-in support for look-ahead sector buffering and came with dynamically loadable disk caching programs working on physical or logical sector level, often utilizing EMS or XMS memory and sometimes providing adaptive caching strategies or even run in protected mode through DPMS or Cloaking to increase performance by gaining direct access to the cached data in linear memory rather than through conventional DOS APIs.

    Write-behind caching was often not enabled by default with Microsoft software (if present) given the problem of data loss in case of a power failure or crash, made easier by the lack of hardware protection between applications and the system.

    Directory table

    A directory table is a special type of file that represents a directory (also known as a folder). Since 86-DOS 0.42, each file or (since MS-DOS 1.40 and PC DOS 2.0) subdirectory stored within it is represented by a 32-byte entry in the table. Each entry records the name, extension, attributes (archive, directory, hidden, read-only, system and volume), the address of the first cluster of the file/directory's data, the size of the file/directory, and the date and (since PC DOS 1.1) also the time of last modification. Earlier versions of 86-DOS used 16-byte directory entries only, supporting no files larger than 16 MB and no time of last modification.

    Aside from the root directory table in FAT12 and FAT16 file systems, which occupies the special Root Directory Region location, all directory tables are stored in the data region. The actual number of entries in a directory stored in the data region can grow by adding another cluster to the chain in the FAT.

    The FAT file system itself does not impose any limits on the depth of a subdirectory tree for as long as there are free clusters available to allocate the subdirectories, however, the internal Current Directory Structure (CDS) under MS-DOS/PC DOS limits the absolute path of a directory to 66 characters (including the drive letter, but excluding the NUL byte delimiter), thereby limiting the maximum supported depth of subdirectories to 32, whatever occurs earlier. Concurrent DOS, Multiuser DOS and DR DOS 3.31 to 6.0 (up to including the 1992-11 updates) do not store absolute paths to working directories internally and therefore do not show this limitation. The same applies to Atari GEMDOS, but the Atari Desktop does not support more than 8 sub-directory levels. Most applications aware of this extension support paths up to at least 127 bytes. FlexOS, 4680 OS and 4690 OS support a length of up to 127 bytes as well, allowing depths down to 60 levels. PalmDOS, DR DOS 6.0 (since BDOS 7.1) and higher, Novell DOS, and OpenDOS sport a MS-DOS-compatible CDS and therefore have the same length limits as MS-DOS/PC DOS.

    Each entry can be preceded by "fake entries" to support a VFAT long filename (LFN); see further below.

    Legal characters for DOS short filenames include the following:

  • Upper case letters AZ
  • Numbers 09
  • Space (though trailing spaces in either the base name or the extension are considered to be padding and not a part of the file name; also filenames with space in them could not easily be used on the DOS command line prior to Windows 95 because of the lack of a suitable escaping system). Another exception are the internal commands MKDIR/MD and RMDIR/RD under DR-DOS which accept single arguments and therefore allow spaces to be entered.
  • ! # $ % & ' ( ) - @ ^ _ ` { } ~
  • Characters 128–228
  • Characters 230–255
  • This excludes the following ASCII characters:

  • " * / : < > ? |
    Windows/MS-DOS has no shell escape character
  • + , . ; = [ ]
    Allowed in long file names only
  • Lower case letters az
    Stored as AZ; allowed in long file names
  • Control characters 0–31
  • Character 127 (DEL)
  • Character 229 (0xE5) was not allowed as first character in a filename in DOS 1 and 2 due to its use as free entry marker. A special case was added to circumvent this limitation with DOS 3.0 and higher.

    The following additional characters are allowed on Atari's GEMDOS, but should be avoided for compatibility with MS-DOS/PC DOS:

  • " + , ; < = > [ ] |
  • The semicolon (;) should be avoided in filenames under DR DOS 3.31 and higher, PalmDOS, Novell DOS, OpenDOS, Concurrent DOS, Multiuser DOS, System Manager and REAL/32, because it may conflict with the syntax to specify file and directory passwords: "...DIRSPEC.EXT;DIRPWDFILESPEC.EXT;FILEPWD". The operating system will strip off one (and also two—since DR-DOS 7.02) semicolons and pending passwords from the filenames before storing them on disk. (The command processor 4DOS uses semicolons for include lists and requires the semicolon to be doubled for password protected files with any commands supporting wildcards.)

    The at-sign character (@) is used for filelists by many DR-DOS, PalmDOS, Novell DOS, OpenDOS and Multiuser DOS, System Manager and REAL/32 commands, as well as by 4DOS and may therefore sometimes be difficult to use in filenames.

    Under Multiuser DOS and REAL/32, the exclamation mark (!) is not a valid filename character since it is used to separate multiple commands in a single command line.

    Under IBM 4680 OS and 4690 OS, the following characters are not allowed in filenames:

  • ? * : . ; , [ ] ! + = < > " - / |
  • Additionally, the following special characters are not allowed in the first, fourth, fifth and eight character of a filename, as they conflict with the host command processor (HCP) and input sequence table build file names:

  • @ # ( ) { } $ &
  • The DOS file names are in the current OEM character set: this can have surprising effects if characters handled in one way for a given code page are interpreted differently for another code page (DOS command CHCP) with respect to lower and upper case, sorting, or validity as file name character.

    Directory entry

    Before Microsoft added support for long filenames and creation/access time stamps, bytes 0x0C0x15 of the directory entry were used by other operating systems to store additional metadata, most notably the operating systems of the Digital Research family stored file passwords, access rights, owner IDs, and file deletion data there. While Microsoft's newer extensions are not fully compatible with these extensions by default, most of them can coexist in third-party FAT implementations (at least on FAT12 and FAT16 volumes).

    32-byte directory entries, both in the Root Directory Region and in subdirectories, are of the following format (see also 8.3 filename):

    The FlexOS-based operating systems IBM 4680 OS and IBM 4690 OS support unique distribution attributes stored in some bits of the previously reserved areas in the directory entries:

    1. Local: Don't distribute file but keep on local controller only.
    2. Mirror file on update: Distribute file to server only when file is updated.
    3. Mirror file on close: Distribute file to server only when file is closed.
    4. Compound file on update: Distribute file to all controllers when file is updated.
    5. Compound file on close: Distribute file to all controllers when file is closed.

    Some incompatible extensions found in some operating systems include:

    VFAT long file names

    VFAT Long File Names (LFNs) are stored on a FAT file system using a trick—adding (possibly multiple) additional entries into the directory before the normal file entry. The additional entries are marked with the Volume Label, System, Hidden, and Read Only attributes (yielding 0x0F), which is a combination that is not expected in the MS-DOS environment, and therefore ignored by MS-DOS programs and third-party utilities. Notably, a directory containing only volume labels is considered as empty and is allowed to be deleted; such a situation appears if files created with long names are deleted from plain DOS. This method is very similar to the DELWATCH method to utilize the volume attribute to hide pending delete files for possible future undeletion since DR DOS 6.0 (1991) and higher. It is also similar to a method publicly discussed to store long filenames on Ataris and under Linux in 1992.

    Because older versions of DOS could mistake LFN names in the root directory for the volume label, VFAT was designed to create a blank volume label in the root directory before adding any LFN name entries (if a volume label did not already exist).

    Each phony entry can contain up to 13 UCS-2 characters (26 bytes) by using fields in the record which contain file size or time stamps (but not the starting cluster field, for compatibility with disk utilities, the starting cluster field is set to a value of 0. See 8.3 filename for additional explanations). Up to 20 of these 13-character entries may be chained, supporting a maximum length of 255 UCS-2 characters.

    After the last UCS-2 character, a 0x0000 is added. The remaining unused characters are filled with 0xFFFF.

    LFN entries use the following format:

    If there are multiple LFN entries, required to represent a file name, firstly comes the last LFN entry (the last part of the filename). The sequence number also has bit 6 (0x40) set (this means the last LFN entry, however it's the first entry seen when reading the directory file). The last LFN entry has the largest sequence number which decreases in following entries. The first LFN entry has sequence number 1. A value of 0xE5 is used to indicate that the entry is deleted.

    On FAT12 and FAT16 volumes, testing for the values at 0x1A to be zero and at 0x1C to be non-zero can be used to distinguish between VFAT LFNs and pending delete files under DELWATCH.

    For example, a filename like "File with very long filename.ext" would be formatted like this:

    A checksum also allows verification of whether a long file name matches the 8.3 name; such a mismatch could occur if a file was deleted and re-created using DOS in the same directory position. The checksum is calculated using the algorithm below. (Note that pFCBName is a pointer to the name as it appears in a regular directory entry, i.e. the first eight characters are the filename, and the last three are the extension. The dot is implicit. Any unused space in the filename is padded with space characters (ASCII 0x20). For example, "Readme.txt" would be "README␠␠TXT".)

    If a filename contains only lowercase letters, or is a combination of a lowercase basename with an uppercase extension, or vice versa; and has no special characters, and fits within the 8.3 limits, a VFAT entry is not created on Windows NT and later versions of Windows such as XP. Instead, two bits in byte 0x0C of the directory entry are used to indicate that the filename should be considered as entirely or partially lowercase. Specifically, bit 4 means lowercase extension and bit 3 lowercase basename, which allows for combinations such as "example.TXT" or "HELLO.txt" but not "Mixed.txt". Few other operating systems support it. This creates a backwards-compatibility problem with older Windows versions (Windows 95 / 98 / 98 SE / ME) that see all-uppercase filenames if this extension has been used, and therefore can change the name of a file when it is transported between operating systems, such as on a USB flash drive. Current 2.6.x versions of Linux will recognize this extension when reading (source: kernel 2.6.18 /fs/fat/dir.c and fs/vfat/namei.c); the mount option shortname determines whether this feature is used when writing.

    References

    Design of the FAT file system Wikipedia