FAT Filesystem

Creating a FAT12 sample file

On Ubuntu the commands (https://superuser.com/questions/668485/creating-a-fat-file-system-and-save-it-into-a-file-in-gnu-linux) create a FAT12 formatted file on your harddrive. Replace the SIZE placeholder by e.g. 2048 for a 2 MB file. of determines the file’s filename. The last command mounts the file and is optional.

dd if=/dev/zero of=fat.fs bs=1024 count=SIZE

mkfs.vfat fat.fs

mount -o loop <image_name> /mnt

On macos which is a BSD-derived Unix system, the command newfs_type is more commonly used than mkfs. The type can be one of hfs, msdos, exfat or udf.

To create a FAT12 file use

dd if=/dev/zero of=floppy.img bs=1024 count=1440

Now attach the floppy.img to a system file (without mounting it, as it does not have a filesystem yet and mounting can only be done on a filesystem)

hdiutil attach -nomount floppy.img
The command above will output the file that the iamge was attached to e.g. /dev/disk2

Now, using the file descriptor (e.g. /dev/disk2), you can call newfs_msdos to create a filesystem on the attached image.

newfs_msdos -F 12 -v vollabel /dev/disk2


Detach the image from the file again

hdiutil unmount /dev/disk2

Check the image

hdiutil attach -readonly floppy.img

Now mount the image

 
diskutil list 
mount -t msdos /dev/disk2 ./mnt
mount_msdos: /dev/disk2 on /Users/bischowg/dev/osdev/fat/resources/mnt: Invalid argument

The file contains data that matches the description for FAT12 given in

The filesystem on the file is initially empty. To store a file, mount the file and copy a file to it.

FAT 12 structure

FAT12 was only ever used on floppy disks. It is only meant for small filesystems. FAT12 is the only FAT filesystem out of FAT12, FAT16 and FAT32 that has a sector to cluster ratio of one. That means a cluster contains only a single sector. That means that when working with FAT12, there is no need to ever distinguish the concept of sectors and clusters. Sectors and clusters can be used interchangeably! In other flavors of FAT, a cluster consists of several sectors.

A FAT12 file system is made up of four parts.

  1. Reserved Sectors – the first sector is the boot sector and contains the bios paramater block (BPB) (see below)
  2. File Allocation Table (FAT) – the bios parameter block describes how many copies of the FAT are stored. FATs are stored redundendly to prevent unreadable disks if a FAT gets corrupted.
  3. Root Directory – the top level directory of the volume
  4. Data Area – stores the raw data of the files and directories

The Boot Sector and the BIOS Parameter Block

The FAT12 filesystem starts of with reserved sectors. There is usually only a single reserved sector which is the boot sector. The boot sector stores the BIOS Parameter Block (BPB) which contains general information that is necessary to know to navigate the FAT12 volume.

The first three bytes contain an assembler jump instruction which makes the CPU jump over the boot sector should it ever be told to execute the contents of the first sector.

The next eight bytes contain the OEM Name, a label that is padded with zeroes should the content be smaller than eight bytes. I think the content is not relevant and can be ignored when reading a FAT12 image.

The following bytes contain the BIOS Parameter Block (BPB). A very good visualization of the BPB is given in https://thestarman.pcministry.com/asm/mbr/GRUBbpb.htm

https://jdebp.eu/FGA/bios-parameter-block.html says:

Because they were originally designed for use on IBM PC compatible machines with Intel CPUs, all of the (integer) fields in BPBs are little-endian.

https://en.wikipedia.org/wiki/Endianness says:

A little-endian ordering places the least significant byte first and the most significant byte last, while a big-endian ordering does the opposite.

Integers in this structure are stored little endian on the disk. That means if you read a word and the word contains the bytes 0x00 0x02, you have to assemble a value of 0x02 0x00 = 512 (decimal) because the byte order is little-endian, and the byte with the highest value is 0x02 whereas the second byte 0x00 follows.

The macros __bswap_16() and __bswap_32() from byteswap.h can be used to convert endianess if needed. On Intel and AMD, there is no need to convert, it will automatically read the bytes in the correct order.

A packed structure that describes the Jump, the OEM Name and the BPB is:

typedef struct __attribute__((packed))
{
    unsigned char jmpBoot[3];
    unsigned char oemName[8];
    uint16_t bytesPerSec; // Bytes per logical sector
    uint8_t secPerClus; // Logical sectors per cluster
    uint16_t rsvdSecCnt; // Reserved logical sectors 
    uint8_t numFats; // Number of FATs 
    uint16_t rootEntCnt; // Root directory entries 
    uint16_t totSec16; // Total logical sectors 
    int8_t media; // Media descriptor 
    int16_t fatSz16; // Logical sectors per FAT 
    int16_t secPerTrk;
    int16_t numHeads;
    int32_t hiddSec;
    int32_t totSec32;

} bios_parameter_block ;

Data Area, Clusters, Sectors, FAT

Files and Directories are stored the same way in FAT, they are stored in sectors within clusters. A directory contains directory entries. In a directory entry, a directory has a directory flag set which distinguishes it from a regular file. A directory maintains a table of the files and directories it contains, it contains directory entries to store the files and folders it contains.

Files and folders are organized in one or more Clusters connected to each other (cluster chain). Clusters contain Sectors. (FAT12 has a sector to cluster ratio of one, that means a cluster contains only a single sector) If a file or folder fits into one Cluster, one cluster suffices. If a file or folder is larger than one cluster, the clusters are chained together that means a cluster maintains a pointer to the next cluster. The FAT can be indexed with a cluster id and stores if there is a pointer to the next cluster, if the cluster is faulty or if it is the last cluster in a cluster chain.

A File Allocation Table (FAT) maintains a list of all the clusters that pertain to files and directories. The FAT is a map, that maps logical cluster indexes to logical cluster indexes. If you put a logical cluster index into the FAT, the FAT gives you the next logical cluster index in the chain. That means the FAT describes a chain of clusters. If a file or a folder is too large for one cluster, it is split up and stored into several clusters. The FAT stores the entire file or folder by storing the file’s or folder’s cluster chain.

The File Allocation Table is stored redundantly (more than once, several copies) in order to keep the files accessible even if one of the copies of the FAT gets corrupted. If files are changed in size, created or deleted, all copies of the FAT have to be updated.

Reading a file from FAT12

The strategy for reading a file from a FAT12 file system is as follows:

  1. Read the boot sector and the bios parameter block therein to get general information about the file system
  2. Make sure that the file system is FAT12 and not FAT16 or FAT32
  3. Compute the offset to the root directory using the count of reserved sectors, the amount of FAT table copies, the size of a FAT in sectors, the size of a sector in bytes. All this information is contained in the BPB
  4. Read the top level entries from the root directory. The root directory is one of the four major parts of a FAT12 volume. (Boot Sector, FATs, root directory, data area). The root directory contains several directory entries. A directory entry points to a file or a folder or a volume label. After reading one of the directory entries, you get the index of the first cluster of the cluster chain that stores the file or folder that the entry points to. Using the first clusters index (which is a logical index), you do two things: You can index the FAT to follow the cluster chain. The second thing you can do is, you can read clusters and sectors from the data area after converting the logical index to a physical index. Reading from the data area allows you to access a file’s raw data or the directory entries of a sub-directory.
  5. Index the FAT to follow the cluster chain that starts with the cluster referenced by the root directory entry.
  6. Read the data from the data area. The data is either a volume description, a file or a folder. In order to read from the data area, you have to convert the logical cluster index into a physical cluster index. Given that pysical cluster index, you can compute an offset in bytes from the start of the volume and read the bytes from that cluster. For a file, the clusters contain the raw data stored in the file. For a directory, the clusters store an array of directory entries.
  7. If the data is a directory, it contains the same kind of directory entries that are also stored in the root directory. You can use the directory entries to dive deeper into the dir tree, to move up the dir tree (by changing directory to the entry called .. which denotes the parent folder) or to access files stored in the current directory. The root directory does not have a .. entry. For folders located in the root directory, the .. directory entry stores a logical cluster index zero, with zero beeing a placeholder for the fact that the root directory is not stored in the data area and hence that there is no physical cluster index to compute.
  8. For a file, visit all the clusters in the file’s cluster chain and read the bytes into a buffer. Return the buffer to the caller.

Leave a Reply