Filesystem Management¶

Lecture objectives:¶

Filesystem abstractions
Filesystem operations
Linux VFS
Overview of Linux I/O Management

Filesystem Abstractions¶

A fileystem is a way to organize files and directories on storage devices such as hard disks, SSDs or flash memory. There are many types of filesystems (e.g. FAT, ext4, btrfs, ntfs) and on one running system we can have multiple instances of the same filesystem type in use.

While filesystems use different data structures to organizing the files, directories, user data and meta (internal) data on storage devices there are a few common abstractions that are used in almost all filesystems:

superblock
file
inode
dentry

Some of these abstractions are present both on disk and in memory while some are only present in memory.

The superblock abstraction contains information about the filesystem instance such as the block size, the root inode, filesystem size. It is present both on storage and in memory (for caching purposes).

The file abstraction contains information about an opened file such as the current file pointer. It only exists in memory.

The inode is identifying a file on disk. It exists both on storage and in memory (for caching purposes). An inode identifies a file in a unique way and has various properties such as the file size, access rights, file type, etc.

Note

The file name is not a property of the file.

The dentry associates a name with an inode. It exists both on storage and in memory (for caching purposes).

The following diagram shows the relationship between the various filesystem abstractions as they used in memory:

../_images/ditaa-29f54aaa1a85b819ff29cb7d101a4d646b3b0b06.png

Note that not all of the one to many relationships between the various abstractions are depicted.

Multiple file descriptors can point to the same file because we can use the dup() system call to duplicate a file descriptor.

Multiple file abstractions can point to the same dentry if we open the same path multiple times.

Multiple dentries can point to the same inode when hard links are used.

The following diagram shows the relationship of the filesystem abstraction on storage:

../_images/ditaa-bc662dab7bb3d9ba3a37efbf69b82c513dcaadd4.png

The diagram shows that the superblock is typically stored at the beginning of the fileystem and that various blocks are used with different purposes: some to store dentries, some to store inodes and some to store user data blocks. There are also blocks used to manage the available free blocks (e.g. bitmaps for the simple filesystems).

The next diagram show a very simple filesystem where blocks are grouped together by function:

the superblock contains information about the block size as well as the IMAP, DMAP, IZONE and DZONE areas.
the IMAP area is comprised of multiple blocks which contains a bitmap for inode allocation; it maintains the allocated/free state for all inodes in the IZONE area
the DMAP area is comprised of multiple blocks which contains a bitmap for data blocks; it maintains the allocated/free state for all blocks the DZONE area

../_images/ditaa-8b59fc3f5245ffb5d7089dc80cf2e306c39a62d8.png

Filesystem Operations¶

The following diagram shows a high level overview of how the file system drivers interact with the rest of the file system "stack". In order to support multiple filesystem types and instances Linux implements a large and complex subsystem that deals with filesystem management. This is called Virtual File System (or sometimes Virtual File Switch) and it is abbreviated with VFS.

../_images/ditaa-6d39f541805ae8197b413ec9c79116382abc4dbc.png

VFS translates the complex file management related system calls to simpler operations that are implemented by the device drivers. These are some of the operations that a file system must implement:

Mount
Open a file
Querying file attributes
Reading data from a file
Writing file to a file
Creating a file
Deleting a file

The next sections will look in-depth at some of these operations.