A filesystem is the structure of some data storage. They allow for storing file hierarchically (folders), remembering metadata (timestamps, owners, etc.).
The mechanics lying behind filesystems is often misunderstood. As a consequence, installing an operating systems is often perceived as a complex operation. A few enlightening explanations might help a great deal to alleviate this fear.
The most important thing to grasp is that every computer storage device, from a hardware point of view, is a continuous segment of memory. The way data gets organized by partitions, folders, metadata, etc. is defined logically by tools and operating systems.
Boot sectors and partition tables
There are 2 types of boot sectors: the legacy Master Boot Record (MBR) and the newer GUID Partition Table (GPT). GPT has less limitations in regard to the number and the size of the partitions.
The boot sector and the partition table typically reside at the very beginning
of the disk. They are not on any partition. This would not make sense since the
partition table defines the partition layout. On Linux, hard disk drives are
typically referenced by the path
/dev/sdX, where X is a letter, and their
partitions by the path
/dev/sdXN, where N a number.
The OS and programs running on it identify partitions by the sector address (or logical block address, a.k.a. LBA) stored in the partition table. The standard starting sector for partitions is at byte 2048. In the past, the first partition used to be written at byte 63, which may cause performance issues since it does not align with the physical sector size of the drive. See the references for more explanations.
Tools for creating and manipulating the boot sector and the partition table
gdisk and more. A tool like
will manipulate the partition table found at the beginning of the designated
storage media. As such, it usually only makes sense to call
fdisk over a hard
disk drive, such as
fdisk /dev/sdX, and not over a partition.
The MBR has different partition types: primary, extended and logical. See this Arch Wiki article for more details.
GPT has only one partition type.
Partitions need to be initialized before they can be used by the OS, that is, the header must be created. This header has different names depending on the filesystem type (e.g. table of content, superblock). The header will usually occupy the first sectors of the partition.
Tools such as
mkfs can be used to initialize partitions.
As mentioned before, partitions are purely logical: it is possible to write data
across partitions with
dd. Although that would probably destroy the logical
integrity of some partitions.
If you remove the partition entry N, then partition N won’t exist in the eyes of
the OS. But
dd can force reading data at any position on disk, and thus
recover data from lost partitions. If you re-add the partition entry with the
same LBA addresses, then the partition will be accessible just like before.
The bootloader is a program that resides partly on a partition and partly on the boot sector.
When the computer starts, it will boot the designated media. It will look for an MBR or a GPT in the first sectors and run the executable code of the boot loader. This code can be configured to boot an OS located at a specific partition.
Disk usage and apparent size
Every file on the system has 2 “size” properties: the disk usage and the apparent size.
The apparent size is the number of bytes contained in a file. It represents the
information held by the file, and as such it is the same across different file
systems. It can be queried with
ls -l or
du -b (GNU) /
du -A (BSD).
Disk usage is highly dependent on file systems. It can be queried with
du. Disk usage accounts for several properties of the file:
- A file has attributes on the filesystem (e.g. timestamp, owner, etc.). Thus it usually requires some additional bytes.
- A file can be fragmented, have indirect blocks, have unused space in some blocks, and the like.
- A file can be sparse. It means that it has big chunks of zeros. Modern filesystems make use of this property to save space, that is, they do not write the zeros on the disk and only tell the filesystem that there are zeros from byte M to byte N.
The disk usage is usually higher than the apparent size because of metadata and fragmentation, but it can also be smaller if the file is sparse.
$ dd of=sparse-file bs=1k seek=5120 count=0 0+0 records in 0+0 records out 0 bytes (0 B) copied, 5.668e-05 s, 0.0 kB/s $ du sparse-file 0 $ du -b sparse-file 5242880
Alternatively we can also use
$ truncate -s 5M sparse-file
The file is full of zeros and requires only 4 bytes on the filesystem, although it contains 5242880 information bytes.
If the kernel and the filesystem support it, it is possible to resize online partitions, e.g. the system partition. Note that while extending a partition is not problematic, shrinking a partition can cause data loss.
Let’s see how this works on an ext4 filesystem.
Warning: The whole process should not be interrupted. Back up your partition table and the data if possible. Make sure the computer is powered by a battery or a UPS.
- Delete the partition N from the partition table (e.g. with
fdisk), and recreate it immediately with the same starting sector and the desired new size.
resize2fs /dev/sdXNon partition N of disk X.
And there is no need to restart the computer!
(8)resize2fs for more options.