1 ext4: import inode data fork chapter from wiki page
3 From: Darrick J. Wong <darrick.wong@oracle.com>
5 Import the chapter about inode data fork from the on-disk format wiki
6 page into the kernel documentation.
8 Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
9 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
11 v2: disable blockmap.rst on latex builds because it doesn't support
14 Documentation/conf.py | 2
15 Documentation/filesystems/ext4/ondisk/blockmap.rst | 49 +++++
16 Documentation/filesystems/ext4/ondisk/dynamic.rst | 1
17 Documentation/filesystems/ext4/ondisk/ifork.rst | 194 ++++++++++++++++++++
18 4 files changed, 245 insertions(+), 1 deletion(-)
19 create mode 100644 Documentation/filesystems/ext4/ondisk/blockmap.rst
20 create mode 100644 Documentation/filesystems/ext4/ondisk/ifork.rst
22 diff --git a/Documentation/conf.py b/Documentation/conf.py
23 index 62ac5a9f3a9f..b691af4831fa 100644
24 --- a/Documentation/conf.py
25 +++ b/Documentation/conf.py
26 @@ -34,7 +34,7 @@ needs_sphinx = '1.3'
27 # Add any Sphinx extension module names here, as strings. They can be
28 # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
30 -extensions = ['kerneldoc', 'rstFlatTable', 'kernel_include', 'cdomain', 'kfigure']
31 +extensions = ['kerneldoc', 'rstFlatTable', 'kernel_include', 'cdomain', 'kfigure', 'sphinx.ext.ifconfig']
33 # The name of the math extension changed on Sphinx 1.4
34 if major == 1 and minor > 3:
35 diff --git a/Documentation/filesystems/ext4/ondisk/blockmap.rst b/Documentation/filesystems/ext4/ondisk/blockmap.rst
37 index 000000000000..30e25750d88a
39 +++ b/Documentation/filesystems/ext4/ondisk/blockmap.rst
41 +.. SPDX-License-Identifier: GPL-2.0
43 ++---------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
44 +| i.i\_block Offset | Where It Points |
45 ++=====================+==============================================================================================================================================================================================================================+
46 +| 0 to 11 | Direct map to file blocks 0 to 11. |
47 ++---------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
48 +| 12 | Indirect block: (file blocks 12 to (``$block_size`` / 4) + 11, or 12 to 1035 if 4KiB blocks) |
50 +| | +------------------------------+--------------------------------------------------------------------+ |
51 +| | | Indirect Block Offset | Where It Points | |
52 +| | +==============================+====================================================================+ |
53 +| | | 0 to (``$block_size`` / 4) | Direct map to (``$block_size`` / 4) blocks (1024 if 4KiB blocks) | |
54 +| | +------------------------------+--------------------------------------------------------------------+ |
55 ++---------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
56 +| 13 | Double-indirect block: (file blocks ``$block_size``/4 + 12 to (``$block_size`` / 4) ^ 2 + (``$block_size`` / 4) + 11, or 1036 to 1049611 if 4KiB blocks) |
58 +| | +--------------------------------+---------------------------------------------------------------------------------------------------------+ |
59 +| | | Double Indirect Block Offset | Where It Points | |
60 +| | +================================+=========================================================================================================+ |
61 +| | | 0 to (``$block_size`` / 4) | Map to (``$block_size`` / 4) indirect blocks (1024 if 4KiB blocks) | |
63 +| | | | +------------------------------+--------------------------------------------------------------------+ | |
64 +| | | | | Indirect Block Offset | Where It Points | | |
65 +| | | | +==============================+====================================================================+ | |
66 +| | | | | 0 to (``$block_size`` / 4) | Direct map to (``$block_size`` / 4) blocks (1024 if 4KiB blocks) | | |
67 +| | | | +------------------------------+--------------------------------------------------------------------+ | |
68 +| | +--------------------------------+---------------------------------------------------------------------------------------------------------+ |
69 ++---------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
70 +| 14 | Triple-indirect block: (file blocks (``$block_size`` / 4) ^ 2 + (``$block_size`` / 4) + 12 to (``$block_size`` / 4) ^ 3 + (``$block_size`` / 4) ^ 2 + (``$block_size`` / 4) + 12, or 1049612 to 1074791436 if 4KiB blocks) |
72 +| | +--------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------+ |
73 +| | | Triple Indirect Block Offset | Where It Points | |
74 +| | +================================+================================================================================================================================================+ |
75 +| | | 0 to (``$block_size`` / 4) | Map to (``$block_size`` / 4) double indirect blocks (1024 if 4KiB blocks) | |
77 +| | | | +--------------------------------+---------------------------------------------------------------------------------------------------------+ | |
78 +| | | | | Double Indirect Block Offset | Where It Points | | |
79 +| | | | +================================+=========================================================================================================+ | |
80 +| | | | | 0 to (``$block_size`` / 4) | Map to (``$block_size`` / 4) indirect blocks (1024 if 4KiB blocks) | | |
82 +| | | | | | +------------------------------+--------------------------------------------------------------------+ | | |
83 +| | | | | | | Indirect Block Offset | Where It Points | | | |
84 +| | | | | | +==============================+====================================================================+ | | |
85 +| | | | | | | 0 to (``$block_size`` / 4) | Direct map to (``$block_size`` / 4) blocks (1024 if 4KiB blocks) | | | |
86 +| | | | | | +------------------------------+--------------------------------------------------------------------+ | | |
87 +| | | | +--------------------------------+---------------------------------------------------------------------------------------------------------+ | |
88 +| | +--------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------+ |
89 ++---------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
90 diff --git a/Documentation/filesystems/ext4/ondisk/dynamic.rst b/Documentation/filesystems/ext4/ondisk/dynamic.rst
91 index 7c5f5019b9d6..f090de8dd1c1 100644
92 --- a/Documentation/filesystems/ext4/ondisk/dynamic.rst
93 +++ b/Documentation/filesystems/ext4/ondisk/dynamic.rst
94 @@ -7,3 +7,4 @@ Dynamic metadata are created on the fly when files and blocks are
97 .. include:: inodes.rst
98 +.. include:: ifork.rst
99 diff --git a/Documentation/filesystems/ext4/ondisk/ifork.rst b/Documentation/filesystems/ext4/ondisk/ifork.rst
101 index 000000000000..5dbe3b2b121a
103 +++ b/Documentation/filesystems/ext4/ondisk/ifork.rst
105 +.. SPDX-License-Identifier: GPL-2.0
107 +The Contents of inode.i\_block
108 +------------------------------
110 +Depending on the type of file an inode describes, the 60 bytes of
111 +storage in ``inode.i_block`` can be used in different ways. In general,
112 +regular files and directories will use it for file block indexing
113 +information, and special files will use it for special purposes.
118 +The target of a symbolic link will be stored in this field if the target
119 +string is less than 60 bytes long. Otherwise, either extents or block
120 +maps will be used to allocate data blocks to store the link target.
122 +Direct/Indirect Block Addressing
123 +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
125 +In ext2/3, file block numbers were mapped to logical block numbers by
126 +means of an (up to) three level 1-1 block map. To find the logical block
127 +that stores a particular file block, the code would navigate through
128 +this increasingly complicated structure. Notice that there is neither a
129 +magic number nor a checksum to provide any level of confidence that the
130 +block isn't full of garbage.
132 +.. ifconfig:: builder != 'latex'
134 + .. include:: blockmap.rst
136 +.. ifconfig:: builder == 'latex'
138 + [Table omitted because LaTeX doesn't support nested tables.]
140 +Note that with this block mapping scheme, it is necessary to fill out a
141 +lot of mapping data even for a large contiguous file! This inefficiency
142 +led to the creation of the extent mapping scheme, discussed below.
144 +Notice also that a file using this mapping scheme cannot be placed
145 +higher than 2^32 blocks.
150 +In ext4, the file to logical block map has been replaced with an extent
151 +tree. Under the old scheme, allocating a contiguous run of 1,000 blocks
152 +requires an indirect block to map all 1,000 entries; with extents, the
153 +mapping is reduced to a single ``struct ext4_extent`` with
154 +``ee_len = 1000``. If flex\_bg is enabled, it is possible to allocate
155 +very large files with a single extent, at a considerable reduction in
156 +metadata block use, and some improvement in disk efficiency. The inode
157 +must have the extents flag (0x80000) flag set for this feature to be in
160 +Extents are arranged as a tree. Each node of the tree begins with a
161 +``struct ext4_extent_header``. If the node is an interior node
162 +(``eh.eh_depth`` > 0), the header is followed by ``eh.eh_entries``
163 +instances of ``struct ext4_extent_idx``; each of these index entries
164 +points to a block containing more nodes in the extent tree. If the node
165 +is a leaf node (``eh.eh_depth == 0``), then the header is followed by
166 +``eh.eh_entries`` instances of ``struct ext4_extent``; these instances
167 +point to the file's data blocks. The root node of the extent tree is
168 +stored in ``inode.i_block``, which allows for the first four extents to
169 +be recorded without the use of extra metadata blocks.
171 +The extent tree header is recorded in ``struct ext4_extent_header``,
172 +which is 12 bytes long:
185 + - Magic number, 0xF30A.
189 + - Number of valid entries following the header.
193 + - Maximum number of entries that could follow the header.
197 + - Depth of this extent node in the extent tree. 0 = this extent node
198 + points to data blocks; otherwise, this extent node points to other
199 + extent nodes. The extent tree can be at most 5 levels deep: a logical
200 + block number can be at most ``2^32``, and the smallest ``n`` that
201 + satisfies ``4*(((blocksize - 12)/12)^n) >= 2^32`` is 5.
205 + - Generation of the tree. (Used by Lustre, but not standard ext4).
207 +Internal nodes of the extent tree, also known as index nodes, are
208 +recorded as ``struct ext4_extent_idx``, and are 12 bytes long:
221 + - This index node covers file blocks from 'block' onward.
225 + - Lower 32-bits of the block number of the extent node that is the next
226 + level lower in the tree. The tree node pointed to can be either another
227 + internal node or a leaf node, described below.
231 + - Upper 16-bits of the previous field.
237 +Leaf nodes of the extent tree are recorded as ``struct ext4_extent``,
238 +and are also 12 bytes long:
251 + - First file block number that this extent covers.
255 + - Number of blocks covered by extent. If the value of this field is <=
256 + 32768, the extent is initialized. If the value of the field is > 32768,
257 + the extent is uninitialized and the actual extent length is ``ee_len`` -
258 + 32768. Therefore, the maximum length of a initialized extent is 32768
259 + blocks, and the maximum length of an uninitialized extent is 32767.
263 + - Upper 16-bits of the block number to which this extent points.
267 + - Lower 32-bits of the block number to which this extent points.
269 +Prior to the introduction of metadata checksums, the extent header +
270 +extent entries always left at least 4 bytes of unallocated space at the
271 +end of each extent tree data block (because (2^x % 12) >= 4). Therefore,
272 +the 32-bit checksum is inserted into this space. The 4 extents in the
273 +inode do not need checksumming, since the inode is already checksummed.
274 +The checksum is calculated against the FS UUID, the inode number, the
275 +inode generation, and the entire extent block leading up to (but not
276 +including) the checksum itself.
278 +``struct ext4_extent_tail`` is 4 bytes long:
291 + - Checksum of the extent block, crc32c(uuid+inum+igeneration+extentblock)
296 +If the inline data feature is enabled for the filesystem and the flag is
297 +set for the inode, it is possible that the first 60 bytes of the file
298 +data are stored here.