add patch improve-code-readability-in-ext4_iget
[ext4-patch-queue.git] / import-inode-data-fork-from-wiki-page
blobdf2fc73b6b2887e8c30d3b238527caf3e078d58d
1 ext4: import inode data fork chapter from wiki page
3 From: Darrick J. Wong <darrick.wong@oracle.com>
5 Import the chapter about inode data fork from the on-disk format wiki
6 page into the kernel documentation.
8 Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
9 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
10 ---
11 v2: disable blockmap.rst on latex builds because it doesn't support
12 nested tables
13 ---
14  Documentation/conf.py                              |    2 
15  Documentation/filesystems/ext4/ondisk/blockmap.rst |   49 +++++
16  Documentation/filesystems/ext4/ondisk/dynamic.rst  |    1 
17  Documentation/filesystems/ext4/ondisk/ifork.rst    |  194 ++++++++++++++++++++
18  4 files changed, 245 insertions(+), 1 deletion(-)
19  create mode 100644 Documentation/filesystems/ext4/ondisk/blockmap.rst
20  create mode 100644 Documentation/filesystems/ext4/ondisk/ifork.rst
22 diff --git a/Documentation/conf.py b/Documentation/conf.py
23 index 62ac5a9f3a9f..b691af4831fa 100644
24 --- a/Documentation/conf.py
25 +++ b/Documentation/conf.py
26 @@ -34,7 +34,7 @@ needs_sphinx = '1.3'
27  # Add any Sphinx extension module names here, as strings. They can be
28  # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
29  # ones.
30 -extensions = ['kerneldoc', 'rstFlatTable', 'kernel_include', 'cdomain', 'kfigure']
31 +extensions = ['kerneldoc', 'rstFlatTable', 'kernel_include', 'cdomain', 'kfigure', 'sphinx.ext.ifconfig']
33  # The name of the math extension changed on Sphinx 1.4
34  if major == 1 and minor > 3:
35 diff --git a/Documentation/filesystems/ext4/ondisk/blockmap.rst b/Documentation/filesystems/ext4/ondisk/blockmap.rst
36 new file mode 100644
37 index 000000000000..30e25750d88a
38 --- /dev/null
39 +++ b/Documentation/filesystems/ext4/ondisk/blockmap.rst
40 @@ -0,0 +1,49 @@
41 +.. SPDX-License-Identifier: GPL-2.0
43 ++---------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
44 +| i.i\_block Offset   | Where It Points                                                                                                                                                                                                              |
45 ++=====================+==============================================================================================================================================================================================================================+
46 +| 0 to 11             | Direct map to file blocks 0 to 11.                                                                                                                                                                                           |
47 ++---------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
48 +| 12                  | Indirect block: (file blocks 12 to (``$block_size`` / 4) + 11, or 12 to 1035 if 4KiB blocks)                                                                                                                                 |
49 +|                     |                                                                                                                                                                                                                              |
50 +|                     | +------------------------------+--------------------------------------------------------------------+                                                                                                                        |
51 +|                     | | Indirect Block Offset        | Where It Points                                                    |                                                                                                                        |
52 +|                     | +==============================+====================================================================+                                                                                                                        |
53 +|                     | | 0 to (``$block_size`` / 4)   | Direct map to (``$block_size`` / 4) blocks (1024 if 4KiB blocks)   |                                                                                                                        |
54 +|                     | +------------------------------+--------------------------------------------------------------------+                                                                                                                        |
55 ++---------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
56 +| 13                  | Double-indirect block: (file blocks ``$block_size``/4 + 12 to (``$block_size`` / 4) ^ 2 + (``$block_size`` / 4) + 11, or 1036 to 1049611 if 4KiB blocks)                                                                     |
57 +|                     |                                                                                                                                                                                                                              |
58 +|                     | +--------------------------------+---------------------------------------------------------------------------------------------------------+                                                                                 |
59 +|                     | | Double Indirect Block Offset   | Where It Points                                                                                         |                                                                                 |
60 +|                     | +================================+=========================================================================================================+                                                                                 |
61 +|                     | | 0 to (``$block_size`` / 4)     | Map to (``$block_size`` / 4) indirect blocks (1024 if 4KiB blocks)                                      |                                                                                 |
62 +|                     | |                                |                                                                                                         |                                                                                 |
63 +|                     | |                                | +------------------------------+--------------------------------------------------------------------+   |                                                                                 |
64 +|                     | |                                | | Indirect Block Offset        | Where It Points                                                    |   |                                                                                 |
65 +|                     | |                                | +==============================+====================================================================+   |                                                                                 |
66 +|                     | |                                | | 0 to (``$block_size`` / 4)   | Direct map to (``$block_size`` / 4) blocks (1024 if 4KiB blocks)   |   |                                                                                 |
67 +|                     | |                                | +------------------------------+--------------------------------------------------------------------+   |                                                                                 |
68 +|                     | +--------------------------------+---------------------------------------------------------------------------------------------------------+                                                                                 |
69 ++---------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
70 +| 14                  | Triple-indirect block: (file blocks (``$block_size`` / 4) ^ 2 + (``$block_size`` / 4) + 12 to (``$block_size`` / 4) ^ 3 + (``$block_size`` / 4) ^ 2 + (``$block_size`` / 4) + 12, or 1049612 to 1074791436 if 4KiB blocks)   |
71 +|                     |                                                                                                                                                                                                                              |
72 +|                     | +--------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------+                                          |
73 +|                     | | Triple Indirect Block Offset   | Where It Points                                                                                                                                |                                          |
74 +|                     | +================================+================================================================================================================================================+                                          |
75 +|                     | | 0 to (``$block_size`` / 4)     | Map to (``$block_size`` / 4) double indirect blocks (1024 if 4KiB blocks)                                                                      |                                          |
76 +|                     | |                                |                                                                                                                                                |                                          |
77 +|                     | |                                | +--------------------------------+---------------------------------------------------------------------------------------------------------+   |                                          |
78 +|                     | |                                | | Double Indirect Block Offset   | Where It Points                                                                                         |   |                                          |
79 +|                     | |                                | +================================+=========================================================================================================+   |                                          |
80 +|                     | |                                | | 0 to (``$block_size`` / 4)     | Map to (``$block_size`` / 4) indirect blocks (1024 if 4KiB blocks)                                      |   |                                          |
81 +|                     | |                                | |                                |                                                                                                         |   |                                          |
82 +|                     | |                                | |                                | +------------------------------+--------------------------------------------------------------------+   |   |                                          |
83 +|                     | |                                | |                                | | Indirect Block Offset        | Where It Points                                                    |   |   |                                          |
84 +|                     | |                                | |                                | +==============================+====================================================================+   |   |                                          |
85 +|                     | |                                | |                                | | 0 to (``$block_size`` / 4)   | Direct map to (``$block_size`` / 4) blocks (1024 if 4KiB blocks)   |   |   |                                          |
86 +|                     | |                                | |                                | +------------------------------+--------------------------------------------------------------------+   |   |                                          |
87 +|                     | |                                | +--------------------------------+---------------------------------------------------------------------------------------------------------+   |                                          |
88 +|                     | +--------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------+                                          |
89 ++---------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
90 diff --git a/Documentation/filesystems/ext4/ondisk/dynamic.rst b/Documentation/filesystems/ext4/ondisk/dynamic.rst
91 index 7c5f5019b9d6..f090de8dd1c1 100644
92 --- a/Documentation/filesystems/ext4/ondisk/dynamic.rst
93 +++ b/Documentation/filesystems/ext4/ondisk/dynamic.rst
94 @@ -7,3 +7,4 @@ Dynamic metadata are created on the fly when files and blocks are
95  allocated to files.
97  .. include:: inodes.rst
98 +.. include:: ifork.rst
99 diff --git a/Documentation/filesystems/ext4/ondisk/ifork.rst b/Documentation/filesystems/ext4/ondisk/ifork.rst
100 new file mode 100644
101 index 000000000000..5dbe3b2b121a
102 --- /dev/null
103 +++ b/Documentation/filesystems/ext4/ondisk/ifork.rst
104 @@ -0,0 +1,194 @@
105 +.. SPDX-License-Identifier: GPL-2.0
107 +The Contents of inode.i\_block
108 +------------------------------
110 +Depending on the type of file an inode describes, the 60 bytes of
111 +storage in ``inode.i_block`` can be used in different ways. In general,
112 +regular files and directories will use it for file block indexing
113 +information, and special files will use it for special purposes.
115 +Symbolic Links
116 +~~~~~~~~~~~~~~
118 +The target of a symbolic link will be stored in this field if the target
119 +string is less than 60 bytes long. Otherwise, either extents or block
120 +maps will be used to allocate data blocks to store the link target.
122 +Direct/Indirect Block Addressing
123 +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
125 +In ext2/3, file block numbers were mapped to logical block numbers by
126 +means of an (up to) three level 1-1 block map. To find the logical block
127 +that stores a particular file block, the code would navigate through
128 +this increasingly complicated structure. Notice that there is neither a
129 +magic number nor a checksum to provide any level of confidence that the
130 +block isn't full of garbage.
132 +.. ifconfig:: builder != 'latex'
134 +   .. include:: blockmap.rst
136 +.. ifconfig:: builder == 'latex'
138 +   [Table omitted because LaTeX doesn't support nested tables.]
140 +Note that with this block mapping scheme, it is necessary to fill out a
141 +lot of mapping data even for a large contiguous file! This inefficiency
142 +led to the creation of the extent mapping scheme, discussed below.
144 +Notice also that a file using this mapping scheme cannot be placed
145 +higher than 2^32 blocks.
147 +Extent Tree
148 +~~~~~~~~~~~
150 +In ext4, the file to logical block map has been replaced with an extent
151 +tree. Under the old scheme, allocating a contiguous run of 1,000 blocks
152 +requires an indirect block to map all 1,000 entries; with extents, the
153 +mapping is reduced to a single ``struct ext4_extent`` with
154 +``ee_len = 1000``. If flex\_bg is enabled, it is possible to allocate
155 +very large files with a single extent, at a considerable reduction in
156 +metadata block use, and some improvement in disk efficiency. The inode
157 +must have the extents flag (0x80000) flag set for this feature to be in
158 +use.
160 +Extents are arranged as a tree. Each node of the tree begins with a
161 +``struct ext4_extent_header``. If the node is an interior node
162 +(``eh.eh_depth`` > 0), the header is followed by ``eh.eh_entries``
163 +instances of ``struct ext4_extent_idx``; each of these index entries
164 +points to a block containing more nodes in the extent tree. If the node
165 +is a leaf node (``eh.eh_depth == 0``), then the header is followed by
166 +``eh.eh_entries`` instances of ``struct ext4_extent``; these instances
167 +point to the file's data blocks. The root node of the extent tree is
168 +stored in ``inode.i_block``, which allows for the first four extents to
169 +be recorded without the use of extra metadata blocks.
171 +The extent tree header is recorded in ``struct ext4_extent_header``,
172 +which is 12 bytes long:
174 +.. list-table::
175 +   :widths: 1 1 1 77
176 +   :header-rows: 1
178 +   * - Offset
179 +     - Size
180 +     - Name
181 +     - Description
182 +   * - 0x0
183 +     - \_\_le16
184 +     - eh\_magic
185 +     - Magic number, 0xF30A.
186 +   * - 0x2
187 +     - \_\_le16
188 +     - eh\_entries
189 +     - Number of valid entries following the header.
190 +   * - 0x4
191 +     - \_\_le16
192 +     - eh\_max
193 +     - Maximum number of entries that could follow the header.
194 +   * - 0x6
195 +     - \_\_le16
196 +     - eh\_depth
197 +     - Depth of this extent node in the extent tree. 0 = this extent node
198 +       points to data blocks; otherwise, this extent node points to other
199 +       extent nodes. The extent tree can be at most 5 levels deep: a logical
200 +       block number can be at most ``2^32``, and the smallest ``n`` that
201 +       satisfies ``4*(((blocksize - 12)/12)^n) >= 2^32`` is 5.
202 +   * - 0x8
203 +     - \_\_le32
204 +     - eh\_generation
205 +     - Generation of the tree. (Used by Lustre, but not standard ext4).
207 +Internal nodes of the extent tree, also known as index nodes, are
208 +recorded as ``struct ext4_extent_idx``, and are 12 bytes long:
210 +.. list-table::
211 +   :widths: 1 1 1 77
212 +   :header-rows: 1
214 +   * - Offset
215 +     - Size
216 +     - Name
217 +     - Description
218 +   * - 0x0
219 +     - \_\_le32
220 +     - ei\_block
221 +     - This index node covers file blocks from 'block' onward.
222 +   * - 0x4
223 +     - \_\_le32
224 +     - ei\_leaf\_lo
225 +     - Lower 32-bits of the block number of the extent node that is the next
226 +       level lower in the tree. The tree node pointed to can be either another
227 +       internal node or a leaf node, described below.
228 +   * - 0x8
229 +     - \_\_le16
230 +     - ei\_leaf\_hi
231 +     - Upper 16-bits of the previous field.
232 +   * - 0xA
233 +     - \_\_u16
234 +     - ei\_unused
235 +     -
237 +Leaf nodes of the extent tree are recorded as ``struct ext4_extent``,
238 +and are also 12 bytes long:
240 +.. list-table::
241 +   :widths: 1 1 1 77
242 +   :header-rows: 1
244 +   * - Offset
245 +     - Size
246 +     - Name
247 +     - Description
248 +   * - 0x0
249 +     - \_\_le32
250 +     - ee\_block
251 +     - First file block number that this extent covers.
252 +   * - 0x4
253 +     - \_\_le16
254 +     - ee\_len
255 +     - Number of blocks covered by extent. If the value of this field is <=
256 +       32768, the extent is initialized. If the value of the field is > 32768,
257 +       the extent is uninitialized and the actual extent length is ``ee_len`` -
258 +       32768. Therefore, the maximum length of a initialized extent is 32768
259 +       blocks, and the maximum length of an uninitialized extent is 32767.
260 +   * - 0x6
261 +     - \_\_le16
262 +     - ee\_start\_hi
263 +     - Upper 16-bits of the block number to which this extent points.
264 +   * - 0x8
265 +     - \_\_le32
266 +     - ee\_start\_lo
267 +     - Lower 32-bits of the block number to which this extent points.
269 +Prior to the introduction of metadata checksums, the extent header +
270 +extent entries always left at least 4 bytes of unallocated space at the
271 +end of each extent tree data block (because (2^x % 12) >= 4). Therefore,
272 +the 32-bit checksum is inserted into this space. The 4 extents in the
273 +inode do not need checksumming, since the inode is already checksummed.
274 +The checksum is calculated against the FS UUID, the inode number, the
275 +inode generation, and the entire extent block leading up to (but not
276 +including) the checksum itself.
278 +``struct ext4_extent_tail`` is 4 bytes long:
280 +.. list-table::
281 +   :widths: 1 1 1 77
282 +   :header-rows: 1
284 +   * - Offset
285 +     - Size
286 +     - Name
287 +     - Description
288 +   * - 0x0
289 +     - \_\_le32
290 +     - eb\_checksum
291 +     - Checksum of the extent block, crc32c(uuid+inum+igeneration+extentblock)
293 +Inline Data
294 +~~~~~~~~~~~
296 +If the inline data feature is enabled for the filesystem and the flag is
297 +set for the inode, it is possible that the first 60 bytes of the file
298 +data are stored here.