1 docs: make ext4 readme tables readable
3 From: Darrick J. Wong <darrick.wong@oracle.com>
5 The tables in the ext4 readme are not particularly space efficient in
6 the text or html outputs, and they're totally broken in the pdf output.
7 Convert them into titled paragraphs so that they render more nicely.
9 Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
10 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
12 Documentation/filesystems/ext4/ext4.rst | 821 +++++++++++++++----------------
13 1 file changed, 391 insertions(+), 430 deletions(-)
16 diff --git a/Documentation/filesystems/ext4/ext4.rst b/Documentation/filesystems/ext4/ext4.rst
17 index 9d4368d591fa..e2b6bb7c2730 100644
18 --- a/Documentation/filesystems/ext4/ext4.rst
19 +++ b/Documentation/filesystems/ext4/ext4.rst
20 @@ -101,269 +101,256 @@ Options
21 When mounting an ext4 filesystem, the following option are accepted:
24 -======================= =======================================================
25 -Mount Option Description
26 -======================= =======================================================
27 -ro Mount filesystem read only. Note that ext4 will
28 - replay the journal (and thus write to the
29 - partition) even when mounted "read only". The
30 - mount options "ro,noload" can be used to prevent
31 - writes to the filesystem.
33 -journal_checksum Enable checksumming of the journal transactions.
34 - This will allow the recovery code in e2fsck and the
35 - kernel to detect corruption in the kernel. It is a
36 - compatible change and will be ignored by older kernels.
38 -journal_async_commit Commit block can be written to disk without waiting
39 - for descriptor blocks. If enabled older kernels cannot
40 - mount the device. This will enable 'journal_checksum'
44 -journal_dev=devnum When the external journal device's major/minor numbers
45 - have changed, these options allow the user to specify
46 - the new journal location. The journal device is
47 - identified through either its new major/minor numbers
48 - encoded in devnum, or via a path to the device.
50 -norecovery Don't load the journal on mounting. Note that
51 -noload if the filesystem was not unmounted cleanly,
52 - skipping the journal replay will lead to the
53 - filesystem containing inconsistencies that can
54 - lead to any number of problems.
56 -data=journal All data are committed into the journal prior to being
57 - written into the main file system. Enabling
58 - this mode will disable delayed allocation and
61 -data=ordered (*) All data are forced directly out to the main file
62 - system prior to its metadata being committed to the
65 -data=writeback Data ordering is not preserved, data may be written
66 - into the main file system after its metadata has been
67 - committed to the journal.
69 -commit=nrsec (*) Ext4 can be told to sync all its data and metadata
70 - every 'nrsec' seconds. The default value is 5 seconds.
71 - This means that if you lose your power, you will lose
72 - as much as the latest 5 seconds of work (your
73 - filesystem will not be damaged though, thanks to the
74 - journaling). This default value (or any low value)
75 - will hurt performance, but it's good for data-safety.
76 - Setting it to 0 will have the same effect as leaving
77 - it at the default (5 seconds).
78 - Setting it to very large values will improve
81 -barrier=<0|1(*)> This enables/disables the use of write barriers in
82 -barrier(*) the jbd code. barrier=0 disables, barrier=1 enables.
83 -nobarrier This also requires an IO stack which can support
84 - barriers, and if jbd gets an error on a barrier
85 - write, it will disable again with a warning.
86 - Write barriers enforce proper on-disk ordering
87 - of journal commits, making volatile disk write caches
88 - safe to use, at some performance penalty. If
89 - your disks are battery-backed in one way or another,
90 - disabling barriers may safely improve performance.
91 - The mount options "barrier" and "nobarrier" can
92 - also be used to enable or disable barriers, for
93 - consistency with other ext4 mount options.
95 -inode_readahead_blks=n This tuning parameter controls the maximum
96 - number of inode table blocks that ext4's inode
97 - table readahead algorithm will pre-read into
98 - the buffer cache. The default value is 32 blocks.
100 -nouser_xattr Disables Extended User Attributes. See the
101 - attr(5) manual page for more information about
102 - extended attributes.
104 -noacl This option disables POSIX Access Control List
105 - support. If ACL support is enabled in the kernel
106 - configuration (CONFIG_EXT4_FS_POSIX_ACL), ACL is
107 - enabled by default on mount. See the acl(5) manual
108 - page for more information about acl.
110 -bsddf (*) Make 'df' act like BSD.
111 -minixdf Make 'df' act like Minix.
113 -debug Extra debugging information is sent to syslog.
115 -abort Simulate the effects of calling ext4_abort() for
116 - debugging purposes. This is normally used while
117 - remounting a filesystem which is already mounted.
119 -errors=remount-ro Remount the filesystem read-only on an error.
120 -errors=continue Keep going on a filesystem error.
121 -errors=panic Panic and halt the machine if an error occurs.
122 - (These mount options override the errors behavior
123 - specified in the superblock, which can be configured
126 -data_err=ignore(*) Just print an error message if an error occurs
127 - in a file data buffer in ordered mode.
128 -data_err=abort Abort the journal if an error occurs in a file
129 - data buffer in ordered mode.
131 -grpid New objects have the group ID of their parent.
134 -nogrpid (*) New objects have the group ID of their creator.
137 -resgid=n The group ID which may use the reserved blocks.
139 -resuid=n The user ID which may use the reserved blocks.
141 -sb=n Use alternate superblock at this location.
143 -quota These options are ignored by the filesystem. They
144 -noquota are used only by quota tools to recognize volumes
145 -grpquota where quota should be turned on. See documentation
146 -usrquota in the quota-tools package for more details
147 - (http://sourceforge.net/projects/linuxquota).
149 -jqfmt=<quota type> These options tell filesystem details about quota
150 -usrjquota=<file> so that quota information can be properly updated
151 -grpjquota=<file> during journal replay. They replace the above
152 - quota options. See documentation in the quota-tools
153 - package for more details
154 - (http://sourceforge.net/projects/linuxquota).
156 -stripe=n Number of filesystem blocks that mballoc will try
157 - to use for allocation size and alignment. For RAID5/6
158 - systems this should be the number of data
159 - disks * RAID chunk size in file system blocks.
161 -delalloc (*) Defer block allocation until just before ext4
162 - writes out the block(s) in question. This
163 - allows ext4 to better allocation decisions
165 -nodelalloc Disable delayed allocation. Blocks are allocated
166 - when the data is copied from userspace to the
167 - page cache, either via the write(2) system call
168 - or when an mmap'ed page which was previously
169 - unallocated is written for the first time.
171 -max_batch_time=usec Maximum amount of time ext4 should wait for
172 - additional filesystem operations to be batch
173 - together with a synchronous write operation.
174 - Since a synchronous write operation is going to
175 - force a commit and then a wait for the I/O
176 - complete, it doesn't cost much, and can be a
177 - huge throughput win, we wait for a small amount
178 - of time to see if any other transactions can
179 - piggyback on the synchronous write. The
180 - algorithm used is designed to automatically tune
181 - for the speed of the disk, by measuring the
182 - amount of time (on average) that it takes to
183 - finish committing a transaction. Call this time
184 - the "commit time". If the time that the
185 - transaction has been running is less than the
186 - commit time, ext4 will try sleeping for the
187 - commit time to see if other operations will join
188 - the transaction. The commit time is capped by
189 - the max_batch_time, which defaults to 15000us
190 - (15ms). This optimization can be turned off
191 - entirely by setting max_batch_time to 0.
193 -min_batch_time=usec This parameter sets the commit time (as
194 - described above) to be at least min_batch_time.
195 - It defaults to zero microseconds. Increasing
196 - this parameter may improve the throughput of
197 - multi-threaded, synchronous workloads on very
198 - fast disks, at the cost of increasing latency.
200 -journal_ioprio=prio The I/O priority (from 0 to 7, where 0 is the
201 - highest priority) which should be used for I/O
202 - operations submitted by kjournald2 during a
203 - commit operation. This defaults to 3, which is
204 - a slightly higher priority than the default I/O
207 -auto_da_alloc(*) Many broken applications don't use fsync() when
208 -noauto_da_alloc replacing existing files via patterns such as
209 - fd = open("foo.new")/write(fd,..)/close(fd)/
210 - rename("foo.new", "foo"), or worse yet,
211 - fd = open("foo", O_TRUNC)/write(fd,..)/close(fd).
212 - If auto_da_alloc is enabled, ext4 will detect
213 - the replace-via-rename and replace-via-truncate
214 - patterns and force that any delayed allocation
215 - blocks are allocated such that at the next
216 - journal commit, in the default data=ordered
217 - mode, the data blocks of the new file are forced
218 - to disk before the rename() operation is
219 - committed. This provides roughly the same level
220 - of guarantees as ext3, and avoids the
221 - "zero-length" problem that can happen when a
222 - system crashes before the delayed allocation
223 - blocks are forced to disk.
225 -noinit_itable Do not initialize any uninitialized inode table
226 - blocks in the background. This feature may be
227 - used by installation CD's so that the install
228 - process can complete as quickly as possible; the
229 - inode table initialization process would then be
230 - deferred until the next time the file system
233 -init_itable=n The lazy itable init code will wait n times the
234 - number of milliseconds it took to zero out the
235 - previous block group's inode table. This
236 - minimizes the impact on the system performance
237 - while file system's inode table is being initialized.
239 -discard Controls whether ext4 should issue discard/TRIM
240 -nodiscard(*) commands to the underlying block device when
241 - blocks are freed. This is useful for SSD devices
242 - and sparse/thinly-provisioned LUNs, but it is off
243 - by default until sufficient testing has been done.
245 -nouid32 Disables 32-bit UIDs and GIDs. This is for
246 - interoperability with older kernels which only
247 - store and expect 16-bit values.
249 -block_validity(*) These options enable or disable the in-kernel
250 -noblock_validity facility for tracking filesystem metadata blocks
251 - within internal data structures. This allows multi-
252 - block allocator and other routines to notice
253 - bugs or corrupted allocation bitmaps which cause
254 - blocks to be allocated which overlap with
255 - filesystem metadata blocks.
257 -dioread_lock Controls whether or not ext4 should use the DIO read
258 -dioread_nolock locking. If the dioread_nolock option is specified
259 - ext4 will allocate uninitialized extent before buffer
260 - write and convert the extent to initialized after IO
261 - completes. This approach allows ext4 code to avoid
262 - using inode mutex, which improves scalability on high
263 - speed storages. However this does not work with
264 - data journaling and dioread_nolock option will be
265 - ignored with kernel warning. Note that dioread_nolock
266 - code path is only used for extent-based files.
267 - Because of the restrictions this options comprises
268 - it is off by default (e.g. dioread_lock).
270 -max_dir_size_kb=n This limits the size of directories so that any
271 - attempt to expand them beyond the specified
272 - limit in kilobytes will cause an ENOSPC error.
273 - This is useful in memory constrained
274 - environments, where a very large directory can
275 - cause severe performance problems or even
276 - provoke the Out Of Memory killer. (For example,
277 - if there is only 512mb memory available, a 176mb
278 - directory may seriously cramp the system's style.)
280 -i_version Enable 64-bit inode version support. This option is
283 -dax Use direct access (no page cache). See
284 - Documentation/filesystems/dax.txt. Note that
285 - this option is incompatible with data=journal.
286 -======================= =======================================================
288 + Mount filesystem read only. Note that ext4 will replay the journal (and
289 + thus write to the partition) even when mounted "read only". The mount
290 + options "ro,noload" can be used to prevent writes to the filesystem.
293 + Enable checksumming of the journal transactions. This will allow the
294 + recovery code in e2fsck and the kernel to detect corruption in the
295 + kernel. It is a compatible change and will be ignored by older
298 + journal_async_commit
299 + Commit block can be written to disk without waiting for descriptor
300 + blocks. If enabled older kernels cannot mount the device. This will
301 + enable 'journal_checksum' internally.
303 + journal_path=path, journal_dev=devnum
304 + When the external journal device's major/minor numbers have changed,
305 + these options allow the user to specify the new journal location. The
306 + journal device is identified through either its new major/minor numbers
307 + encoded in devnum, or via a path to the device.
310 + Don't load the journal on mounting. Note that if the filesystem was
311 + not unmounted cleanly, skipping the journal replay will lead to the
312 + filesystem containing inconsistencies that can lead to any number of
316 + All data are committed into the journal prior to being written into the
317 + main file system. Enabling this mode will disable delayed allocation
318 + and O_DIRECT support.
321 + All data are forced directly out to the main file system prior to its
322 + metadata being committed to the journal.
325 + Data ordering is not preserved, data may be written into the main file
326 + system after its metadata has been committed to the journal.
329 + Ext4 can be told to sync all its data and metadata every 'nrsec'
330 + seconds. The default value is 5 seconds. This means that if you lose
331 + your power, you will lose as much as the latest 5 seconds of work (your
332 + filesystem will not be damaged though, thanks to the journaling). This
333 + default value (or any low value) will hurt performance, but it's good
334 + for data-safety. Setting it to 0 will have the same effect as leaving
335 + it at the default (5 seconds). Setting it to very large values will
336 + improve performance.
338 + barrier=<0|1(*)>, barrier(*), nobarrier
339 + This enables/disables the use of write barriers in the jbd code.
340 + barrier=0 disables, barrier=1 enables. This also requires an IO stack
341 + which can support barriers, and if jbd gets an error on a barrier
342 + write, it will disable again with a warning. Write barriers enforce
343 + proper on-disk ordering of journal commits, making volatile disk write
344 + caches safe to use, at some performance penalty. If your disks are
345 + battery-backed in one way or another, disabling barriers may safely
346 + improve performance. The mount options "barrier" and "nobarrier" can
347 + also be used to enable or disable barriers, for consistency with other
348 + ext4 mount options.
350 + inode_readahead_blks=n
351 + This tuning parameter controls the maximum number of inode table blocks
352 + that ext4's inode table readahead algorithm will pre-read into the
353 + buffer cache. The default value is 32 blocks.
356 + Disables Extended User Attributes. See the attr(5) manual page for
357 + more information about extended attributes.
360 + This option disables POSIX Access Control List support. If ACL support
361 + is enabled in the kernel configuration (CONFIG_EXT4_FS_POSIX_ACL), ACL
362 + is enabled by default on mount. See the acl(5) manual page for more
363 + information about acl.
366 + Make 'df' act like BSD.
369 + Make 'df' act like Minix.
372 + Extra debugging information is sent to syslog.
375 + Simulate the effects of calling ext4_abort() for debugging purposes.
376 + This is normally used while remounting a filesystem which is already
380 + Remount the filesystem read-only on an error.
383 + Keep going on a filesystem error.
386 + Panic and halt the machine if an error occurs. (These mount options
387 + override the errors behavior specified in the superblock, which can be
388 + configured using tune2fs)
391 + Just print an error message if an error occurs in a file data buffer in
394 + Abort the journal if an error occurs in a file data buffer in ordered
398 + New objects have the group ID of their parent.
400 + nogrpid (*) | sysvgroups
401 + New objects have the group ID of their creator.
404 + The group ID which may use the reserved blocks.
407 + The user ID which may use the reserved blocks.
410 + Use alternate superblock at this location.
412 + quota, noquota, grpquota, usrquota
413 + These options are ignored by the filesystem. They are used only by
414 + quota tools to recognize volumes where quota should be turned on. See
415 + documentation in the quota-tools package for more details
416 + (http://sourceforge.net/projects/linuxquota).
418 + jqfmt=<quota type>, usrjquota=<file>, grpjquota=<file>
419 + These options tell filesystem details about quota so that quota
420 + information can be properly updated during journal replay. They replace
421 + the above quota options. See documentation in the quota-tools package
422 + for more details (http://sourceforge.net/projects/linuxquota).
425 + Number of filesystem blocks that mballoc will try to use for allocation
426 + size and alignment. For RAID5/6 systems this should be the number of
427 + data disks * RAID chunk size in file system blocks.
430 + Defer block allocation until just before ext4 writes out the block(s)
431 + in question. This allows ext4 to better allocation decisions more
435 + Disable delayed allocation. Blocks are allocated when the data is
436 + copied from userspace to the page cache, either via the write(2) system
437 + call or when an mmap'ed page which was previously unallocated is
438 + written for the first time.
440 + max_batch_time=usec
441 + Maximum amount of time ext4 should wait for additional filesystem
442 + operations to be batch together with a synchronous write operation.
443 + Since a synchronous write operation is going to force a commit and then
444 + a wait for the I/O complete, it doesn't cost much, and can be a huge
445 + throughput win, we wait for a small amount of time to see if any other
446 + transactions can piggyback on the synchronous write. The algorithm
447 + used is designed to automatically tune for the speed of the disk, by
448 + measuring the amount of time (on average) that it takes to finish
449 + committing a transaction. Call this time the "commit time". If the
450 + time that the transaction has been running is less than the commit
451 + time, ext4 will try sleeping for the commit time to see if other
452 + operations will join the transaction. The commit time is capped by
453 + the max_batch_time, which defaults to 15000us (15ms). This
454 + optimization can be turned off entirely by setting max_batch_time to 0.
456 + min_batch_time=usec
457 + This parameter sets the commit time (as described above) to be at least
458 + min_batch_time. It defaults to zero microseconds. Increasing this
459 + parameter may improve the throughput of multi-threaded, synchronous
460 + workloads on very fast disks, at the cost of increasing latency.
462 + journal_ioprio=prio
463 + The I/O priority (from 0 to 7, where 0 is the highest priority) which
464 + should be used for I/O operations submitted by kjournald2 during a
465 + commit operation. This defaults to 3, which is a slightly higher
466 + priority than the default I/O priority.
468 + auto_da_alloc(*), noauto_da_alloc
469 + Many broken applications don't use fsync() when replacing existing
470 + files via patterns such as fd = open("foo.new")/write(fd,..)/close(fd)/
471 + rename("foo.new", "foo"), or worse yet, fd = open("foo",
472 + O_TRUNC)/write(fd,..)/close(fd). If auto_da_alloc is enabled, ext4
473 + will detect the replace-via-rename and replace-via-truncate patterns
474 + and force that any delayed allocation blocks are allocated such that at
475 + the next journal commit, in the default data=ordered mode, the data
476 + blocks of the new file are forced to disk before the rename() operation
477 + is committed. This provides roughly the same level of guarantees as
478 + ext3, and avoids the "zero-length" problem that can happen when a
479 + system crashes before the delayed allocation blocks are forced to disk.
482 + Do not initialize any uninitialized inode table blocks in the
483 + background. This feature may be used by installation CD's so that the
484 + install process can complete as quickly as possible; the inode table
485 + initialization process would then be deferred until the next time the
486 + file system is unmounted.
489 + The lazy itable init code will wait n times the number of milliseconds
490 + it took to zero out the previous block group's inode table. This
491 + minimizes the impact on the system performance while file system's
492 + inode table is being initialized.
494 + discard, nodiscard(*)
495 + Controls whether ext4 should issue discard/TRIM commands to the
496 + underlying block device when blocks are freed. This is useful for SSD
497 + devices and sparse/thinly-provisioned LUNs, but it is off by default
498 + until sufficient testing has been done.
501 + Disables 32-bit UIDs and GIDs. This is for interoperability with
502 + older kernels which only store and expect 16-bit values.
504 + block_validity(*), noblock_validity
505 + These options enable or disable the in-kernel facility for tracking
506 + filesystem metadata blocks within internal data structures. This
507 + allows multi- block allocator and other routines to notice bugs or
508 + corrupted allocation bitmaps which cause blocks to be allocated which
509 + overlap with filesystem metadata blocks.
511 + dioread_lock, dioread_nolock
512 + Controls whether or not ext4 should use the DIO read locking. If the
513 + dioread_nolock option is specified ext4 will allocate uninitialized
514 + extent before buffer write and convert the extent to initialized after
515 + IO completes. This approach allows ext4 code to avoid using inode
516 + mutex, which improves scalability on high speed storages. However this
517 + does not work with data journaling and dioread_nolock option will be
518 + ignored with kernel warning. Note that dioread_nolock code path is only
519 + used for extent-based files. Because of the restrictions this options
520 + comprises it is off by default (e.g. dioread_lock).
523 + This limits the size of directories so that any attempt to expand them
524 + beyond the specified limit in kilobytes will cause an ENOSPC error.
525 + This is useful in memory constrained environments, where a very large
526 + directory can cause severe performance problems or even provoke the Out
527 + Of Memory killer. (For example, if there is only 512mb memory
528 + available, a 176mb directory may seriously cramp the system's style.)
531 + Enable 64-bit inode version support. This option is off by default.
534 + Use direct access (no page cache). See
535 + Documentation/filesystems/dax.txt. Note that this option is
536 + incompatible with data=journal.
540 @@ -407,11 +394,8 @@ in table below.
542 Files in /proc/fs/ext4/<devname>
544 -================ =======
546 -================ =======
547 - mb_groups details of multiblock allocator buddy cache of free blocks
548 -================ =======
550 + details of multiblock allocator buddy cache of free blocks
554 @@ -426,74 +410,71 @@ Files in /sys/fs/ext4/<devname>:
556 (see also Documentation/ABI/testing/sysfs-fs-ext4)
558 -============================= =================================================
560 -============================= =================================================
561 - delayed_allocation_blocks This file is read-only and shows the number of
562 - blocks that are dirty in the page cache, but
563 - which do not have their location in the
564 - filesystem allocated yet.
566 -inode_goal Tuning parameter which (if non-zero) controls
567 - the goal inode used by the inode allocator in
568 - preference to all other allocation heuristics.
569 - This is intended for debugging use only, and
570 - should be 0 on production systems.
572 -inode_readahead_blks Tuning parameter which controls the maximum
573 - number of inode table blocks that ext4's inode
574 - table readahead algorithm will pre-read into
577 -lifetime_write_kbytes This file is read-only and shows the number of
578 - kilobytes of data that have been written to this
579 - filesystem since it was created.
581 - max_writeback_mb_bump The maximum number of megabytes the writeback
582 - code will try to write out before move on to
585 - mb_group_prealloc The multiblock allocator will round up allocation
586 - requests to a multiple of this tuning parameter if
587 - the stripe size is not set in the ext4 superblock
589 - mb_max_to_scan The maximum number of extents the multiblock
590 - allocator will search to find the best extent
592 - mb_min_to_scan The minimum number of extents the multiblock
593 - allocator will search to find the best extent
595 - mb_order2_req Tuning parameter which controls the minimum size
596 - for requests (as a power of 2) where the buddy
599 - mb_stats Controls whether the multiblock allocator should
600 - collect statistics, which are shown during the
601 - unmount. 1 means to collect statistics, 0 means
602 - not to collect statistics
604 - mb_stream_req Files which have fewer blocks than this tunable
605 - parameter will have their blocks allocated out
606 - of a block group specific preallocation pool, so
607 - that small files are packed closely together.
608 - Each large file will have its blocks allocated
609 - out of its own unique preallocation pool.
611 - session_write_kbytes This file is read-only and shows the number of
612 - kilobytes of data that have been written to this
613 - filesystem since it was mounted.
615 - reserved_clusters This is RW file and contains number of reserved
616 - clusters in the file system which will be used
617 - in the specific situations to avoid costly
618 - zeroout, unexpected ENOSPC, or possible data
619 - loss. The default is 2% or 4096 clusters,
620 - whichever is smaller and this can be changed
621 - however it can never exceed number of clusters
622 - in the file system. If there is not enough space
623 - for the reserved space when mounting the file
624 - mount will _not_ fail.
625 -============================= =================================================
626 + delayed_allocation_blocks
627 + This file is read-only and shows the number of blocks that are dirty in
628 + the page cache, but which do not have their location in the filesystem
632 + Tuning parameter which (if non-zero) controls the goal inode used by
633 + the inode allocator in preference to all other allocation heuristics.
634 + This is intended for debugging use only, and should be 0 on production
637 + inode_readahead_blks
638 + Tuning parameter which controls the maximum number of inode table
639 + blocks that ext4's inode table readahead algorithm will pre-read into
642 + lifetime_write_kbytes
643 + This file is read-only and shows the number of kilobytes of data that
644 + have been written to this filesystem since it was created.
646 + max_writeback_mb_bump
647 + The maximum number of megabytes the writeback code will try to write
648 + out before move on to another inode.
651 + The multiblock allocator will round up allocation requests to a
652 + multiple of this tuning parameter if the stripe size is not set in the
656 + The maximum number of extents the multiblock allocator will search to
657 + find the best extent.
660 + The minimum number of extents the multiblock allocator will search to
661 + find the best extent.
664 + Tuning parameter which controls the minimum size for requests (as a
665 + power of 2) where the buddy cache is used.
668 + Controls whether the multiblock allocator should collect statistics,
669 + which are shown during the unmount. 1 means to collect statistics, 0
670 + means not to collect statistics.
673 + Files which have fewer blocks than this tunable parameter will have
674 + their blocks allocated out of a block group specific preallocation
675 + pool, so that small files are packed closely together. Each large file
676 + will have its blocks allocated out of its own unique preallocation
679 + session_write_kbytes
680 + This file is read-only and shows the number of kilobytes of data that
681 + have been written to this filesystem since it was mounted.
684 + This is RW file and contains number of reserved clusters in the file
685 + system which will be used in the specific situations to avoid costly
686 + zeroout, unexpected ENOSPC, or possible data loss. The default is 2% or
687 + 4096 clusters, whichever is smaller and this can be changed however it
688 + can never exceed number of clusters in the file system. If there is not
689 + enough space for the reserved space when mounting the file mount will
694 @@ -504,100 +485,80 @@ shown in the table below.
696 Table of Ext4 specific ioctls
698 -============================= =================================================
700 -============================= =================================================
701 - EXT4_IOC_GETFLAGS Get additional attributes associated with inode.
702 - The ioctl argument is an integer bitfield, with
703 - bit values described in ext4.h. This ioctl is an
704 - alias for FS_IOC_GETFLAGS.
706 - EXT4_IOC_SETFLAGS Set additional attributes associated with inode.
707 - The ioctl argument is an integer bitfield, with
708 - bit values described in ext4.h. This ioctl is an
709 - alias for FS_IOC_SETFLAGS.
711 - EXT4_IOC_GETVERSION
712 - EXT4_IOC_GETVERSION_OLD
713 - Get the inode i_generation number stored for
714 - each inode. The i_generation number is normally
715 - changed only when new inode is created and it is
716 - particularly useful for network filesystems. The
717 - '_OLD' version of this ioctl is an alias for
720 - EXT4_IOC_SETVERSION
721 - EXT4_IOC_SETVERSION_OLD
722 - Set the inode i_generation number stored for
723 - each inode. The '_OLD' version of this ioctl
724 - is an alias for FS_IOC_SETVERSION.
726 - EXT4_IOC_GROUP_EXTEND This ioctl has the same purpose as the resize
727 - mount option. It allows to resize filesystem
728 - to the end of the last existing block group,
729 - further resize has to be done with resize2fs,
730 - either online, or offline. The argument points
731 - to the unsigned logn number representing the
732 - filesystem new block count.
734 - EXT4_IOC_MOVE_EXT Move the block extents from orig_fd (the one
735 - this ioctl is pointing to) to the donor_fd (the
736 - one specified in move_extent structure passed
737 - as an argument to this ioctl). Then, exchange
738 - inode metadata between orig_fd and donor_fd.
739 - This is especially useful for online
740 - defragmentation, because the allocator has the
741 - opportunity to allocate moved blocks better,
742 - ideally into one contiguous extent.
744 - EXT4_IOC_GROUP_ADD Add a new group descriptor to an existing or
745 - new group descriptor block. The new group
746 - descriptor is described by ext4_new_group_input
747 - structure, which is passed as an argument to
748 - this ioctl. This is especially useful in
749 - conjunction with EXT4_IOC_GROUP_EXTEND,
750 - which allows online resize of the filesystem
751 - to the end of the last existing block group.
752 - Those two ioctls combined is used in userspace
753 - online resize tool (e.g. resize2fs).
755 - EXT4_IOC_MIGRATE This ioctl operates on the filesystem itself.
756 - It converts (migrates) ext3 indirect block mapped
757 - inode to ext4 extent mapped inode by walking
758 - through indirect block mapping of the original
759 - inode and converting contiguous block ranges
760 - into ext4 extents of the temporary inode. Then,
761 - inodes are swapped. This ioctl might help, when
762 - migrating from ext3 to ext4 filesystem, however
763 - suggestion is to create fresh ext4 filesystem
764 - and copy data from the backup. Note, that
765 - filesystem has to support extents for this ioctl
768 - EXT4_IOC_ALLOC_DA_BLKS Force all of the delay allocated blocks to be
769 - allocated to preserve application-expected ext3
770 - behaviour. Note that this will also start
771 - triggering a write of the data blocks, but this
772 - behaviour may change in the future as it is
773 - not necessary and has been done this way only
774 - for sake of simplicity.
776 - EXT4_IOC_RESIZE_FS Resize the filesystem to a new size. The number
777 - of blocks of resized filesystem is passed in via
778 - 64 bit integer argument. The kernel allocates
779 - bitmaps and inode table, the userspace tool thus
780 - just passes the new number of blocks.
782 - EXT4_IOC_SWAP_BOOT Swap i_blocks and associated attributes
783 - (like i_blocks, i_size, i_flags, ...) from
784 - the specified inode with inode
785 - EXT4_BOOT_LOADER_INO (#5). This is typically
786 - used to store a boot loader in a secure part of
787 - the filesystem, where it can't be changed by a
788 - normal user by accident.
789 - The data blocks of the previous boot loader
790 - will be associated with the given inode.
791 -============================= =================================================
793 + Get additional attributes associated with inode. The ioctl argument is
794 + an integer bitfield, with bit values described in ext4.h. This ioctl is
795 + an alias for FS_IOC_GETFLAGS.
798 + Set additional attributes associated with inode. The ioctl argument is
799 + an integer bitfield, with bit values described in ext4.h. This ioctl is
800 + an alias for FS_IOC_SETFLAGS.
802 + EXT4_IOC_GETVERSION, EXT4_IOC_GETVERSION_OLD
803 + Get the inode i_generation number stored for each inode. The
804 + i_generation number is normally changed only when new inode is created
805 + and it is particularly useful for network filesystems. The '_OLD'
806 + version of this ioctl is an alias for FS_IOC_GETVERSION.
808 + EXT4_IOC_SETVERSION, EXT4_IOC_SETVERSION_OLD
809 + Set the inode i_generation number stored for each inode. The '_OLD'
810 + version of this ioctl is an alias for FS_IOC_SETVERSION.
812 + EXT4_IOC_GROUP_EXTEND
813 + This ioctl has the same purpose as the resize mount option. It allows
814 + to resize filesystem to the end of the last existing block group,
815 + further resize has to be done with resize2fs, either online, or
816 + offline. The argument points to the unsigned logn number representing
817 + the filesystem new block count.
820 + Move the block extents from orig_fd (the one this ioctl is pointing to)
821 + to the donor_fd (the one specified in move_extent structure passed as
822 + an argument to this ioctl). Then, exchange inode metadata between
823 + orig_fd and donor_fd. This is especially useful for online
824 + defragmentation, because the allocator has the opportunity to allocate
825 + moved blocks better, ideally into one contiguous extent.
828 + Add a new group descriptor to an existing or new group descriptor
829 + block. The new group descriptor is described by ext4_new_group_input
830 + structure, which is passed as an argument to this ioctl. This is
831 + especially useful in conjunction with EXT4_IOC_GROUP_EXTEND, which
832 + allows online resize of the filesystem to the end of the last existing
833 + block group. Those two ioctls combined is used in userspace online
834 + resize tool (e.g. resize2fs).
837 + This ioctl operates on the filesystem itself. It converts (migrates)
838 + ext3 indirect block mapped inode to ext4 extent mapped inode by walking
839 + through indirect block mapping of the original inode and converting
840 + contiguous block ranges into ext4 extents of the temporary inode. Then,
841 + inodes are swapped. This ioctl might help, when migrating from ext3 to
842 + ext4 filesystem, however suggestion is to create fresh ext4 filesystem
843 + and copy data from the backup. Note, that filesystem has to support
844 + extents for this ioctl to work.
846 + EXT4_IOC_ALLOC_DA_BLKS
847 + Force all of the delay allocated blocks to be allocated to preserve
848 + application-expected ext3 behaviour. Note that this will also start
849 + triggering a write of the data blocks, but this behaviour may change in
850 + the future as it is not necessary and has been done this way only for
851 + sake of simplicity.
854 + Resize the filesystem to a new size. The number of blocks of resized
855 + filesystem is passed in via 64 bit integer argument. The kernel
856 + allocates bitmaps and inode table, the userspace tool thus just passes
857 + the new number of blocks.
860 + Swap i_blocks and associated attributes (like i_blocks, i_size,
861 + i_flags, ...) from the specified inode with inode EXT4_BOOT_LOADER_INO
862 + (#5). This is typically used to store a boot loader in a secure part of
863 + the filesystem, where it can't be changed by a normal user by accident.
864 + The data blocks of the previous boot loader will be associated with the