From a48813665f472d19c76410d053dfda924685c527 Mon Sep 17 00:00:00 2001 From: Theodore Ts'o Date: Wed, 2 Jul 2008 09:24:59 -0400 Subject: [PATCH] Fix up patch comments and add signed-off by message. Move and expand comments about delayed allocation not getting quotas right, as well as the i_blocks problem, into the series file. --- Add-range_cont-mode-for-writeback.patch | 1 - delalloc-ext4-reserve-locking-order-support.patch | 14 +++++---- delalloc-ext4-writeback-mode.patch | 5 ---- ...le-nonextents-mount-on-large-than-16TB-fs.patch | 23 ++++++--------- ...dle-page-writhout-buffers-in-da-writepage.patch | 33 +++++++++++----------- series | 19 +++++++++++++ 6 files changed, 52 insertions(+), 43 deletions(-) diff --git a/Add-range_cont-mode-for-writeback.patch b/Add-range_cont-mode-for-writeback.patch index add9c099..13f3dc7b 100644 --- a/Add-range_cont-mode-for-writeback.patch +++ b/Add-range_cont-mode-for-writeback.patch @@ -20,7 +20,6 @@ Signed-off-by: Aneesh Kumar K.V Signed-off-by: Mingming Cao Acked-by: Jan Kara Signed-off-by: "Theodore Ts'o" - --- include/linux/writeback.h | 1 + diff --git a/delalloc-ext4-reserve-locking-order-support.patch b/delalloc-ext4-reserve-locking-order-support.patch index 5be20f6a..4fdff850 100644 --- a/delalloc-ext4-reserve-locking-order-support.patch +++ b/delalloc-ext4-reserve-locking-order-support.patch @@ -2,17 +2,19 @@ ext4: Invert lock ordering of page_lock and transaction start in delalloc From: Mingming Cao -With the reverse locking, we need to start a transation before taking the page -lock, so in ext4_da_writepages() we need to break the write-out into chunks, -and restart the journal for each chunck to ensure the write-out fits in -a single transaction. +With the reverse locking, we need to start a transation before taking +the page lock, so in ext4_da_writepages() we need to break the write-out +into chunks, and restart the journal for each chunck to ensure the +write-out fits in a single transaction. -Updated patch from Aneesh Kumar K.V which -fixes delalloc sync hang with journal lock inversion, and address the performance regression issue. +Updated patch from Aneesh Kumar K.V +which fixes delalloc sync hang with journal lock inversion, and address +the performance regression issue. Signed-off-by: Mingming Cao Signed-off-by: Aneesh Kumar K.V Signed-off-by: Jan Kara +Signed-off-by: "Theodore Ts'o" --- fs/ext4/extents.c | 10 ++ diff --git a/delalloc-ext4-writeback-mode.patch b/delalloc-ext4-writeback-mode.patch index 61fd1091..a8010044 100644 --- a/delalloc-ext4-writeback-mode.patch +++ b/delalloc-ext4-writeback-mode.patch @@ -21,16 +21,11 @@ Updated fixes from Aneesh Kumar K.V to update i_disksize properly with delayed allocation, and add bmap support for delalloc. - -Todo: - * quota - Signed-off-by: Alex Tomas Signed-off-by: Mingming Cao Signed-off-by: Dave Kleikamp Signed-off-by: "Theodore Ts'o" Signed-off-by: Aneesh Kumar K.V - --- fs/ext4/ext4.h | 1 fs/ext4/inode.c | 296 +++++++++++++++++++++++++++++++++++++++++++++++++++++++- diff --git a/ext4-disable-nonextents-mount-on-large-than-16TB-fs.patch b/ext4-disable-nonextents-mount-on-large-than-16TB-fs.patch index 95be17f4..2efb2792 100644 --- a/ext4-disable-nonextents-mount-on-large-than-16TB-fs.patch +++ b/ext4-disable-nonextents-mount-on-large-than-16TB-fs.patch @@ -1,24 +1,17 @@ -From: "Aneesh Kumar K.V" ext4: Don't allow nonextenst mount option for large filesystem -The block mapped inode format can address only blocks within -2**32. So with file system larger than 2**32 don't allow -noextents mount option. There are multiple issues that need -to addressed in the long run. - -a) When e2fsprogs support resizing an already existing ext3 -file system to greater than 2**32 we need to add support to block -allocator to handle growing already existing block mapped inode -so that blocks allocated for them fall within 2**32 - -b) When we do that we should enable mounting larger file system -with -o noextents mount option. - -Till them we fail mount of large file systems with -o noextents. +From: "Aneesh Kumar K.V" +The block mapped inode format can address only blocks within 2**32. This +causes a number of issues, the biggest of which is that the block +allocator needs to be taught that certain inodes can not utilize block +numbers > 2**32. So until this is fixed, it is simplest to fail +mounting of file systems with more than 2**32 blocks if the -o noextents +option is given. Signed-off-by: Aneesh Kumar K.V Signed-off-by: Mingming Cao +Signed-off-by: "Theodore Ts'o" --- fs/ext4/super.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/ext4-handle-page-writhout-buffers-in-da-writepage.patch b/ext4-handle-page-writhout-buffers-in-da-writepage.patch index 9178a824..ec32db35 100644 --- a/ext4-handle-page-writhout-buffers-in-da-writepage.patch +++ b/ext4-handle-page-writhout-buffers-in-da-writepage.patch @@ -3,27 +3,28 @@ ext4: Handle page without buffers in ext4_*_writepage() From: Aneesh Kumar It can happen that buffers are removed from the page before it gets -marked dirty and then is passed to writepage(). In writepage() -we just initialize the buffers and check whether they are mapped and non +marked dirty and then is passed to writepage(). In writepage() we just +initialize the buffers and check whether they are mapped and non delay. If they are mapped and non delay we write the page. Otherwise we -mark then dirty. With this change we don't do block allocation at all in -ext4_*_write_page. - -writepage() get called under many condition and with a locking order -of journal_start -> lock_page we shouldnot try to allocate blocks in -writepage() which get called after taking page lock. writepage can get -called via shrink_page_list even with a journal handle which was created -for doing inode update. For example when doing ext4_da_write_begin we -create a journal handle with credit 1 expecting a i_disksize update for -the inode. But ext4_da_write_begin can cause shrink_page_list via -_grab_page_cache. So having a valid handle via ext4_journal_current_handle -is not a guarantee that we can use the handle for block allocation in -writepage. We should not be using credits which are taken for other updates. -That would result in we running out of credits when we update inodes. +mark them dirty. With this change we don't do block allocation at all +in ext4_*_write_page. +writepage() can get called under many condition and with a locking order +of journal_start -> lock_page, we should not try to allocate blocks in +writepage() which get called after taking page lock. writepage() can +get called via shrink_page_list even with a journal handle which was +created for doing inode update. For example when doing +ext4_da_write_begin we create a journal handle with credit 1 expecting a +i_disksize update for the inode. But ext4_da_write_begin can cause +shrink_page_list via _grab_page_cache. So having a valid handle via +ext4_journal_current_handle is not a guarantee that we can use the +handle for block allocation in writepage, since we shouldn't be using +credits that had been reserved for other updates. That it could result +in we running out of credits when we update inodes. Signed-off-by: Aneesh Kumar K.V Signed-off-by: Mingming Cao +Signed-off-by: "Theodore Ts'o" --- fs/ext4/inode.c | 169 +++++++++++++++++++++++++++++++++++++++++--------------- 1 file changed, 124 insertions(+), 45 deletions(-) diff --git a/series b/series index e6130448..7800ca8c 100644 --- a/series +++ b/series @@ -58,6 +58,25 @@ ext4-Use-new-framework-for-data-ordered-mode-in-JBD.patch jbd2-Remove-data-ordered-mode-support-using-jbd-buf.patch # New delayed allocation patch +# +# Note: we still need to improve quota handling; at the moment, if the +# user goes over quota, the block is not allocated and the page is left +# dirty in the page cache. This will prevent the filesystem from being +# mounted, and if the system crashes, the user's data will be lost without +# warning. This is doubleplusungood!! +# +# Another major (and related) problem is that i_blocks is not getting +# updated until the disks are actually allocaed on disk. This means +# that right after files are copied, "ls -sF" shoes the file as taking +# 0 blocks on disk. "du" also shows the files taking zero space, which +# is highly confusing to the user. +# +# A quick way of fixing this is to keep a count of blocks that are +# subject to delayed allocation, and use that to adjust the value returned +# by stat(2). +# +# XXX does "df do the right thing with blocks not yet allocated? +# delalloc-vfs.patch delalloc-ext4-writeback-mode.patch -- 2.11.4.GIT