Red Hat Bugzilla – Bug 510379
resize2fs corrupting ext4 filesystems on ENOSPC
Last modified: 2009-10-06 10:29:59 EDT
Resizing the attached image to 589824 blocks will corrupt it.
Things to note:
* 589824 is actually well above the claimed minimum fs size
* Things go bad in ext2fs_extent_set_bmap()
- "(re/un)mapping first block in extent"
- modifies existing extent (removes the block that is moving)
- goto again
- "mapping unmapped logical block"
- tries to insert node and fails with ENOSPC
- extent tree is now corrupted, it lost a block
* It's handy to make an e2image -r of the attached image for testing.
* Even if ENOSPC doesn't corrupt extent tree, block bitmaps are still left bad
* We could try to do better front & back merging
* We could bail early on any modification requiring a split if we have few blocks left
* We could try to unwrap after the error
* We could try to insert the new node before we modify the old
* etc :)
Oh, you'll probably want http://marc.info/?l=linux-ext4&m=124699503718719&w=2 too or you may hit that bug first...
Three solutions proposed on #ext4 today:
* Create a special purpose ad hoc undo manager
* Fix ext2fs_extent_set_bmap() to return in uncorrupted state
* Fix the size calculation (which may include other changes to moving blocks - e.g., extent merging - to make this easier to compute)
* How hard is each individual solution to implement?
* What is the likelihood of future bugs appearing after each fix?
* What is the common case/use case/situation we have to fix the most?
I think this bug is triggered by the Fedora livecd creator, which used to iteratively resize the file system smaller and now attempts to do it all in one fell swoop.
Answers to above questions welcomed...
Another proposed solution: Keep track of the free space as the resize progresses. Once it hits a certain threshold (the worst case number of blocks needed for some granularity of operation?), quit at a consistent place.
Created attachment 351192 [details]
problematic image, pre-resize
resize this image to 589824 blocks & fsck to see the problem.
From tytso on #ext4:
The ext4 kernel code assumes a depth of 5; we've *never* seen anything with more than a depth of 3. We don't have anything that enforces that limit; maybe we should.
I guess what we can do is iterate over all of the extent nodes and count the number of index nodes that have n-1 entries filled.
we don't even need to check all of the index nodes, just the ones on the path from the root to the extent leaf that we need to modify.
I just looked at the code; and it's trivial to count the number of blocks that might need to be split.
might as well do it right the first time around. I can whip up the patch in about 5-10 minutes.
I have a proposed patch that fixes most of this problem here:
Note that finding this problem wasn't really all that difficult. I just set a break point in resize2fs's get_new_block() where it returns 0 indicating an allocation failure, and at that point just walked up the stack trace finding the problem.
There is still may be a potential problem if we fail inside a recursive node split (i.e., where an inode has more than one level of the extent tree that needs splitting) but that wasn't what was hitting us here. This was another case of ext2fs_extent_set_bmap() doing something destructive (specifically, shortening an existing extent from the front end) before checking to make sure the insert would be successful. The fix for this isn't that hard.
As long as we don't lose any data, I think it's fair game to ask the user to run e2fsck after a resize2fs failure to fix up the block allocation bitmaps, which is what still remains to be done after this patch is applied.
This has been fixed upstream now by:
Author: Theodore Ts'o <email@example.com>
Date: Fri Jul 10 00:19:28 2009 -0400
libext2fs: Make ext2fs_extent_set_bmap() more robust against ENOSPC