Red Hat Bugzilla – Bug 496982
resize2fs is corrupting ext4 filesystems during livecd creation
Last modified: 2013-01-10 00:10:55 EST
During the livecd-creator process, the resizes of the ext4 filesystem are corrupting it. I added post-resize fscks and got:
e2fsck 1.41.4 (27-Jan-2009)
Pass 1: Checking inodes, blocks, and sizes
Inode 170 has an invalid extent
(logical block 6144, invalid physical block 753664, len 1)
Inode 170, i_blocks is 53560, should be 53552. Fix? yes
Inode 35435, i_blocks is 448, should be 424. Fix? yes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Block bitmap differences: -(539069--539071)
Free blocks count wrong for group #16 (0, counted=3).
Free blocks count wrong (0, counted=3).
I have patches from ted that fix this; we will certainly want them for F11.
Jeremy, I'd highly suggest adding post-resize fscks (or at least a final, pre-munging-into-iso fsck) to the livecd process?
In rawhide & F11 now (and tagged for F11)
Do these patches (or their equivalent) also need adding to livecd-iso-to-disk. The reason I ask is I wonder if this is the problem that is causing BZ #499149 ?
(In reply to comment #2)
> Do these patches (or their equivalent) also need adding to livecd-iso-to-disk.
> The reason I ask is I wonder if this is the problem that is causing BZ #499149
livecd-iso-to-disk doesn't do any resizing, just a copy. So there shouldn't be a need for an explicit fsck there.
Ok I have a reproducer now.
# resize2fs <that image> 578639
fails to resize ... and corrupts it :(
This happens during the binary chop of searching for a minimum size.
Jeremy, remind me again why you're not using the -M option you wanted so badly? :)
So, this looks really random. Partly due to pdflush timing on the original image population, most likely, and get_random_bytes() in the inode allocation code (!)
resize2fs really should not be finding itself in a point of no return once it hits ENOSPC, but auditing that is probably going to be really tricky.
I know this is listed as a blocker, but in reality, if you hit it, and you run the tool again, it will probably pass just fine (and really will be fine). Given how many times you need a good livecd before F11 ships ... once, right? I wonder if we can live with "oops, got a bad one, try again"
But doesn't this also mean that a user who runs resize2fs on his installed machine for whatever reason faces risk of data loss?
If the user tries to shrink it to something extremely small then yes. resize2fs -M should provide a safe way to do this, though...
(I don't mean it shouldn't be fixed, of course, I just wonder about blocker status)
Is this the cause of #499452 ?
Gr, I confused these two bugs a bit I think.
This one was for a pretty repeatable corruption which I put patches in for, and it was mostly fixed.
But then bug #499452 is for a much harder to hit, more random corruption during the resize phase of livecd creation...
After about comment #4 in this bug they should have been on bug #499452 ... sigh.
This bug appears to have been reported against 'rawhide' during the Fedora 11 development cycle.
Changing version to '11'.
More information and reason for this action is here:
MODIFIED->CLOSED magic doesn't work for rawhide I guess. :)