Bug 496982 - resize2fs is corrupting ext4 filesystems during livecd creation
resize2fs is corrupting ext4 filesystems during livecd creation
Status: CLOSED CURRENTRELEASE
Product: Fedora
Classification: Fedora
Component: e2fsprogs (Show other bugs)
11
All Linux
high Severity high
: ---
: ---
Assigned To: Eric Sandeen
Fedora Extras Quality Assurance
:
Depends On:
Blocks: 516996
  Show dependency treegraph
 
Reported: 2009-04-21 18:11 EDT by Eric Sandeen
Modified: 2013-01-10 00:10 EST (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-10-06 10:32:36 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Eric Sandeen 2009-04-21 18:11:06 EDT
During the livecd-creator process, the resizes of the ext4 filesystem are corrupting it.  I added post-resize fscks and got:

e2fsck 1.41.4 (27-Jan-2009)
Pass 1: Checking inodes, blocks, and sizes
Inode 170 has an invalid extent
	(logical block 6144, invalid physical block 753664, len 1)
Clear? yes

Inode 170, i_blocks is 53560, should be 53552.  Fix? yes

Inode 35435, i_blocks is 448, should be 424.  Fix? yes

Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Block bitmap differences:  -(539069--539071)
Fix? yes

Free blocks count wrong for group #16 (0, counted=3).
Fix? yes

Free blocks count wrong (0, counted=3).
Fix? yes


I have patches from ted that fix this; we will certainly want them for F11.

Jeremy, I'd highly suggest adding post-resize fscks (or at least a final, pre-munging-into-iso fsck) to the livecd process?
Comment 1 Eric Sandeen 2009-04-23 13:07:05 EDT
In rawhide & F11 now (and tagged for F11)
Comment 2 Jonathan Underwood 2009-05-11 20:20:09 EDT
Do these patches (or their equivalent) also need adding to livecd-iso-to-disk. The reason I ask is I wonder if this is the problem that is causing BZ #499149 ?
Comment 3 Jeremy Katz 2009-05-11 21:45:41 EDT
(In reply to comment #2)
> Do these patches (or their equivalent) also need adding to livecd-iso-to-disk.
> The reason I ask is I wonder if this is the problem that is causing BZ #499149

livecd-iso-to-disk doesn't do any resizing, just a copy.  So there shouldn't be a need for an explicit fsck there.
Comment 4 Eric Sandeen 2009-05-11 23:47:23 EDT
Ok I have a reproducer now.

http://people.redhat.com/esandeen/livecd-creator-imagefile.bz2

# resize2fs <that image> 578639

fails to resize ... and corrupts it :(

This happens during the binary chop of searching for a minimum size.

Jeremy, remind me again why you're not using the -M option you wanted so badly? :)

-Eric
Comment 5 Eric Sandeen 2009-05-11 23:54:53 EDT
So, this looks really random.  Partly due to pdflush timing on the original image population, most likely, and get_random_bytes() in the inode allocation code (!)

resize2fs really should not be finding itself in a point of no return once it hits ENOSPC, but auditing that is probably going to be really tricky.

I know this is listed as a blocker, but in reality, if you hit it, and you run the tool again, it will probably pass just fine (and really will be fine).  Given how many times you need a good livecd before F11 ships ... once, right?  I wonder if we can live with "oops, got a bad one, try again"
Comment 6 Jonathan Underwood 2009-05-12 07:15:22 EDT
But doesn't this also mean that a user who runs resize2fs on his installed machine for whatever reason faces risk of data loss?
Comment 7 Eric Sandeen 2009-05-12 09:29:44 EDT
If the user tries to shrink it to something extremely small then yes.  resize2fs -M should provide a safe way to do this, though...
Comment 8 Eric Sandeen 2009-05-12 09:36:28 EDT
(I don't mean it shouldn't be fixed, of course, I just wonder about blocker status)
Comment 9 Chris Shoemaker 2009-05-15 17:06:12 EDT
Is this the cause of #499452 ?
Comment 10 Eric Sandeen 2009-05-15 17:15:34 EDT
Gr, I confused these two bugs a bit I think.

This one was for a pretty repeatable corruption which I put patches in for, and it was mostly fixed.

But then bug #499452 is for a much harder to hit, more random corruption during the resize phase of livecd creation...

After about comment #4 in this bug they should have been on bug #499452 ... sigh.
Comment 11 Bug Zapper 2009-06-09 10:21:37 EDT
This bug appears to have been reported against 'rawhide' during the Fedora 11 development cycle.
Changing version to '11'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 12 Eric Sandeen 2009-10-06 10:32:36 EDT
MODIFIED->CLOSED magic doesn't work for rawhide I guess.  :)

Note You need to log in before you can comment on or make changes to this bug.