Bug 442106
Summary: | on initial reboot, filesystem had errors | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Bill Nottingham <notting> | ||||||
Component: | anaconda | Assignee: | Anaconda Maintenance Team <anaconda-maint-list> | ||||||
Status: | CLOSED RAWHIDE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | low | Docs Contact: | |||||||
Priority: | low | ||||||||
Version: | rawhide | CC: | cpanceac, esandeen, rvokal | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2008-04-21 21:38:52 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 235706 | ||||||||
Attachments: |
|
Description
Bill Nottingham
2008-04-11 19:03:44 UTC
Created attachment 302162 [details]
dumpe2fs output
Here's the dumpe2fs output.
[root@localhost ~]# e2fsck -n -v /dev/mapper/moofoo e2fsck 1.40.8 (13-Mar-2008) / contains a file system with errors, check forced. Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information Block bitmap differences: -(577587--577611) Fix? no /: ********** WARNING: Filesystem still has errors ********** 87480 inodes used (0.87%) 517 non-contiguous inodes (0.6%) # of inodes with ind/dind/tind blocks: 6144/30/0 1200262 blocks used (2.99%) 0 bad blocks 1 large file 65955 regular files 7776 directories 8 character device files 0 block device files 0 fifos 3499 links 13732 symbolic links (13697 fast symbolic links) 0 sockets -------- 90970 files I have in the log: EXT3-fs error (device dm-2): ext3_free_blocks: Freeing blocks not in datazone - block = 202366355, count = 1 EXT3-fs error (device dm-2): ext3_free_blocks: Freeing blocks not in datazone - block = 1673128691, count = 1 EXT3-fs error (device dm-2): ext3_free_blocks: Freeing blocks not in datazone - block = 1148093888, count = 1 EXT3-fs error (device dm-2): ext3_free_blocks: Freeing blocks not in datazone - block = 1338544658, count = 1 EXT3-fs error (device dm-2): ext3_free_blocks: Freeing blocks not in datazone - block = 3927257361, count = 1 EXT3-fs error (device dm-2): ext3_free_blocks: Freeing blocks not in datazone - block = 1674848470, count = 1 EXT3-fs error (device dm-2): ext3_free_blocks: Freeing blocks not in datazone - block = 2292416204, count = 1 EXT3-fs error (device dm-2): ext3_free_blocks: Freeing blocks not in datazone - block = 2231364692, count = 1 EXT3-fs error (device dm-2): ext3_free_blocks_sb: bit already cleared for block 28802063 .... Created attachment 302164 [details]
dmesg
This appears to be happening (at least the errors appear) when running mkinitrd in the post-install step. No, that doesn't make sense to me either. hrm, can't do an x86_64 install right now, so, I tried: get the livecd iso notting used. jump through the hoops to get ext3fs.img extracted from it. under the 2.6.25-0.218.rc8.git7.fc9.x86_64 kernel... truncate the image to 164587324416 (the size of notting's lvm volume) mount the image grow the image mount up /sys /boot /proc /dev and whatnot, and chroot in try mkinitrd ... no problems. Also, the ext3fs.img seems clean before & after the resize. when I can try a livecd i'll either try with errors=panic (assuming ext3 can be remounted that way... need to check...) or try replacing ext3 to get more info on how it got to that error point. -Eric notting, if you get to it before me, see if you can pop over to a shell and do mount -o remount,errors=panic on the dm-2 fs early in the install process... Initial error is: EXT3-fs error (device dm-2): ext3_free_blocks: Freeing blocks not in datazone - block = 202366355, count = 1 Kernel panic - not syncing: EXT3-fs (device dm-2): panic forced after error panic isn't doing a stack trace. Haven't tried to rebuild the kernel for the livecd. yep... yay, ext3_error() :/ rebuilding ext3.ko to provide more info would maybe help. I could provide that if you want... put it in the initrd... rebuild/reburn... yuck. FWIW I can hit this too. :) Trying to cook up a way to hit it outside of anaconda so I can debug it more easily... so far, no luck. -Eric What does anaconda actually copy when it copies the fs image to the system root disk? I put a pre-resize early return in livecd.py and the un-resized image has the corruption that notting and I both saw... however: [root@localhost]# e2fsck -fn /dev/mapper/live-osimg-min this checks clean... is this what anaconda is copying? Thanks, -Eric Believe so, yes. hrm. Ok will keep trying to narrow it down.... The inode for the corrupted file itself is intact, but the first indirect block that it points to is full of garbage, even on my not-resized-but-just-copied LV. The inode metadata matches the original fs image; i.e. atime, ctime etc are not changed; whatever corruption happened doesn't seem related to operations on this inode, it was just that an indirect block seems clobbered. -Eric in debugfs-speak: [root@localhost foo]# debugfs /dev/mapper/live-osimg-min debugfs 1.40.8 (13-Mar-2008) debugfs: stat <86555> Inode: 86555 Type: regular Mode: 0644 Flags: 0x0 Generation: 2462646136 User: 0 Group: 0 Size: 150359 File ACL: 0 Directory ACL: 0 Links: 1 Blockcount: 304 Fragment: Address: 0 Number: 0 Size: 0 ctime: 0x47ff7b37 -- Fri Apr 11 09:52:39 2008 atime: 0x47ff78ca -- Fri Apr 11 09:42:18 2008 mtime: 0x47ff78ca -- Fri Apr 11 09:42:18 2008 Size of extra inode fields: 4 Extended attributes stored in inode body: selinux = "system_u:object_r:modules_dep_t:s0\000" (35) BLOCKS: (0-11):577574-577585, (IND):577586, (12-36):577587-577611 TOTAL: 38 debugfs: quit [root@localhost foo]# debugfs /dev/mapper/VolGroup00-LogVol00 debugfs 1.40.8 (13-Mar-2008) debugfs: stat <86555> Inode: 86555 Type: regular Mode: 0644 Flags: 0x0 Generation: 2462646136 User: 0 Group: 0 Size: 150359 File ACL: 0 Directory ACL: 0 Links: 1 Blockcount: 304 Fragment: Address: 0 Number: 0 Size: 0 ctime: 0x47ff7b37 -- Fri Apr 11 09:52:39 2008 atime: 0x47ff78ca -- Fri Apr 11 09:42:18 2008 mtime: 0x47ff78ca -- Fri Apr 11 09:42:18 2008 Size of extra inode fields: 4 Extended attributes stored in inode body: selinux = "system_u:object_r:modules_dep_t:s0\000" (35) BLOCKS: (0-11):577574-577585, (IND):577586, (12):2943334241, (13):1038652331, so the indirect block (577586) is just full of junk at this point. And this block is correct on the 'live' filesystem, theoretically? Because of the way the copy loop gets sizes: readamt = 1024 * 1024 * 8 # 8 megs at a time size = float(self._getLiveSizeMB() * 1024 * 1024) copied = 0 while copied < size: buf = os.read(osfd, readamt) written = os.write(rootfd, buf) if (written < readamt) and (written < len(buf)): raise RuntimeError, "error copying filesystem!" copied += written progress.set_fraction(pct = copied / size) progress.processEvents() and _getLiveSizeMB does: return blkcnt * blksize / 1024 / 1024 This is going to round the copied size down to the nearest megabyte, no? And miss the last part of the filesystem... where this corrupt block we saw just happens to live... -Eric Thanks for the debugging! Fixed in http://git.fedorahosted.org/git/?p=anaconda.git;a=commit;h=9083f70668bfeb72b5dfea73d5cc68685e057e8b *** Bug 431647 has been marked as a duplicate of this bug. *** Closing, did a test of this patch. |