Bug 521107
Summary: | fsck cannot clean up filesystem, eventually hangs forever | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Dave <dave.costakos> | ||||||
Component: | e2fsprogs | Assignee: | Eric Sandeen <esandeen> | ||||||
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | BaseOS QE <qe-baseos-auto> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | low | ||||||||
Version: | 5.1 | CC: | fhirtz, sct, zbrown | ||||||
Target Milestone: | rc | ||||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2011-01-26 21:20:57 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Dave
2009-09-03 17:13:13 UTC
Created attachment 359705 [details]
Output of fsck.ext2 command on the filesystem
strace -p <PID> on ext2 after it is "hung" creates an empty file, but the CPU utilization for that process is using 100% of 1 CPU. I've also updated the core core-bz521107.bz2 to dropbox.redhat.com/incoming . . . this is for fsck.ext2 Created attachment 359706 [details]
Output of fsck.ext3 command
Also put core-fsck.ext3-bz521107.bz2 on dropbox.redhat.com/incoming for the core dump of the fsck.ext3 filesystem. Could you please create an "e2image -r" of the problematic filesystem, compress it, and provide it for analysis? I can probably work backwards from the corefile, but with a filesystem image I could verify any fix. If there is concern about sensitive filenames, the -s option will scramble them up in the image. Thanks, -Eric Thanks Eric, Here's the output of my command: [root@almcrpstg01 workspace]# e2image -s -r /dev/mapper/almcrpprd03VG-localmnt2 fsimage-bz521107.img e2image 1.39 (29-May-2006) e2image: A block group is missing an inode table while getting next inode It creates a 0 byte image file: [root@almcrpstg01 workspace]# ls -lah fsimage-bz521107.img -rw------- 1 root bin 0 Sep 3 10:50 fsimage-bz521107.img Note: also opened service request 1949408 for this. Ah crud. Just in case more recent e2fsprogs can handle this, you might try installing e4fsprogs (userspace for the ext4 tech preview) and running e4image ... but I bet it dies the same way. I'll try to look backwards from the core. Any idea what happened to this filesystem? -Eric I think it was pretty straight forward in this case. My understanding is that we initiated a reboot (using the reboot command manually) most probably while some application was still using this filesystem. I'm presuming that the process didn't get killed off before the system rebooted. Before that reboot, we noted a syslog error message saying that multipathd segfaulted. when the system came back up, this filesystem would not fsck. When e2fsck is running are you getting any errors in dmesg from the storage? e2fsck should handle it more gracefully of course, but I wonder if everything got put back together again properly after the reboot ... Thanks, -Eric I've looked and I just don't see anything. Unfortunately as well, this is a down application. We had to completely wipe this filesystem to get our internal customer back up. So, I can't try fscking again the broken fs. Also, I'm not confident that we can reproduce this error reliably (though I will try). I wasn't able to sort out from the core what was wrong, and I am afraid that without access to the broken image, this will be nigh impossible to fix... I'm afraid I'll have to close this one. |