|Summary:||ext3 filesystem won't fsck after power loss|
|Product:||[Retired] Red Hat Linux||Reporter:||Drew Vogel <andrew.vogel>|
|Component:||e2fsprogs||Assignee:||Florian La Roche <laroche>|
|Status:||CLOSED NOTABUG||QA Contact:||Jay Turner <jturner>|
|Version:||8.0||CC:||barryn, menscher, srevivo|
|Fixed In Version:||Doc Type:||Bug Fix|
|Doc Text:||Story Points:||---|
|Last Closed:||2003-06-03 12:51:58 UTC||Type:||---|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
Description Drew Vogel 2003-04-22 03:44:21 UTC
Description of problem: A power outage/brownout seems to have corrupted several of my ext3 partitions. When running fsck on them, it returns "short read; bad superblock". During bootup, RH stops with "Can't find matching filesystem: LABEL=/home" and offers the console. All attempts to fix have failed. Version-Release number of selected component (if applicable): RedHat 8.0, kept up2date. How reproducible: Hard to reproduce. Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Comment 1 Drew Vogel 2003-04-22 03:51:23 UTC
The alternate superblocks can be listed with mke2fs -n <device>, but using e2fsck -b <superblock> <device> returns an error ("Invalid arguement while reading block -2147450360, invalid arguement reading journal superblock, invalid arguement while checking ext3 journal for /var").
Comment 2 Drew Vogel 2003-04-22 13:04:46 UTC
This offers additional information in the form of my /etc/fstab, a listing of what we've determined to be the partitioning scheme on the drive, and a log from an IRC chat showing what we've tried. /etc/fstab: =========== LABEL=/ / ext3 defaults 1 1 LABEL=/boot /boot ext3 exec,dev,suid,rw 1 2 none /dev/pts devpts gid=5,mode=620 0 0 LABEL=/home /home ext3 exec,dev,suid,rw 1 2 none /proc proc defaults 0 0 none /dev/shm tmpfs exec,dev,suid,rw 0 0 LABEL=/tmp /tmp ext3 exec,dev,suid,rw 1 2 LABEL=/var /var ext3 exec,dev,suid,rw 1 2 /dev/hda3 swap swap defaults 0 0 /dev/cdrom /mnt/cdrom iso9660 noauto,owner,kudzu,rw 0 0 /dev/hdd4 /mnt/zip100.0 vfat noauto,owner,kudzu,rw 0 0 /dev/fd0 /mnt/floppy auto noauto,owner,kudzu 0 0 PARTITIONS: =========== HDA1 = /boot HDA2 = / HDA3 = SWAP HDA4 = ??? (short read; bad superblock) HDA5 = /tmp HDB1 = /var HDB2 = /home IRC LOG ======= [21:48:26] <drew> I'm wondering if anyone can help with a bootup problem I'm having with my RH8.0 server... [21:49:24] <drew> Anyone? [21:49:54] <vexas-z> state the problem and pray... [21:50:03] <drew> hehehe. Thanks. [21:51:35] <drew> Last night at 8:30pm, I was on my machine from my parent's house, and it was working fine. Got home at 9:30, and it was crashed. Rebooted, and when I reboot, it locates and mounts my FIRST physical drive (master, IDE channel 1), but then when it tries to hit the SECOND drive (slave, IDE channel 1), it gets an error. [21:52:12] <vexas-z> whats the error say? [21:52:12] <drew> "Couldn't find matching filesystem: LABEL=/home" [21:52:38] <vexas-z> is the bios at boot seeing both drives? [21:53:15] <drew> Yes. Bios sees both physical drives. It's NOT seeing my ZIP drive (second IDE channel, slave). The CD burner, IDE channel 2, master, shows in bios. [21:54:15] <vexas-z> did you check the /etc/fstab file? [21:54:32] <drew> Yes, it's there, but honestly, I don't know what I'm looking at there. I can copy it for you. [21:55:03] <vexas-z> usually....I believe this is where AUTO mounting occurs of drives. [21:55:45] <drew> LABEL=/ / ext3 defaults 1 1 LABEL=/boot /boot ext3 exec,dev,suid,rw 1 2 none /dev/pts devpts gid=5,mode=620 0 0 LABEL=/home /home ext3 exec,dev,suid,rw 1 2 none /proc proc defaults 0 0 none /dev/shm tmpfs exec,dev,suid,rw 0 0 LABEL=/tmp /tmp ext3 exec,dev,suid,rw 1 2 LABEL=/var /var ext3 exec,dev,suid,rw 1 2 /dev/hda3 swap swap defaults 0 0 /dev/cdrom /mnt/cdrom iso9660 noauto,owner,kudzu,rw 0 0 /dev/hdd4 /mnt/zip100.0 vfat noauto,owner,kudzu,rw 0 0 /dev/fd0 /mnt/floppy auto noauto,owner,kudzu 0 0 [21:58:33] <vexas-z> did you you try to manually mount your second drive? [21:59:33] <drew> How? [22:00:55] <vexas-z> mount /dev/hdb /mnt/whereyouwant_it _mounted [22:01:20] <drew> Must specify FS type... The drives are Ext3. [22:02:02] <drew> Is that "mount -t ext3 /dev/hdb /home"? [22:02:49] <vexas-z> dont mount it at /home. [22:03:20] <vexas-z> was home lcated on your second ide drive? [22:03:42] <drew> I _THINK_ so... I think it had /var and /home on it. I don't know how to check that, though... [22:04:25] <vexas-z> you can just type "mount" without anything else -- it will display whats going on. [22:06:03] <drew> Three lines... 1. /dev/hda2 on / type ext3 (rw) [22:06:20] <drew> 2. none on /proc type proc (rw) [22:06:46] <drew> 3. usbdevfs on /proc/bus/usb type usbdevfs (rw) [22:06:48] <drew> That's it. [22:08:51] <vexas-z> In think your are over my head. [22:10:40] <drew> Thanks for working on it with me, Vexas! I appreciate it. [22:10:51] <vexas-z> i try..sorry I cant help more. [22:11:14] <drew> Nononono... You know where to look, which is further than _I've_ gotten on my own, so I appreciate it! [22:12:17] <drew> The closest I've gotten is that it might be a zero-sector partition, which it's not. [22:12:44] <drew> This is NOT a new install; it's been running reliably for months and months before last night. We had a thunderstorm, and I expect that the house browned-out during a write or something. [22:12:52] <drew> It's also said something about a bad superblock... [22:13:28] <drew> Anyone else have any ideas? [22:13:47] <Hydrogenum> drew: wasn't really paying attention, but sounds like you can't fsck? [22:14:50] <drew> True... Hi Hydro... It's getting part way through system reboot -- finds my first physical drive just fine. Gets to the second and says "Couldn't find matching filesystem: LABEL=/home". [22:14:56] <drew> Gives me a console. [22:15:32] <Hydrogenum> what happens if you try to fsck it? [22:16:23] <drew> What should the command line be? [22:16:28] <drew> fsck /dev/hdb ??? [22:16:36] <|Jef|> drew: sounds like a job for........rescue mode [22:16:45] <Hydrogenum> probably fsck /dev/hdb1 [22:16:53] <drew> Lemme try it, Hydro. [22:16:54] <Hydrogenum> depends on your partition layout [22:17:01] <Hydrogenum> I dont' know which partition /home is for you [22:17:08] <|Jef|> Hydrogenum: silly labels [22:17:22] <|Jef|> Hydrogenum: will fdisk show you the labels? [22:17:28] <Hydrogenum> doubtful [22:17:36] <drew> I think it does... [22:17:41] <Hydrogenum> drew: what's mounted now? [22:17:54] <|Jef|> Hydrogenum: i thought the labels as used from fstab were the volume labels on the partitions [22:17:57] <drew> Err... Fdisk does NOT show labels. [22:18:14] <drew> Typing mount shows me three lines: [22:18:18] <drew> 1. /dev/hda2 on / type ext3 (rw) [22:18:25] <drew> 2. none on /proc type proc (rw) [22:18:32] <drew> 3. usbdevfs on /proc/bus/usb type usbdevfs (rw) [22:18:35] <drew> That's it. [22:18:54] <Hydrogenum> and you're sure /home is /dev/hda4 ? [22:18:56] <drew> Using fsconf, I can manually mount all EXCEPT for /home and /var [22:19:13] <drew> Hydro... I THOUGHT something else showed me that. fsconf? How would I find out? [22:19:31] <drew> RH8.0... Something called fsconf [22:19:44] <strawman> try fdisk -l /dev/hda [22:21:36] <drew> Strawman... It's got /dev/hda1 through /dev/hda5. hda3 is SWAP. hda5 is EXTENDED, and has the same start/end sectors as hda5, though hda5 has slightly fewer BLOCKS than hda4. [22:22:42] <drew> Doing 'fdisk -l /dev/hdb' shows /dev/hdb1 & /dev/hdb2, both system type of LINUX. [22:22:57] <Hydrogenum> drew: FYI, there can only be four primary partitions. They get around that by allowing them to be "extended" partitions, which can be subdivided [22:23:09] <Hydrogenum> so hda5 is really a subset of hda4 [22:23:37] <drew> Gotcha. I don't remember how, but I think I saw somewhere in this process that /home lived in /dev/hda4. [22:23:53] <|Jef|> Hydrogenum: so the trick is...how do you go about figureing which parition is /home since fstab is using labels instead of actually device listings [22:24:09] <drew> That's the trick. [22:24:15] <Hydrogenum> |Jef|: no idea... labels scare me [22:24:30] <drew> RH8 musta set them up automagically... I wouldn't do that. [22:24:33] <strawman> fsck each partition manually, reboot :) [22:24:50] <drew> And I've heard from others that "labels are scary"... Dunno WHY, but I believe it! [22:25:09] <Hydrogenum> drew: now you know why ;) [22:25:44] <drew> Straw... If I do "fsck /dev/hda4", I get "Couldn't find matching filesystem: LABEL=/home". [22:25:50] <drew> True, Hydro. [22:26:17] <Hydrogenum> seriously, labels are probably a good thing. It's just that nobody is used to them. [22:26:23] <drew> OIC. [22:26:49] <drew> Any ideas how to figure out which partion is /home? [22:26:54] <drew> partion ==partition [22:27:19] <Hydrogenum> what is the partition type for hda4, etc/ ? [22:27:31] <Hydrogenum> fdisk should at least know that... [22:27:36] <drew> hda4 == EXTENDED. [22:27:51] <drew> hda3 == SWAP [22:27:55] <Hydrogenum> and hdb1 and hdb2 were LINUX, right? [22:28:02] <drew> hda1,2,5==LINUX [22:28:08] <strawman> i'll bet /home lives on hda5 [22:28:22] <Hydrogenum> ok... try to fsck /dev/hda5 [22:28:26] <drew> Remember, though, I've got two partitions on /dev/hdb, too... [22:28:50] <drew> "fsck /dev/hda5" gives me "Couldn't find matching filesystem: LABEL=/home". [22:28:51] <drew> Hrm. [22:29:02] <|Jef|> Hydrogenum: ah it looks like tune2fs is where you set the label [22:29:09] <drew> So does fsck /dev/hda3 [22:29:18] <drew> tune2fs? [22:29:27] <Hydrogenum> do *not* fsck /dev/hda3 [22:29:35] <haji> how easy is it to network with windows on RH? [22:30:02] <drew> OK, Hydro. hda2 complained about active fs, so I said "NO". [22:30:23] <Hydrogenum> drew: try the hdb? partitions? [22:30:38] <|Jef|> /sbin/tune2fs -l /dev/hda3 for example should list the info including the volume label [22:30:58] <Hydrogenum> |Jef|: even if not mounted? [22:31:04] <|Jef|> Hydrogenum: its just a list [22:31:05] <drew> Both of the hdb? partitions gave me "Couldn't find matching filesystem: LABEL=/home". [22:31:19] <Hydrogenum> why are these all saying /home ??? [22:31:21] <|Jef|> Hydrogenum: you can actually use tune2fs to mounted systems [22:31:48] <|Jef|> Hydrogenum: maybe because hes not in rescue mode yet [22:31:59] <|Jef|> Hydrogenum: and the running system is expecting a /home directory to be mounted [22:32:03] <drew> Would it be beneficial to boot from RH8 CD into RESCUE mode? [22:32:11] <Hydrogenum> drew: yes, do that -- might help [22:32:27] <drew> Okay. Working. [22:33:27] <drew> Rebooting from RH8 Cd... [22:33:51] <|Jef|> drew: once a boot process craps out with a filesystem error that a journal recovery cant get around...its probably wisest to go into rescue mode so you can work out of the ramdisk from the cdrom and leave the harddrive partitions unmounted...so you can fix them [22:34:27] <drew> Thanks, Jef! I'm booting into rescue mode now -- typed "linux rescue". [22:35:43] <drew> Do I want to mount file systems READ-ONLY or into /mnt/sysimage? [22:36:07] <drew> Or I can SKIP and go right to a command prompt... Is that what I want, Jef? [22:36:12] <|Jef|> drew: you want to leave the harddrtive partitions unmounted so you can fsck them [22:36:25] <drew> So, SKIP or READ-ONLY? [22:36:38] <|Jef|> drew: skip [22:36:43] <drew> K. SKIPPING. [22:36:56] <drew> At command shell. [22:37:45] <|Jef|> drew: tune2fs -l /dev/hda3 for example [22:37:52] <|Jef|> drew: to see what the label for that partition is [22:37:55] <drew> Ok. [22:38:37] <|Jef|> drew: its spew a lot of info the volume name is at the top [22:39:00] <|Jef|> drew: so now the trick is...to find the /home partition and fsck it [22:40:49] <Hydrogenum> keep in mind your /dev/hda3 is swap, so you shouldn't expect that one to have a volume label [22:40:51] <drew> Here's the report from /dev/hda?: [22:40:58] <drew> hda1 == /boot [22:41:03] <drew> hda2 == / [22:41:16] <drew> hda3 == short read; bad superblock [22:41:21] <drew> hda4 = short read; bad superblock [22:41:25] <drew> hda5 == /tmp [22:41:28] <drew> Working on hdb? now. [22:41:39] <Hydrogenum> ooh! progress! [22:42:00] <|Jef|> Hydrogenum: so is hda5 inside hda3 or hda4 [22:42:06] <Hydrogenum> it's inside hda4 [22:42:06] <drew> Both hdb1 & hdb2 == short read; bad superblock [22:42:11] <Hydrogenum> hda3 is his swap partition [22:42:18] <|Jef|> Hydrogenum: ah well... [22:42:24] <Hydrogenum> and hdb? sounds screwed... :-( [22:42:31] <drew> My guess is that /var & /home are on hdb. [22:42:32] <|Jef|> drew: fsck the hdb* paritions [22:42:46] <drew> Jef... "fsck /dev/hdb1"??? [22:43:06] <Hydrogenum> yes [22:43:33] <drew> Did "fsck /dev/hdb1": Got a LONG error... Want it verbatum? [22:43:55] <Hydrogenum> just the first line [22:44:15] <drew> fsck.ext2: Filesystem revision too high while trying to open /dev/hdb1. [22:44:36] <Hydrogenum> you're running RH9, right? [22:44:41] <Hydrogenum> but that was a RH8 bootdisk? [22:44:52] <drew> RH8 == installed, rescue == RH8 [22:45:29] <drew> Shall I try fsck /dev/hdb2? [22:45:31] <|Jef|> Hydrogenum: what the hell does that revision thing mean [22:45:50] <drew> Same error for "fsck /dev/hdb2". [22:45:59] <Hydrogenum> |Jef|: I was guessing it has a later version of ext2 than the rescue disk can handle [22:46:30] <drew> At the bottom of the error, it says "The superblock could not be read or does not describe a correct ext2 filesystem..." [22:46:37] <|Jef|> drew: and hdb paritions are not mounted? [22:46:47] <drew> Suggests that the superblock is corrupt... [22:46:53] <Hydrogenum> drew: I think we need to have it read an alternate superblock [22:47:01] <drew> Jef: Nope. Not mounted. [22:47:31] <drew> It suggests "e2fsck -b 8193 <device>". Shall I try that? [22:47:41] <Hydrogenum> yes, that's the command I was looking for, actually [22:48:50] <|Jef|> Hydrogenum: looks like the revision error is superblock related too [22:48:52] <drew> Tried "e2fsck -b 8193 /dev/hdb1" and "... /dev/hdb2". Got "bad magic number in super-block while trying to open /dev/hdb?". [22:49:13] <drew> I think I've got a RH7 CD around here. Run RESCUE from that? [22:49:14] <Hydrogenum> drew: try again, but with 16384 instead of 8193 [22:49:33] <Hydrogenum> and if that doesn't work, try 32768 [22:50:07] <drew> 16384 returns "Filesystem has unexpected block size while trying to open..." [22:50:25] <Hydrogenum> basically, we're looking for the alternate superblock here [22:50:37] <drew> 32768 gives a different looking error... Here: [22:50:40] <Hydrogenum> just so you know what's going on [22:51:28] <drew> on HDB1 it says "/var: Invalid arguement while reading block - 2147450360. So we know that HDB1 == /var [22:51:49] <drew> On hdb2 it's the same error, except it's "/home" instead of "/var". [22:51:51] <|Jef|> Hydrogenum: grrr....googles giving me some similar power outage spawned error reports...but no solutions yet [22:52:05] <drew> And the block is -2147450361 [22:52:23] <drew> Jef: So we're thinking it was a power outage/blip? [22:52:33] <|Jef|> drew: didnt say you had a power blip? [22:52:42] <drew> Yes, Jef. [22:53:05] <Hydrogenum> hrmm... those numbers are close to 2^31 [22:53:06] <drew> I've got a good surge protector, but it mighta been a brownout because no clocks were blinking, etc. [22:53:08] <|Jef|> grrrr...all the most relavent google responses are not in english.... [22:53:36] <strawman> yes :) farking German [22:53:37] <drew> Do we think a RH_7_ rescue CD would be useful? [22:53:41] <drew> babelfish? [22:54:06] <|Jef|> Hydrogenum: do we have to roll out the journal? [22:54:09] <Hydrogenum> drew: doubt RH7 CD would be any better [22:54:14] <|Jef|> drew: was this a fresh install? [22:54:21] <Hydrogenum> |Jef|: define "roll out the journal" ? [22:54:30] <drew> This was a fresh 8.0 install way back when, kept up2date. [22:54:34] <|Jef|> Hydrogenum: force the removal of the journal... [22:54:44] <Hydrogenum> |Jef|: shouldn't matter at all [22:54:46] <drew> Hydro... <<gulp> [22:55:30] <drew> Dunno what "remove the journal" means, but it sounds like I should get some rubber gloves. [22:55:53] <Hydrogenum> drew: don't worry about it... the journal is irrelevant here [22:56:12] <|Jef|> drew: at this point im just groping [22:56:22] <|Jef|> drew: im even on google... [22:56:25] <drew> I am, of course, worried about data loss... [22:56:32] <Hydrogenum> drew: if you're not in a rush, you might want to wait until later, when others can help [22:56:52] <Hydrogenum> as |Jef| said, we're at the limit of our knowledge [22:56:58] <drew> I'm either in a rush, or not... I'm leaving Wednesday morning and will be AFK until Sun evening... [22:57:25] <Hydrogenum> well, I'm just saying that maybe in an hour some expert will sign on [22:57:31] <drew> Tomorrow, I gotta work (AF server), then packing that evening... Can work on it after ~9:00pm EST tomorrow...
Comment 3 Barry K. Nathan 2003-05-16 09:30:34 UTC
Drew, what kind of hard drive experienced this problem? Some (if not most?) IDE drives can render sectors completely unreadable if the power goes out in the middle of a disk write. To the best of my knowledge, few (if any) SCSI drives are affected by this problem (I guess they can ensure that the entire sectors gets written out before the drive completely loses power). If the drive did a partial write, then when it tries to read it back it will seem like a bad sector (with the same type of error reported back to the Linux kernel). That could be the cause of the "short read" errors.
Comment 4 Florian La Roche 2003-06-03 12:51:58 UTC
You need to guess alternative superblocks to get access to this filesystem again. I think there are tools which try to find partitions on a disk with a lost partition table, but I have not used them until now and have not included them into Red Hat Linux until now. greetings and hope you have recovered your data, Florian La Roche