From Bugzilla Helper: User-Agent: Mozilla/5.0 Galeon/1.2.0 (X11; Linux i686; U;) Gecko/20020516 Description of problem: Recently my computer overheated and crashed. (It's a dual processor Athlon MP 1600 in an unairconditioned room. :) Not real bright of me.) I restarted it, and it then _crashed during journal recovery._ I doubt the crash was due to Linux... I suspect the machine had cooled insufficiently and crashed. I let it cool for a long time, and then restarted again. I did a journal recovery (I believe) and then moved on w/o any interaction with me. Later, though, some filesystem corruption was found. Running "ls" in one of the home directories returned: ls: kpulse10.f: Input/output error ls: x9pt003.dat: Input/output error ls: x9pr003.dat: Input/output error ls: x9ps003.dat: Input/output error ls: x9pp003.dat: Input/output error fftw_f77.i kpuls10.dat kpulse10.out test.dat test.f x9pk003.dat Looking back at /var/log/messages, there six instances of fsck clearing orphaned inodes on /home: Jun 5 22:52:56 crossbow fsck: /home: Clearing orphaned inode 2305690 (uid=502, gid=502, mode=0100664, size=0) Jun 5 22:52:56 crossbow fsck: /home: Clearing orphaned inode 2305695 (uid=502, gid=502, mode=0100664, size=0) Jun 5 22:53:39 crossbow kernel: 0x378: FIFO is 16 bytes Jun 5 22:52:56 crossbow fsck: /home: Clearing orphaned inode 2305694 (uid=502, gid=502, mode=0100664, size=0) Jun 5 22:53:40 crossbow kernel: 0x378: writeIntrThreshold is 9 Jun 5 22:52:57 crossbow fsck: /home: Clearing orphaned inode 2305696 (uid=502, gid=502, mode=0100664, size=53248) Jun 5 22:53:40 crossbow kernel: 0x378: readIntrThreshold is 9 Jun 5 22:52:57 crossbow fsck: /home: Clearing orphaned inode 2305687 (uid=502, gid=502, mode=0100664, size=0) Jun 5 22:52:57 crossbow fsck: /home: Clearing orphaned inode 2305686 (uid=502, gid=502, mode=0100775, size=378017) Jun 5 22:52:57 crossbow fsck: /home: Clearing orphaned inode 1635643 (uid=500, gid=500, mode=0100700, size=3184) Jun 5 22:52:57 crossbow fsck: /home: clean, 89262/4480448 files, 2354637/8960253 blocks (I can't seem to find the startup which crashed while recovering the journal in the log. Weird. I'll have to make another search for it.) I e2fscked the /home partition, and that seems to have fixed the problem. But, I thought ext3 was supposed to fix this sort of thing. It looks like the journaling system does not gracefully recover if the system crashes during journal recovery. (I could be wrong... that's just what it looks like to me.) Hopefully this is useful! Let me know if you need any more information! John Version-Release number of selected component (if applicable): How reproducible: Didn't try Additional info:
"I/O error" usually means that the disk has a bad sector and the filesystem cannot read data from it. Journaling filesystems protect you from the effects of a crash *assuming the data is still intact on disk*. Ext3 does survive a crash during recovery perfectly well. However, if you get I/O errors and bad sectors on disk, then there's nothing that the journaling filesystem can do to correct that. e2fsck might fix it simply by removing the unreadable files completely. I/O errors also occasionally occur because there is corrupt data on disk even if that data is still readable. Journaling relies on the data that the filesystem sent to disk being written correctly. If the hardware is overheating then it is quite possible for the data to get corrupted, and in that case again the filesystem is powerless to protect you: there's no point in writing a journal carefully to disk if the memory, controller, cpu or disk drive is flipping bits in the journal on its way. Again, a full fsck may be required to sort out the mess afterwards.