Bug 523378

Summary: [kernel] file system errors for recent 2.6.31 kernels
Product: [Fedora] Fedora Reporter: Joachim Frieben <jfrieben>
Component: kernelAssignee: Eric Sandeen <esandeen>
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: rawhideCC: bloch, esandeen, itamar, kernel-maint, vedran, yaneti
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-10-22 16:03:57 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 473303    
Attachments:
Description Flags
boot.log after forced reboot due to file system inconsistencies none

Description Joachim Frieben 2009-09-15 09:31:22 UTC
Description of problem:
After a fresh install from the "rawhide" tree, (already familiar) failed attempts to suspend the system with active KMS or resetting the system after an X freeze lead to error messages

  "/dev/VolGroup00/LogVol00 contains a file system error, check forced
   Type Ctrl -d to proceed with normal startup, (or give root password
   for system maintenance)
   ..".

Without waiting for user input, the system then complains about a missing library /usr/lib/libfreebl3.so. Hereafter the system reboots immediately, and a file system check on /dev/VolGroup00/LogVol00 is performed automatically upon reboot, after which the system comes up as expected.

Version-Release number of selected component (if applicable):
kernel-2.6.31.6-2.fc12.x86_64

How reproducible:


Steps to Reproduce:
1. Reset system.
  
Actual results:
Upon rebbot, system complains about file system errors.

Expected results:
System picks up journal and checks file system integrity.

Additional info:
- /dev/VolGroup00/LogVol00 contains the root files system. /boot, /usr and
  /home are hosted by different partitions/volumes.
- This issue seems to be a recent one arosen within the last two weeks.

Comment 1 Joachim Frieben 2009-09-15 09:33:31 UTC
System actually complains about a missing library /usr/lib64/libfreebl3.so.

Comment 2 Joachim Frieben 2009-09-15 09:35:42 UTC
Root volume /dev/VolGroup00/LogVol00 has file system type ext4.

Comment 3 Joachim Frieben 2009-09-25 12:26:39 UTC
After a fresh install from the "rawhide" tree using the 2009-08-18 boot.iso due to anaconda crashes of the most recent versions, no initrd entry is present in /etc/grub.conf, and subsequently, the kernel panics upon the next reboot.
After booting into rescue mode, chrooting and installing the latest device-mapper, dracut, lvm2 and kernel packages from Koji, the new kernel entry in /etc/grub.conf looks correct, and the system seems to boot as expected.
However, shortly after starting udev, the system reports "Unexpected inconsistencies of file system", and then "Automatic reboot in progress".
Even running fsck.ext4 on /dev/VolGroup00/LogVol00 has no effect: after rebooting the system, the same error messages about inconsistencies reappear.
There seems to be no way to rescue the system other than intalling it from scratch, possibly using ext3 instead of ext4 .. :o(

Comment 4 Joachim Frieben 2009-10-02 07:44:34 UTC
Created attachment 363426 [details]
boot.log after forced reboot due to file system inconsistencies

Because of temporary problems with the power supply, the system happens to power off fairly frequently. Here is the typical boot log after the forced reboot triggered by the kernel after file inconsistencies have been detected which happens every time after a power cut. They always appear on the root [ext4] volume, never on /usr [ext4] or /home [ext3] volumes.

Comment 5 Joachim Frieben 2009-10-02 07:57:58 UTC
Comment on attachment 363426 [details]
boot.log after forced reboot due to file system inconsistencies

Welcome to Fedora 
Press 'I' to enter interactive startup.
Starting udev: [ OK ]
Setting hostname banach:  [ OK ]
Setting up Logical Volume Management:   4 logical volume(s) in volume group "VolGroup00" now active  [ OK ]
Checking filesystems
/dev/mapper/VolGroup00-LogVol00 contains a file system with errors, check forced.
/dev/mapper/VolGroup00-LogVol00: |=                               |  2.2%
/dev/mapper/VolGroup00-LogVol00: |=                               /  4.4%
/dev/mapper/VolGroup00-LogVol00: |======                          - 19.7%
/dev/mapper/VolGroup00-LogVol00: |========================        \ 74.9%
/dev/mapper/VolGroup00-LogVol00: |===========================     | 85.7%
/dev/mapper/VolGroup00-LogVol00: |================================| 100.0%
/dev/mapper/VolGroup00-LogVol00: 25890/262144 files (0.4% non-contiguous), 443960/1048576 blocks
/dev/sda2: recovering journal
/dev/sda2: clean, 36/26208 files, 26549/104420 blocks
/dev/mapper/VolGroup00-LogVol02: recovering journal
/dev/mapper/VolGroup00-LogVol02: clean, 79288/6635520 files, 13493266/26533888 blocks
/dev/mapper/VolGroup00-LogVol01: recovering journal
/dev/mapper/VolGroup00-LogVol01: clean, 266965/524288 files, 1648210/2097152 blocks  [ OK ]
Remounting root filesystem in read-write mode:  [ OK ]
Mounting local filesystems:  [ OK ]
Enabling local filesystem quotas:  [ OK ]
Enabling /etc/fstab swaps:  [ OK ]
Entering non-interactive startup
Starting monitoring for VG VolGroup00:   4 logical volume(s) in volume group "VolGroup00" monitored  [ OK ]

Comment 6 Joachim Frieben 2009-10-02 12:36:42 UTC
After a shutdown triggered by pressing the soft power button while investigating bug 526433, the system does not recover from the file system inconsistencies anymore. It always proceeds to the automatic reboot step and is hence caught in an endless loop of reboots.
In the past I had already tried to run fsck.ext4 in rescue mode but after rebooting the system from the system disk, the same inconsistencies were reported. The hardware profile of the affected system can be found at

http://www.smolts.org/client/show/pub_ce3d646e-2918-44c9-a6c0-10fb3d9a0180

Comment 7 Joachim Frieben 2009-10-02 12:46:25 UTC
I should add that the system lives on a Hitachi Deskstar T7K250 160GB PATA disk which is connected to the onboard nVidia Corporation CK804 IDE interface.

Comment 8 Chuck Ebbert 2009-10-05 04:54:32 UTC
(In reply to comment #1)
> System actually complains about a missing library /usr/lib64/libfreebl3.so.  

From the nss-softokn-freebl package.

Comment 9 Joachim Frieben 2009-10-14 11:41:09 UTC
The issue has been confirmed for two other systems, thus no faulty hardware. It might be worthwile checking the ext4 default settings. It seems that recent kernels are a bit too aggressive regarding maximizing disk I/O performance.

Comment 10 Yanko Kaneti 2009-10-14 11:51:23 UTC
Apart from the  libfreebl3.so issue which I haven't seen this reads to me like bug 522969. Are you east of UTC ?

Comment 11 Joachim Frieben 2009-10-14 13:23:25 UTC
(In reply to comment #10)
Correct, local time zone is CEST (UTC+2h). I have to repeat that this issue is fairly recent, maybe 1-2 months old. It does not occur for current F11 even when the root volume has file type ext4.

Comment 12 Joachim Frieben 2009-10-16 16:27:32 UTC
Btw, before the automatic reboot is triggered with the announcement,

  "Automatic reboot in progress."

there -is- a message

  "Superblock last write time is in the future"

And yes, the hardware clock is set to the local time zone CEST.

Comment 13 Eric Sandeen 2009-10-22 14:32:59 UTC
Regarding the superblock write time in the future, it should be fixed as of bug 522969 's resolution in rawhide & f12, so you might update that to get that annoyance out of the way.

Other than that, there have been other reports of corruption problems in ext4 upstream, if you have a repeatable testcase we may have a patch you could try if you're handy with such things; it's on my list of things to look into today.

Thanks,
-Eric

Comment 14 Joachim Frieben 2009-10-22 16:03:57 UTC

*** This bug has been marked as a duplicate of bug 522969 ***

Comment 15 Eric Sandeen 2009-10-22 16:36:16 UTC
Joachim, so were no other errors found during fsck?

Comment 16 Joachim Frieben 2009-10-22 16:52:45 UTC
No, situation has evolved toward the normal procedure after an unclean reboot:

Setting up Logical Volume Management:   4 logical volume(s) in volume group "VolGroup00" now active
                                                           [  OK  ]
Checking filesystems
/dev/mapper/VolGroup00-LogVol00: clean, 26468/262144 files, 293417/1048576 blocks
/dev/sda2: clean, 36/51200 files, 29848/204800 blocks
/dev/mapper/VolGroup00-LogVol02: clean, 81179/6635520 files, 11247838/26524672 blocks
/dev/mapper/VolGroup00-LogVol01: clean, 279431/524288 files, 1713916/2097152 blocks
                                                           [  OK  ]
Remounting root filesystem in read-write mode:             [  OK  ]
Mounting local filesystems:                                [  OK  ]
Enabling local filesystem quotas:                          [  OK  ]
Enabling /etc/fstab swaps:                                 [  OK  ]
Entering non-interactive startup
Starting monitoring for VG VolGroup00:   4 logical volume(s) in volume group "VolGroup00" monitored

Comment 17 Adam Huffman 2009-10-23 00:04:38 UTC
Just adding to this report as I've seen these exact symptoms after upgrading my laptop from F11 to the Beta tonight, including the complaint about a missing libfreebl3.so.

I will try installing the updated e2fsprogs via the rescue image.

It's now also complaining that it's unable to write some udev-related files, before the "Welcome to Fedora" text appears.