Red Hat Bugzilla – Bug 506376
ext3 file system became inconsistent
Last modified: 2009-07-03 21:13:05 EDT
Created attachment 348196 [details]
Description of problem:
I recently installed F11 on my PC. I installed it into a newly formatted ext4 partition. However, I kept several of the ext3 partitions from my F9 system intact. These partitions include /home, /opt, and the F9 root partition, which I have been mounting on /mnt/oldroot.
When I rebooted, the F9 root partition (ext3) was inconsistent. I started to enter Y to the fsck questions. After a few errors, I decided to enter N to all the rest. There were a lot of them, and eventually I just rebooted without getting to the end.
In order to boot into F11, I had to comment out the line for this partition in /etc/fstab. The commented out line looks like:
#/dev/mapper/vg00-root2 /mnt/oldroot ext3 defaults 1 2
I have also noticed some kernel messages which seem to indicate a problem. I am attaching these messages in a file.
Version-Release number of selected component (if applicable):
I don't why this happened or if there is any troubleshooting that I can do.
Steps to Reproduce:
The /dev/mapper/vg00-root2 filesystem became inconsistent.
The /dev/mapper/vg00-root2 filesystem should not have become inconsistent.
My PC is a Dell Optiplex 755 with the Q35 Express chipset. I believe that the ICH is ICH9 DO.
There are two drives in a dmraid configuration (RAID 1).
I don't know if this is related, but when I run the find command in my home directory it aborts.
Can you post the output of these commands:
[root@frogn ~]# lvs
LV VG Attr LSize Origin Snap% Move Log Copy% Convert
files vg00 -wi-ao 40.00G
home vg00 -wi-ao 40.00G
root vg00 -wi-ao 20.00G
root2 vg00 -wi-a- 20.00G
swap vg00 -wi-ao 6.00G
tmp vg00 -wi-ao 10.00G
vartmp vg00 -wi-ao 20.00G
winxp vg00 -wi-a- 30.00G
[root@frogn ~]# pvs
PV VG Fmt Attr PSize PFree
/dev/dm-2 vg00 lvm2 a- 232.59G 46.59G
[root@frogn ~]# vgs
VG #PV #LV #SN Attr VSize VFree
vg00 1 8 0 wz--n- 232.59G 46.59G
dm-5: rw=0, want=2952931456, limit=83886080
attempt to access beyond end of device
83886080 = 40G
2952931456 = 1408G
Can you attach the entire /var/log/dmesg file?
Created attachment 348502 [details]
The timestamp on this file is 2009-06-16 19:02.
When I was running F9, there was a problem with dmraid on some early kernels:
For a while, dmraid wasn't working on this computer. I few months later, I got it to start working again. (I don't remember exactly what I did, I think I adjusted the kernel parameters.) It's possible that some metadata became messed up, but I never had any problems with dmraid after that on F9.
I am using dmraid on F11 on a different PC with a newer Intel chipset. I don't have any problems there.
I just remembered how I got dmraid working under F9. It involved the initrd image. I unpacked an initrd image from F8 and looked at the init script inside. I then copied a couple of lines from init in F8's initrd to the init in F9's initrd. After that, RAID seemed to work OK.
Created attachment 348696 [details]
a new dmesg output with new errors
I just noticed a bunch of errors in dmesg which I would like to call to your attention. The e1000 driver seems be involved. I don't know if this is related to the filesystem inconsistency that I experienced earlier.
I don't know exactly what I was doing that caused these errors. I was doing some downloading earlier. I will try to test that and report back.
Just to be clear, you had been using the F9 system up until the F11 upgrade without any problems?
After I updated the initrd in F9, dmraid seemed to be working fine.
I haven't had any further inconsistency errors in F11. (Two ext3 filesystems are no longer being mounted.) However, when I run the find command on the /home filesystem (ext3), it still aborts.
Let me rephrase the question ... when did the problems with the F9/ext3 partitions start?
Boot-time fsck when booting F11 would not have touched them unless they were flagged as having errors from a previous mount, or if you happened to exceed the maximal mount count. So I wonder if something went bad under F9; perhaps you can look at the F9 system logs.
After the initrd fix mentioned earlier, I never noticed any problems related to filesystems when running F9. I just looked at and grepped the messages and dmesg log files from F9. I didn't find anything that would indicate to me that there was a problem related to filesystems when running F9.
To answer the earlier question, the first problem that I noticed was the inconsistency error that occured when I rebooted F11 on Tuesday, 6/16. I believe that the inconsistency error occurred on the second time that I booted F11.
I had installed F11 and booted for the first time four days earlier.
I just looked through the messages files from F11. I found one relevant error message:
Jun 12 16:16:52 frogn kernel: EXT3-fs error (device dm-9): htree_dirblock_to_tree: bad entry in directory #1295877: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
> Jun 12 16:16:52 frogn kernel: EXT3-fs error (device dm-9):
> htree_dirblock_to_tree: bad entry in directory #1295877: rec_len is smaller
> than minimal - offset=0, inode=0, rec_len=0, name_len=0
Ok, thanks, that explains the fsck on the subsequent boot. Also, ext3 somewhat easily corrupts directory entries when a drive with write caching loses power, if barriers aren't enabled ... and ext3 disables barriers by default, and dm in the past didn't pass them through even if you enabled them.
Is it possible that the box lost power while this device (dm-9) was mounted?
dm-9 was my root partition for F9, so it was always mounted.
As best as I can remember, the computer hasn't lost power recently. (It is on a UPS.) I'm sure that it has lost power at some point.
When I first installed F9, the video driver was locking up the computer. I had to force the computer to powerdown in order to restart it. I did that many times, but it was at least 9 months ago.
dm-5 is my home partition. Perhaps the error messages in the first attachment have something to with why the find command doesn't work. I think I'll try to reformat this partition as ext4 and then restore from backup.
I reformatted the partition and restored the home filesystem from backup. when I did this, I noticed that the disk space numbers in df didn't match those from du -s. I remembered that was the case even in F9. So, the problems with these ext3 filesystems existed in F9 ever since I tried to fix dmraid. However, fsck wasn't triggered for the root partition on F9 but it was on F11.
Thank you for all responses.
Ok, thanks for the update. If you run into any more problems, let us know!