Bug 477739 - EXt3 Raid 1 File system corruption with LVM2 Snapshot volume overflow. (Possibly)
EXt3 Raid 1 File system corruption with LVM2 Snapshot volume overflow. (Possi...
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: lvm2 (Show other bugs)
x86_64 Linux
low Severity high
: ---
: ---
Assigned To: LVM and device-mapper development team
Cluster QE
Depends On:
  Show dependency treegraph
Reported: 2008-12-23 02:29 EST by Richard Chapman
Modified: 2010-10-05 10:18 EDT (History)
8 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2010-10-05 10:18:27 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

External Trackers
Tracker ID Priority Status Summary Last Updated
CentOS 3305 None None None Never

  None (edit)
Description Richard Chapman 2008-12-23 02:29:27 EST
Description of problem:
I had some Kernel Errors and 2 systemn crashes on a sstem whcih ran fine for more than a year. After the second crsh - the filesystem was so corrupt - the system would not boot. Both the Journal and the volume group descriptor were reported to be corrupt. I have rebuilt the system from backups -and it is working fine again.
I have an ext3 root file-system running on a software raid 1 array with LVM2.
At the time of the incident I was running 2.6.18-92.1.18.el5.
I had created a snapshot volume in VolGroup00. I believe the snapshot violume was physically located between LogVol00 (swap) and LogVol01 (root) because I shrank the logvol00 to make space for the snapshot.
I forgot about the snapshot and left in in place for several days until I got the following kernel errors in my logwatch.
 --------------------- Kernel Begin ------------------------ 

 WARNING: Kernel Errors Present
    Buffer I/O error on device dm-2, ...: 20 Time(s)
 ---------------------- Kernel End ------------------------- 
--------------------- Kernel Begin ------------------------ 

 WARNING: Kernel Errors Present
    Buffer I/O error on device dm-2, ...: 5 Time(s)
    EXT3-fs error (device dm-2): e ...: 750 Time(s)
    EXT3-fs error (dffset 0 ...: 1 Time(s)
 ---------------------- Kernel End ------------------------- 
At that time - I deleted the snapshot volume - and everything seemed fine. A few days later the system crashed - but rebooted oK. A few days later it crashed again - but this time the root filesayatem was totally corrupt.

There is only circumstantial evidence - but my theory is that the file-system got some minor corruption at the time of the snapshot overflow - and the corruption got exacerbated by subsequest events.

If there is any more information that wuld be useful - pleaae contact me and I'll see if I can provide it. I can be contacted at chapman dot richard at gmail dot com.

This may be related to Bug 461289 BUT in this case it was definitely the root file system which was corrupted - not just the snapshot. It isn't clear to me which was corrupted in Bug 461289.

Version-Release number of selected component (if applicable):

How reproducible:

I haven't attempted to reproduce the problem.

Steps to Reproduce:
Actual results:

Expected results:

Additional info:
Comment 2 Heinz Mauelshagen 2010-10-05 10:18:27 EDT
Closing because of long term dormancy. Reopen if problem still exists in current release.

Note You need to log in before you can comment on or make changes to this bug.