Bug 477739 - EXt3 Raid 1 File system corruption with LVM2 Snapshot volume overflow. (Possibly)
Summary: EXt3 Raid 1 File system corruption with LVM2 Snapshot volume overflow. (Possi...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: lvm2
Version: 5.2
Hardware: x86_64
OS: Linux
low
high
Target Milestone: ---
: ---
Assignee: LVM and device-mapper development team
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-12-23 07:29 UTC by Richard Chapman
Modified: 2010-10-05 14:18 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-10-05 14:18:27 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
CentOS 3305 0 None None None Never

Description Richard Chapman 2008-12-23 07:29:27 UTC
Description of problem:
I had some Kernel Errors and 2 systemn crashes on a sstem whcih ran fine for more than a year. After the second crsh - the filesystem was so corrupt - the system would not boot. Both the Journal and the volume group descriptor were reported to be corrupt. I have rebuilt the system from backups -and it is working fine again.
I have an ext3 root file-system running on a software raid 1 array with LVM2.
At the time of the incident I was running 2.6.18-92.1.18.el5.
I had created a snapshot volume in VolGroup00. I believe the snapshot violume was physically located between LogVol00 (swap) and LogVol01 (root) because I shrank the logvol00 to make space for the snapshot.
I forgot about the snapshot and left in in place for several days until I got the following kernel errors in my logwatch.
 --------------------- Kernel Begin ------------------------ 

 
 WARNING: Kernel Errors Present
    Buffer I/O error on device dm-2, ...: 20 Time(s)
 
 ---------------------- Kernel End ------------------------- 
--------------------- Kernel Begin ------------------------ 

 WARNING: Kernel Errors Present
    Buffer I/O error on device dm-2, ...: 5 Time(s)
    EXT3-fs error (device dm-2): e ...: 750 Time(s)
    EXT3-fs error (dffset 0 ...: 1 Time(s)
 
 ---------------------- Kernel End ------------------------- 
At that time - I deleted the snapshot volume - and everything seemed fine. A few days later the system crashed - but rebooted oK. A few days later it crashed again - but this time the root filesayatem was totally corrupt.

There is only circumstantial evidence - but my theory is that the file-system got some minor corruption at the time of the snapshot overflow - and the corruption got exacerbated by subsequest events.

If there is any more information that wuld be useful - pleaae contact me and I'll see if I can provide it. I can be contacted at chapman dot richard at gmail dot com.

This may be related to Bug 461289 BUT in this case it was definitely the root file system which was corrupted - not just the snapshot. It isn't clear to me which was corrupted in Bug 461289.


Version-Release number of selected component (if applicable):


How reproducible:

I haven't attempted to reproduce the problem.

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 2 Heinz Mauelshagen 2010-10-05 14:18:27 UTC
Closing because of long term dormancy. Reopen if problem still exists in current release.


Note You need to log in before you can comment on or make changes to this bug.