Bug 200176

Summary: mirror volumes appear to be corrupted after leg failure for a period of time
Product: Red Hat Enterprise Linux 4 Reporter: Corey Marthaler <cmarthal>
Component: lvm2Assignee: Jonathan Earl Brassow <jbrassow>
Status: CLOSED DUPLICATE QA Contact:
Severity: medium Docs Contact:
Priority: high    
Version: 4.0CC: agk, dwysocha, mbroz
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-12-05 18:05:28 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Corey Marthaler 2006-07-25 21:47:53 UTC
Description of problem:
After failing a mirror leg, the mirrored volume appears corrupted for about 5
minutes after one of it's legs are failed.

Right after the failure:
[root@taft-04 ~]# lvscan
  /dev/sdd1: read failed after 0 of 2048 at 0: Input/output error
  /dev/sdd2: read failed after 0 of 2048 at 0: Input/output error
  /dev/dm-3: read failed after 0 of 4096 at 1073676288: Input/output error
  /dev/dm-3: read failed after 0 of 4096 at 0: Input/output error
  /dev/sdd: read failed after 0 of 4096 at 0: Input/output error
  /dev/sdd: read failed after 0 of 4096 at 1999073378304: Input/output error
  /dev/sdd: read failed after 0 of 4096 at 0: Input/output error
  /dev/sdd1: read failed after 0 of 2048 at 1011548160: Input/output error
  /dev/sdd1: read failed after 0 of 2048 at 0: Input/output error
  /dev/sdd2: read failed after 0 of 512 at 1998060257280: Input/output error
  /dev/sdd2: read failed after 0 of 2048 at 0: Input/output error
  Couldn't find device with uuid 'Pd41er-YXpT-tP4G-dXWo-LaUx-SKXH-ZptJIm'.
  Couldn't find all physical volumes for volume group vg.
  /dev/dm-3: read failed after 0 of 4096 at 0: Input/output error
  /dev/sdd: read failed after 0 of 4096 at 0: Input/output error
  /dev/sdd1: read failed after 0 of 2048 at 0: Input/output error
  /dev/sdd2: read failed after 0 of 2048 at 0: Input/output error
  Couldn't find device with uuid 'Pd41er-YXpT-tP4G-dXWo-LaUx-SKXH-ZptJIm'.
  Couldn't find all physical volumes for volume group vg.
  /dev/dm-3: read failed after 0 of 4096 at 0: Input/output error
  /dev/sdd: read failed after 0 of 4096 at 0: Input/output error
  /dev/sdd1: read failed after 0 of 2048 at 0: Input/output error
  /dev/sdd2: read failed after 0 of 2048 at 0: Input/output error
  /dev/dm-3: read failed after 0 of 4096 at 0: Input/output error
  /dev/sdd: read failed after 0 of 4096 at 0: Input/output error
  /dev/sdd1: read failed after 0 of 2048 at 0: Input/output error
  /dev/sdd2: read failed after 0 of 2048 at 0: Input/output error
  Couldn't find device with uuid 'Pd41er-YXpT-tP4G-dXWo-LaUx-SKXH-ZptJIm'.
  Couldn't find all physical volumes for volume group vg.
  /dev/dm-3: read failed after 0 of 4096 at 0: Input/output error
  /dev/sdd: read failed after 0 of 4096 at 0: Input/output error
  /dev/sdd1: read failed after 0 of 2048 at 0: Input/output error
  /dev/sdd2: read failed after 0 of 2048 at 0: Input/output error
  Couldn't find device with uuid 'Pd41er-YXpT-tP4G-dXWo-LaUx-SKXH-ZptJIm'.
  Couldn't find all physical volumes for volume group vg.
  Volume group "vg" not found
  ACTIVE            '/dev/VolGroup00/LogVol00' [19.53 GB] inherit
  ACTIVE            '/dev/VolGroup00/LogVol01' [1.94 GB] inherit


Five or so minutes after the failure:
[root@taft-04 ~]# lvscan
  /dev/sdd: read failed after 0 of 4096 at 0: Input/output error
  /dev/sdd1: read failed after 0 of 2048 at 0: Input/output error
  /dev/sdd2: read failed after 0 of 2048 at 0: Input/output error
  ACTIVE            '/dev/vg/mirror' [1.00 GB] inherit
  ACTIVE            '/dev/VolGroup00/LogVol00' [19.53 GB] inherit
  ACTIVE            '/dev/VolGroup00/LogVol01' [1.94 GB] inherit


[root@taft-04 ~]# dmsetup ls
vg-mirror       (253, 5)
VolGroup00-LogVol01     (253, 1)
VolGroup00-LogVol00     (253, 0)


Version-Release number of selected component (if applicable):
[root@taft-04 ~]# rpm -q lvm2
lvm2-2.02.06-6.0.RHEL4
[root@taft-04 ~]# rpm -q device-mapper
device-mapper-1.02.07-4.0.RHEL4


How reproducible:
every so often

Comment 1 Corey Marthaler 2006-07-25 22:03:59 UTC
During another attempt of this same test case, lvm was deadlocked during whole
"five minute time period" in an flock. After that stuck period, the volumes
appeared fine. 

[root@taft-04 lvm]# lvscan
  /dev/sdd1: read failed after 0 of 2048 at 0: Input/output error
  /dev/sdd2: read failed after 0 of 2048 at 0: Input/output error
  ACTIVE            '/dev/vg/mirror' [1.00 GB] inherit
  ACTIVE            '/dev/VolGroup00/LogVol00' [19.53 GB] inherit
  ACTIVE            '/dev/VolGroup00/LogVol01' [1.94 GB] inherit


Comment 2 Corey Marthaler 2006-08-16 15:07:33 UTC
I believe that we already know this fact, but after the leg failure, the lvm2
mirror will stay "corrupted" until a write to it is attempted. I reproduced this
by not having I/O going to the mirror and failing the leg. The volume was
corrupt for 12 hours or so over night until I attempted to write to it in the
morning.

Comment 3 Jonathan Earl Brassow 2006-12-05 18:05:28 UTC
Unless you can show me otherwise, I don't think the volume is corrupted.  It is
simply incomplete.  IOW, dmeventd has not yet reconfigured the mirror yet.

Read failures will not trigger a reconfiguration (because reads can go to other
devices and it is possible for the drive to remap the block - S.M.A.R.T.).

Yes, while the reconfiguration is happening, LVM commands will be stuck.

This is a duplicate of bug 199724, AFAICT

*** This bug has been marked as a duplicate of 199724 ***