Bug 199749

Summary: mirror leg failure during syncing will cause corrupt volume
Product: Red Hat Enterprise Linux 4 Reporter: Corey Marthaler <cmarthal>
Component: lvm2Assignee: Jonathan Earl Brassow <jbrassow>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 4.0CC: agk, dwysocha, jbrassow, mbroz
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-01-26 19:06:54 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Corey Marthaler 2006-07-21 18:26:14 UTC
Description of problem:
I created a mirror and while it was syncing I failed one of the legs. 

[root@taft-04 ~]# lvscan
  /dev/dm-3: read failed after 0 of 4096 at 0: Input/output error
  /dev/sdd: read failed after 0 of 4096 at 0: Input/output error
  /dev/sdd1: read failed after 0 of 2048 at 0: Input/output error
  /dev/sdd2: read failed after 0 of 2048 at 0: Input/output error
  /dev/dm-3: read failed after 0 of 4096 at 5368643584: Input/output error
  /dev/dm-3: read failed after 0 of 4096 at 0: Input/output error
  /dev/dm-5: read failed after 0 of 4096 at 5368643584: Input/output error
  /dev/sdd: read failed after 0 of 4096 at 0: Input/output error
  /dev/sdd: read failed after 0 of 4096 at 1999073378304: Input/output error
  /dev/sdd: read failed after 0 of 4096 at 0: Input/output error
  /dev/sdd1: read failed after 0 of 2048 at 1011548160: Input/output error
  /dev/sdd1: read failed after 0 of 2048 at 0: Input/output error
  /dev/sdd2: read failed after 0 of 512 at 1998060257280: Input/output error
  /dev/sdd2: read failed after 0 of 2048 at 0: Input/output error
  Couldn't find device with uuid 'V6Fx4p-jzEO-Qlxc-fCBN-f1o1-46V5-IA0jyk'.
  Couldn't find all physical volumes for volume group vg.
  /dev/dm-3: read failed after 0 of 4096 at 0: Input/output error
  /dev/sdd: read failed after 0 of 4096 at 0: Input/output error
  /dev/sdd1: read failed after 0 of 2048 at 0: Input/output error
  /dev/sdd2: read failed after 0 of 2048 at 0: Input/output error
  Couldn't find device with uuid 'V6Fx4p-jzEO-Qlxc-fCBN-f1o1-46V5-IA0jyk'.
  Couldn't find all physical volumes for volume group vg.
  /dev/dm-3: read failed after 0 of 4096 at 0: Input/output error
  /dev/sdd: read failed after 0 of 4096 at 0: Input/output error
  /dev/sdd1: read failed after 0 of 2048 at 0: Input/output error
  /dev/sdd2: read failed after 0 of 2048 at 0: Input/output error
  /dev/dm-3: read failed after 0 of 4096 at 0: Input/output error
  /dev/sdd: read failed after 0 of 4096 at 0: Input/output error
  /dev/sdd1: read failed after 0 of 2048 at 0: Input/output error
  /dev/sdd2: read failed after 0 of 2048 at 0: Input/output error
  Couldn't find device with uuid 'V6Fx4p-jzEO-Qlxc-fCBN-f1o1-46V5-IA0jyk'.
  Couldn't find all physical volumes for volume group vg.
  /dev/dm-3: read failed after 0 of 4096 at 0: Input/output error
  /dev/sdd: read failed after 0 of 4096 at 0: Input/output error
  /dev/sdd1: read failed after 0 of 2048 at 0: Input/output error
  /dev/sdd2: read failed after 0 of 2048 at 0: Input/output error
  Couldn't find device with uuid 'V6Fx4p-jzEO-Qlxc-fCBN-f1o1-46V5-IA0jyk'.
  Couldn't find all physical volumes for volume group vg.
  Volume group "vg" not found
  ACTIVE            '/dev/VolGroup00/LogVol00' [19.53 GB] inherit
  ACTIVE            '/dev/VolGroup00/LogVol01' [1.94 GB] inherit
[root@taft-04 ~]# dmsetup ls
vg-mirror_mimage_1      (253, 4)
vg-mirror_mimage_0      (253, 3)
vg-mirror       (253, 5)
VolGroup00-LogVol01     (253, 1)
VolGroup00-LogVol00     (253, 0)
vg-mirror_mlog  (253, 2)
[root@taft-04 ~]# ls /dev/vg/mirror
/dev/vg/mirror
[root@taft-04 ~]# mkfs /dev/vg/mirror
mke2fs 1.35 (28-Feb-2004)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
655360 inodes, 1310720 blocks
65536 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=1342177280
40 block groups
32768 blocks per group, 32768 fragments per group
16384 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736

Writing inode tables: done
Writing superblocks and filesystem accounting information:
Warning, had trouble writing out superblocks.done

This filesystem will be automatically checked every 28 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.
[root@taft-04 ~]# mount /dev/vg/mirror /mnt/mirror
[root@taft-04 ~]# df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                      20158332   8790188  10344144  46% /
/dev/sda1               101086     84436     11431  89% /boot
none                   4084472         0   4084472   0% /dev/shm
/dev/mapper/vg-mirror
                       5160576     10232   4888200   1% /mnt/mirror
[root@taft-04 ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                       20G  8.4G  9.9G  46% /
/dev/sda1              99M   83M   12M  89% /boot
none                  3.9G     0  3.9G   0% /dev/shm
/dev/mapper/vg-mirror
                      5.0G   10M  4.7G   1% /mnt/mirror
[root@taft-04 ~]# cd /mnt/mirror
[root@taft-04 mirror]# touch foo
touch: cannot touch `foo': Input/output error

I've reproduced this cluster mirrors as well.

[root@taft-04 mirror]#  rpm -q device-mapper
device-mapper-1.02.07-4.0.RHEL4
[root@taft-04 mirror]#  rpm -q lvm2
lvm2-2.02.06-6.0.RHEL4
[root@taft-04 mirror]# uname -ar
Linux taft-04 2.6.9-42.ELsmp #1 SMP Wed Jul 12 23:32:02 EDT 2006 x86_64 x86_64
x86_64 GNU/Linux

Comment 1 Jonathan Earl Brassow 2006-11-27 23:29:09 UTC
'dmsetup status' ?


Comment 2 Jonathan Earl Brassow 2006-12-07 21:59:37 UTC
Which device did you disable?  The primary, or other?


Comment 3 Corey Marthaler 2006-12-07 23:11:18 UTC
It's been awhile since trying that but I have to believe that both cases were
attempted (primary and other) due to the large number of times that I attempted
that test case. I try and reproduce this again and get more info for you...

Comment 4 Corey Marthaler 2007-01-03 23:30:19 UTC
Back to mirror leg failure testing once again... 

I tried this today but had different results. Not quite as bad as in this
original report, but still an issue. After the primary leg failure, I was still
able to access the filesystem and the data on the mirror. The mirror then appear
to be automatically down convert to a linear. All appear to be fine until I
powered up the now failed leg. Once it was back up, lvm started reporting the
mirror as inconsistent. I then attempt to upconvert the linear back to the way
the mirror was orginally and that failed.

# Powered up the mirror and then...
[root@link-08 ~]# lvscan
  Volume group "vg" inconsistent
  Inconsistent metadata copies found - updating to use version 7
  ACTIVE            '/dev/vg/mirror' [752.00 MB] inherit
[root@link-08 ~]# pvscan
  Warning: Volume Group vg is not consistent
  PV /dev/sda1   VG vg   lvm2 [135.66 GB / 134.92 GB free]
  PV /dev/sdb1   VG vg   lvm2 [135.66 GB / 135.66 GB free]
  PV /dev/sdc1   VG vg   lvm2 [135.66 GB / 135.66 GB free]
  PV /dev/sdd1   VG vg   lvm2 [135.66 GB / 135.66 GB free]
  PV /dev/sde1   VG vg   lvm2 [135.66 GB / 135.66 GB free]
  PV /dev/sdf1   VG vg   lvm2 [135.66 GB / 135.66 GB free]
  PV /dev/sdg1   VG vg   lvm2 [135.66 GB / 135.66 GB free]
  Total: 7 [949.59 GB] / in use: 7 [949.59 GB] / in no VG: 0 [0   ]
[root@link-08 ~]# vgscan
  Reading all physical volumes.  This may take a while...
  Volume group "vg" inconsistent
  Inconsistent metadata copies found - updating to use version 8
  Found volume group "vg" using metadata type lvm2
[root@link-08 ~]# lvs -a -o +devices
  Volume group "vg" inconsistent
  Inconsistent metadata copies found - updating to use version 10
  LV     VG   Attr   LSize   Origin Snap%  Move Log Copy%  Devices
  mirror vg   -wi-ao 752.00M                               /dev/sda1(0)
[root@link-08 ~]# lvconvert -m 0 /dev/vg/mirror
  Inconsistent metadata copies found - updating to use version 11
  Logical volume mirror is already not mirrored.
[root@link-08 ~]# lvconvert -m 1 /dev/vg/mirror
  Inconsistent metadata copies found - updating to use version 12
  Volume group vg metadata is inconsistent
  Volume group for uuid not found:
KgTtQPpOsnujX3DSDtRDiYZ97saUuNAAnrJPJNfix9cSfeBr76wwsWbTArWWnswa
  Aborting. Failed to activate mirror log. Remove new LVs and retry.
  Failed to create mirror log.
[root@link-08 ~]# lvscan
  Volume group "vg" inconsistent
  Inconsistent metadata copies found - updating to use version 14
  ACTIVE            '/dev/vg/mirror' [752.00 MB] inherit
  inactive          '/dev/vg/mirror_mlog' [4.00 MB] inherit


# Note the appearance of the new mirror_mlog

[root@link-08 ~]# dmsetup ls
vg-mirror       (253, 5)
VolGroup00-LogVol01     (253, 1)
VolGroup00-LogVol00     (253, 0)
[root@link-08 ~]# dmsetup table
vg-mirror: 0 1540096 linear 8:1 384
VolGroup00-LogVol01: 0 4063232 linear 3:2 151912832
VolGroup00-LogVol00: 0 151912448 linear 3:2 384
[root@link-08 ~]# dmsetup ls --tree
vg-mirror (253:5)
 └─ (8:1)
VolGroup00-LogVol01 (253:1)
 └─ (3:2)
VolGroup00-LogVol00 (253:0)
 └─ (3:2)


Comment 5 Jonathan Earl Brassow 2007-01-08 23:11:37 UTC
When you bring back the failed device, two VG's with the same name are now
present.  An inconsistent VG named "vg" and a consistent VG named "vg".  They
are conflicting in the namespace.

The fix is to pvcreate/vgextend on the device you just brought back to life.


Comment 6 Corey Marthaler 2007-01-22 19:24:35 UTC
The workaround mentioned in comment #5 has been verified to work. 

Marking verified.