Bug 1016296 - Leg and Log failure in a 2-way mirror can cause the Repair to fail
Leg and Log failure in a 2-way mirror can cause the Repair to fail
Status: CLOSED WONTFIX
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: lvm2 (Show other bugs)
6.5
x86_64 Linux
unspecified Severity urgent
: rc
: ---
Assigned To: Jonathan Earl Brassow
Cluster QE
: TestBlocker
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-10-07 15:59 EDT by Corey Marthaler
Modified: 2014-07-25 22:33 EDT (History)
9 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-07-25 22:33:17 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Corey Marthaler 2013-10-07 15:59:57 EDT
Description of problem:
I've seen this twice now on two different machines running similar test cases. One involved failing the primary and log device, and the other the secondary and log device. In both cases, it appear that mirror recovery had started just a bit before the second device was failed. Also, in both of these cases, lvmetad was running. 

Scenario kill_primary_and_log_synced_2_legs: Kill primary leg and disk log of synced 2 leg mirror(s)

********* Mirror hash info for this scenario *********
* names:              syncd_primary_log_2legs_1
* sync:               1
* striped:            0
* leg devices:        /dev/sdd1 /dev/sda1
* log devices:        /dev/sdc1
* no MDA devices:     
* failpv(s):          /dev/sdd1 /dev/sdc1
* failnode(s):        virt-049.cluster-qe.lab.eng.brq.redhat.com
* lvmetad:            1
* leg fault policy:   allocate
* log fault policy:   allocate
******************************************************

Creating mirror(s) on virt-049.cluster-qe.lab.eng.brq.redhat.com...
virt-049.cluster-qe.lab.eng.brq.redhat.com: lvcreate -m 1 -n syncd_primary_log_2legs_1 -L 500M helter_skelter /dev/sdd1:0-1000 /dev/sda1:0-1000 /dev/sdc1:0-150

Current mirror/raid device structure(s):
  LV                                   Attr       LSize   Cpy%Sync Devices
  syncd_primary_log_2legs_1            mwi-a-m--- 500.00m    16.00 syncd_primary_log_2legs_1_mimage_0(0),syncd_primary_log_2legs_1_mimage_1(0)
  [syncd_primary_log_2legs_1_mimage_0] Iwi-aom--- 500.00m          /dev/sdd1(0)
  [syncd_primary_log_2legs_1_mimage_1] Iwi-aom--- 500.00m          /dev/sda1(0)
  [syncd_primary_log_2legs_1_mlog]     lwi-aom---   4.00m          /dev/sdc1(0)

Waiting until all mirror|raid volumes become fully syncd...
   1/1 mirror(s) are fully synced: ( 100.00% )

Creating ext on top of mirror(s) on virt-049.cluster-qe.lab.eng.brq.redhat.com...
mke2fs 1.41.12 (17-May-2010)
Mounting mirrored ext filesystems on virt-049.cluster-qe.lab.eng.brq.redhat.com...

PV=/dev/sdd1
       syncd_primary_log_2legs_1_mimage_0: 5.1
PV=/dev/sdc1
       syncd_primary_log_2legs_1_mlog: 1.2
PV=/dev/sdd1
       syncd_primary_log_2legs_1_mimage_0: 5.1
PV=/dev/sdc1
       syncd_primary_log_2legs_1_mlog: 1.2

Writing verification files (checkit) to mirror(s) on...
       ---- virt-049.cluster-qe.lab.eng.brq.redhat.com ----

Sleeping 15 seconds to get some outsanding EXT I/O locks before the failure 
Verifying files (checkit) on mirror(s) on...

Disabling device sdd on virt-049.cluster-qe.lab.eng.brq.redhat.com (lvmetad is on)
Disabling device sdc on virt-049.cluster-qe.lab.eng.brq.redhat.com (lvmetad is on)

Getting recovery check start time from /var/log/messages: Oct  5 04:07
Attempting I/O to cause mirror down conversion(s) on virt-049.cluster-qe.lab.eng.brq.redhat.com
10+0 records in
10+0 records out
41943040 bytes (42 MB) copied, 0.268018 s, 156 MB/s
Verifying current sanity of lvm after the failure

Current mirror/raid device structure(s):
  PV zzQtQB-zAWG-INcc-CcN6-4maz-6DsP-oHopY0 not recognised. Is the device missing?
  [...]
  PV DSvz2q-dvac-zxq5-KTQ5-BaKi-cLdo-Nc0lq1 not recognised. Is the device missing?
  LV                             Attr       LSize   Cpy%Sync Devices        
   syncd_primary_log_2legs_1      -wi-ao---- 500.00m          /dev/sda1(0)   
   syncd_primary_log_2legs_1_mlog -wi-------   4.00m          /dev/sdb1(0)   


Verifying FAILED device /dev/sdd1 is *NOT* in the volume(s)
Verifying FAILED device /dev/sdc1 is *NOT* in the volume(s)
Verifying LOG device(s) /dev/sdc1 are *NOT* in the mirror/linear(s)
Verifying LEG device /dev/sda1 *IS* in the volume(s)
verify the newly allocated dm devices were added as a result of the failures
Checking EXISTENCE of syncd_primary_log_2legs_1_mimage_0 on:  virt-049.cluster-qe.lab.eng.brq.redhat.com
syncd_primary_log_2legs_1_mimage_0 on virt-049.cluster-qe.lab.eng.brq.redhat.com should now exist
** ARE THERE ENOUGH SPARE PVS TO ALLOCATE? **
FI_engine: recover() method failed

[root@virt-049 ~]# lvs -a -o +devices
  PV zzQtQB-zAWG-INcc-CcN6-4maz-6DsP-oHopY0 not recognised. Is the device missing?
  PV DSvz2q-dvac-zxq5-KTQ5-BaKi-cLdo-Nc0lq1 not recognised. Is the device missing?
  PV zzQtQB-zAWG-INcc-CcN6-4maz-6DsP-oHopY0 not recognised. Is the device missing?
  PV DSvz2q-dvac-zxq5-KTQ5-BaKi-cLdo-Nc0lq1 not recognised. Is the device missing?
  LV                             VG             Attr       LSize   Devices
  syncd_primary_log_2legs_1      helter_skelter -wi-ao---- 500.00m /dev/sda1(0)
  syncd_primary_log_2legs_1_mlog helter_skelter -wi-------   4.00m /dev/sdb1(0)

LOG:
qarshd[16771]: Running cmdline: echo 1 > /sys/block/sdd/device/delete &
lvm[3522]: Primary mirror device 253:3 has failed (D).
lvm[3522]: Device failure in helter_skelter-syncd_primary_log_2legs_1.
lvm[3522]: PV zzQtQB-zAWG-INcc-CcN6-4maz-6DsP-oHopY0 not recognised. Is the device missing?
lvm[3522]: Mirror status: 1 of 2 images failed.
lvm[3522]: Trying to up-convert to 2 images, 1 logs.
qarshd[16798]: Running cmdline: echo 1 > /sys/block/sdc/device/delete &
lvm[3522]: /dev/sdc1: read failed after 0 of 1024 at 4096: Input/output error
lvm[3522]: Trying to up-convert to 2 images, 0 logs.
lvm[3522]: intermediate VG write failed.
lvm[3522]: WARNING: Failed to replace 1 of 1 logs in volume syncd_primary_log_2legs_1
lvm[3522]: PV zzQtQB-zAWG-INcc-CcN6-4maz-6DsP-oHopY0 not recognised. Is the device missing?
lvm[3522]: ABORTING: Mirror percentage check failed.
lvm[3522]: Repair of mirrored device helter_skelter-syncd_primary_log_2legs_1 failed.
lvm[3522]: Failed to remove faulty devices in helter_skelter-syncd_primary_log_2legs_1.
lvm[3522]: No longer monitoring mirror device helter_skelter-syncd_primary_log_2legs_1 for events.


Version-Release number of selected component (if applicable):
2.6.32-420.el6.x86_64
lvm2-2.02.100-4.el6    BUILT: Fri Sep 27 16:05:32 CEST 2013
lvm2-libs-2.02.100-4.el6    BUILT: Fri Sep 27 16:05:32 CEST 2013
lvm2-cluster-2.02.100-4.el6    BUILT: Fri Sep 27 16:05:32 CEST 2013
udev-147-2.48.el6    BUILT: Fri Aug  9 13:09:50 CEST 2013
device-mapper-1.02.79-4.el6    BUILT: Fri Sep 27 16:05:32 CEST 2013
device-mapper-libs-1.02.79-4.el6    BUILT: Fri Sep 27 16:05:32 CEST 2013
device-mapper-event-1.02.79-4.el6    BUILT: Fri Sep 27 16:05:32 CEST 2013
device-mapper-event-libs-1.02.79-4.el6    BUILT: Fri Sep 27 16:05:32 CEST 2013
cmirror-2.02.100-4.el6    BUILT: Fri Sep 27 16:05:32 CEST 2013
Comment 2 Corey Marthaler 2013-10-07 16:09:03 EDT
Here's the second case of what appears to be the exact same issue:

Scenario kill_secondary_and_log_synced_2_legs: Kill secondary leg and disk log of synced 2 leg mirror(s)

********* Mirror hash info for this scenario *********
* names:              syncd_secondary_log_2legs_1
* sync:               1
* striped:            0
* leg devices:        /dev/sdc1 /dev/sdh1
* log devices:        /dev/sdd1
* no MDA devices:     
* failpv(s):          /dev/sdh1 /dev/sdd1
* failnode(s):        virt-024.cluster-qe.lab.eng.brq.redhat.com
* lvmetad:            1
* leg fault policy:   allocate
* log fault policy:   allocate
******************************************************

Creating mirror(s) on virt-024.cluster-qe.lab.eng.brq.redhat.com...
virt-024.cluster-qe.lab.eng.brq.redhat.com: lvcreate -m 1 -n syncd_secondary_log_2legs_1 -L 500M helter_skelter /dev/sdc1:0-1000 /dev/sdh1:0-1000 /dev/sdd1:0-150

Current mirror/raid device structure(s):
  LV                                     Attr       LSize   Cpy%Sync Devices
  syncd_secondary_log_2legs_1            mwi-a-m--- 500.00m    15.20 syncd_secondary_log_2legs_1_mimage_0(0),syncd_secondary_log_2legs_1_mimage_1(0)
  [syncd_secondary_log_2legs_1_mimage_0] Iwi-aom--- 500.00m          /dev/sdc1(0)
  [syncd_secondary_log_2legs_1_mimage_1] Iwi-aom--- 500.00m          /dev/sdh1(0)
  [syncd_secondary_log_2legs_1_mlog]     lwi-aom---   4.00m          /dev/sdd1(0)

Waiting until all mirror|raid volumes become fully syncd...
   1/1 mirror(s) are fully synced: ( 100.00% )

Creating ext on top of mirror(s) on virt-024.cluster-qe.lab.eng.brq.redhat.com...
mke2fs 1.41.12 (17-May-2010)
Mounting mirrored ext filesystems on virt-024.cluster-qe.lab.eng.brq.redhat.com...

PV=/dev/sdh1
       syncd_secondary_log_2legs_1_mimage_1: 5.1
PV=/dev/sdd1
       syncd_secondary_log_2legs_1_mlog: 1.2
PV=/dev/sdh1
       syncd_secondary_log_2legs_1_mimage_1: 5.1
PV=/dev/sdd1
       syncd_secondary_log_2legs_1_mlog: 1.2

Writing verification files (checkit) to mirror(s) on...
       ---- virt-024.cluster-qe.lab.eng.brq.redhat.com ----

Sleeping 15 seconds to get some outsanding EXT I/O locks before the failure 
Verifying files (checkit) on mirror(s) on...
       ---- virt-024.cluster-qe.lab.eng.brq.redhat.com ----

Disabling device sdh on virt-024.cluster-qe.lab.eng.brq.redhat.com (lvmetad is on)
Disabling device sdd on virt-024.cluster-qe.lab.eng.brq.redhat.com (lvmetad is on)

Getting recovery check start time from /var/log/messages: Oct  5 04:03
Attempting I/O to cause mirror down conversion(s) on virt-024.cluster-qe.lab.eng.brq.redhat.com
10+0 records in
10+0 records out
41943040 bytes (42 MB) copied, 0.304512 s, 138 MB/s
Verifying current sanity of lvm after the failure

Current mirror/raid device structure(s):
  PV BKrdgC-KTeE-bfdt-HX5l-6V6t-H4Ey-Mo6Ny3 not recognised. Is the device missing?
  [...]
  PV iUqSL5-5i7L-tJFh-fI2G-olZt-7Uds-IYOKa4 not recognised. Is the device missing?
  LV                               Attr       LSize   Cpy%Sync Devices
  syncd_secondary_log_2legs_1      -wi-ao---- 500.00m          /dev/sdc1(0)
  syncd_secondary_log_2legs_1_mlog -wi-------   4.00m          /dev/sdb1(0)

Verifying FAILED device /dev/sdh1 is *NOT* in the volume(s)
Verifying FAILED device /dev/sdd1 is *NOT* in the volume(s)
Verifying LOG device(s) /dev/sdd1 are *NOT* in the mirror/linear(s)
Verifying LEG device /dev/sdc1 *IS* in the volume(s)
verify the newly allocated dm devices were added as a result of the failures
Checking EXISTENCE of syncd_secondary_log_2legs_1_mimage_0 on:  virt-024.cluster-qe.lab.eng.brq.redhat.com
syncd_secondary_log_2legs_1_mimage_0 on virt-024.cluster-qe.lab.eng.brq.redhat.com should now exist
** ARE THERE ENOUGH SPARE PVS TO ALLOCATE? **
FI_engine: recover() method failed

[root@virt-024 ~]# lvs -a -o +devices
  PV BKrdgC-KTeE-bfdt-HX5l-6V6t-H4Ey-Mo6Ny3 not recognised. Is the device missing?
  PV iUqSL5-5i7L-tJFh-fI2G-olZt-7Uds-IYOKa4 not recognised. Is the device missing?
  PV BKrdgC-KTeE-bfdt-HX5l-6V6t-H4Ey-Mo6Ny3 not recognised. Is the device missing?
  PV iUqSL5-5i7L-tJFh-fI2G-olZt-7Uds-IYOKa4 not recognised. Is the device missing?
  LV                               VG             Attr       LSize   Devices
  syncd_secondary_log_2legs_1      helter_skelter -wi-ao---- 500.00m /dev/sdc1(0)
  syncd_secondary_log_2legs_1_mlog helter_skelter -wi-------   4.00m /dev/sdb1(0)

LOG:
qarshd[18480]: Running cmdline: echo 1 > /sys/block/sdh/device/delete &
lvm[3710]: Secondary mirror device 253:4 has failed (D).
lvm[3710]: Device failure in helter_skelter-syncd_secondary_log_2legs_1.
qarshd[18488]: Running cmdline: echo 1 > /sys/block/sdd/device/delete &
lvm[3710]: PV iUqSL5-5i7L-tJFh-fI2G-olZt-7Uds-IYOKa4 not recognised. Is the device missing?
lvm[3710]: Mirror status: 1 of 2 images failed.
lvm[3710]: Trying to up-convert to 2 images, 1 logs.
lvm[3710]: /dev/sdd1: read failed after 0 of 1024 at 4096: Input/output error
lvm[3710]: Trying to up-convert to 2 images, 0 logs.
lvm[3710]: intermediate VG write failed.
lvm[3710]: WARNING: Failed to replace 1 of 1 logs in volume syncd_secondary_log_2legs_1
lvm[3710]: PV BKrdgC-KTeE-bfdt-HX5l-6V6t-H4Ey-Mo6Ny3 not recognised. Is the device missing?
lvm[3710]: PV iUqSL5-5i7L-tJFh-fI2G-olZt-7Uds-IYOKa4 not recognised. Is the device missing?
lvm[3710]: ABORTING: Mirror percentage check failed.
lvm[3710]: Repair of mirrored device helter_skelter-syncd_secondary_log_2legs_1 failed.
lvm[3710]: Failed to remove faulty devices in helter_skelter-syncd_secondary_log_2legs_1.
lvm[3710]: No longer monitoring mirror device helter_skelter-syncd_secondary_log_2legs_1 for events.
Comment 4 Jonathan Earl Brassow 2013-10-15 11:42:33 EDT
Intermediate operations are failing:
  lvm[3522]: intermediate VG write failed.

This is going to cause things to fail.  In the cases above, it causes the repair procedure to fail part way through.  It is failing after the mirror has been down converted to linear and before it can be up-converted and the old log removed.

Couple things about this.
1) Users generally don't have 'mirror_image_fault_policy' set to "allocate" - this is not the default.  The repair happens in two stages - remove the bad devices and then allocate new devices according to user-set policy.  It is the "allocate" portion of this that is failing.

2) There is no corruption.  Clean-up is required because the command failed half-way through, but that isn't preventing things from continuing to operation.  The user must - at their convenience - remove the now-visible log device and attempt a fresh up-convert.

3) Failing intermediate operations cannot be completely handled.  The devices are failing in stages.  One device fails during the repair operation for the first failure.  It would not be all that dissimilar to kill the power to the machine during a repair operation.  Upon booting up, the user would find a similar situation with a half-completed repair.

I'm not sure to what extent lvmetad is responsible here.  But here are the steps to test:
1) disable dmeventd (making it easier to perform steps by hand)
2) kill one of the mirror devices
3) do some I/O to trigger kernel to recognise the failure (this will block I/O)
4) use gdb to run 'lvconvert --repair' and stop before adding new mirror images
5) kill second device (i.e. the log device)
6) proceed to perform vg_write during final "allocate" stage
We can then determine if this is a special problem with lvmetad.  (My guess is that although this has only showed up with lvmetad, it is not isolated to it.)
Comment 6 Jonathan Earl Brassow 2013-10-15 11:47:01 EDT
FWIW, I can run 20 clean iterations on my machine just fine.  This is likely because of timing issues.  The second device must fail during the repair of the first.  I'll try this by hand using GDB in the way described in comment 4.

Note You need to log in before you can comment on or make changes to this bug.