Bug 1060294

Summary: raid10 multiple device failure scenario causes vgreduce to deadlock
Product: Red Hat Enterprise Linux 7 Reporter: Corey Marthaler <cmarthal>
Component: lvm2Assignee: Heinz Mauelshagen <heinzm>
lvm2 sub component: Mirroring and RAID QA Contact: cluster-qe <cluster-qe>
Status: CLOSED WORKSFORME Docs Contact:
Severity: high    
Priority: medium CC: agk, cmarthal, heinzm, jbrassow, msnitzer, prajnoha, prockai, zkabelac
Version: 7.0Keywords: Triaged
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-05-22 20:48:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1075262, 1205796    
Attachments:
Description Flags
kernel stack none

Description Corey Marthaler 2014-01-31 17:34:27 UTC
Description of problem:
This may be related to bug 1031204.

./black_bird -f -e kill_two_synced_raid10_3legs

================================================================================
Iteration 0.33 started at Thu Jan 30 18:48:58 CST 2014
================================================================================
Scenario kill_two_synced_raid10_3legs: Kill two legs (none of which share the same stripe leg) of synced 3 leg raid10 volume(s)

********* RAID hash info for this scenario *********
* names:              synced_two_raid10_3legs_1
* sync:               1
* type:               raid10
* -m |-i value:       3
* leg devices:        /dev/sdc1 /dev/sdb1 /dev/sda1 /dev/sdd1 /dev/sde1 /dev/sdh1
* failpv(s):          /dev/sdc1 /dev/sda1
* failnode(s):        host-002.virt.lab.msp.redhat.com
* additional snap:    /dev/sdb1
* lvmetad:             1
* raid fault policy:   warn
******************************************************

Creating raids(s) on host-002.virt.lab.msp.redhat.com...
host-002.virt.lab.msp.redhat.com: lvcreate --type raid10 -i 3 -n synced_two_raid10_3legs_1 -L 500M black_bird /dev/sdc1:0-2000 /dev/sdb1:0-2000 /dev/sda1:0-2000 /dev/sdd1:0-2000 /dev/sde1:0-2000 /dev/sdh1:0-2000
WARNING: ext3 signature detected on /dev/black_bird/synced_two_raid10_3legs_1 at offset 1080. Wipe it? [y/n] 
  1 existing signature left on the device.

Current mirror/raid device structure(s):
  LV                                   Attr       LSize   Cpy%Sync Devices
   synced_two_raid10_3legs_1            rwi-aor--- 504.00m     5.56 synced_two_raid10_3legs_1_rimage_0(0),synced_two_raid10_3legs_1_rimage_1(0),synced_two_raid10_3legs_1_rimage_2(0),synced_two_raid10_3legs_1_rimage_3(0),synced_two_raid10_3legs_1_rimage_4(0),synced_two_raid10_3legs_1_rimage_5(0)
   [synced_two_raid10_3legs_1_rimage_0] Iwi-aor--- 168.00m          /dev/sdc1(1)
   [synced_two_raid10_3legs_1_rimage_1] Iwi-aor--- 168.00m          /dev/sdb1(1)
   [synced_two_raid10_3legs_1_rimage_2] Iwi-aor--- 168.00m          /dev/sda1(1)
   [synced_two_raid10_3legs_1_rimage_3] Iwi-aor--- 168.00m          /dev/sdd1(1)
   [synced_two_raid10_3legs_1_rimage_4] Iwi-aor--- 168.00m          /dev/sde1(1)
   [synced_two_raid10_3legs_1_rimage_5] Iwi-aor--- 168.00m          /dev/sdh1(1)
   [synced_two_raid10_3legs_1_rmeta_0]  ewi-aor---   4.00m          /dev/sdc1(0)
   [synced_two_raid10_3legs_1_rmeta_1]  ewi-aor---   4.00m          /dev/sdb1(0)
   [synced_two_raid10_3legs_1_rmeta_2]  ewi-aor---   4.00m          /dev/sda1(0)
   [synced_two_raid10_3legs_1_rmeta_3]  ewi-aor---   4.00m          /dev/sdd1(0)
   [synced_two_raid10_3legs_1_rmeta_4]  ewi-aor---   4.00m          /dev/sde1(0)
   [synced_two_raid10_3legs_1_rmeta_5]  ewi-aor---   4.00m          /dev/sdh1(0)

* NOTE: not enough available devices for allocation fault polices to fully work *
(well technically, since we have 1, some allocation should work)

Waiting until all mirror|raid volumes become fully syncd...
   0/1 mirror(s) are fully synced: ( 56.36% )
   1/1 mirror(s) are fully synced: ( 100.00% )

Creating ext on top of mirror(s) on host-002.virt.lab.msp.redhat.com...
mke2fs 1.42.9 (28-Dec-2013)
Mounting mirrored ext filesystems on host-002.virt.lab.msp.redhat.com...

PV=/dev/sda1
        synced_two_raid10_3legs_1_rimage_2: 2
        synced_two_raid10_3legs_1_rmeta_2: 2
PV=/dev/sdc1
        synced_two_raid10_3legs_1_rimage_0: 2
        synced_two_raid10_3legs_1_rmeta_0: 2

Creating a snapshot volume of each of the raids

Writing verification files (checkit) to mirror(s) on...
        ---- host-002.virt.lab.msp.redhat.com ----

Sleeping 15 seconds to get some outsanding EXT I/O locks before the failure 
Verifying files (checkit) on mirror(s) on...
        ---- host-002.virt.lab.msp.redhat.com ----

Disabling device sdc on host-002.virt.lab.msp.redhat.com (lvmetad is on)
Disabling device sda on host-002.virt.lab.msp.redhat.com (lvmetad is on)

Getting recovery check start time from /var/log/messages: Jan 30 18:49
Attempting I/O to cause mirror down conversion(s) on host-002.virt.lab.msp.redhat.com
10+0 records in
10+0 records out
41943040 bytes (42 MB) copied, 0.354805 s, 118 MB/s

Verifying current sanity of lvm after the failure

Current mirror/raid device structure(s):
  PV OuO3aj-5uki-4uH3-H03t-uoOc-rE5R-BfJj4f not recognised. Is the device missing?
  [...]
  PV UgfkxC-hZBz-mbks-GYO1-V3h1-he4F-ixdwRf not recognised. Is the device missing?
  LV                                   Attr       LSize   Cpy%Sync Devices
   bb_snap1                             swi-a-s--- 252.00m          /dev/sdb1(43)
   synced_two_raid10_3legs_1            owi-aor-p- 504.00m   100.00 synced_two_raid10_3legs_1_rimage_0(0),synced_two_raid10_3legs_1_rimage_1(0),synced_two_raid10_3legs_1_rimage_2(0),synced_two_raid10_3legs_1_rimage_3(0),synced_two_raid10_3legs_1_rimage_4(0),synced_two_raid10_3legs_1_rimage_5(0)
   [synced_two_raid10_3legs_1_rimage_0] iwi-aor-p- 168.00m          unknown device(1)
   [synced_two_raid10_3legs_1_rimage_1] iwi-aor--- 168.00m          /dev/sdb1(1)
   [synced_two_raid10_3legs_1_rimage_2] iwi-aor-p- 168.00m          unknown device(1)
   [synced_two_raid10_3legs_1_rimage_3] iwi-aor--- 168.00m          /dev/sdd1(1)
   [synced_two_raid10_3legs_1_rimage_4] iwi-aor--- 168.00m          /dev/sde1(1)
   [synced_two_raid10_3legs_1_rimage_5] iwi-aor--- 168.00m          /dev/sdh1(1)
   [synced_two_raid10_3legs_1_rmeta_0]  ewi-aor-p-   4.00m          unknown device(0)
   [synced_two_raid10_3legs_1_rmeta_1]  ewi-aor---   4.00m          /dev/sdb1(0)
   [synced_two_raid10_3legs_1_rmeta_2]  ewi-aor-p-   4.00m          unknown device(0)
   [synced_two_raid10_3legs_1_rmeta_3]  ewi-aor---   4.00m          /dev/sdd1(0)
   [synced_two_raid10_3legs_1_rmeta_4]  ewi-aor---   4.00m          /dev/sde1(0)
   [synced_two_raid10_3legs_1_rmeta_5]  ewi-aor---   4.00m          /dev/sdh1(0)

Verifying FAILED device /dev/sdc1 is *NOT* in the volume(s)
Verifying FAILED device /dev/sda1 is *NOT* in the volume(s)
Verifying IMAGE device /dev/sdb1 *IS* in the volume(s)
Verifying IMAGE device /dev/sdd1 *IS* in the volume(s)
Verifying IMAGE device /dev/sde1 *IS* in the volume(s)
Verifying IMAGE device /dev/sdh1 *IS* in the volume(s)
verify the rimage/rmeta dm devices remain after the failures

Checking EXISTENCE and STATE of synced_two_raid10_3legs_1_rimage_2 on: host-002.virt.lab.msp.redhat.com 
Checking EXISTENCE and STATE of synced_two_raid10_3legs_1_rmeta_2 on: host-002.virt.lab.msp.redhat.com 
Checking EXISTENCE and STATE of synced_two_raid10_3legs_1_rimage_0 on: host-002.virt.lab.msp.redhat.com 
Checking EXISTENCE and STATE of synced_two_raid10_3legs_1_rmeta_0 on: host-002.virt.lab.msp.redhat.com 

Verify the raid image order is what's expected based on raid fault policy
EXPECTED LEG ORDER: unknown /dev/sdb1 unknown /dev/sdd1 /dev/sde1 /dev/sdh1
ACTUAL LEG ORDER: unknown /dev/sdb1 unknown /dev/sdd1 /dev/sde1 /dev/sdh1

Fault policy is warn, manually repairing failed raid volumes
host-002.virt.lab.msp.redhat.com: 'lvconvert --yes --repair black_bird/synced_two_raid10_3legs_1'
  PV OuO3aj-5uki-4uH3-H03t-uoOc-rE5R-BfJj4f not recognised. Is the device missing?
  PV UgfkxC-hZBz-mbks-GYO1-V3h1-he4F-ixdwRf not recognised. Is the device missing?
  Insufficient suitable allocatable extents for logical volume : 254 more required
  Failed to allocate replacement images for black_bird/synced_two_raid10_3legs_1
  Attempting replacement of 1 devices instead of 2

Waiting until all mirror|raid volumes become fully syncd...
   0/1 mirror(s) are fully synced: ( 20.96% )
   1/1 mirror(s) are fully synced: ( 100.00% )

Verifying files (checkit) on mirror(s) on...
        ---- host-002.virt.lab.msp.redhat.com ----

Enabling device sdc on host-002.virt.lab.msp.redhat.com (lvmetad is on)
Enabling device sda on host-002.virt.lab.msp.redhat.com (lvmetad is on)

-------------------------------------------------------------------------------
Force a vgreduce to clean up the corrupt additional LV
-------------------------------------------------------------------------------
  PV OuO3aj-5uki-4uH3-H03t-uoOc-rE5R-BfJj4f not recognised. Is the device missing?
  PV UgfkxC-hZBz-mbks-GYO1-V3h1-he4F-ixdwRf not recognised. Is the device missing?
  PV OuO3aj-5uki-4uH3-H03t-uoOc-rE5R-BfJj4f not recognised. Is the device missing?
  PV UgfkxC-hZBz-mbks-GYO1-V3h1-he4F-ixdwRf not recognised. Is the device missing?
  Internal error: Performing unsafe table load while 18 device(s) are known to be suspended:  (253:16) 
  
  [DEADLOCK]


# From syslog:
qarshd[25797]: Running cmdline: echo "- - -" > /sys/class/scsi_host/host2/scan
kernel: [ 7427.071965] scsi 2:0:0:1: Direct-Access     IET      VIRTUAL-DISK     0001 PQ: 0 ANSI: 5
kernel: [ 7427.076708] sd 2:0:0:1: Attached scsi generic sg8 type 0
kernel: [ 7427.077472] sd 2:0:0:1: [sda] 20971520 512-byte logical blocks: (10.7 GB/10.0 GiB)
kernel: [ 7427.081328] sd 2:0:0:1: [sda] Write Protect is off
kernel: [ 7427.082969] sd 2:0:0:1: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
kernel: [ 7427.091721]  sda: sda1
kernel: [ 7427.098108] sd 2:0:0:1: [sda] Attached SCSI disk
qarshd[25806]: Running cmdline: echo "- - -" > /sys/class/scsi_host/host7/scan
kernel: [ 7427.206696] scsi 7:0:0:1: Direct-Access     IET      VIRTUAL-DISK     0001 PQ: 0 ANSI: 5
kernel: [ 7427.210626] sd 7:0:0:1: [sdc] 20971520 512-byte logical blocks: (10.7 GB/10.0 GiB)
kernel: [ 7427.212202] sd 7:0:0:1: Attached scsi generic sg10 type 0
kernel: [ 7427.217196] sd 7:0:0:1: [sdc] Write Protect is off
kernel: [ 7427.220121] sd 7:0:0:1: [sdc] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
kernel: [ 7427.231385]  sdc: sdc1
kernel: [ 7427.239565] sd 7:0:0:1: [sdc] Attached SCSI disk
qarshd[25810]: Running cmdline: vgreduce --removemissing --force black_bird
systemd: Starting LVM2 PV scan on device 8:33...
systemd: Starting LVM2 PV scan on device 8:1...
pvscan: WARNING: Inconsistent metadata found for VG black_bird
systemd: Started LVM2 PV scan on device 8:33.
kernel: [ 7427.545488] device-mapper: raid: Failed to read superblock of device at position 0
pvscan: WARNING: Inconsistent metadata found for VG black_bird
pvscan: Refusing refresh of partial LV synced_two_raid10_3legs_1. Use --partial to override.
lvm[3300]: No longer monitoring snapshot black_bird-bb_snap1
lvm[3300]: Monitoring snapshot black_bird-bb_snap1
kernel: [ 7427.671092] md/raid10:mdX: active with 5 out of 6 devices
kernel: [ 7427.673447] created bitmap (1 pages) for device mdX
lvm[3300]: No longer monitoring snapshot black_bird-bb_snap1
lvm[3300]: No longer monitoring RAID device black_bird-synced_two_raid10_3legs_1-real for events.
pvscan: Refusing refresh of partial LV synced_two_raid10_3legs_1. Use --partial to override.
kernel: [ 7427.753724] mdX: bitmap initialized from disk: read 1 pages, set 112 of 1008 bits
lvm[3300]: Monitoring snapshot black_bird-bb_snap1
pvscan: Refusing refresh of partial LV synced_two_raid10_3legs_1. Use --partial to override.
pvscan: Attempted to decrement suspended device counter below zero.
pvscan: Refusing refresh of partial LV synced_two_raid10_3legs_1. Use --partial to override.
lvm[3300]: No longer monitoring snapshot black_bird-bb_snap1
systemd-udevd: worker [25799] /devices/virtual/block/dm-13 timeout; kill it
systemd-udevd: seq 8955 '/devices/virtual/block/dm-13' killed
systemd-udevd: worker [25800] /devices/virtual/block/dm-15 timeout; kill it
systemd-udevd: seq 8967 '/devices/virtual/block/dm-15' killed


Version-Release number of selected component (if applicable):
3.10.0-80.el7.x86_64

lvm2-2.02.105-2.el7    BUILT: Sun Jan 26 09:00:43 CST 2014
lvm2-libs-2.02.105-2.el7    BUILT: Sun Jan 26 09:00:43 CST 2014
lvm2-cluster-2.02.105-2.el7    BUILT: Sun Jan 26 09:00:43 CST 2014
device-mapper-1.02.84-2.el7    BUILT: Sun Jan 26 09:00:43 CST 2014
device-mapper-libs-1.02.84-2.el7    BUILT: Sun Jan 26 09:00:43 CST 2014
device-mapper-event-1.02.84-2.el7    BUILT: Sun Jan 26 09:00:43 CST 2014
device-mapper-event-libs-1.02.84-2.el7    BUILT: Sun Jan 26 09:00:43 CST 2014
device-mapper-persistent-data-0.2.8-3.el7    BUILT: Fri Dec 27 13:40:56 CST 2013
cmirror-2.02.105-2.el7    BUILT: Sun Jan 26 09:00:43 CST 2014

Comment 1 Corey Marthaler 2014-01-31 17:36:33 UTC
Created attachment 857936 [details]
kernel stack

Comment 6 Jonathan Earl Brassow 2015-04-10 22:16:33 UTC
corey, is it possible to reproduce this?

I tried some time back (March 2014) but was never able to reproduce it.

I will close if you are unable.

Comment 7 Corey Marthaler 2015-05-22 20:48:12 UTC
I don't remember seeing this in the testing of the latest 7.1 rpms. Let's close this and we can reopen it if this is reproduced in 7.2.