Hide Forgot
Description of problem: # When a single pv volume has its device failed, there still exists a dm entry for it [root@taft-01 ~]# lvcreate -L 100M -n test TEST Logical volume "test" created [root@taft-01 ~]# lvs -a -o +devices LV VG Attr LSize Devices test TEST -wi-a----- 100.00m /dev/sdb1(0) [root@taft-01 ~]# echo offline > /sys/block/sdb/device/state [root@taft-01 ~]# lvs -a -o +devices /dev/TEST/test: read failed after 0 of 4096 at 104792064: Input/output error [...] /dev/sdb1: read failed after 0 of 512 at 4096: Input/output error Couldn't find device with uuid Q5XL0S-9LPI-ctEi-FD5h-R31H-TX1T-AsnB7e. LV VG Attr LSize Devices test TEST -wi-a---p- 100.00m unknown device(0) [root@taft-01 ~]# dmsetup ls TEST-test (253:3) # However, when the lvol0_pmspare volume has its device failed, the dm entry is gone. Is this expected behavior? Should my test not care here and move on? Current mirror/raid device structure(s): LV Attr LSize Cpy%Sync Devices [lvol0_pmspare] ewi------- 200.00m /dev/sdc1(126) synced_random_raid1_3legs_1 twi-a-tz-- 500.00m synced_random_raid1_3legs_1_tdata(0) [synced_random_raid1_3legs_1_tdata] rwi-aor--- 500.00m 100.00 synced_random_raid1_3legs_1_tdata_rimage_0(0),synced_random_raid1_3legs_1_tdata_rimage_1(0),synced_random_raid1_3legs_1_tdata_rimage_2(0),synced_random_raid1_3legs_1_tdata_rimage_3(0) [synced_random_raid1_3legs_1_tdata_rimage_0] iwi-aor--- 500.00m /dev/sde1(1) [synced_random_raid1_3legs_1_tdata_rimage_1] iwi-aor--- 500.00m /dev/sdb1(1) [synced_random_raid1_3legs_1_tdata_rimage_2] iwi-aor--- 500.00m /dev/sdc1(1) [synced_random_raid1_3legs_1_tdata_rimage_3] iwi-aor--- 500.00m /dev/sdd1(1) [synced_random_raid1_3legs_1_tdata_rmeta_0] ewi-aor--- 4.00m /dev/sde1(0) [synced_random_raid1_3legs_1_tdata_rmeta_1] ewi-aor--- 4.00m /dev/sdb1(0) [synced_random_raid1_3legs_1_tdata_rmeta_2] ewi-aor--- 4.00m /dev/sdc1(0) [synced_random_raid1_3legs_1_tdata_rmeta_3] ewi-aor--- 4.00m /dev/sdd1(0) [synced_random_raid1_3legs_1_tmeta] ewi-ao---- 200.00m /dev/sde1(126) virt_synced_random_raid1_3legs_1 Vwi-aotz-- 200.00m Disabling device sdc on taft-01 Getting recovery check start time from /var/log/messages: Nov 13 16:26 Attempting I/O to cause mirror down conversion(s) on taft-01 10+0 records in 10+0 records out 41943040 bytes (42 MB) copied, 0.62023 s, 67.6 MB/s Verifying current sanity of lvm after the failure Current mirror/raid device structure(s): /dev/sdc1: read failed after 0 of 2048 at 0: Input/output error [...] /dev/sdc1: read failed after 0 of 512 at 4096: Input/output error Couldn't find device with uuid 3O6Ulm-NPhO-eDGX-A8in-68rS-3AKx-1Y7gFd. LV Attr LSize Cpy%Sync Devices [lvol0_pmspare] ewi-----p- 200.00m unknown device(126) snap1_synced_random_raid1_3legs_1 Vwi---tzpk 200.00m snap2_synced_random_raid1_3legs_1 Vwi---tzpk 200.00m snap3_synced_random_raid1_3legs_1 Vwi---tzpk 200.00m synced_random_raid1_3legs_1 twi-a-tzp- 500.00m synced_random_raid1_3legs_1_tdata(0) [synced_random_raid1_3legs_1_tdata] rwi-aor-p- 500.00m 100.00 synced_random_raid1_3legs_1_tdata_rimage_0(0),synced_random_raid1_3legs_1_tdata_rimage_1(0),synced_random_raid1_3legs_1_tdata_rimage_2(0),synced_random_raid1_3legs_1_tdata_rimage_3(0) [synced_random_raid1_3legs_1_tdata_rimage_0] iwi-aor--- 500.00m /dev/sde1(1) [synced_random_raid1_3legs_1_tdata_rimage_1] iwi-aor--- 500.00m /dev/sdb1(1) [synced_random_raid1_3legs_1_tdata_rimage_2] iwi-aor-p- 500.00m unknown device(1) [synced_random_raid1_3legs_1_tdata_rimage_3] iwi-aor--- 500.00m /dev/sdd1(1) [synced_random_raid1_3legs_1_tdata_rmeta_0] ewi-aor--- 4.00m /dev/sde1(0) [synced_random_raid1_3legs_1_tdata_rmeta_1] ewi-aor--- 4.00m /dev/sdb1(0) [synced_random_raid1_3legs_1_tdata_rmeta_2] ewi-aor-p- 4.00m unknown device(0) [synced_random_raid1_3legs_1_tdata_rmeta_3] ewi-aor--- 4.00m /dev/sdd1(0) [synced_random_raid1_3legs_1_tmeta] ewi-ao---- 200.00m /dev/sde1(126) virt_synced_random_raid1_3legs_1 Vwi-aotzp- 200.00m Verifying FAILED device /dev/sdc1 is *NOT* in the volume(s) Verifying IMAGE device /dev/sde1 *IS* in the volume(s) Verifying IMAGE device /dev/sdb1 *IS* in the volume(s) Verifying IMAGE device /dev/sdd1 *IS* in the volume(s) verify the rimage/rmeta dm devices remain after the failures Checking EXISTENCE and STATE of lvol0_pmspare on: taft-01 lvol0_pmspare on taft-01 should still exist [root@taft-01 ~]# dmsetup ls black_bird-synced_random_raid1_3legs_1_tdata_rmeta_1 (253:6) black_bird-synced_random_raid1_3legs_1_tdata_rmeta_0 (253:4) black_bird-synced_random_raid1_3legs_1_tdata_rimage_3 (253:11) black_bird-synced_random_raid1_3legs_1_tdata_rimage_2 (253:9) black_bird-virt_synced_random_raid1_3legs_1 (253:15) black_bird-synced_random_raid1_3legs_1_tdata_rimage_1 (253:7) black_bird-synced_random_raid1_3legs_1_tdata_rimage_0 (253:5) black_bird-synced_random_raid1_3legs_1 (253:14) black_bird-synced_random_raid1_3legs_1-tpool (253:13) black_bird-synced_random_raid1_3legs_1_tdata (253:12) black_bird-synced_random_raid1_3legs_1_tdata_rmeta_3 (253:10) black_bird-synced_random_raid1_3legs_1_tdata_rmeta_2 (253:8) black_bird-synced_random_raid1_3legs_1_tmeta (253:3) Version-Release number of selected component (if applicable): 2.6.32-425.el6.x86_64 lvm2-2.02.100-8.el6 BUILT: Wed Oct 30 03:10:56 CDT 2013 lvm2-libs-2.02.100-8.el6 BUILT: Wed Oct 30 03:10:56 CDT 2013 lvm2-cluster-2.02.100-8.el6 BUILT: Wed Oct 30 03:10:56 CDT 2013 udev-147-2.51.el6 BUILT: Thu Oct 17 06:14:34 CDT 2013 device-mapper-1.02.79-8.el6 BUILT: Wed Oct 30 03:10:56 CDT 2013 device-mapper-libs-1.02.79-8.el6 BUILT: Wed Oct 30 03:10:56 CDT 2013 device-mapper-event-1.02.79-8.el6 BUILT: Wed Oct 30 03:10:56 CDT 2013 device-mapper-event-libs-1.02.79-8.el6 BUILT: Wed Oct 30 03:10:56 CDT 2013 device-mapper-persistent-data-0.2.8-2.el6 BUILT: Mon Oct 21 09:14:25 CDT 2013 cmirror-2.02.100-8.el6 BUILT: Wed Oct 30 03:10:56 CDT 2013 How reproducible: Everytime
I'm somewhat confused from this question. From 'lvs' output lvol0_pmspare volume seems to be inactive and also 'dmsetup' is not listing volume as active. The tool will likely not automatically handle 'double faults' - thus if more errors appears at once then user needs to resolve problems.
Comment 1 makes a valid point. After looking at this more, the pmspare device *never* appears to be active (even before the failure) so it should never show up in dmsetup, correct? If that's the case then this can be closed NOTABUG. [root@host-109 ~]# lvs -a -o +devices LV Attr LSize Pool Data% Meta% Cpy%Sync Devices [lvol0_pmspare] ewi------- 500.00m /dev/sda1(126) raid1 twi-a-tz-- 500.00m 0.00 0.01 raid1_tdata(0) [raid1_tdata] rwi-aor--- 500.00m 100.00 raid1_tdata_rimage_0(0),raid1_tdata_rimage_1(0) [raid1_tdata_rimage_0] iwi-aor--- 500.00m /dev/sda1(1) [raid1_tdata_rimage_1] iwi-aor--- 500.00m /dev/sdb1(1) [raid1_tdata_rmeta_0] ewi-aor--- 4.00m /dev/sda1(0) [raid1_tdata_rmeta_1] ewi-aor--- 4.00m /dev/sdb1(0) [raid1_tmeta] ewi-ao---- 500.00m /dev/sdd1(0) virt_1 Vwi-a-tz-- 100.00m raid1 0.00 virt_2 Vwi-a-tz-- 100.00m raid1 0.00 [root@host-109 ~]# lvchange -ay TEST/lvol0_pmspare Unable to change internal LV lvol0_pmspare directly [root@host-109 ~]# lvs -a -o +devices LV Attr LSize Pool Data% Meta% Cpy%Sync Devices [lvol0_pmspare] ewi------- 500.00m /dev/sda1(126) raid1 twi-a-tz-- 500.00m 0.00 0.01 raid1_tdata(0) [raid1_tdata] rwi-aor--- 500.00m 100.00 raid1_tdata_rimage_0(0),raid1_tdata_rimage_1(0) [raid1_tdata_rimage_0] iwi-aor--- 500.00m /dev/sda1(1) [raid1_tdata_rimage_1] iwi-aor--- 500.00m /dev/sdb1(1) [raid1_tdata_rmeta_0] ewi-aor--- 4.00m /dev/sda1(0) [raid1_tdata_rmeta_1] ewi-aor--- 4.00m /dev/sdb1(0) [raid1_tmeta] ewi-ao---- 500.00m /dev/sdd1(0) virt_1 Vwi-a-tz-- 100.00m raid1 0.00 virt_2 Vwi-a-tz-- 100.00m raid1 0.00 [root@host-109 ~]# dmsetup ls TEST-raid1 (253:9) TEST-raid1-tpool (253:8) TEST-raid1_tdata (253:7) TEST-raid1_tmeta (253:2) TEST-virt_2 (253:11) TEST-raid1_tdata_rimage_1 (253:6) TEST-virt_1 (253:10) TEST-raid1_tdata_rimage_0 (253:4) TEST-raid1_tdata_rmeta_1 (253:5) TEST-raid1_tdata_rmeta_0 (253:3) [root@host-109 ~]# echo offline > /sys/block/sda/device/state [root@host-109 ~]# pvscan --cache /dev/sda1 /dev/sda1: read failed after 0 of 2048 at 0: Input/output error No PV label found on /dev/sda1. [root@host-109 ~]# lvs -a -o +devices PV 85qEBV-5uad-iH4L-KAjW-DMgj-EvHJ-fL2VfE not recognised. Is the device missing? PV 85qEBV-5uad-iH4L-KAjW-DMgj-EvHJ-fL2VfE not recognised. Is the device missing? LV Attr LSize Pool Data% Meta% Cpy%Sync Devices [lvol0_pmspare] ewi-----p- 500.00m unknown device(126) raid1 twi-a-tzp- 500.00m 0.00 0.01 raid1_tdata(0) [raid1_tdata] rwi-aor-p- 500.00m 100.00 raid1_tdata_rimage_0(0),raid1_tdata_rimage_1(0) [raid1_tdata_rimage_0] iwi-aor-p- 500.00m unknown device(1) [raid1_tdata_rimage_1] iwi-aor--- 500.00m /dev/sdb1(1) [raid1_tdata_rmeta_0] ewi-aor-p- 4.00m unknown device(0) [raid1_tdata_rmeta_1] ewi-aor--- 4.00m /dev/sdb1(0) [raid1_tmeta] ewi-ao---- 500.00m /dev/sdd1(0) virt_1 Vwi-a-tzp- 100.00m raid1 0.00 virt_2 Vwi-a-tzp- 100.00m raid1 0.00 [root@host-109 ~]# dmsetup ls TEST-raid1 (253:9) TEST-raid1-tpool (253:8) TEST-raid1_tdata (253:7) TEST-raid1_tmeta (253:2) TEST-virt_2 (253:11) TEST-raid1_tdata_rimage_1 (253:6) TEST-virt_1 (253:10) TEST-raid1_tdata_rimage_0 (253:4) TEST-raid1_tdata_rmeta_1 (253:5) TEST-raid1_tdata_rmeta_0 (253:3) 3.10.0-163.el7.x86_64 lvm2-2.02.111-1.el7 BUILT: Mon Sep 29 09:18:07 CDT 2014 lvm2-libs-2.02.111-1.el7 BUILT: Mon Sep 29 09:18:07 CDT 2014 lvm2-cluster-2.02.111-1.el7 BUILT: Mon Sep 29 09:18:07 CDT 2014 device-mapper-1.02.90-1.el7 BUILT: Mon Sep 29 09:18:07 CDT 2014 device-mapper-libs-1.02.90-1.el7 BUILT: Mon Sep 29 09:18:07 CDT 2014 device-mapper-event-1.02.90-1.el7 BUILT: Mon Sep 29 09:18:07 CDT 2014 device-mapper-event-libs-1.02.90-1.el7 BUILT: Mon Sep 29 09:18:07 CDT 2014 device-mapper-persistent-data-0.3.2-1.el7 BUILT: Thu Apr 3 09:58:51 CDT 2014 cmirror-2.02.111-1.el7 BUILT: Mon Sep 29 09:18:07 CDT 2014
_pmspare - as such will appear in the 'dmtable' in certain cases. One of them will be 'pvmove' - where we so far don't how to move 'extents' assigned to some LV offline. Other case could be some repair operation in progress. But other then these limited case, it should stay inactive.