Red Hat Bugzilla – Bug 995440
lvmetad unable to update metadata after failed PV has returned to mirror.
Last modified: 2013-11-21 18:26:47 EST
Description of problem: When a PV fails and leaves VG (mirror leg failure), then subsequently returns, lvmetad cannot seem to cope with updating of the metadata. It keeps repeating the same message over and over again This is the output: [root@tardis-01 log]# vgs Missing device /dev/sdk1 reappeared, updating metadata for VG revolution_9 to version 11. Missing device /dev/sdk1 reappeared, updating metadata for VG revolution_9 to version 11. VG #PV #LV #SN Attr VSize VFree revolution_9 6 1 0 wz--n- 465.80g 459.80g vg_tardis01 1 3 0 wz--n- 278.88g 0 [root@tardis-01 log]# pvs Missing device /dev/sdk1 reappeared, updating metadata for VG revolution_9 to version 11. Missing device /dev/sdk1 reappeared, updating metadata for VG revolution_9 to version 11. Missing device /dev/sdk1 reappeared, updating metadata for VG revolution_9 to version 11. Missing device /dev/sdk1 reappeared, updating metadata for VG revolution_9 to version 11. Missing device /dev/sdk1 reappeared, updating metadata for VG revolution_9 to version 11. Missing device /dev/sdk1 reappeared, updating metadata for VG revolution_9 to version 11. Missing device /dev/sdk1 reappeared, updating metadata for VG revolution_9 to version 11. PV VG Fmt Attr PSize PFree /dev/sda2 vg_tardis01 lvm2 a-- 278.88g 0 /dev/sdb1 revolution_9 lvm2 a-- 93.12g 91.12g /dev/sdc1 lvm2 a-- 93.13g 93.13g /dev/sdd1 revolution_9 lvm2 a-- 93.12g 91.12g /dev/sde1 revolution_9 lvm2 a-- 184.00m 184.00m /dev/sdf1 lvm2 a-- 93.13g 93.13g /dev/sdg1 lvm2 a-- 93.13g 93.13g /dev/sdi1 revolution_9 lvm2 a-- 93.12g 93.12g /dev/sdk1 revolution_9 lvm2 a-- 93.12g 93.12g /dev/sdl1 lvm2 a-- 93.13g 93.13g /dev/sdm1 revolution_9 lvm2 a-- 93.12g 91.12g /dev/sdo1 lvm2 a-- 93.13g 93.13g [root@tardis-01 log]# lvs Missing device /dev/sdk1 reappeared, updating metadata for VG revolution_9 to version 11. Missing device /dev/sdk1 reappeared, updating metadata for VG revolution_9 to version 11. LV VG Attr LSize Pool Origin Data% Move Log Cpy%Sync Convert mirror_1 revolution_9 mwi-a-m--- 2.00g mirror_1_mlog 100.00 lv_home vg_tardis01 -wi-ao---- 224.88g lv_root vg_tardis01 -wi-ao---- 50.00g lv_swap vg_tardis01 -wi-ao---- 4.00g The message: Missing device /dev/sdk1 reappeared, updating metadata for VG revolution_9 to version 11. keeps repeating with every LV command. By the way the returned PV is actually shown in the VG as present. with lvmetad off it goes like so: [root@tardis-01 log]# lvs --config 'global{use_lvmetad=0}' WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it! WARNING: Inconsistent metadata found for VG revolution_9 - updating to use version 11 Missing device /dev/sdk1 reappeared, updating metadata for VG revolution_9 to version 11. LV VG Attr LSize Pool Origin Data% Move Log Cpy%Sync Convert mirror_1 revolution_9 mwi-a-m--- 2.00g mirror_1_mlog 100.00 lv_home vg_tardis01 -wi-ao---- 224.88g lv_root vg_tardis01 -wi-ao---- 50.00g lv_swap vg_tardis01 -wi-ao---- 4.00g [root@tardis-01 log]# lvs --config 'global{use_lvmetad=0}' WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it! LV VG Attr LSize Pool Origin Data% Move Log Cpy%Sync Convert mirror_1 revolution_9 mwi-a-m--- 2.00g mirror_1_mlog 100.00 lv_home vg_tardis01 -wi-ao---- 224.88g lv_root vg_tardis01 -wi-ao---- 50.00g lv_swap vg_tardis01 -wi-ao---- 4.00g As you can see the message does not repeat anymore. Turn on lvmetad again and we get: [root@tardis-01 log]# lvs Missing device /dev/sdk1 reappeared, updating metadata for VG revolution_9 to version 11. Missing device /dev/sdk1 reappeared, updating metadata for VG revolution_9 to version 11. LV VG Attr LSize Pool Origin Data% Move Log Cpy%Sync Convert mirror_1 revolution_9 mwi-a-m--- 2.00g mirror_1_mlog 100.00 lv_home vg_tardis01 -wi-ao---- 224.88g lv_root vg_tardis01 -wi-ao---- 50.00g lv_swap vg_tardis01 -wi-ao---- 4.00g Version-Release number of selected component (if applicable): lvm2-2.02.100-0.45.el6.x86_64 How reproducible: Eveerytime Steps to Reproduce: 1. Create VG, Create a mirror LV, wait for sync 2. Fail a random PV (wait for repair/conversion) 3. Get the PV back and try to execute any LVM command with lvmetad running. Expected results: Should update metadata and its version as it does without lvmetad on.
I can still reproduce this easily by running revolution_9 test: revolution_9 -i 5 -o virt-012 -e kill_random_legs,kill_random_devices it is 100% reproducible for me (lvmetad on) [root@virt-012 ~]# vgs PV LpymAu-GeS1-jD9f-OHgn-keyq-kPcv-Jj1qO8 not recognised. Is the device missing? Missing device /dev/sdj1 reappeared, updating metadata for VG revolution_9 to version 55. Missing device /dev/sdb1 reappeared, updating metadata for VG revolution_9 to version 55. PV LpymAu-GeS1-jD9f-OHgn-keyq-kPcv-Jj1qO8 not recognised. Is the device missing? Missing device /dev/sdj1 reappeared, updating metadata for VG revolution_9 to version 55. Missing device /dev/sdb1 reappeared, updating metadata for VG revolution_9 to version 55. VG #PV #LV #SN Attr VSize VFree revolution_9 6 1 0 wz-pn- 59.95g 55.95g vg_virt012 1 2 0 wz--n- 7.51g 0 [root@virt-012 ~]# lvs PV LpymAu-GeS1-jD9f-OHgn-keyq-kPcv-Jj1qO8 not recognised. Is the device missing? Missing device /dev/sdj1 reappeared, updating metadata for VG revolution_9 to version 55. Missing device /dev/sdb1 reappeared, updating metadata for VG revolution_9 to version 55. PV LpymAu-GeS1-jD9f-OHgn-keyq-kPcv-Jj1qO8 not recognised. Is the device missing? Missing device /dev/sdj1 reappeared, updating metadata for VG revolution_9 to version 55. Missing device /dev/sdb1 reappeared, updating metadata for VG revolution_9 to version 55. LV VG Attr LSize Pool Origin Data% Move Log Cpy%Sync Convert mirror_1 revolution_9 mwi-aom--- 2.00g mirror_1_mlog 100.00 lv_root vg_virt012 -wi-ao---- 6.71g lv_swap vg_virt012 -wi-ao---- 816.00m ================================================================================ ================================================================================ An easy way to reproduce WITHOUT this revolution test would be: Turn off lvmetad and set use_lvmetad to 0. Have these set in lvm.conf: mirror_log_fault_policy="allocate" mirror_image_fault_policy="allocate" mirror_segtype_default="mirror" Create a VG, and a mirrored LV, wait for sync, then fail a device: - vgcreate newvg /dev/sd{a..f}1 - lvcreate -m1 -L3G -n mirror newvg - echo 1 >/sys/block/sda/device/delete do some I/O to force repair and replacement of the device wait for the sync to finish 9so that LVM is not doing anything on the LV anymore) When device is replaced and sync is done, turn on lvmetad by changing use_lvmetad to 1 in lvm.conf and starting lvm2-lvmetad daemon (/etc/init.d/lvm2-lcmetad start). Return the failing device by rescanning the scsi bus for example: echo "- - -" >/sys/class/scsi_host/host6/scan Now try any lvm command, here are the results: [root@virt-012 ~]# lvs vg Missing device /dev/sda1 reappeared, updating metadata for VG newvg to version 12. Missing device /dev/sda1 reappeared, updating metadata for VG newvg to version 12. LV VG Attr LSize Pool Origin Data% Move Log Cpy%Sync Convert mirror newvg mwi-a-m--- 3.00g mirror_mlog 100.00 lv_root vg_virt012 -wi-ao---- 6.71g lv_swap vg_virt012 -wi-ao---- 816.00m [root@virt-012 ~]# vgs Missing device /dev/sda1 reappeared, updating metadata for VG newvg to version 12. Missing device /dev/sda1 reappeared, updating metadata for VG newvg to version 12. VG #PV #LV #SN Attr VSize VFree newvg 6 1 0 wz--n- 59.95g 53.95g vg_virt012 1 2 0 wz--n- 7.51g 0 [root@virt-012 ~]# pvscan --cache WARNING: Inconsistent metadata found for VG newvg [root@virt-012 ~]# vgs Missing device /dev/sda1 reappeared, updating metadata for VG newvg to version 12. Missing device /dev/sda1 reappeared, updating metadata for VG newvg to version 12. VG #PV #LV #SN Attr VSize VFree newvg 6 1 0 wz--n- 59.95g 53.95g vg_virt012 1 2 0 wz--n- 7.51g 0 The number just stays the same.
The problem here seems to be with lvmetad updating its cache. I've upstreamed test case for internal lvm test suite for such case: https://www.redhat.com/archives/lvm-devel/2013-October/msg00021.html Fix for lvmetad case needs to be added.
This should be fixed upstream in 0decd7553ac9dcf4a7d81f5b10b1f4ca053ae9a5. (We have cleared up the matter about lvconvert --repair: without lvmetad, it accidentally repairs the metadata even though it's not supposed to do that. It's not a major issue, although it might be surprising that PVs are re-integrated from by dmeventd. With "normal" commands, things now work as expected both with and without lvmetad.)
The messages no longer repeat and metadata is updated as it should be. [root@virt-011 ~]# vgs PV SRfdXr-cr5q-1lgz-UFi3-7uRd-dJhd-vcKszr not recognised. Is the device missing? PV SRfdXr-cr5q-1lgz-UFi3-7uRd-dJhd-vcKszr not recognised. Is the device missing? VG #PV #LV #SN Attr VSize VFree newvg 4 1 0 wz-pn- 39.97g 33.96g vg_virt011 1 2 0 wz--n- 7.51g 0 [root@virt-011 ~]# echo "- - -" >/sys/class/scsi_host/host9/scan [root@virt-011 ~]# lvs -a Missing device /dev/sdb1 reappeared, updating metadata for VG newvg to version 10. LV VG Attr LSize Pool Origin Data% Move Log Cpy%Sync Convert mirror newvg mwi-a-m--- 3.00g mirror_mlog 100.00 [mirror_mimage_0] newvg iwi-aom--- 3.00g [mirror_mimage_1] newvg iwi-aom--- 3.00g [mirror_mlog] newvg lwi-aom--- 4.00m lv_root vg_virt011 -wi-ao---- 6.71g lv_swap vg_virt011 -wi-ao---- 816.00m [root@virt-011 ~]# lvs -a LV VG Attr LSize Pool Origin Data% Move Log Cpy%Sync Convert mirror newvg mwi-a-m--- 3.00g mirror_mlog 100.00 [mirror_mimage_0] newvg iwi-aom--- 3.00g [mirror_mimage_1] newvg iwi-aom--- 3.00g [mirror_mlog] newvg lwi-aom--- 4.00m lv_root vg_virt011 -wi-ao---- 6.71g lv_swap vg_virt011 -wi-ao---- 816.00m marking VERIFIED with: lvm2-2.02.100-5.el6.x86_64
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1704.html