Bug 1467975
Summary: | stray lvmlockd lock when attempting to create volume using an already in use minor number | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Corey Marthaler <cmarthal> | ||||
Component: | lvm2 | Assignee: | LVM and device-mapper development team <lvm-team> | ||||
lvm2 sub component: | LVM lock daemon / lvmlockd | QA Contact: | cluster-qe <cluster-qe> | ||||
Status: | CLOSED ERRATA | Docs Contact: | |||||
Severity: | medium | ||||||
Priority: | unspecified | CC: | agk, heinzm, jbrassow, mcsontos, prajnoha, teigland, zkabelac | ||||
Version: | 7.4 | ||||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | lvm2-2.02.175-1.el7 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2018-04-10 15:20:44 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Corey Marthaler
2017-07-05 17:36:28 UTC
> #vgchange --lock-stop raid_sanity
> VG raid_sanity stop failed: LVs must first be deactivated
This is printed when lvmlockd returns -EBUSY for the lock-stop request from vgchange. lvmlockd returns -EBUSY if it finds LV locks still exist in the lockspace. -EBUSY is then translated into the "LVs are active" message.
What's not clear is what LV locks still exist and why. If no LVs exist, no LV locks should exist either. So, I'm guessing that there may have been a command run during the test that left a stray LV lock in the lockspace (a bug).
'sanlock gets' reports info about lockspaces+hosts, but not about locks. The 'sanlock status' command will show info about lockspaces+locks, which should show us if lvmlockd is still holding any stray LV locks. To actually debug where a stray lock came from, we'll need to collect the full debug log from lvmlockd after the --lock-stop failure. The command for that is 'lvmlockctl --dump' redirected to some file.
[root@host-114 ~]# vgchange --lock-stop raid_sanity [root@host-114 ~]# sanlock status daemon 116c961f-2475-43fa-9bef-373f3f068f46.host-114.v p -1 helper p -1 listener p 28036 lvmlockd p -1 status s lvm_global:105:/dev/mapper/global-lvmlock:0 [root@host-115 ~]# vgchange --lock-stop raid_sanity VG raid_sanity stop failed: LVs must first be deactivated [root@host-115 ~]# sanlock status daemon 2e704e09-785f-400d-82c1-9675ca80bdf4.host-115.v p -1 helper p -1 listener p 16581 lvmlockd p 16581 lvmlockd p -1 status s lvm_raid_sanity:1063:/dev/mapper/raid_sanity-lvmlock:0 s lvm_global:1063:/dev/mapper/global-lvmlock:0 r lvm_raid_sanity:zd1NSD-pHdO-Kxy3-VmFD-urzN-EecT-7mzRbk:/dev/mapper/raid_sanity-lvmlock:71303168:1 p 16581 Created attachment 1294744 [details]
lvmlockctl --dump from failing node host-115
r lvm_raid_sanity:zd1NSD-pHdO-Kxy3-VmFD-urzN-EecT-7mzRbk:/dev/mapper/raid_sanity-lvmlock:71303168:1 p 16581 Searching for "zd1NSD-pHdO-Kxy3-VmFD-urzN-EecT-7mzRbk": 1499286064 send lvcreate[32619] cl 330 find_free_lock vg rv 0 1499286064 recv lvcreate[32619] cl 330 init lv "raid_sanity" mode iv flags 0 1499286064 work init_lv raid_sanity/inuse_minorB uuid zd1NSD-pHdO-Kxy3-VmFD-urzN-EecT-7mzRbk 1499286064 S lvm_raid_sanity init_lv_san zd1NSD-pHdO-Kxy3-VmFD-urzN-EecT-7mzRbk found unused area at 71303168 1499286064 send lvcreate[32619] cl 330 init lv rv 0 vg_args 1.0.0:lvmlock lv_args 1.0.0:71303168 1499286064 recv lvcreate[32619] cl 330 update vg "raid_sanity" mode iv flags 0 1499286064 S lvm_raid_sanity R VGLK action update iv 1499286064 S lvm_raid_sanity R VGLK res_update cl 330 lk version to 281 1499286064 send lvcreate[32619] cl 330 update vg rv 0 1499286064 recv lvcreate[32619] cl 330 update vg "raid_sanity" mode iv flags 0 1499286064 S lvm_raid_sanity R VGLK action update iv 1499286064 S lvm_raid_sanity R VGLK res_update cl 330 lk version to 282 1499286064 send lvcreate[32619] cl 330 update vg rv 0 1499286065 recv lvcreate[32619] cl 330 lock lv "raid_sanity" mode ex flags 1 1499286065 S lvm_raid_sanity R zd1NSD-pHdO-Kxy3-VmFD-urzN-EecT-7mzRbk action lock ex 1499286065 S lvm_raid_sanity R zd1NSD-pHdO-Kxy3-VmFD-urzN-EecT-7mzRbk res_lock cl 330 mode ex (inuse_minorB) 1499286065 S lvm_raid_sanity R zd1NSD-pHdO-Kxy3-VmFD-urzN-EecT-7mzRbk lock_san ex at /dev/mapper/raid_sanity-lvmlock:71303168 1499286065 S lvm_raid_sanity R zd1NSD-pHdO-Kxy3-VmFD-urzN-EecT-7mzRbk res_lock rv 0 1499286065 send lvcreate[32619] cl 330 lock lv rv 0 1499286065 recv lvcreate[32619] cl 330 update vg "raid_sanity" mode iv flags 0 1499286065 S lvm_raid_sanity R VGLK action update iv 1499286065 S lvm_raid_sanity R VGLK res_update cl 330 lk version to 283 1499286065 send lvcreate[32619] cl 330 update vg rv 0 1499286065 recv lvcreate[32619] cl 330 lock vg "raid_sanity" mode un flags 0 Which is the following test case: SCENARIO (raid1) - [create_inuse_minor_raid] Create a raid and then attempt to reuse it's minor num on a new raid Creating raid with rand minor num 165 lvcreate --activate ey --type raid1 -m 1 -n inuse_minorA -L 300M -My --major 253 --minor 165 raid_sanity WARNING: Ignoring supplied major number 253 - kernel assigns major numbers dynamically. Using major number 253 instead. dmsetup ls | grep inuse_minorA | grep 165 Attempt to create raid with in use minor num 165 lvcreate --activate ey --type raid1 -m 1 -n inuse_minorB -L 300M -My --major 253 --minor 165 raid_sanity WARNING: Ignoring supplied major number 253 - kernel assigns major numbers dynamically. Using major number 253 instead. The requested major:minor pair (253:165) is already used. Failed to activate new LV. Deactivating raid inuse_minorA... and removing VG raid_sanity stop failed: LVs must first be deactivated unable to stop lock space for raid_sanity on host-115 [root@host-115 ~]# lvs LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert root rhel_host-115 -wi-ao---- <6.20g swap rhel_host-115 -wi-ao---- 820.00m [root@host-115 ~]# sanlock status daemon 9986d712-c73e-4690-a2ce-51b911a84641.host-115.v p -1 helper p -1 listener p 21184 lvmlockd p 21184 lvmlockd p -1 status s lvm_raid_sanity:1063:/dev/mapper/raid_sanity-lvmlock:0 s lvm_global:1063:/dev/mapper/global-lvmlock:0 r lvm_raid_sanity:WT1nN5-EtwM-pJnJ-uexB-zoHi-cQij-lI4g2I:/dev/mapper/raid_sanity-lvmlock:71303168:1 p 21184 Thanks for the debugging, it looks like the error path is missing the unlock which leaves the stray LV lock as suspected. fixed here https://sourceware.org/git/?p=lvm2.git;a=commit;h=3797f47ecf23c41d4476e2cce0f210b48b32923d *** Bug 1489986 has been marked as a duplicate of this bug. *** Fix verified in the latest rpms. lvm2-2.02.175-2.el7 BUILT: Fri Oct 13 06:31:22 CDT 2017 lvm2-libs-2.02.175-2.el7 BUILT: Fri Oct 13 06:31:22 CDT 2017 lvm2-cluster-2.02.175-2.el7 BUILT: Fri Oct 13 06:31:22 CDT 2017 device-mapper-1.02.144-2.el7 BUILT: Fri Oct 13 06:31:22 CDT 2017 device-mapper-libs-1.02.144-2.el7 BUILT: Fri Oct 13 06:31:22 CDT 2017 device-mapper-event-1.02.144-2.el7 BUILT: Fri Oct 13 06:31:22 CDT 2017 device-mapper-event-libs-1.02.144-2.el7 BUILT: Fri Oct 13 06:31:22 CDT 2017 device-mapper-persistent-data-0.7.0-0.1.rc6.el7 BUILT: Mon Mar 27 10:15:46 CDT 2017 cmirror-2.02.175-2.el7 BUILT: Fri Oct 13 06:31:22 CDT 2017 sanlock-3.5.0-1.el7 BUILT: Wed Apr 26 09:37:30 CDT 2017 sanlock-lib-3.5.0-1.el7 BUILT: Wed Apr 26 09:37:30 CDT 2017 lvm2-lockd-2.02.175-2.el7 BUILT: Fri Oct 13 06:31:22 CDT 2017 # mirrors host-040: pvcreate /dev/sdf2 /dev/sdf1 /dev/sda2 /dev/sda1 /dev/sdh2 /dev/sdh1 /dev/sdd2 /dev/sdd1 /dev/sdg2 /dev/sdg1 host-040: vgcreate --shared mirror_sanity /dev/sdf2 /dev/sdf1 /dev/sda2 /dev/sda1 /dev/sdh2 /dev/sdh1 /dev/sdd2 /dev/sdd1 /dev/sdg2 /dev/sdg1 host-040: vgchange --lock-start mirror_sanity host-041: vgchange --lock-start mirror_sanity host-042: vgchange --lock-start mirror_sanity ============================================================ Iteration 1 of 1 started at Mon Oct 16 17:43:28 CDT 2017 ============================================================ SCENARIO - [create_inuse_minor_mirror] Create a mirror and then attempt to reuse it's minor num on a new mirror Creating mirror with rand minor num 145 WARNING: Ignoring supplied major number 253 - kernel assigns major numbers dynamically. Using major number 253 instead. Attempt to create mirror with in use minor num 145 WARNING: Ignoring supplied major number 253 - kernel assigns major numbers dynamically. Using major number 253 instead. The requested major:minor pair (253:145) is already used. Failed to activate new LV. Deactivating mirror inuse_minorA... and removing # thin pools host-040: pvcreate /dev/sdf1 /dev/sda1 /dev/sdh1 /dev/sdd1 /dev/sdg1 host-040: vgcreate --shared snapper_thinp /dev/sdf1 /dev/sda1 /dev/sdh1 /dev/sdd1 /dev/sdg1 host-040: vgchange --lock-start snapper_thinp host-041: vgchange --lock-start snapper_thinp host-042: vgchange --lock-start snapper_thinp ============================================================ Iteration 1 of 1 started at Mon Oct 16 17:46:40 CDT 2017 ============================================================ SCENARIO - [create_inuse_minor_thin_snap] Create a snapshot and then attempt to reuse it's minor num on a new snapshot Making pool volume lvcreate --activate ey --thinpool POOL -L 1G --profile thin-performance --zero n --poolmetadatasize 4M snapper_thinp There should be no "stripesize" messages in pool create output (possible regression of bug 1382860) Using default stripesize 64.00 KiB. Thin pool volume with chunk size 64.00 KiB can address at most 15.81 TiB of data. Logical volume "POOL" created. Making origin volume lvcreate --activate ey --virtualsize 1G -T snapper_thinp/POOL -n origin lvcreate --activate ey -V 1G -T snapper_thinp/POOL -n other1 WARNING: Sum of all thin volume sizes (2.00 GiB) exceeds the size of thin pool snapper_thinp/POOL (1.00 GiB). lvcreate --activate ey -V 1G -T snapper_thinp/POOL -n other2 WARNING: Sum of all thin volume sizes (3.00 GiB) exceeds the size of thin pool snapper_thinp/POOL (1.00 GiB). lvcreate --activate ey -V 1G -T snapper_thinp/POOL -n other3 WARNING: Sum of all thin volume sizes (4.00 GiB) exceeds the size of thin pool snapper_thinp/POOL (1.00 GiB). lvcreate --activate ey --virtualsize 1G -T snapper_thinp/POOL -n other4 WARNING: Sum of all thin volume sizes (5.00 GiB) exceeds the size of thin pool snapper_thinp/POOL (1.00 GiB). lvcreate --activate ey -V 1G -T snapper_thinp/POOL -n other5 WARNING: Sum of all thin volume sizes (6.00 GiB) exceeds the size of thin pool snapper_thinp/POOL (1.00 GiB). Creating snapshot with rand minor num 102 lvcreate --activate ey -k n -s /dev/snapper_thinp/origin -n inuse_minorA -My --major 253 --minor 102 WARNING: Ignoring supplied major number 253 - kernel assigns major numbers dynamically. Using major number 253 instead. WARNING: Sum of all thin volume sizes (7.00 GiB) exceeds the size of thin pool snapper_thinp/POOL (1.00 GiB). Attempt to create snapshot with in use minor num 102 Removing thin origin and other virtual thin volumes Removing pool snapper_thinp/POOL # raids host-040: pvcreate /dev/sdf2 /dev/sdf1 /dev/sda2 /dev/sda1 /dev/sdh2 /dev/sdh1 /dev/sdd2 /dev/sdd1 /dev/sdg2 /dev/sdg1 host-040: vgcreate --shared raid_sanity /dev/sdf2 /dev/sdf1 /dev/sda2 /dev/sda1 /dev/sdh2 /dev/sdh1 /dev/sdd2 /dev/sdd1 /dev/sdg2 /dev/sdg1 host-040: vgchange --lock-start raid_sanity host-041: vgchange --lock-start raid_sanity host-042: vgchange --lock-start raid_sanity ============================================================ Iteration 1 of 1 started at Mon Oct 16 17:50:09 CDT 2017 ============================================================ SCENARIO (raid1) - [create_inuse_minor_raid] Create a raid and then attempt to reuse it's minor num on a new raid Creating raid with rand minor num 104 WARNING: Ignoring supplied major number 253 - kernel assigns major numbers dynamically. Using major number 253 instead. Attempt to create raid with in use minor num 104 WARNING: Ignoring supplied major number 253 - kernel assigns major numbers dynamically. Using major number 253 instead. The requested major:minor pair (253:104) is already used. Failed to activate new LV. perform raid scrubbing (lvchange --syncaction check) on raid raid_sanity/inuse_minorA raid_sanity/inuse_minorA state is currently "resync". Unable to switch to "check". Waiting until all mirror|raid volumes become fully syncd... 1/1 mirror(s) are fully synced: ( 100.00% ) Sleeping 15 sec Deactivating raid inuse_minorA... and removing Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:0853 |