| Summary: | [RFE] better error when unable to grab VG lock on one node | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Corey Marthaler <cmarthal> |
| Component: | lvm2 | Assignee: | LVM and device-mapper development team <lvm-team> |
| lvm2 sub component: | LVM lock daemon / lvmlockd | QA Contact: | cluster-qe <cluster-qe> |
| Status: | CLOSED CURRENTRELEASE | Docs Contact: | |
| Severity: | low | ||
| Priority: | unspecified | CC: | agk, heinzm, jbrassow, prajnoha, teigland, zkabelac |
| Version: | 7.3 | Keywords: | FutureFeature |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-09-09 15:09:32 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Description
Corey Marthaler
2016-09-27 16:44:05 UTC
This is much easier to reproduce now that we've added more meta data change intensive testing for 7.5. This test keeps a continuous amount of meta operation commands running on different nodes in the cluster simultaneously, so it's not likely a "real world" type scenario. Sep 28 14:22:29 host-040 qarshd[21990]: Running cmdline: lvconvert --type mirror -m 1 lock_stress/host-040.16366 Up converting linear to a 1 redundant legged mirror on host-040 Retrying ex lock on VG lock_stress Retrying ex lock on VG lock_stress Retrying ex lock on VG lock_stress VG lock_stress lock failed: error -11 couldn't up convert cmirror on host-040 3.10.0-693.2.1.el7.x86_64 lvm2-2.02.171-8.el7 BUILT: Wed Jun 28 13:28:58 CDT 2017 lvm2-libs-2.02.171-8.el7 BUILT: Wed Jun 28 13:28:58 CDT 2017 lvm2-cluster-2.02.171-8.el7 BUILT: Wed Jun 28 13:28:58 CDT 2017 device-mapper-1.02.140-8.el7 BUILT: Wed Jun 28 13:28:58 CDT 2017 device-mapper-libs-1.02.140-8.el7 BUILT: Wed Jun 28 13:28:58 CDT 2017 device-mapper-event-1.02.140-8.el7 BUILT: Wed Jun 28 13:28:58 CDT 2017 device-mapper-event-libs-1.02.140-8.el7 BUILT: Wed Jun 28 13:28:58 CDT 2017 device-mapper-persistent-data-0.7.0-0.1.rc6.el7 BUILT: Mon Mar 27 10:15:46 CDT 2017 cmirror-2.02.171-8.el7 BUILT: Wed Jun 28 13:28:58 CDT 2017 sanlock-3.5.0-1.el7 BUILT: Wed Apr 26 09:37:30 CDT 2017 sanlock-lib-3.5.0-1.el7 BUILT: Wed Apr 26 09:37:30 CDT 2017 lvm2-lockd-2.02.171-8.el7 BUILT: Wed Jun 28 13:28:58 CDT 2017 Some additional lvmpolld messages during the failure...
Creating a 1 redundant legged cmirror named host-034.15783
Logical volume "host-034.15783" created.
Down converting cmirror from 1 legs to linear on host-034
Retrying ex lock on VG lock_stress
Logical volume lock_stress/host-034.15783 converted.
Up converting linear to a 3 redundant legged cmirror on host-034
Retrying ex lock on VG lock_stress
Logical volume lock_stress/host-034.15783 being converted.
lock_stress/host-034.15783: Converted: 64.00%
lock_stress/host-034.15783: Converted: 100.00%
lock_stress/host-034.15783: Converted: 100.00%
lock_stress/host-034.15783: Converted: 100.00%
For more information see lvmpolld messages in syslog or lvmpolld log file.
LVM command executed by lvmpolld failed.
couldn't up convert cmirror on host-034
Oct 5 00:32:49 host-034 qarshd[16606]: Running cmdline: lvs lock_stress/host-034.15763 --noheadings -o copy_percent | awk {'print $1'}
Oct 5 00:32:50 host-034 qarshd[16607]: Running cmdline: lvs lock_stress/host-034.15765 --noheadings -o copy_percent | awk {'print $1'}
Oct 5 00:32:55 host-034 lvmpolld: W: #011LVPOLL: PID 16516: STDERR: ' Retrying ex lock on VG lock_stress'
Oct 5 00:33:01 host-034 lvmpolld: W: #011LVPOLL: PID 16592: STDERR: ' Retrying ex lock on VG lock_stress'
Oct 5 00:33:07 host-034 lvmpolld: W: #011LVPOLL: PID 16516: STDERR: ' Retrying ex lock on VG lock_stress'
Oct 5 00:33:13 host-034 lvmpolld: W: #011LVPOLL: PID 16592: STDERR: ' Retrying ex lock on VG lock_stress'
Oct 5 00:33:19 host-034 lvmpolld: W: #011LVPOLL: PID 16516: STDERR: ' Retrying ex lock on VG lock_stress'
Oct 5 00:33:24 host-034 systemd: Started qarsh Per-Connection Server (10.15.80.218:48300).
Oct 5 00:33:24 host-034 systemd: Starting qarsh Per-Connection Server (10.15.80.218:48300)...
Oct 5 00:33:24 host-034 qarshd[16639]: Talking to peer ::ffff:10.15.80.218:48300 (IPv6)
Oct 5 00:33:24 host-034 qarshd[16639]: Running cmdline: lvconvert --type mirror -m 3 --corelog lock_stress/host-034.15758
Oct 5 00:33:24 host-034 lvmpolld: W: #011LVPOLL: PID 16592: STDERR: ' Retrying ex lock on VG lock_stress'
Oct 5 00:33:31 host-034 lvmpolld[16309]: LVMPOLLD: lvm2 cmd (PID 16516) failed (retcode: 5)
Oct 5 00:33:31 host-034 lvmpolld: W: #011LVPOLL: PID 16516: STDERR: ' VG lock_stress lock failed: error -11'
Oct 5 00:33:31 host-034 lvmpolld: W: #011LVPOLL: PID 16516: STDERR: ABORTING: Can't lock VG for lock_stress/host-034.15783.
Oct 5 00:33:34 host-034 qarshd[16229]: Sending child 16231 signal 2
Oct 5 00:33:34 host-034 qarshd[16639]: Sending child 16640 signal 2
At some point error -11 was changed to "VG %s lock failed: held by other host." |