Bug 1647167

Summary: vg_lookup vgid name found incomplete mapping
Product: Red Hat Enterprise Linux 7 Reporter: Corey Marthaler <cmarthal>
Component: lvm2Assignee: David Teigland <teigland>
lvm2 sub component: LVM lock daemon / lvmlockd QA Contact: cluster-qe <cluster-qe>
Status: CLOSED WONTFIX Docs Contact:
Severity: low    
Priority: unspecified CC: agk, heinzm, jbrassow, pasik, prajnoha, sbradley, teigland, zkabelac
Version: 7.6   
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-11-15 19:29:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Corey Marthaler 2018-11-06 19:10:48 UTC
Description of problem:
During a VG metadata locking intensive test, I saw this failure momentarily while attempting to get the sync percent of an active mirror. This is a test that has passed on other clusters, so this is probably a difficult to recreate timing issue. 


Nov  2 05:41:42 mckinley-02 qarshd[63209]: Running cmdline: lvs lock_stress/mckinley-02.28583 --noheadings -o copy_percent | awk {'print $1'}
Nov  2 05:41:42 mckinley-02 lvmetad[4892]: vg_lookup vgid btouMp-IJi0-TLFq-x0EO-G2qH-21dy-jpCZC3 name lock_stress found incomplete mapping uuid none name none

This cmd when run again later was fine:
[root@mckinley-02 ~]# lvs lock_stress/mckinley-02.28583 --noheadings -o copy_percent 
  100.00  

This test does lots of creates/conversions/etc to exclusively active volumes from different machines all located on the same VG. Here's what was done on this machine prior to this lvs cmd failing:

Creating a 5 redundant legged cmirror named mckinley-02.28583
  Logical volume "mckinley-02.28583" created.

Down converting cmirror from 5 legs to 1 on mckinley-02
  Retrying ex lock on VG lock_stress
  Logical volume lock_stress/mckinley-02.28583 converted.
Now... down converting cmirror from  legs to linear on mckinley-02
  Logical volume lock_stress/mckinley-02.28583 converted.

Up converting linear to a 2 redundant legged cmirror on mckinley-02
  Logical volume lock_stress/mckinley-02.28583 being converted.
  lock_stress/mckinley-02.28583: Converted: 0.00%
  lock_stress/mckinley-02.28583: Converted: 100.00%

Nov  2 05:41:42 mckinley-02 qarshd[63209]: Running cmdline: lvs lock_stress/mckinley-02.28583 --noheadings -o copy_percent | awk {'print $1'}
Nov  2 05:41:42 mckinley-02 lvmetad[4892]: vg_lookup vgid btouMp-IJi0-TLFq-x0EO-G2qH-21dy-jpCZC3 name lock_stress found incomplete mapping uuid none name none

  Volume group "lock_stress" not found
  Cannot process volume group lock_stress





Version-Release number of selected component (if applicable):
3.10.0-957.el7.x86_64

lvm2-2.02.180-10.el7_6.2    BUILT: Wed Oct 31 03:55:58 CDT 2018
lvm2-libs-2.02.180-10.el7_6.2    BUILT: Wed Oct 31 03:55:58 CDT 2018
lvm2-cluster-2.02.180-10.el7_6.2    BUILT: Wed Oct 31 03:55:58 CDT 2018
lvm2-lockd-2.02.180-10.el7_6.2    BUILT: Wed Oct 31 03:55:58 CDT 2018
cmirror-2.02.180-10.el7_6.2    BUILT: Wed Oct 31 03:55:58 CDT 2018
device-mapper-1.02.149-10.el7_6.2    BUILT: Wed Oct 31 03:55:58 CDT 2018
device-mapper-libs-1.02.149-10.el7_6.2    BUILT: Wed Oct 31 03:55:58 CDT 2018
device-mapper-event-1.02.149-10.el7_6.2    BUILT: Wed Oct 31 03:55:58 CDT 2018
device-mapper-event-libs-1.02.149-10.el7_6.2    BUILT: Wed Oct 31 03:55:58 CDT 2018


How reproducible:
Just once so far

Comment 3 David Teigland 2018-11-15 19:29:45 UTC
I've seen this a few times in recent years, and it's never really caused any trouble.  It looks like there's some unusual race between updating lvmetad contents from one process while reading it from another.  Given it's rare, and the effect isn't too severe, I don't think it's worth the effort to pick apart lvmetad since it's going away.