Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1691826

Summary: fix concurrent lvmetad initializations
Product: Red Hat Enterprise Linux 7 Reporter: David Teigland <teigland>
Component: lvm2Assignee: David Teigland <teigland>
lvm2 sub component: LVM Metadata / lvmetad QA Contact: cluster-qe <cluster-qe>
Status: CLOSED DUPLICATE Docs Contact:
Severity: unspecified    
Priority: unspecified CC: agk, heinzm, jbrassow, jonathon.parrish, mcsontos, msnitzer, pdwyer, prajnoha, rhandlin, zkabelac
Version: 7.7   
Target Milestone: rc   
Target Release: 7.7   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-04-05 14:17:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description David Teigland 2019-03-22 15:32:56 UTC
Description of problem:

This is copying fedora bug 1672062.


The first pvscan[a] is for the nvme dev, and the second pvscan[b] is for the md0 dev.

Both pvscan's believe that they are doing lvmetad initialization (this is a bug we figured out for a different bz today.)

pvscan[a] gets the list of devices which does not include md0 because md0 hasn't appeared yet.
pvscan[a] begins the device scan based on the list of devs without md0,
pvscan[a] is delayed a bit (looks like udev db delays), but this delay isn't essential to the bug.
pvscan[a] is paused at this point while pvscan[b] runs...

pvscan[b] gets the list of devices including md0.
pvscan[b] does the device scan including both PVs and VGs.
pvscan[b] begins lvmetad initialization including md0 and finishes the update, so lvmetad correctly holds both vgs.
pvscan[b] pauses here (before the activation phase) and pvscan[a] runs...

pvscan[a] does the lvmetad initialization *without* md0 which did not exist when it scanned devs.

lvmetad has now been updated from pvscan[a] without md0!

pvscan[b] now runs the activation phase based on the two VGs it saw above.
pvscan[b] asks lvmetad for information about vg01 on md0, but lvmetad no longer knows about vg01 or md0.
pvscan[b] gets an error from lvmetad, and lvmetad logs this as an error:

Feb 12 20:41:13 myhostmydom.local lvmetad[849]: vg_lookup vgid 2FB2CO-8aRt-bY8B-rgQQ-A8YO-vMMa-wWdDTP name vg01 found incomple
Feb 12 20:41:13 myhostmydom.local lvm[1165]: #cache/lvmetad.c:986           Asking lvmetad for VG 2FB2CO-8aRt-bY8B-rgQQ-A8YO-v
Feb 12 20:41:13 myhostmydom.local lvm[1165]: #cache/lvmetad.c:785         Request to lookup VG 2FB2CO-8aRt-bY8B-rgQQ-A8YO-vMMa

pvscan[b] cannot get info for vg01 from lvmetad so it quits without activating vg01.  (But pvscan[b] does activate the other v

Also, lvmetad is left in the incorrect state.

As mentioned above, the other bug I sorted out earlier today causes multiple pvscan's to believe that they should do lvmetad i

Analysis of the bug causing multiple pvscans to believe they are doing lvmetad initialization, based on debug logs from Peter.

Because the pvscans run from systemd, are all run concurrently, each of them thinks that it is doing the first initialization
VGs.  The code that should serialize them isn't working because of timing issues (a combination of scanning taking a long time

Each pvscan does:

1. lvmetad_token_matches() to check the lvmetad state which detects
   if initialization is needed, and is supposed to wait for another
   pvscan that's doing initialization.

   a. this logs "Sending lvmetad get_global_info"
   b. see if another pvscan is updating lvmetad, and if so wait for
      that other pvscan to finish before continuing.
      (this is what we want to happen, but it's not)
   c. in your logs, each pvscan comes back seeing lvmetad is uninitialized
      and is not being updated, so each one believes it needs to do
      initialization

2. lvmetad_pvscan_all_devs(cmd, 0) to scan devs and populate lvmetad.

   a. this starts by scanning all devs (with the new parallel/aio)
   b. then it calls _token_update() to tell lvmetad it is updating state
   c. this logs "Sending lvmetad token_update update in progress"
   d. any new pvscan that starts *now* will wait for this one to finish,
      but any other pvscan already running will be doing these same
      initialization steps (sadly they are already running in this case)
   e. it sends lvmetad info about each PV that it saw from 2.a
   f. it calls _token_update() to tell lvmetad that it's done populating
      the cache, and sets an actual token.

The timing of things is defeating the serialization logic:

- Step 2a is taking over 10 seconds.

- The scanning in 2a is taking place *before* pvscan sets
  the "update in progress" token in 2c.  Because the scan takes so
  long, it gives all the pvscans a chance to start and think
  that init is required.  Before the new aio scan, I believe
  that "update in progress" would be set, *then* each device
  would be read serially, and as each was read, the info for that
  PV was sent to lvmetad.

My first idea for fixing this is to set the "update in progress" token
*before* scanning all the devs.


The timing of those two lvmetad serialization steps from your logs:

Feb 12 13:55:28.497284 fed.virt lvm[950]: #cache/lvmetad.c:257 Sending lvmetad get_global_info

Feb 12 13:55:28.714728 fed.virt lvm[972]: #cache/lvmetad.c:257 Sending lvmetad get_global_info

Feb 12 13:55:29.077168 fed.virt lvm[979]: #cache/lvmetad.c:257 Sending lvmetad get_global_info

Feb 12 13:55:29.548005 fed.virt lvm[991]: #cache/lvmetad.c:257 Sending lvmetad get_global_info

Feb 12 13:55:29.210852 fed.virt lvm[978]: #cache/lvmetad.c:257 Sending lvmetad get_global_info

(all pvscans now running, none have set "update in progress", so all think
they need to do initialization)

10+ seconds later, all are done scanning devices, and they all begin to
populate lvmetad in steps 2c - 2f above.

Feb 12 13:55:40.698830 fed.virt lvm[950]: #cache/lvmetad.c:559 Sending lvmetad token_update update in progress

Feb 12 13:55:40.784047 fed.virt lvm[972]: #cache/lvmetad.c:559 Sending lvmetad token_update update in progress

Feb 12 13:55:46.955881 fed.virt lvm[979]: #cache/lvmetad.c:559 Sending lvmetad token_update update in progress

Feb 12 13:55:47.952818 fed.virt lvm[991]: #cache/lvmetad.c:559 Sending lvmetad token_update update in progress

Feb 12 13:55:47.948535 fed.virt lvm[978]: #cache/lvmetad.c:559 Sending lvmetad token_update update in progress


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 Jonathan Earl Brassow 2019-04-04 20:46:34 UTC
a duplicate of bug 1672336 ?

If so, please close as such.  We will use that one, since it has user comments.

Comment 3 David Teigland 2019-04-05 14:17:36 UTC

*** This bug has been marked as a duplicate of bug 1672336 ***