Description of problem: Current chain: xxx < 20d6814c < 7a5d99a4 (top) Live Merge result: xxx < 20d6814c (top) Problem: 20d6814c VOLTYPE is set to INTERNAL instead of LEAF. Checking the logs, it seems this is a result of the deleteVolume on the SPM failing to do [1] 2019-03-22 07:24:32,251+1300 INFO (jsonrpc/1) [vdsm.api] START deleteVolume(sdUUID=u'0b0107b2-9d2a-4871-b4d6-76bdffba84cd', spUUID=u'58117215-0261-0167-0311-0000000002b6', imgUUID=u'778f6bf5-77e8-4f75-a41d-a1d5341f0c3d', volumes=[u'7a5d99a4-ad44-41f1-94cd-1d67b3d8c13c'], postZero=u'false', force=u'false', discard=False) from=::ffff:10.1.224.10,59366, flow_id=dcdde73b-3b1f-4259-89c1-3b63ff21a67a, task_id=478b1e32-aea5-4710-850a-8b525e5c165f (api:46) 2019-03-22 07:24:56,108+1300 WARN (tasks/2) [storage.Volume] cannot finalize parent volume 20d6814c-1234-480d-8128-17fee729927d (blockVolume:625) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/blockVolume.py", line 617, in delete self.setParent(sc.BLANK_UUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line 1494, in setParent self.setParentMeta(puuid) File "/usr/lib/python2.7/site-packages/vdsm/storage/blockVolume.py", line 756, in setParentMeta return self._manifest.setParentMeta(puuid) File "/usr/lib/python2.7/site-packages/vdsm/storage/blockVolume.py", line 270, in setParentMeta self.setMetaParam(sc.PUUID, puuid) File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line 327, in setMetaParam meta = self.getMetadata() File "/usr/lib/python2.7/site-packages/vdsm/storage/blockVolume.py", line 100, in getMetadata metaId = self.getMetadataId() File "/usr/lib/python2.7/site-packages/vdsm/storage/blockVolume.py", line 81, in getMetadataId return (self.sdUUID, self.getMetaOffset()) File "/usr/lib/python2.7/site-packages/vdsm/storage/blockVolume.py", line 85, in getMetaOffset md = getVolumeTag(self.sdUUID, self.volUUID, sc.TAG_PREFIX_MD) File "/usr/lib/python2.7/site-packages/vdsm/storage/blockVolume.py", line 773, in getVolumeTag tags = lvm.getLV(sdUUID, volUUID).tags File "/usr/lib/python2.7/site-packages/vdsm/storage/lvm.py", line 970, in getLV raise se.LogicalVolumeDoesNotExistError("%s/%s" % (vgName, lvName)) LogicalVolumeDoesNotExistError: Logical volume does not exist: (u'0b0107b2-9d2a-4871-b4d6-76bdffba84cd/7a5d99a4-ad44-41f1-94cd-1d67b3d8c13c',) This happened because the "lvm lvs" preceding it failed. And it failed with a very weird error: 2019-03-22 07:24:56,107+1300 WARN (tasks/2) [storage.LVM] lvm lvs failed: 5 [] [' Metadata on /dev/mapper/36000d31000e8a600000000000000000c at 27487763185152 has wrong VG name "= [', '"pv1", 167317', ']', '}', 'segment9 {', 'start_extent = 88', 'extent_count = 8', '', 'type = "striped"', 'stripe_count = 1', '', 'stripes = [', '"pv1", 16751" expected 0b0107b2-9d2a-4871-b4d6-76bdffba84cd.', ' Volume group "0b0107b2-9d2a-4871-b4d6-76bdffba84cd" not found', ' Cannot process volume group 0b0107b2-9d2a-4871-b4d6-76bdffba84cd'] (lvm:431) 2019-03-22 07:24:56,107+1300 WARN (tasks/2) [storage.LVM] lv: 7a5d99a4-ad44-41f1-94cd-1d67b3d8c13c not found in lvs vg: 0b0107b2-9d2a-4871-b4d6-76bdffba84cd response (lvm:627) So the VG name was "= [', '"pv1", 167317', ']', '}', 'segm....." which is actually a random piece of LVM metadata on that PV. * The LVM metadata for this SD is 1.6MB (1300 LVs, high fragmentation of LV segments) * There are several snapshots created/deleted around these times (Commvault) Differently to BZ1553133, here the VG is not getting corrupted, it just fails to properly read the metadata for short periods of time. It looks like there is some improper locking of lvm commands, or it is reading the lvm metadata while something else is still writing? [1] https://github.com/oVirt/vdsm/blob/master/lib/vdsm/storage/blockVolume.py#L616 Version-Release number of selected component (if applicable): vdsm-4.20.46-1.el7ev.x86_64 kernel-3.10.0-957.1.3.el7.x86_64 lvm2-2.02.180-10.el7_6.2.x86_64 How reproducible: Not yet
This also seem to have caused the metadata not being written at all for a newly created volume, I'll update with more details tomorrow when I finish that one.
(In reply to Germano Veit Michel from comment #0) > Differently to BZ1553133, here the VG is not getting corrupted, it just > fails to properly read the metadata for short periods of time. Not entirely sure actually. But here the seqno of the metadata is sequential, there are no jumps. Still there are not that many in the first 128MB as each is 1.6MB.
Nir, this seems like the same root cause as in bug 1553133 although the outcome is a bit different (no corruption), the solution should fix both since we change the locking mechanism, right?
This seems like like the temporary random failures reported by qe. David, is this a result of lvm on another host, trying to "fix" the metadata during lvchange --refresh or other command that should not change the metadata?
(In reply to Nir Soffer from comment #6) > This seems like like the temporary random failures reported by qe. > > David, is this a result of lvm on another host, trying to "fix" the metadata > during lvchange --refresh or other command that should not change the > metadata? While one host is writing VG metadata (as a normal change, not fixing anything), if another host tries to read the VG (e.g. lvs) at the same time, the lvs may see bad VG metadata and report errors. The locking_type 4 change from bug 1553133 should prevent the 'lvs' from "fixing" any transient problem it sees. It shouldn't be difficult to test this with lvm directly. On one host run a loop that does repeated 'lvcreate' in the VG, and on the second host run a loop that does repeated 'lvs' in the same VG. I expect you may see the 'lvs' report errors periodically, especially when the VG metadata grows large.
(In reply to David Teigland from comment #7) > (In reply to Nir Soffer from comment #6) In this case we see an error in lvs command on the SPM - the only host adding, removing or extending lvs. Is it possible that we had: 1. host A: create/extend/delete/change tags -> writing new metadata 2. host B: run lvs/vgs/pvs/lvchange --refresh -> try fix inconsistent metadata 3. host A: run lvs -> see inconsistent metadata from bad fix by host B, fix it again Step 1 and 2 may happen in parallel.
> Is it possible that we had: > > 1. host A: create/extend/delete/change tags -> writing new metadata > 2. host B: run lvs/vgs/pvs/lvchange --refresh -> try fix inconsistent > metadata > 3. host A: run lvs -> see inconsistent metadata from bad fix by host B, fix > it again > > Step 1 and 2 may happen in parallel. If host B was able to modify the metadata, then host A could report errors and try to fix things again. If the metadata buffer is large enough, you might be able to see evidence of those changes from previous copies of the metadata in the buffer.
David, do we have a way to detect when lvm try to fix inconsistent metadata? can we enable specific debug logs that will reveal when this happens? We already collect anything lvm write to stderr. Currently this goes to debug log that is disable by default but we can change it to log warnings if this can help to diagnose such issues.
Is the code not using the locking_type 4 change from bug 1553133 which should prevent the unwanted repair? If not, then you would probably see this: WARNING: Inconsistent metadata found for VG %s - updating to use version %u"
(In reply to David Teigland from comment #11) > Is the code not using the locking_type 4 change from bug 1553133 which > should prevent the unwanted repair? Not yet, integrating locking_type 4 is not that easy. > If not, then you would probably see this: > WARNING: Inconsistent metadata found for VG %s - updating to use version %u" Sounds good, I'll change the logging to log lvm stderr as warning.
I think the attached patch is the best we can do for now. With the improved log, we will warnings when lvm try to fix metadata. I think should deliver the improved log in 4.3.5, and reopen this bug when we have more data from users. Avihai, I think your team filed one or more bugs about the same issue, which may be a duplicate of this bug.
Possible duplicate: bug 1637405
(In reply to Nir Soffer from comment #13) > I think the attached patch is the best we can do for now. With the improved > log, we will warnings when lvm try to fix metadata. > > I think should deliver the improved log in 4.3.5, and reopen this bug when > we have more data from users. > > Avihai, I think your team filed one or more bugs about the same issue, which > may be a duplicate of this bug. In the bug you mentioned deleteVolume VDSM error looks different so I'm not sure about its similarity VDSM: 2019-05-04 21:55:30,237+0300 ERROR (tasks/7) [storage.TaskManager.Task] (Task='3f8954bd-093f-4dff-a69e-266965267a33') Unexpected error (task:875) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 336, in run return self.cmd(*self.argslist, **self.argsdict) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper return method(self, *args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 1967, in deleteVolume vol.delete(postZero=postZero, force=force, discard=discard) File "/usr/lib/python2.7/site-packages/vdsm/storage/blockVolume.py", line 588, in delete self.validateDelete() However Bug 1666795 which were resolved at 4.3.2 looks with a similar errors in VDSM log(VolumeDoesNotExist) , maybe this is more related?
(In reply to Avihai from comment #16) The error in vdsm log does not matter. It is a result of the actual issue, not the root cause. When we run the automation tests again with the fix, we may get warnings from LVM showing the root cause. The expected result is: 1. Random flow fail on the SPM, trying to access LVM with a message about bad LVM metadata, like the one from comment 0: WARN (tasks/2) [storage.LVM] lvm lvs failed: 5 [] [' Metadata on /dev/mapper/36000d31000e8a600000000000000000c at 27487763185152 has wrong VG name "= [', '"pv1", 167317', ']', '}', 'segment9 {',... 2. At the same time, on another host, WARN log message with: WARN [storage.LVM] Command [...] succeeded with warnings: [' WARNING: Inconsistent metadata found for VG xxx-yyy-zzz - updating to use version yy'] In this case we know that LVM command running on that host tried to fix the VG metadata, which may corrupt the VG metadata - bug 1553133.
(In reply to Nir Soffer from comment #18) > (In reply to Avihai from comment #16) > > The error in vdsm log does not matter. It is a result of the actual issue, > not > the root cause. > > When we run the automation tests again with the fix, we may get warnings > from LVM showing the root cause. > > The expected result is: > > 1. Random flow fail on the SPM, trying to access LVM with a message about > bad LVM metadata, like the one from comment 0: > > WARN (tasks/2) [storage.LVM] lvm lvs failed: 5 [] [' Metadata on > /dev/mapper/36000d31000e8a600000000000000000c at 27487763185152 has > wrong VG name "= [', '"pv1", 167317', ']', '}', 'segment9 {',... > > 2. At the same time, on another host, WARN log message with: > > WARN [storage.LVM] Command [...] succeeded with warnings: [' WARNING: > Inconsistent metadata found for VG xxx-yyy-zzz - updating to use version > yy'] > > In this case we know that LVM command running on that host tried to fix > the VG metadata, which may corrupt the VG metadata - bug 1553133. Moved the discussion to clone 4.3 bug 1693075, please update all info there. Also see my comment about retargeting this bug to 4.3.6 in that bug/
David, with see these LVM warnings in ovirt system tests: 2019-07-04 12:11:45,478-0400 WARN (jsonrpc/5) [storage.LVM] Command ['/sbin/lvm', 'lvs', '--config', 'devices { preferred_names=["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 filter=["a|^/dev/mapper/3600140525911f82f9ce4e34b70c73bc2$|^/dev/mapper/3600140541592ee797ff458aba08a28a4$|", "r|.*|"] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min=50 retain_days=0 }', '--noheadings', '--units', 'b', '--nosuffix', '--separator', '|', '--ignoreskippedcluster', '-o', 'uuid,name,vg_name,attr,size,seg_start_pe,devices,tags', '05a08319-1d06-41a1-a7ad-fccd5fa3768e'] succeeded with warnings: [' WARNING: Cannot find matching striped segment for 05a08319-1d06-41a1-a7ad-fccd5fa3768e/97fec0b7-b94c-46d1-a47d-494b4f0ccdcc.'] (lvm:391) This warning repeated 35 times on a host which is not the SPM. What is the meaning of this warning?
That think it means that this host has an active LV that has been modified (e.g. extended) from another host, but has not been refreshed on this host. So, lvm prints this warning when it seems the metadata for the LV does not match the LV active in device mapper.
(In reply to David Teigland from comment #21) > That think it means that this host has an active LV that has been modified > (e.g. extended) from another host, but has not been refreshed on this host. > So, lvm prints this warning when it seems the metadata for the LV does not > match the LV active in device mapper. This may happen during automatic extend flow. We see these warnings now in our system tests (49 warnings per run): 2019-07-12 15:18:34,034-0400 WARN (jsonrpc/7) [storage.LVM] Command ['/sbin/lvm', 'lvchange', '--config', 'devices { preferred_names=["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 filter=["a|^/dev/mapper/3600140510698bd4b8b84143b0f7dbbc7$|^/dev/mapper/360014053544df693e4b4f63a6ba4a428$|", "r|.*|"] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min=50 retain_days=0 }', '--autobackup', 'n', '-a', 'y', '--deltag', u'IU_651741de-1549-4659-88bb-37d117044df9', '--addtag', u'IU__remove_me_651741de-1549-4659-88bb-37d117044df9', '91047f09-f2d7-4453-8e11-483fed09556d/fdd994be-e45e-43b2-a614-9b9626cb0aef'] succeeded with warnings: [' Scan of VG 91047f09-f2d7-4453-8e11-483fed09556d from /dev/mapper/360014053544df693e4b4f63a6ba4a428 found metadata seqno 72 vs previous 71.', ' Scan of VG 91047f09-f2d7-4453-8e11-483fed09556d from /dev/mapper/360014053544df693e4b4f63a6ba4a428 found metadata seqno 72 vs previous 71.'] (lvm:391) This should run only on the SPM host, so I wonder how lvm can see inconsistent metadata if this is the only host changing the VG/LVs? Does lvm try to fix the metadata in this flow?
This is the complete flow when this error happens: 2019-07-12 15:11:56,157-0400 DEBUG (jsonrpc/3) [jsonrpc.JsonRpcServer] Calling 'Image.delete' in bridge with {u'force': u'false', u'imageID': u'cba6f922-1def-40b7-b197-c65a1 088847d', u'postZero': u'false', u'storagepoolID': u'5eddd851-93f5-482d-b0c7-7245d98e66ea', u'storagedomainID': u'91047f09-f2d7-4453-8e11-483fed09556d', u'discard': True} (_ _init__:329) 2019-07-12 15:11:56,174-0400 INFO (jsonrpc/3) [vdsm.api] START deleteImage(sdUUID=u'91047f09-f2d7-4453-8e11-483fed09556d', spUUID=u'5eddd851-93f5-482d-b0c7-7245d98e66ea', i mgUUID=u'cba6f922-1def-40b7-b197-c65a1088847d', postZero=u'false', force=u'false', discard=True) from=::ffff:192.168.201.4,59046, flow_id=7240e365, task_id=457a38dc-b1a5-473 5-816f-65e6840ada4a (api:48) 2019-07-12 15:11:56,543-0400 WARN (jsonrpc/3) [storage.LVM] Command ['/sbin/lvm', 'lvchange', '--config', 'devices { preferred_names=["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 filter=["a|^/dev/mapper/3600140510698bd4b8b84143b0f7dbbc7$|^/dev/mapper/360014053544df693e4b4f63a6ba4a428$|", "r|.*|"] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min=50 retain_days=0 }', '--autobackup', 'n', '-a', 'y', '--deltag', u'IU_cba6f922-1def-40b7-b197-c65a1088847d', '--addtag', u'IU__remove_me_cba6f922-1def-40b7-b197-c65a1088847d', '91047f09-f2d7-4453-8e11-483fed09556d/034748fe-3c34-4d05-88d4-00a2a3856513'] succeeded with warnings: [' Scan of VG 91047f09-f2d7-4453-8e11-483fed09556d from /dev/mapper/360014053544df693e4b4f63a6ba4a428 found metadata seqno 31 vs previous 30.', ' Scan of VG 91047f09-f2d7-4453-8e11-483fed09556d from /dev/mapper/360014053544df693e4b4f63a6ba4a428 found metadata seqno 31 vs previous 30.'] (lvm:391) 2019-07-12 15:11:56,567-0400 INFO (jsonrpc/3) [vdsm.api] FINISH deleteImage return=None from=::ffff:192.168.201.4,59046, flow_id=7240e365, task_id=457a38dc-b1a5-4735-816f-65e6840ada4a (api:54) This flow runs only on the SPM. Tested with latest CentOS 7.6.
On the other host in this setup, we see these LVM commands related to this VG before this warning on the SPM host: 2019-07-12 15:02:30,610-0400 INFO (jsonrpc/4) [storage.LVM] Activating lvs: vg=91047f09-f2d7-4453-8e11-483fed09556d lvs=['metadata'] (lvm:1448) 2019-07-12 15:02:30,904-0400 INFO (jsonrpc/4) [storage.LVM] Activating lvs: vg=91047f09-f2d7-4453-8e11-483fed09556d lvs=['ids'] (lvm:1448) 2019-07-12 15:02:31,092-0400 INFO (jsonrpc/4) [storage.LVM] Activating lvs: vg=91047f09-f2d7-4453-8e11-483fed09556d lvs=['leases'] (lvm:1448) 2019-07-12 15:02:31,452-0400 INFO (jsonrpc/4) [storage.LVM] Activating lvs: vg=91047f09-f2d7-4453-8e11-483fed09556d lvs=['inbox', 'outbox', 'xleases', 'master'] (lvm:1448) 2019-07-12 15:02:39,893-0400 INFO (jsonrpc/2) [storage.LVM] Activating lvs: vg=91047f09-f2d7-4453-8e11-483fed09556d lvs=['c8ab8b74-dfe7-4306-9924-bc970d4a1074'] (lvm:1448) 2019-07-12 15:02:40,718-0400 INFO (jsonrpc/4) [storage.LVM] Deactivating lvs: vg=91047f09-f2d7-4453-8e11-483fed09556d lvs=['c8ab8b74-dfe7-4306-9924-bc970d4a1074'] (lvm:1456) 2019-07-12 15:02:41,415-0400 INFO (jsonrpc/0) [storage.LVM] Activating lvs: vg=91047f09-f2d7-4453-8e11-483fed09556d lvs=['122328ea-bf51-43a6-8b10-fb0800aaa832'] (lvm:1448) 2019-07-12 15:02:42,159-0400 INFO (jsonrpc/3) [storage.LVM] Deactivating lvs: vg=91047f09-f2d7-4453-8e11-483fed09556d lvs=['122328ea-bf51-43a6-8b10-fb0800aaa832'] (lvm:1456) 2019-07-12 15:04:13,009-0400 INFO (jsonrpc/0) [storage.LVM] Activating lvs: vg=91047f09-f2d7-4453-8e11-483fed09556d lvs=['c8ab8b74-dfe7-4306-9924-bc970d4a1074'] (lvm:1448) 2019-07-12 15:04:14,238-0400 INFO (jsonrpc/5) [storage.LVM] Deactivating lvs: vg=91047f09-f2d7-4453-8e11-483fed09556d lvs=['c8ab8b74-dfe7-4306-9924-bc970d4a1074'] (lvm:1456) 2019-07-12 15:04:38,777-0400 INFO (jsonrpc/5) [storage.LVM] Activating lvs: vg=91047f09-f2d7-4453-8e11-483fed09556d lvs=['122328ea-bf51-43a6-8b10-fb0800aaa832'] (lvm:1448) 2019-07-12 15:04:40,551-0400 INFO (jsonrpc/4) [storage.LVM] Deactivating lvs: vg=91047f09-f2d7-4453-8e11-483fed09556d lvs=['122328ea-bf51-43a6-8b10-fb0800aaa832'] (lvm:1456) 2019-07-12 15:08:37,610-0400 INFO (jsonrpc/4) [storage.LVM] Activating lvs: vg=91047f09-f2d7-4453-8e11-483fed09556d lvs=['b798074f-c891-459f-aafe-fb54ce6d0134', 'c8ab8b74-dfe7-4306-9924-bc970d4a1074'] (lvm:1448) 2019-07-12 15:08:38,313-0400 INFO (jsonrpc/0) [storage.LVM] Deactivating lvs: vg=91047f09-f2d7-4453-8e11-483fed09556d lvs=['b798074f-c891-459f-aafe-fb54ce6d0134'] (lvm:1456) 2019-07-12 15:09:40,533-0400 INFO (vm/762f2d6f) [storage.LVM] Activating lvs: vg=91047f09-f2d7-4453-8e11-483fed09556d lvs=['122328ea-bf51-43a6-8b10-fb0800aaa832'] (lvm:1448) 2019-07-12 15:10:34,554-0400 INFO (mailbox-hsm/2) [storage.LVM] Refreshing LVs (vg=91047f09-f2d7-4453-8e11-483fed09556d, lvs=('122328ea-bf51-43a6-8b10-fb0800aaa832',)) (lvm:1472) 2019-07-12 15:11:20,133-0400 INFO (migsrc/762f2d6f) [storage.LVM] Activating lvs: vg=91047f09-f2d7-4453-8e11-483fed09556d lvs=['034748fe-3c34-4d05-88d4-00a2a3856513'] (lvm:1448) 2019-07-12 15:11:34,255-0400 INFO (migsrc/762f2d6f) [storage.LVM] Deactivating lvs: vg=91047f09-f2d7-4453-8e11-483fed09556d lvs=['034748fe-3c34-4d05-88d4-00a2a3856513'] (lvm:1456) 2019-07-12 15:11:34,491-0400 INFO (migsrc/762f2d6f) [storage.LVM] Activating lvs: vg=91047f09-f2d7-4453-8e11-483fed09556d lvs=['89750db5-703b-4da3-a971-a831422c319e'] (lvm:1448) 2019-07-12 15:11:35,045-0400 INFO (migsrc/762f2d6f) [storage.LVM] Deactivating lvs: vg=91047f09-f2d7-4453-8e11-483fed09556d lvs=['89750db5-703b-4da3-a971-a831422c319e'] (lvm:1456) 2019-07-12 15:11:35,397-0400 INFO (jsonrpc/4) [storage.LVM] Deactivating lvs: vg=91047f09-f2d7-4453-8e11-483fed09556d lvs=['122328ea-bf51-43a6-8b10-fb0800aaa832'] (lvm:1456) 2019-07-12 15:11:41,532-0400 INFO (jsonrpc/0) [storage.LVM] Activating lvs: vg=91047f09-f2d7-4453-8e11-483fed09556d lvs=['89750db5-703b-4da3-a971-a831422c319e'] (lvm:1448) 2019-07-12 15:11:42,054-0400 INFO (jsonrpc/0) [storage.LVM] Deactivating lvs: vg=91047f09-f2d7-4453-8e11-483fed09556d lvs=['89750db5-703b-4da3-a971-a831422c319e'] (lvm:1456) 2019-07-12 15:11:42,629-0400 INFO (vm/762f2d6f) [storage.LVM] Activating lvs: vg=91047f09-f2d7-4453-8e11-483fed09556d lvs=['122328ea-bf51-43a6-8b10-fb0800aaa832'] (lvm:1448) 2019-07-12 15:11:44,411-0400 INFO (vm/762f2d6f) [storage.LVM] Activating lvs: vg=91047f09-f2d7-4453-8e11-483fed09556d lvs=['034748fe-3c34-4d05-88d4-00a2a3856513'] (lvm:1448) 2019-07-12 15:11:55,847-0400 INFO (vm/762f2d6f) [storage.LVM] Deactivating lvs: vg=91047f09-f2d7-4453-8e11-483fed09556d lvs=['034748fe-3c34-4d05-88d4-00a2a3856513'] (lvm:1456) Is it possible that activating or deactivating LVs or refreshing another LV from another hosts caused LVM to try to fix the VG metadata without logging any warnings?
"found metadata seqno 31 vs previous 30" -- these messages usually mean that the metadata has changed between scanning consecutive metadata areas for the VG. This is during the label scan phase where VG locks are not held, so it could happen legitimately from time to time if you're reading a VG while it's being written from another host. LVM will not try to repair when it sees this. I think I should look at a completely debug logs of the LVM commands running during one of these events, to check exactly what's happening. It would be best to log lvm commands from all hosts during this event in case there are any changes being made that we don't expect.
sync2jira
Hi Nir, Lot's of work has been done on this bug so far. But no fix yet. With the work done on bug 1553133, could there be any progress on this bz as well?
This looks like a duplicate of bug 1553133. *** This bug has been marked as a duplicate of bug 1553133 ***
Removing stale need infos.