RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1812802 - lvchange --refresh succeeds with various warnings during stress test
Summary: lvchange --refresh succeeds with various warnings during stress test
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: lvm2
Version: 8.2
Hardware: Unspecified
OS: Unspecified
high
unspecified
Target Milestone: rc
: 8.0
Assignee: David Teigland
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-03-12 08:33 UTC by David Vaanunu
Modified: 2021-09-07 11:49 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-05-26 17:40:32 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
command debug output (173.17 KB, text/plain)
2020-03-18 20:14 UTC, David Teigland
no flags Details

Description David Vaanunu 2020-03-12 08:33:12 UTC
Description of problem:

Have WARNING messages while running extend.py script

Version-Release number of selected component (if applicable):

vdsm-4.40.0-164.git38a19bb.el8ev.x86_64
rhv-release-4.4.0-13-999.noarch
redhat-release-8.2-0.7.el8.x86_64
python2-subprocess32-3.2.6-14.el7.x86_64


How reproducible:

While trying to reproduce Bug1553133 on RHEL8.2, running extend.py script on 2 hosts. (Create, Extend & Remove LV)

High load test - 5 extends per second (Running with delay flag)

Steps to Reproduce:
1. Copy python script to both hosts from: 
   https://github.com/oVirt/vdsm/blob/master/tests/storage/stress/extend.py

2. Install python2 package on both hosts - python2-subprocess32-3.2.6-14.el7.x86_64

3. On the first host:
   python2 extend.py create-vg <VG_NAME> /dev/mapper/<LUN_ID>
   firewall-cmd --add-port=8000/tcp
   python2 extend.py run-manager -p 8000 <VG_NAME> /dev/mapper/<LUN_ID> 
   2>1_run-manager.log

4. On the second host:
   python2 extend.py run-regular --extend-delay 10 --iterations 100 -p 8000 <First Host hostnae/IP> <VG_NAME> /dev/mapper/<LUN_ID> 2>1_run-regular.log


Actual results:

Get WARNING Messages:
  * fopen failed: No such file or directory
  * Inconsistent metadata found for VG Bug
  * fgets failed: Invalid argument


Expected results:

LV commands create, extend and remove - run successfully.
No Errors/WARNING


Additional info:

*** Test log files

https://drive.google.com/drive/folders/17qInSKdoEX9ysebFNo4KP8Bo8vQ8VBKV
(RHEL82_delay_run-manager.log.xz & RHEL82_delay_run-regular.log.xz)

*** Full WARNING messages:

16 times
--------
2020-03-10 10:02:45,567 WARNING (worker-011) succeeded with warnings cmd=['lvchange', '--config', ' devices {  preferred_names = ["^/dev/mapper/"]  ignore_suspended_devices=1  write_cache_state=0  disable_after_error_count=3  filter=[ "a|^/dev/mapper/3600a098038304437415d4b6a59684474$|", "r|.*|" ] } global {  locking_type=1  prioritise_write_locks=1  wait_for_locks=1  use_lvmetad=0 } backup {  retain_min=50  retain_days=0 } ', '--refresh', 'Bug/worker-011-000000'] out='' err='  /sys/dev/block/253:28/dm/uuid: fopen failed: No such file or directory\n'

6 times
-------
2020-03-10 10:04:04,422 WARNING (worker-014) succeeded with warnings cmd=['lvchange', '--config', ' devices {  preferred_names = ["^/dev/mapper/"]  ignore_suspended_devices=1  write_cache_state=0  disable_after_error_count=3  filter=[ "a|^/dev/mapper/3600a098038304437415d4b6a59684474$|", "r|.*|" ] } global {  locking_type=1  prioritise_write_locks=1  wait_for_locks=1  use_lvmetad=0 } backup {  retain_min=50  retain_days=0 } ', '--refresh', 'Bug/worker-014-000001'] out='' err='  WARNING: ignoring metadata seqno 1566 on /dev/mapper/3600a098038304437415d4b6a59684474 for seqno 1567 on /dev/mapper/3600a098038304437415d4b6a59684474 for VG Bug.\n  WARNING: Inconsistent metadata found for VG Bug.\n'

2 times
-------
2020-03-10 14:30:36,324 WARNING (worker-036) succeeded with warnings cmd=['lvchange', '--config', ' devices {  preferred_names = ["^/dev/mapper/"]  ignore_suspended_devices=1  write_cache_state=0  disable_after_error_count=3  filter=[ "a|^/dev/mapper/3600a098038304437415d4b6a59684474$|", "r|.*|" ] } global {  locking_type=1  prioritise_write_locks=1  wait_for_locks=1  use_lvmetad=0 } backup {  retain_min=50  retain_days=0 } ', '--refresh', 'Bug/worker-036-000078'] out='' err='  /sys/dev/block/253:49/dm/uuid: fgets failed: Invalid argument\n'

Comment 1 Nir Soffer 2020-03-16 18:27:25 UTC
Seems that some warnings are eliminated upstream:

commit 957904933b2b0f94eddc6204ca2fa61e2df741a0 (HEAD -> master, origin/master, origin/HEAD)
Author: David Teigland <teigland>
Date:   Thu Mar 12 10:18:51 2020 -0500

    reduce device path error messsages
    
    When /dev entries or sysfs entries are changing
    due to concurrent lvm commands, it can cause
    warning/error messages about missing paths.

David, what about:

    WARNING: ignoring metadata seqno 1566 on /dev/mapper/x for seqno 1567 on /dev/mapper/y for VG vg-name.
    WARNING: Inconsistent metadata found for VG Bug.

These messages may be a warning when running lvm locally, but in a cluster
setup used by RHV, this is expected condition.

RHV can filter out these warnings, but looks like LVM is too verbose by default,
and it can help more users if LVM stops writing these warnings to stderr.

Comment 2 David Teigland 2020-03-16 18:46:41 UTC
Removing those warnings (about metadata inconsistency) for all users might be a good idea, but there are probably cases where we would need to replace them with different/better messages.  Since it's a real issue and can indicate real problems for non-RHV users, we would need to do some careful research and testing before dropping them, i.e. it's not a trivial case (like the fopen messages) that we can push out quickly.  The next time I'm working in this area it's something I'll look at, but for now you'll need to just ignore them.

Comment 3 Nir Soffer 2020-03-16 19:04:45 UTC
(In reply to David Teigland from comment #2)
Thanks David. I filed bug 1814022 for suppressing such warnings until we have
a better solution in LVM.

Comment 4 Zdenek Kabelac 2020-03-18 09:41:07 UTC
I think this case is effectively showing our label scanning is accessing devices which it should never open
(clearly filter command only lists 1 device).

Older version of lvm2 always accesses (open/read...) only devices that have passed the filter.

Current version seems to have parallel code for label scanning which is probably not properly filtering scanned devices through a whole filter chain and opens & reports messages from irrelevant devices causing possibly some other issues as well.

I believe the idea was to replace current label scanning with a new io manager engine - which was supposed to address the issue, but it has not yet happened.

Comment 5 Nir Soffer 2020-03-18 10:04:50 UTC
(In reply to Zdenek Kabelac from comment #4)
> I think this case is effectively showing our label scanning is accessing
> devices which it should never open
> (clearly filter command only lists 1 device).

I guess adding -vvvv to the lvm commands will shed more light on this.

David, can you rerun the same tests with --verbose flag?

    python2 extend.py run-regular \
        --verbose \
        --extend-delay 10 \
        --iterations 100 \
        --port 8000 \
        <manager-address> \
        <VG_NAME> \
        /dev/mapper/<LUN_ID> \
        2>run-regular.log

The warnings should show more info about the warnings.

Comment 6 David Teigland 2020-03-18 19:59:30 UTC
(In reply to Zdenek Kabelac from comment #4)
> I think this case is effectively showing our label scanning is accessing
> devices which it should never open
> (clearly filter command only lists 1 device).

That is incorrect AFAIK, I would be interested to see any logging that indicated this.

> Older version of lvm2 always accesses (open/read...) only devices that have
> passed the filter.
> 
> Current version seems to have parallel code for label scanning which is
> probably not properly filtering scanned devices through a whole filter chain
> and opens & reports messages from irrelevant devices causing possibly some
> other issues as well.

I have not seen this, please let me know if you find any evidence of that.

> I believe the idea was to replace current label scanning with a new io
> manager engine - which was supposed to address the issue, but it has not yet
> happened.

That should not have any difference here.

Comment 7 David Teigland 2020-03-18 20:06:35 UTC
Using the extend test I have collected a full debug log of an lvchange
--refresh command that prints the "Inconsistent" message.  As I've said a
number of times in the past, this is expected given the way vdsm is using
lvm.  In the following I will explain that in more detail, using the debug
log to show what is happening and why it's not a problem.

Steps in the lvm command:

dev cache scan
--------------

Creates a list of device path names from /dev dir entries (or udev).  This
is the list of devs available on the system that could potentially be
scanned.  This step does not open or read the devices.

Start of dev cache scan:
label/label.c:1001  Finding devices to scan
device/dev-cache.c:1173  Creating list of system devices.

Example of finding a path:
device/dev-cache.c:712  Found dev ...


filtering
---------

Excludes items from the previous dev cache list based on various filters,
e.g. lvm.conf regex filter, zero size dev, dev is a multipath component.
The size check requires opening the dev so stat can be run.  No devices
are read.  Some filters which require reading the dev (if not already
excluded by regex), like the md component filter and signature filters,
are deferred and applied later.

Start of filtering:
label/label.c:1038  Filtering devices to scan

Example of a dev being skipped by lvm.conf filter:
filters/filter-regex.c:172  /dev/loop6: Skipping (regex)

Example of a filter being deferred to later because it requires reading:
filters/filter-signature.c:32  filter signature deferred /dev/sde


label scan
----------

The list of devs from dev cache scan, minus the devs already excluded from
filtering are opened and read here.  After each dev is read, any deferred
filters from the previous step are applied (e.g. signatures are checked
now that the data is available), and devs are immediately excluded if the
dev does not pass the deferred filters.

Remaining devs are checked to see if they have an LVM label.  If no LVM
label, the dev is dropped here.  With an LVM label, the VG metadata is
read from the dev and a summary of the metadata is saved in lvmcache
structures (info and vginfo).

(In the example below only one device passed the filters.)

Start of label scan:
label/label.c:665  Scanning 1 devices for VG info

Data has been read from a device:
label/label.c:719  Processing data from device /dev/sde 8:64 fd 4 block 0x5646f330e190

Deferred filters are applied to the device:
label/label.c:379  Scan filtering /dev/sde

An lvm label is found on the device:
label/label.c:323  /dev/sde: lvm2 label detected at sector 1

Start of metadata scan (first mda):
format_text/text_label.c:497  Scanning /dev/sde mda1 summary.

Metadata is found:
format_text/format-text.c:1657  Found metadata summary on /dev/sde at 85184512 size 77783 for VG vvgg

Summary of metadata is saved in lvmcache:
cache/lvmcache.c:1435  lvmcache /dev/sde: now in VG vvgg with 2 mda(s).
cache/lvmcache.c:1259  lvmcache /dev/sde: VG vvgg: set VGID to jVWyt6vKaEcboTssdp7OJTvWepW67eC9.
cache/lvmcache.c:1634  lvmcache /dev/sde mda1 VG vvgg set seqno 2296 checksum b425d952 mda_size 77783

End of metadata scan (first mda):
format_text/text_label.c:525  Scanned /dev/sde mda1 seqno 2296

Start of metadata scan (second mda):
format_text/text_label.c:548  Scanning /dev/sde mda2 summary.

Metadata is found:
format_text/format-text.c:1657  Found metadata summary on /dev/sde at 999617183744 size 77783 for VG vvgg

Summary of metadata is saved in lvmcache:
(skipped because both mda1 and mda2 have the same metadata seqno 2296)

End of metadata scan (second mda):
format_text/text_label.c:575  Scanned /dev/sde mda2 seqno 2296

End of label scan:
label/label.c:798  Scanned devices: read errors 0 process errors 0 failed 0


Command processing
------------------

The command-specific code begins, usually via process_each_vg,
process_each_lv or process_each_pv.  Most commands are implemented by
processing one or more VGs.

Begin processing of a VG:
toollib.c:3668  Processing VG vvgg jVWyt6-vKaE-cboT-ssdp-7OJT-vWep-W67eC9

Lock the VG (local flock plus a distributed lock for shared VGs):
misc/lvm-flock.c:230  Locking /run/lock/lvm/V_vvgg RB


Read the VG metadata
--------------------

Now that a lock is held on the VG, read the VG metadata again.  It was
already read above during label scan during which a summary of it was
saved in lvmcache.  Another process or host could have changed the VG
metadata between the label scan and the flock begin taken, so reread the
metadata before using it for the command-specific actions.

Begin reading the VG:
metadata/metadata.c:4700  Reading VG vvgg jVWyt6vKaEcboTssdp7OJTvWepW67eC9

Rescan metadata from devs:
metadata/metadata.c:4747  Rescanning devices for vvgg

(Reusing the same code as in label scan above, so most of the same
messages are seen, but note the differences in metadata found here vs the
metadata details reported by the same messages above during initial label
scan.  These differences illustrate the purpose of rescanning and how it's
working correctly.)

Start of label scan:
label/label.c:665  Scanning 1 devices for VG info

Data has been read from a device:
label/label.c:719  Processing data from device /dev/sde 8:64 fd 4 block 0x5646f330e350

An lvm label is found on the device:
label/label.c:323  /dev/sde: lvm2 label detected at sector 1 

Start of metadata scan (first mda):
format_text/text_label.c:497  Scanning /dev/sde mda1 summary.

Metadata is found:
format_text/format-text.c:1657  Found metadata summary on /dev/sde at 85262336 size 77891 for VG vvgg

Summary of metadata is saved in lvmcache:
cache/lvmcache.c:1435  lvmcache /dev/sde: now in VG vvgg with 2 mda(s).
cache/lvmcache.c:1259  lvmcache /dev/sde: VG vvgg: set VGID to jVWyt6vKaEcboTssdp7OJTvWepW67eC9.
cache/lvmcache.c:1634  lvmcache /dev/sde mda1 VG vvgg set seqno 2297 checksum a03ca8a9 mda_size 77891

End of metadata scan (first mda):
format_text/text_label.c:525  Scanned /dev/sde mda1 seqno 2297

Start of metadata scan (second mda):
format_text/text_label.c:548  Scanning /dev/sde mda2 summary.

Metadata is found:
format_text/format-text.c:1657  Found metadata summary on /dev/sde at 999617339904 size 78001 for VG vvgg

Summary of metadata is saved in lvmcache:
cache/lvmcache.c:1655  lvmcache /dev/sde mda2 VG vvgg newer seqno 2298 checksum f56ca155 mda_size 78001

(Here we observe something different.  During the initial label scan, both
mda1 and mda2 contained the same seqno 2296.  The other host then wrote
seqno 2297 after our initial label scan, but before we rescanned mda1
where we saw 2297.  Then the other host wrote seqno 2298 between us
reading mda1 with 2297 and reading mda2 with 2298. Between local commands
using flock, and remote commands using distributed locks, the seqno
difference between 2296 and 2297 is normal and expected. But the
difference with mda1 seqno 2297 and mda2 seqno 2298 is unique to RHV
because RHV has concurrent remote commands but no distributed lock.)

End of metadata scan (second mda):
format_text/text_label.c:575  Scanned /dev/sde mda2 seqno 2298

End of label scan:
label/label.c:798  Scanned devices: read errors 0 process errors 0 failed 0

Read the VG metadata for processing (all fields, not just summary):
metadata/metadata.c:4853  Reading VG vvgg metadata from /dev/sde 4096

(Reading the VG metadata for processing uses the same data that was used
to read the VG summary just above, no new io is performed.)

Start of metadata read (first mda):
format_text/import.c:155  Reading metadata from /dev/sde at 85262336 size 77891 (+0)

Metadata is found:
format_text/format-text.c:497  Found metadata on /dev/sde at 85262336 size 77891 for VG vvgg

Start of metadata read (second mda):
format_text/import.c:155  Reading metadata from /dev/sde at 999617339904 size 78001 (+0)

Metadata is found:
format_text/format-text.c:497  Found metadata on /dev/sde at 999617339904 size 78001 for VG vvgg

Warning is produced because different metadata is seen in mda1 vs mda2:
metadata/metadata.c:4885  WARNING: ignoring metadata seqno 2297 on /dev/sde for seqno 2298 on /dev/sde for VG vvgg.
metadata/metadata.c:4902  WARNING: Inconsistent metadata found for VG vvgg.

The lvm command will continue to be processed using the newest version of
the metadata, 2298.

Comment 8 David Teigland 2020-03-18 20:14:34 UTC
Created attachment 1671207 [details]
command debug output

The full debugging output that was used for comment 7.

Comment 9 Zdenek Kabelac 2020-03-19 14:17:37 UTC
Ahh - so according to comment 7 - it looks we are 'reading' proper set of devices - but are repeatedly read/access same data during one command as we have lockless first scan.

So with current master we have:

lvchange[8728] toollib.c:3808      Obtaining the complete list of VGs before processing their LVs
lvchange[8728] toollib.c:3668      Processing VG vg XXXXXXXXXXXXXXXXX
lvchange[8728] misc/lvm-flock.c:229      Locking /run/lock/lvm/V_vg RB
lvchange[8728] misc/lvm-flock.c:113        _do_flock /run/lock/lvm/V_vg:aux WB

With stable-2.02 branch we have:

#cache/lvmcache.c:1570          Found VG info for 1 VGs
#toollib.c:3815        Obtaining the complete list of VGs before processing their LVs
#toollib.c:3677        Processing VG vg XXXXXXXXXXXXXXXXXX
#locking/locking.c:331           Dropping cache for vg.
#misc/lvm-flock.c:201         Locking /run/lock/lvm/V_vg RB
#misc/lvm-flock.c:99            _do_flock /run/lock/lvm/V_vg:aux WB

So anything cached before is erased - so there cannot happen inconsistencies between
discovered & cached metadata in the 1st. scan and 2nd. scan.

But this bring back the question of 'locking' mechanism RHEV deploys.

Is the case:

1st.host changing in loop lvm2 metadata on a single disk  A
2nd.host running another lvm2 command with disk A - and there is no locking between these 2 commands ?


As in attachment from comment 8:

/dev/sde has seqno  2296 during 'unlocked' scan
then after taking the VG lock seqno 2297 is reported from mda1 and seqno 2298 goes from mda2 ??
When lvm2 helds the lock - obtained metadata from  mda1 & mda2 must match.
This situation should not be possible in the case the locking works properly between commands 
unless I'm missing something??)

So how does the trace from comment 8 is relevant with the original report of this bug ?

We need to see  -vvvv from  'RHEV' case.

Is now RHEV relaying on lockless reading of metadata for read-only lvm2 commands and uses 2 metadata areas per PV ?

And we need to see how the locking works between running commands.

Comment 10 David Teigland 2020-03-19 15:36:18 UTC
> 1st.host changing in loop lvm2 metadata on a single disk  A
> 2nd.host running another lvm2 command with disk A - and there is no locking
> between these 2 commands ?

Correct, there is no locking.  vdsm has its own clustering implementation that ensures only one host is updating the vg and other hosts are only performing safe lvm commands in that vg.  

> As in attachment from comment 8:
> 
> /dev/sde has seqno  2296 during 'unlocked' scan
> then after taking the VG lock seqno 2297 is reported from mda1 and seqno
> 2298 goes from mda2 ??
> When lvm2 helds the lock - obtained metadata from  mda1 & mda2 must match.
> This situation should not be possible in the case the locking works properly
> between commands 
> unless I'm missing something??)

You're missing that the commands are running on different hosts, so there is no locking.
If the commands were on the same host, then flocks would prevent this, or if they were using lvmlockd that would also prevent it.

> So how does the trace from comment 8 is relevant with the original report of
> this bug ?

This bug was opened about the WARNING message about inconsistent metadata.  That is now explained and shown to be harmless.

> We need to see  -vvvv from  'RHEV' case.

I don't think we need any other traces unless there is another problem we need to explain.

> Is now RHEV relaying on lockless reading of metadata for read-only lvm2
> commands and uses 2 metadata areas per PV ?

I don't know if they have always used 2 mdas.

> And we need to see how the locking works between running commands.

vdsm coordinates which hosts are running which commands to ensure things are done safely.  Based on my memory of their descriptions over the years, the basic idea is that one host does lvcreate/lvextend on behalf of other hosts, and then tells those other hosts to do lvchange -ay or lvchange --refresh.

Comment 11 Nir Soffer 2020-03-19 15:51:30 UTC
(In reply to David Teigland from comment #10)
> > And we need to see how the locking works between running commands.
> 
> vdsm coordinates which hosts are running which commands to ensure things are
> done safely.  Based on my memory of their descriptions over the years, the
> basic idea is that one host does lvcreate/lvextend on behalf of other hosts,
> and then tells those other hosts to do lvchange -ay or lvchange --refresh.

Correct.

lvchange -ay/n is use when a host need to start/finish using a lv. For example
when running a vm using this lv, or copying lv contents.

lvchange --refresh is done after a host requested the manager host to extend
a lv. When the host gets a reply that the extend was completed it refreshes
the volume. This volume is active and used by a running vm during this. This
used only for qcow2 images.

lvchange --refresh is also used when a lv is active when we start to use it,
to make sure we don't use lvs which are not synced with current vg metadata.

We also run pvs, vgs, lvs, and vgck from other hosts.

Comment 12 Zdenek Kabelac 2020-03-19 17:13:22 UTC
(In reply to Nir Soffer from comment #11)
> (In reply to David Teigland from comment #10)
> > > And we need to see how the locking works between running commands.
> > 
> > vdsm coordinates which hosts are running which commands to ensure things are
> > done safely.  Based on my memory of their descriptions over the years, the
> > basic idea is that one host does lvcreate/lvextend on behalf of other hosts,
> > and then tells those other hosts to do lvchange -ay or lvchange --refresh.
> 
> Correct.
> 
> lvchange -ay/n is use when a host need to start/finish using a lv. For
> example
> when running a vm using this lv, or copying lv contents.
> 
> lvchange --refresh is done after a host requested the manager host to extend
> a lv. When the host gets a reply that the extend was completed it refreshes
> the volume. This volume is active and used by a running vm during this. This
> used only for qcow2 images.
> 
> lvchange --refresh is also used when a lv is active when we start to use it,
> to make sure we don't use lvs which are not synced with current vg metadata.
> 
> We also run pvs, vgs, lvs, and vgck from other hosts.

The main question is - how is it protected that 'vdsm master' is i.e. running 'lvextend'
command - and  'vdms client'  runs  'lvs' command. ?

Is it possible to have command  'lvextend' running in parallel with 'lvs'  - is there some 'locking' engine for this on vdsm side ?

Comment 13 Nir Soffer 2020-03-19 17:26:35 UTC
(In reply to Zdenek Kabelac from comment #12)
> (In reply to Nir Soffer from comment #11)
> > (In reply to David Teigland from comment #10)
> > > > And we need to see how the locking works between running commands.
> > > 
> > > vdsm coordinates which hosts are running which commands to ensure things are
> > > done safely.  Based on my memory of their descriptions over the years, the
> > > basic idea is that one host does lvcreate/lvextend on behalf of other hosts,
> > > and then tells those other hosts to do lvchange -ay or lvchange --refresh.
> > 
> > Correct.
> > 
> > lvchange -ay/n is use when a host need to start/finish using a lv. For
> > example
> > when running a vm using this lv, or copying lv contents.
> > 
> > lvchange --refresh is done after a host requested the manager host to extend
> > a lv. When the host gets a reply that the extend was completed it refreshes
> > the volume. This volume is active and used by a running vm during this. This
> > used only for qcow2 images.
> > 
> > lvchange --refresh is also used when a lv is active when we start to use it,
> > to make sure we don't use lvs which are not synced with current vg metadata.
> > 
> > We also run pvs, vgs, lvs, and vgck from other hosts.
> 
> The main question is - how is it protected that 'vdsm master' is i.e.
> running 'lvextend'
> command - and  'vdms client'  runs  'lvs' command. ?

The only protection is for having one host doing the read-write operations
using sanlock lease.

> Is it possible to have command  'lvextend' running in parallel with 'lvs'  -
> is there some 'locking' engine for this on vdsm side ?

Sure, we can have 50 "client" hosts doing lvs, pvs, vgs, lvchange --refresh,
lvchange -ay/n, vgck at the same time while the "master" host is running
lvcreate, lvextend, lvremove, lvchange, vgchange.

Comment 14 David Teigland 2020-07-15 16:54:38 UTC
as mentioned in comment 1, this has already been fixed by commit
https://sourceware.org/git/?p=lvm2.git;a=commit;h=957904933b2b0f94eddc6204ca2fa61e2df741a0


Note You need to log in before you can comment on or make changes to this bug.