Bug 1829597 - Require lvm2 with supporting locking_type=4 for pvs command on CentOS
Summary: Require lvm2 with supporting locking_type=4 for pvs command on CentOS
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: vdsm
Classification: oVirt
Component: Core
Version: 4.30.45
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ovirt-4.3.10
: 4.30.46
Assignee: bugs@ovirt.org
QA Contact: Ilan Zuckerman
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-04-29 20:41 UTC by Nir Soffer
Modified: 2020-06-10 10:57 UTC (History)
4 users (show)

Fixed In Version: vdsm-4.30.46
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-06-10 10:57:08 UTC
oVirt Team: Storage
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 108693 0 ovirt-4.3 MERGED lvm: Enable read-only pvs command on CentOS 2021-01-07 19:22:36 UTC

Description Nir Soffer 2020-04-29 20:41:55 UTC
Description of problem:

Bug 1811391 was fixed in 4.3.9 for RHEL 7.8. On CentOS the fix was partial
because lvm2 package providing fix for bug 1809660 was not available when
vdsm was released.

Using locking_type=1 when running pvs command can lead to VG metadata
corruption, when host running as non-SPM runs pvs command when the SPM is
modifying VG metadta. The command may see inconsistent metadata and may try
to fix the metadata, corrupting the VG metadata.

The package is available since 2020-04-08:
http://mirror.centos.org/centos/7/updates/x86_64/Packages/lvm2-2.02.186-7.el7_8.1.x86_64.rpm

Vdsm should require this package and enable locking_type=4 when running on
CentOS 7.8.

Version-Release number of selected component (if applicable):
3.30.45.

How reproducible:
Always

Steps to Reproduce:
1. Activate a non-spm host on CentOS 7.8

Actual results:
Vdsm using locking_type=1 in pvs command when host is not the SPM.

Expected results:
Vdsm using locking_type=4 in pvs command when host is not the SPM.

Comment 1 Nir Soffer 2020-04-29 20:43:52 UTC
Milestone 4.3.10 is not available, setting 4.3.9-1.

Sandro, we need to deliver this fix in the next ovirt release.

Comment 2 Avihai 2020-05-17 08:04:10 UTC
Hi Nir,
Please provide a clear verification scenario,

Comment 3 Nir Soffer 2020-05-18 09:30:18 UTC
This bug is relevant only to CentOS - on RHEL this was fixed in 4.3.9.

To verify, check vdsm DEBUG logs in non-spm host. Before this fix vdsm
would run "pvs" command with locking_type=1. Now pvs command runs with
locking_type=4.

Note that on the spm host, pvs command run with locking_type=1.

Comment 4 Ilan Zuckerman 2020-05-18 13:20:07 UTC
Verified on specially build centos env:

ovirt-engine-4.3.10.3-0.0.master.20200513172426.git741b00b.el7.noarch
vdsm-4.30.45-1.el7.x86_64


[root@storage-ge4-vdsm1 ~]# cat /etc/redhat-release 
CentOS Linux release 7.8.2003 (Core)


Vdsm log on a non SPM host showing locking_type=4 :

2020-05-18 14:10:48,891+0300 DEBUG (monitor/52a60f2) [storage.Misc.excCmd] /usr/bin/taskset --cpu-list 0-3 /usr/bin/sudo -n /usr/sbin/lvm vgs --config 'devices {  preferred_names=["^/dev/mapper/"]  ignore_suspended_devices=1  write_cache_state=0  disable_after_error_count=3  filter=["a|^/dev/mapper/360002ac0000000000000001600021f6b$|^/dev/mapper/360002ac0000000000000001700021f6b$|^/dev/mapper/360002ac0000000000000001800021f6b$|^/dev/mapper/360002ac0000000000000001900021f6b$|^/dev/mapper/360002ac0000000000000001a00021f6b$|^/dev/mapper/360002ac0000000000000001b00021f6b$|", "r|.*|"] } global {  locking_type=4  prioritise_write_locks=1  wait_for_locks=1  use_lvmetad=0 } backup {  retain_min=50  retain_days=0 }' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name --select 'vg_name = 52a60f26-b230-4505-a80e-5f9c2305cc62' (cwd None) (commands:198)


Vdsm log of a SPM host showing locking_type=1 :

2020-05-18 13:10:57,923+0300 DEBUG (monitor/0e3a104) [storage.Misc.excCmd] /usr/bin/taskset --cpu-list 0-3 /usr/bin/sudo -n /usr/sbin/lvm vgs --config 'devices {  preferred_names=["^/dev/mapper/"]  ignore_suspended_devices=1  write_cache_state=0  disable_after_error_count=3  filter=["a|^/dev/mapper/360002ac0000000000000001600021f6b$|^/dev/mapper/360002ac0000000000000001700021f6b$|^/dev/mapper/360002ac0000000000000001800021f6b$|^/dev/mapper/360002ac0000000000000001900021f6b$|^/dev/mapper/360002ac0000000000000001a00021f6b$|^/dev/mapper/360002ac0000000000000001b00021f6b$|", "r|.*|"] } global {  locking_type=1  prioritise_write_locks=1  wait_for_locks=1  use_lvmetad=0 } backup {  retain_min=50  retain_days=0 }' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name --select 'vg_name = 0e3a104d-cf58-4446-8e88-1a93e653331c' (cwd None) (commands:198)


Nir, pls acknowledge that this can be considered as a verification. Because i didnt find 'pvs' command anywhere in the vdsm log. Just 'vgs'.

Comment 5 Nir Soffer 2020-05-18 14:54:36 UTC
(In reply to Ilan Zuckerman from comment #4)
> Verified on specially build centos env:

Verification is incorrect, the issue was only in pvs command. vgs command was
used correctly since 4.3.9.

To trigger pvs command you can do:

# vdsm-client Host setLogLevel level=DEBUG 
true

# echo -n >/var/log/vdsm/vdsm.log

# vdsm-client Host getDeviceList > /dev/null

# grep pvs /var/log/vdsm/vdsm.log | tail -1
2020-05-18 17:43:43,211+0300 DEBUG (jsonrpc/1) [common.commands] /usr/bin/taskset --cpu-list 0-1 /usr/bin/sudo -n /sbin/lvm pvs --config 'devices {  preferred_names=["^/dev/mapper/"]  ignore_suspended_devices=1  write_cache_state=0  disable_after_error_count=3  filter=["r|.*|"]  hints="none" } global {  locking_type=1  prioritise_write_locks=1  wait_for_locks=1  use_lvmetad=0 } backup {  retain_min=50  retain_days=0 }' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,size,vg_name,vg_uuid,pe_start,pe_count,pe_alloc_count,mda_count,dev_size,mda_used_count (cwd None) (commands:153)

This was on SPM host -on other hosts we should see locking_type=4.

Comment 7 Ilan Zuckerman 2020-05-19 10:17:09 UTC
Verified on 
ovirt-engine-4.3.10.3-0.0.master.20200513172426.git741b00b.el7.noarch
vdsm-4.30.46-1.el7.x86_64

On vdsm SPM host locking_type=1:

[root@storage-ge4-vdsm3 ~]# grep pvs /var/log/vdsm/vdsm.log | grep locking_type
2020-05-19 13:11:05,410+0300 DEBUG (jsonrpc/5) [storage.Misc.excCmd] /usr/bin/taskset --cpu-list 0-3 /usr/bin/sudo -n /usr/sbin/lvm pvs --config 'devices {  preferred_names=["^/dev/mapper/"]  ignore_suspended_devices=1  write_cache_state=0  disable_after_error_count=3  filter=["a|^/dev/mapper/360002ac0000000000000001600021f6b$|^/dev/mapper/360002ac0000000000000001700021f6b$|^/dev/mapper/360002ac0000000000000001800021f6b$|^/dev/mapper/360002ac0000000000000001900021f6b$|^/dev/mapper/360002ac0000000000000001a00021f6b$|^/dev/mapper/360002ac0000000000000001b00021f6b$|", "r|.*|"] } global {  locking_type=1  prioritise_write_locks=1  wait_for_locks=1  use_lvmetad=0 } backup {  retain_min=50  retain_days=0 }' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,size,vg_name,vg_uuid,pe_start,pe_count,pe_alloc_count,mda_count,dev_size,mda_used_count (cwd None) (commands:198)


On regular vdsm (NOT SPM) locking_type=4

[root@storage-ge4-vdsm1 ~]# grep pvs /var/log/vdsm/vdsm.log | grep locking_type
2020-05-19 13:13:22,538+0300 DEBUG (jsonrpc/2) [storage.Misc.excCmd] /usr/bin/taskset --cpu-list 0-3 /usr/bin/sudo -n /usr/sbin/lvm pvs --config 'devices {  preferred_names=["^/dev/mapper/"]  ignore_suspended_devices=1  write_cache_state=0  disable_after_error_count=3  filter=["a|^/dev/mapper/360002ac0000000000000001600021f6b$|^/dev/mapper/360002ac0000000000000001700021f6b$|^/dev/mapper/360002ac0000000000000001800021f6b$|^/dev/mapper/360002ac0000000000000001900021f6b$|^/dev/mapper/360002ac0000000000000001a00021f6b$|^/dev/mapper/360002ac0000000000000001b00021f6b$|", "r|.*|"] } global {  locking_type=4  prioritise_write_locks=1  wait_for_locks=1  use_lvmetad=0 } backup {  retain_min=50  retain_days=0 }' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,size,vg_name,vg_uuid,pe_start,pe_count,pe_alloc_count,mda_count,dev_size,mda_used_count (cwd None) (commands:198)

Comment 8 Michal Skrivanek 2020-06-10 10:57:08 UTC
4.3.10 was released


Note You need to log in before you can comment on or make changes to this bug.