1452210 – multipath does not distribute the unpriv-SGIO setting to its child devices

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1452210 - multipath does not distribute the unpriv-SGIO setting to its child devices

Summary: multipath does not distribute the unpriv-SGIO setting to its child devices

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	device-mapper-multipath
Sub Component:
Version:	7.3
Hardware:	All
OS:	Linux
Priority:	urgent
Severity:	urgent
Target Milestone:	rc
Target Release:	---
Assignee:	Ben Marzinski
QA Contact:	Lin Li
Docs Contact:	Marek Suchánek
URL:
Whiteboard:
Depends On:
Blocks:	RHEV_SCSI_reserve_Win_DirectLUN RHEV_SCSI_reserve_Win_SharedDisk 1420851 1469559 1510834 1540718
TreeView+	depends on / blocked

Reported:	2017-05-18 14:56 UTC by Martin Tessun
Modified:	2021-09-03 12:04 UTC (History)
CC List:	25 users (show)
Fixed In Version:	device-mapper-multipath-0.4.9-118.el7
Doc Type:	Enhancement
Doc Text:	DM Multipath no longer requires reservation keys in advance DM Multipath now supports two new configuration options in the `multipath.conf` file: * `unpriv_sgio` * `prkeys_file` The `reservation_key` option of the `defaults` and `multipaths` sections accepts a new keyword: `file`. When set, the `multipathd` service will now use the file configured in the `prkeys_file` option of the `defaults` section to get the reservation key to use for the paths of a multipath device. The `prkeys` file is automatically updated by the `mpathpersist` utility. The default for the `reservation_key` option remains undefined, and default for the `prkeys_file` is `/etc/multipath/prkeys`. If the new `unpriv_sgio` option is set to `yes`, DM Multipath will now create all new devices and their paths with the `unpriv_sgio` attribute. This option is used internally by other software, and is unnecessary for most DM Multipath users. It defaults to `no`. These changes make it possible to use the `mpathpersist` utility without knowing ahead of time what reservation keys will be used and without adding them to the `multipath.conf` configuration file. As a result, it is now easier to use the `mpathpersist` utility to manage multipath persistent reservations in multiple setups.
Clone Of:
Clones:	1510834 1540718 (view as bug list)
Environment:
Last Closed:	2018-04-10 16:10:28 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
failover cluster validation test report (44.52 KB, text/html) 2017-08-21 10:35 UTC, Peixiu Hou	no flags	Details
the validation report after set "unpriv_sgio yes" (46.19 KB, text/html) 2017-08-24 10:00 UTC, Peixiu Hou	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHEA-2018:0884	0	normal	SHIPPED_LIVE	device-mapper-multipath bug fix and enhancement update	2018-04-10 13:47:14 UTC

Description Martin Tessun 2017-05-18 14:56:55 UTC

Description of problem:
A VM running in qmeu does fail to send S3-PR (SGIO) commands to the storage path, in case the path is presented via multipath.

Version-Release number of selected component (if applicable):
all

How reproducible:
always

Steps to Reproduce:
1. Have a VM installed and started by libvrit (qemu process is run by a uid!=0
2. Try to send a S3-PR (e.g. run the WIndows cluster tests)

Actual results:
- The test fails

Expected results:
- The test should pass.

Additional info:
The reason for that is that the unpriv-SGIO setting that is passed to the multipath parent-device is not propagated to the children. As such the dm-device accepts the command, but it is not sent to the attached paths/active path due to the missing unpriv SGIO setting.

So we need multipath to propagate the unpriv SGIO setting to all its connected devices.

This is one of the blockers for supporting Windows Cluster on RHV.

Comment 1 Paolo Bonzini 2017-05-21 14:44:16 UTC

The same bug was opened on the kernel (bug 1254316) but it was expected behavior there.

Comment 5 Yaniv Kaul 2017-06-18 08:28:18 UTC

(In reply to Martin Tessun from comment #0)
> Description of problem:
> A VM running in qmeu does fail to send S3-PR (SGIO) commands to the storage
> path, in case the path is presented via multipath.
> 
> Version-Release number of selected component (if applicable):
> all
> 
> How reproducible:
> always
> 
> Steps to Reproduce:
> 1. Have a VM installed and started by libvrit (qemu process is run by a
> uid!=0
> 2. Try to send a S3-PR (e.g. run the WIndows cluster tests)

So let's assume you have 1 path up and 1 path down.
You send the PR, it gets sent via the path that is up. Later on, things change - and the path that was up is now down, the one that was down is now up. Is the reservation still in place?

Comment 7 Martin Tessun 2017-06-19 12:17:37 UTC

(In reply to Yaniv Kaul from comment #5)
> (In reply to Martin Tessun from comment #0)
> > Description of problem:
> > A VM running in qmeu does fail to send S3-PR (SGIO) commands to the storage
> > path, in case the path is presented via multipath.
> > 
> > Version-Release number of selected component (if applicable):
> > all
> > 
> > How reproducible:
> > always
> > 
> > Steps to Reproduce:
> > 1. Have a VM installed and started by libvrit (qemu process is run by a
> > uid!=0
> > 2. Try to send a S3-PR (e.g. run the WIndows cluster tests)
> 
> So let's assume you have 1 path up and 1 path down.
> You send the PR, it gets sent via the path that is up. Later on, things
> change - and the path that was up is now down, the one that was down is now
> up. Is the reservation still in place?

In iSCSI usecase yes (as the initiator does not change in RHV), for FC not (as each FC-Port has its own WWNN/PN assigned to it.

Comment 23 Peixiu Hou 2017-08-21 10:26:53 UTC

Hi,

Tried to test this bug with packages provided in comment#12, the msfc failover validation test also failed, error message as attachment.

Tested through RHVM, it used qemu user(uid!=0) to boot vms up. 

Used packages as follows:
device-mapper-multipath-libs-0.4.9-111.el7.bz1452210.x86_64
device-mapper-multipath-sysvinit-0.4.9-111.el7.bz1452210.x86_64
device-mapper-multipath-devel-0.4.9-111.el7.bz1452210.x86_64
device-mapper-multipath-debuginfo-0.4.9-111.el7.bz1452210.x86_64
device-mapper-multipath-0.4.9-111.el7.bz1452210.x86_64
kpartx-0.4.9-111.el7.bz1452210.x86_64
libdmmp-0.4.9-111.el7.bz1452210.x86_64
libdmmp-devel-0.4.9-111.el7.bz1452210.x86_64

kernel-3.10.0-663.el7.x86_64
qemu-kvm-rhev-2.9.0-16.el7_4.4.x86_64
libvirt-3.2.0-14.el7.x86_64
vdsm-4.19.28-1.el7ev.x86_64
rhv-4.1.5.2-0.1.el7
ovirt-engine-4.1.5.2-0.1.el7.noarch
virtio-win-1.9.3-1.el7.noarch
seabios-1.10.2-1.el7.x86_64

FC host info:
# multipath -ll
mpathb (360050763008084e6e000000000000195) dm-1 IBM     ,2145            
size=120G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 1:0:0:1 sdc 8:32  active ready running
| `- 2:0:0:1 sdf 8:80  active ready running
`-+- policy='service-time 0' prio=10 status=enabled
  |- 1:0:1:1 sdg 8:96  active ready running
  `- 2:0:1:1 sdi 8:128 active ready running
mpatha (360050763008084e6e000000000000194) dm-0 IBM     ,2145            
size=100G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 1:0:1:0 sde 8:64  active ready running
| `- 2:0:1:0 sdh 8:112 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
  |- 1:0:0:0 sdb 8:16  active ready running
  `- 2:0:0:0 sdd 8:48  active ready running

Booted command info:
-------------------------------------------------------------------------------
qemu 26936 23.7 3.5 2014688 1171756 ? Rl 03:21 12:50 /usr/libexec/qemu-kvm -name guest=msfc-vm1...-drive file=/dev/mapper/360050763008084e6e000000000000195,format=raw,if=none,id=drive-scsi0-0-0-0,cache=none,werror=stop,rerror=stop,aio=native -device scsi-block,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0...
-------------------------------------------------------------------------------
qemu 27294 25.8 3.5 2015724 1166660 ? Sl 03:24 13:13 /usr/libexec/qemu-kvm -name guest=msfc-vm2...-drive file=/dev/mapper/360050763008084e6e000000000000195,format=raw,if=none,id=drive-scsi0-0-0-0,cache=none,werror=stop,rerror=stop,aio=native -device scsi-block,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0...
-------------------------------------------------------------------------------

Comment 24 Peixiu Hou 2017-08-21 10:35:29 UTC

Created attachment 1316185 [details]
failover cluster validation test report

Comment 25 Ben Marzinski 2017-08-22 17:13:47 UTC

Did you add

unpriv_sgio yes

to the defaults section of multipath.conf? Was this a setup where the only issue was that unpriv_sgio wasn't getting set on all the multipath paths correctly? All those packages did was make it so that you should configure multipath to set unpriv_sgio on its path devices.

The packages that will include the code to remove the need for setting the reservation key in multipath.conf will be coming shortly (now that I'm finally done with all my summer time off).

Comment 26 Peixiu Hou 2017-08-24 09:55:27 UTC

Hi,

I tried follows 3 tests for this bug with FC env:

1. Tried to test this case with "unpriv_sgio yes" added on multipath.conf. Used the RHVM to complete this test due to it use the qemu user to boot the vms up.
The shared disk unpriv_sgio status as follows:

[root@hp-dl388g9-01 etc]# grep -e 0 -e 1 /sys/devices/virtual/block/dm-0/queue/unpriv_sgio /sys/devices/platform/host*/session*/target*/*/block/sd*/queue/unpriv_sgio
/sys/devices/virtual/block/dm-0/queue/unpriv_sgio:1
/sys/devices/platform/host3/session1/target3:0:0/3:0:0:0/block/sdj/queue/unpriv_sgio:1

the shared disk multipath info:
[root@hp-dl388g9-01 ~]# multipath -ll
360050763008084e6e000000000000195 dm-0 IBM     ,2145            
size=120G features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 1:0:0:1 sdc 8:32  active ready running
| `- 2:0:0:1 sdg 8:96  active ready running
`-+- policy='service-time 0' prio=10 status=enabled
  |- 1:0:1:1 sdf 8:80  active ready running
  `- 2:0:1:1 sdi 8:128 active ready running

The test result:
Also failed, but the failed step was not same as the comment#23, the step "ListDisk To Be Validated" can be passed, it failed on the step "Validation SCSi-3 Reservation". Report pls refer to the attachment.


2. Tried to test with root user under qemu level(not RHVM). Tried with the disk /dev/mapper/360050763008084e6e000000000000195 to complete the test, the command line is same as comment#23, it also failed, the error message same as the first test result, the step "ListDisk To Be Validated" can be passed, it failed on the step "Validation SCSi-3 Reservation".

3. Tried to test with root user under qemu level(not RHVM). Tried with the disk /dev/sdc and /dev/sdf directly to complete the failover validation test, it can be passed.

The command line as:
--------------------------------------------------------------------------------
root     18437  9.8  6.6 3082072 2167452 pts/0 Sl+  22:27   3:24 /usr/libexec/qemu-kvm -name server1...-drive file=/dev/sdc,if=none,media=disk,format=raw,rerror=stop,werror=stop,readonly=off,aio=threads,cache=none,cache.direct=on,id=drive-hotadd,serial=sas-test -device scsi-block,drive=drive-hotadd,bus=scsi-hotadd.0...
--------------------------------------------------------------------------------
root     18480  8.4  6.6 3067692 2165884 pts/1 Sl+  22:27   2:56 /usr/libexec/qemu-kvm -name server2...-drive file=/dev/sdf,if=none,media=disk,format=raw,rerror=stop,werror=stop,readonly=off,aio=threads,cache=none,cache.direct=on,id=drive-hotadd,serial=sas-test -device scsi-block,drive=drive-hotadd,bus=scsi-hotadd.0...
--------------------------------------------------------------------------------

Hope the above information is useful~

Best Regards~
Peixiu Hou

Comment 27 Peixiu Hou 2017-08-24 10:00:41 UTC

Created attachment 1317562 [details]
the validation report after set "unpriv_sgio yes"

Comment 29 Ben Marzinski 2017-09-20 00:08:58 UTC

Along with the unpriv_sgio option, I've now added the ability to use /etc/multipath/prkeys to keep track of the persistent reservation keys.  To do this, you must set

reservation_key file

in /etc/multipath.conf

This will tell multipathd to look in the prkeys file to see if there is any reservation key associated with the device's wwid.  However, you will not need to manually add or remove the keys from the prkeys file. When you have reservation_key set to "file" mpathpersist will handle this for you.  Simply create a registration like normally with mpathpersist

# mpathpersist -oGS <key> <device>

and along with passing the registration to all the path devices, it will notify multipathd to add the key to the prkeys file, so that it will be used for setting
registrations on other paths that come up for this device. When you remove a registration with mpathpersist

# mpathpersist -oGK <key> <device>

it will notify multipathd to remove the key from the prkeys file.

Comment 42 Lin Li 2017-12-20 12:01:42 UTC

change to verified according to comment 39.

Comment 50 errata-xmlrpc 2018-04-10 16:10:28 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:0884

Note You need to log in before you can comment on or make changes to this bug.

agk
ailan
amureini
bmarzins
boruvka.michal
heinzm
jbrassow
knoel
lijin
lilin
loberman
lvm-team
michen
msnitzer
mtessun
nsoffer
pasik
pbonzini
phou
prajnoha
rhandlin
slevine
vanhoof
ykaul
ylavi