Bug 1452210

Summary: multipath does not distribute the unpriv-SGIO setting to its child devices
Product: Red Hat Enterprise Linux 7 Reporter: Martin Tessun <mtessun>
Component: device-mapper-multipathAssignee: Ben Marzinski <bmarzins>
Status: CLOSED ERRATA QA Contact: Lin Li <lilin>
Severity: urgent Docs Contact: Marek Suchánek <msuchane>
Priority: urgent    
Version: 7.3CC: agk, ailan, amureini, bmarzins, boruvka.michal, heinzm, jbrassow, knoel, lijin, lilin, loberman, lvm-team, michen, msnitzer, mtessun, nsoffer, pasik, pbonzini, phou, prajnoha, rhandlin, slevine, vanhoof, ykaul, ylavi
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: device-mapper-multipath-0.4.9-118.el7 Doc Type: Enhancement
Doc Text:
DM Multipath no longer requires reservation keys in advance DM Multipath now supports two new configuration options in the `multipath.conf` file: * `unpriv_sgio` * `prkeys_file` The `reservation_key` option of the `defaults` and `multipaths` sections accepts a new keyword: `file`. When set, the `multipathd` service will now use the file configured in the `prkeys_file` option of the `defaults` section to get the reservation key to use for the paths of a multipath device. The `prkeys` file is automatically updated by the `mpathpersist` utility. The default for the `reservation_key` option remains undefined, and default for the `prkeys_file` is `/etc/multipath/prkeys`. If the new `unpriv_sgio` option is set to `yes`, DM Multipath will now create all new devices and their paths with the `unpriv_sgio` attribute. This option is used internally by other software, and is unnecessary for most DM Multipath users. It defaults to `no`. These changes make it possible to use the `mpathpersist` utility without knowing ahead of time what reservation keys will be used and without adding them to the `multipath.conf` configuration file. As a result, it is now easier to use the `mpathpersist` utility to manage multipath persistent reservations in multiple setups.
Story Points: ---
Clone Of:
: 1510834 1540718 (view as bug list) Environment:
Last Closed: 2018-04-10 16:10:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1111783, 1111784, 1420851, 1469559, 1510834, 1540718    
Attachments:
Description Flags
failover cluster validation test report
none
the validation report after set "unpriv_sgio yes" none

Description Martin Tessun 2017-05-18 14:56:55 UTC
Description of problem:
A VM running in qmeu does fail to send S3-PR (SGIO) commands to the storage path, in case the path is presented via multipath.

Version-Release number of selected component (if applicable):
all

How reproducible:
always

Steps to Reproduce:
1. Have a VM installed and started by libvrit (qemu process is run by a uid!=0
2. Try to send a S3-PR (e.g. run the WIndows cluster tests)

Actual results:
- The test fails

Expected results:
- The test should pass.

Additional info:
The reason for that is that the unpriv-SGIO setting that is passed to the multipath parent-device is not propagated to the children. As such the dm-device accepts the command, but it is not sent to the attached paths/active path due to the missing unpriv SGIO setting.

So we need multipath to propagate the unpriv SGIO setting to all its connected devices.

This is one of the blockers for supporting Windows Cluster on RHV.

Comment 1 Paolo Bonzini 2017-05-21 14:44:16 UTC
The same bug was opened on the kernel (bug 1254316) but it was expected behavior there.

Comment 5 Yaniv Kaul 2017-06-18 08:28:18 UTC
(In reply to Martin Tessun from comment #0)
> Description of problem:
> A VM running in qmeu does fail to send S3-PR (SGIO) commands to the storage
> path, in case the path is presented via multipath.
> 
> Version-Release number of selected component (if applicable):
> all
> 
> How reproducible:
> always
> 
> Steps to Reproduce:
> 1. Have a VM installed and started by libvrit (qemu process is run by a
> uid!=0
> 2. Try to send a S3-PR (e.g. run the WIndows cluster tests)

So let's assume you have 1 path up and 1 path down.
You send the PR, it gets sent via the path that is up. Later on, things change - and the path that was up is now down, the one that was down is now up. Is the reservation still in place?

Comment 7 Martin Tessun 2017-06-19 12:17:37 UTC
(In reply to Yaniv Kaul from comment #5)
> (In reply to Martin Tessun from comment #0)
> > Description of problem:
> > A VM running in qmeu does fail to send S3-PR (SGIO) commands to the storage
> > path, in case the path is presented via multipath.
> > 
> > Version-Release number of selected component (if applicable):
> > all
> > 
> > How reproducible:
> > always
> > 
> > Steps to Reproduce:
> > 1. Have a VM installed and started by libvrit (qemu process is run by a
> > uid!=0
> > 2. Try to send a S3-PR (e.g. run the WIndows cluster tests)
> 
> So let's assume you have 1 path up and 1 path down.
> You send the PR, it gets sent via the path that is up. Later on, things
> change - and the path that was up is now down, the one that was down is now
> up. Is the reservation still in place?

In iSCSI usecase yes (as the initiator does not change in RHV), for FC not (as each FC-Port has its own WWNN/PN assigned to it.

Comment 23 Peixiu Hou 2017-08-21 10:26:53 UTC
Hi,

Tried to test this bug with packages provided in comment#12, the msfc failover validation test also failed, error message as attachment.

Tested through RHVM, it used qemu user(uid!=0) to boot vms up. 

Used packages as follows:
device-mapper-multipath-libs-0.4.9-111.el7.bz1452210.x86_64
device-mapper-multipath-sysvinit-0.4.9-111.el7.bz1452210.x86_64
device-mapper-multipath-devel-0.4.9-111.el7.bz1452210.x86_64
device-mapper-multipath-debuginfo-0.4.9-111.el7.bz1452210.x86_64
device-mapper-multipath-0.4.9-111.el7.bz1452210.x86_64
kpartx-0.4.9-111.el7.bz1452210.x86_64
libdmmp-0.4.9-111.el7.bz1452210.x86_64
libdmmp-devel-0.4.9-111.el7.bz1452210.x86_64

kernel-3.10.0-663.el7.x86_64
qemu-kvm-rhev-2.9.0-16.el7_4.4.x86_64
libvirt-3.2.0-14.el7.x86_64
vdsm-4.19.28-1.el7ev.x86_64
rhv-4.1.5.2-0.1.el7
ovirt-engine-4.1.5.2-0.1.el7.noarch
virtio-win-1.9.3-1.el7.noarch
seabios-1.10.2-1.el7.x86_64

FC host info:
# multipath -ll
mpathb (360050763008084e6e000000000000195) dm-1 IBM     ,2145            
size=120G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 1:0:0:1 sdc 8:32  active ready running
| `- 2:0:0:1 sdf 8:80  active ready running
`-+- policy='service-time 0' prio=10 status=enabled
  |- 1:0:1:1 sdg 8:96  active ready running
  `- 2:0:1:1 sdi 8:128 active ready running
mpatha (360050763008084e6e000000000000194) dm-0 IBM     ,2145            
size=100G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 1:0:1:0 sde 8:64  active ready running
| `- 2:0:1:0 sdh 8:112 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
  |- 1:0:0:0 sdb 8:16  active ready running
  `- 2:0:0:0 sdd 8:48  active ready running

Booted command info:
-------------------------------------------------------------------------------
qemu 26936 23.7 3.5 2014688 1171756 ? Rl 03:21 12:50 /usr/libexec/qemu-kvm -name guest=msfc-vm1...-drive file=/dev/mapper/360050763008084e6e000000000000195,format=raw,if=none,id=drive-scsi0-0-0-0,cache=none,werror=stop,rerror=stop,aio=native -device scsi-block,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0...
-------------------------------------------------------------------------------
qemu 27294 25.8 3.5 2015724 1166660 ? Sl 03:24 13:13 /usr/libexec/qemu-kvm -name guest=msfc-vm2...-drive file=/dev/mapper/360050763008084e6e000000000000195,format=raw,if=none,id=drive-scsi0-0-0-0,cache=none,werror=stop,rerror=stop,aio=native -device scsi-block,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0...
-------------------------------------------------------------------------------

Comment 24 Peixiu Hou 2017-08-21 10:35:29 UTC
Created attachment 1316185 [details]
failover cluster validation test report

Comment 25 Ben Marzinski 2017-08-22 17:13:47 UTC
Did you add

unpriv_sgio yes

to the defaults section of multipath.conf? Was this a setup where the only issue was that unpriv_sgio wasn't getting set on all the multipath paths correctly? All those packages did was make it so that you should configure multipath to set unpriv_sgio on its path devices.

The packages that will include the code to remove the need for setting the reservation key in multipath.conf will be coming shortly (now that I'm finally done with all my summer time off).

Comment 26 Peixiu Hou 2017-08-24 09:55:27 UTC
Hi,

I tried follows 3 tests for this bug with FC env:

1. Tried to test this case with "unpriv_sgio yes" added on multipath.conf. Used the RHVM to complete this test due to it use the qemu user to boot the vms up.
The shared disk unpriv_sgio status as follows:

[root@hp-dl388g9-01 etc]# grep -e 0 -e 1 /sys/devices/virtual/block/dm-0/queue/unpriv_sgio /sys/devices/platform/host*/session*/target*/*/block/sd*/queue/unpriv_sgio
/sys/devices/virtual/block/dm-0/queue/unpriv_sgio:1
/sys/devices/platform/host3/session1/target3:0:0/3:0:0:0/block/sdj/queue/unpriv_sgio:1

the shared disk multipath info:
[root@hp-dl388g9-01 ~]# multipath -ll
360050763008084e6e000000000000195 dm-0 IBM     ,2145            
size=120G features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 1:0:0:1 sdc 8:32  active ready running
| `- 2:0:0:1 sdg 8:96  active ready running
`-+- policy='service-time 0' prio=10 status=enabled
  |- 1:0:1:1 sdf 8:80  active ready running
  `- 2:0:1:1 sdi 8:128 active ready running

The test result:
Also failed, but the failed step was not same as the comment#23, the step "ListDisk To Be Validated" can be passed, it failed on the step "Validation SCSi-3 Reservation". Report pls refer to the attachment.


2. Tried to test with root user under qemu level(not RHVM). Tried with the disk /dev/mapper/360050763008084e6e000000000000195 to complete the test, the command line is same as comment#23, it also failed, the error message same as the first test result, the step "ListDisk To Be Validated" can be passed, it failed on the step "Validation SCSi-3 Reservation".

3. Tried to test with root user under qemu level(not RHVM). Tried with the disk /dev/sdc and /dev/sdf directly to complete the failover validation test, it can be passed.

The command line as:
--------------------------------------------------------------------------------
root     18437  9.8  6.6 3082072 2167452 pts/0 Sl+  22:27   3:24 /usr/libexec/qemu-kvm -name server1...-drive file=/dev/sdc,if=none,media=disk,format=raw,rerror=stop,werror=stop,readonly=off,aio=threads,cache=none,cache.direct=on,id=drive-hotadd,serial=sas-test -device scsi-block,drive=drive-hotadd,bus=scsi-hotadd.0...
--------------------------------------------------------------------------------
root     18480  8.4  6.6 3067692 2165884 pts/1 Sl+  22:27   2:56 /usr/libexec/qemu-kvm -name server2...-drive file=/dev/sdf,if=none,media=disk,format=raw,rerror=stop,werror=stop,readonly=off,aio=threads,cache=none,cache.direct=on,id=drive-hotadd,serial=sas-test -device scsi-block,drive=drive-hotadd,bus=scsi-hotadd.0...
--------------------------------------------------------------------------------

Hope the above information is useful~

Best Regards~
Peixiu Hou

Comment 27 Peixiu Hou 2017-08-24 10:00:41 UTC
Created attachment 1317562 [details]
the validation report after set "unpriv_sgio yes"

Comment 29 Ben Marzinski 2017-09-20 00:08:58 UTC
Along with the unpriv_sgio option, I've now added the ability to use /etc/multipath/prkeys to keep track of the persistent reservation keys.  To do this, you must set

reservation_key file

in /etc/multipath.conf

This will tell multipathd to look in the prkeys file to see if there is any reservation key associated with the device's wwid.  However, you will not need to manually add or remove the keys from the prkeys file. When you have reservation_key set to "file" mpathpersist will handle this for you.  Simply create a registration like normally with mpathpersist

# mpathpersist -oGS <key> <device>

and along with passing the registration to all the path devices, it will notify multipathd to add the key to the prkeys file, so that it will be used for setting
registrations on other paths that come up for this device. When you remove a registration with mpathpersist

# mpathpersist -oGK <key> <device>

it will notify multipathd to remove the key from the prkeys file.

Comment 42 Lin Li 2017-12-20 12:01:42 UTC
change to verified according to comment 39.

Comment 50 errata-xmlrpc 2018-04-10 16:10:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:0884