Bug 1452210
Summary: | multipath does not distribute the unpriv-SGIO setting to its child devices | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Martin Tessun <mtessun> | ||||||
Component: | device-mapper-multipath | Assignee: | Ben Marzinski <bmarzins> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Lin Li <lilin> | ||||||
Severity: | urgent | Docs Contact: | Marek Suchánek <msuchane> | ||||||
Priority: | urgent | ||||||||
Version: | 7.3 | CC: | agk, ailan, amureini, bmarzins, boruvka.michal, heinzm, jbrassow, knoel, lijin, lilin, loberman, lvm-team, michen, msnitzer, mtessun, nsoffer, pasik, pbonzini, phou, prajnoha, rhandlin, slevine, vanhoof, ykaul, ylavi | ||||||
Target Milestone: | rc | Keywords: | ZStream | ||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | device-mapper-multipath-0.4.9-118.el7 | Doc Type: | Enhancement | ||||||
Doc Text: |
DM Multipath no longer requires reservation keys in advance
DM Multipath now supports two new configuration options in the `multipath.conf` file:
* `unpriv_sgio`
* `prkeys_file`
The `reservation_key` option of the `defaults` and `multipaths` sections accepts a new keyword: `file`. When set, the `multipathd` service will now use the file configured in the `prkeys_file` option of the `defaults` section to get the reservation key to use for the paths of a multipath device. The `prkeys` file is automatically updated by the `mpathpersist` utility. The default for the `reservation_key` option remains undefined, and default for the `prkeys_file` is `/etc/multipath/prkeys`.
If the new `unpriv_sgio` option is set to `yes`, DM Multipath will now create all new devices and their paths with the `unpriv_sgio` attribute. This option is used internally by other software, and is unnecessary for most DM Multipath users. It defaults to `no`.
These changes make it possible to use the `mpathpersist` utility without knowing ahead of time what reservation keys will be used and without adding them to the `multipath.conf` configuration file. As a result, it is now easier to use the `mpathpersist` utility to manage multipath persistent reservations in multiple setups.
|
Story Points: | --- | ||||||
Clone Of: | |||||||||
: | 1510834 1540718 (view as bug list) | Environment: | |||||||
Last Closed: | 2018-04-10 16:10:28 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 1111783, 1111784, 1420851, 1469559, 1510834, 1540718 | ||||||||
Attachments: |
|
Description
Martin Tessun
2017-05-18 14:56:55 UTC
The same bug was opened on the kernel (bug 1254316) but it was expected behavior there. (In reply to Martin Tessun from comment #0) > Description of problem: > A VM running in qmeu does fail to send S3-PR (SGIO) commands to the storage > path, in case the path is presented via multipath. > > Version-Release number of selected component (if applicable): > all > > How reproducible: > always > > Steps to Reproduce: > 1. Have a VM installed and started by libvrit (qemu process is run by a > uid!=0 > 2. Try to send a S3-PR (e.g. run the WIndows cluster tests) So let's assume you have 1 path up and 1 path down. You send the PR, it gets sent via the path that is up. Later on, things change - and the path that was up is now down, the one that was down is now up. Is the reservation still in place? (In reply to Yaniv Kaul from comment #5) > (In reply to Martin Tessun from comment #0) > > Description of problem: > > A VM running in qmeu does fail to send S3-PR (SGIO) commands to the storage > > path, in case the path is presented via multipath. > > > > Version-Release number of selected component (if applicable): > > all > > > > How reproducible: > > always > > > > Steps to Reproduce: > > 1. Have a VM installed and started by libvrit (qemu process is run by a > > uid!=0 > > 2. Try to send a S3-PR (e.g. run the WIndows cluster tests) > > So let's assume you have 1 path up and 1 path down. > You send the PR, it gets sent via the path that is up. Later on, things > change - and the path that was up is now down, the one that was down is now > up. Is the reservation still in place? In iSCSI usecase yes (as the initiator does not change in RHV), for FC not (as each FC-Port has its own WWNN/PN assigned to it. Hi, Tried to test this bug with packages provided in comment#12, the msfc failover validation test also failed, error message as attachment. Tested through RHVM, it used qemu user(uid!=0) to boot vms up. Used packages as follows: device-mapper-multipath-libs-0.4.9-111.el7.bz1452210.x86_64 device-mapper-multipath-sysvinit-0.4.9-111.el7.bz1452210.x86_64 device-mapper-multipath-devel-0.4.9-111.el7.bz1452210.x86_64 device-mapper-multipath-debuginfo-0.4.9-111.el7.bz1452210.x86_64 device-mapper-multipath-0.4.9-111.el7.bz1452210.x86_64 kpartx-0.4.9-111.el7.bz1452210.x86_64 libdmmp-0.4.9-111.el7.bz1452210.x86_64 libdmmp-devel-0.4.9-111.el7.bz1452210.x86_64 kernel-3.10.0-663.el7.x86_64 qemu-kvm-rhev-2.9.0-16.el7_4.4.x86_64 libvirt-3.2.0-14.el7.x86_64 vdsm-4.19.28-1.el7ev.x86_64 rhv-4.1.5.2-0.1.el7 ovirt-engine-4.1.5.2-0.1.el7.noarch virtio-win-1.9.3-1.el7.noarch seabios-1.10.2-1.el7.x86_64 FC host info: # multipath -ll mpathb (360050763008084e6e000000000000195) dm-1 IBM ,2145 size=120G features='1 queue_if_no_path' hwhandler='0' wp=rw |-+- policy='service-time 0' prio=50 status=active | |- 1:0:0:1 sdc 8:32 active ready running | `- 2:0:0:1 sdf 8:80 active ready running `-+- policy='service-time 0' prio=10 status=enabled |- 1:0:1:1 sdg 8:96 active ready running `- 2:0:1:1 sdi 8:128 active ready running mpatha (360050763008084e6e000000000000194) dm-0 IBM ,2145 size=100G features='1 queue_if_no_path' hwhandler='0' wp=rw |-+- policy='service-time 0' prio=50 status=active | |- 1:0:1:0 sde 8:64 active ready running | `- 2:0:1:0 sdh 8:112 active ready running `-+- policy='service-time 0' prio=10 status=enabled |- 1:0:0:0 sdb 8:16 active ready running `- 2:0:0:0 sdd 8:48 active ready running Booted command info: ------------------------------------------------------------------------------- qemu 26936 23.7 3.5 2014688 1171756 ? Rl 03:21 12:50 /usr/libexec/qemu-kvm -name guest=msfc-vm1...-drive file=/dev/mapper/360050763008084e6e000000000000195,format=raw,if=none,id=drive-scsi0-0-0-0,cache=none,werror=stop,rerror=stop,aio=native -device scsi-block,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0... ------------------------------------------------------------------------------- qemu 27294 25.8 3.5 2015724 1166660 ? Sl 03:24 13:13 /usr/libexec/qemu-kvm -name guest=msfc-vm2...-drive file=/dev/mapper/360050763008084e6e000000000000195,format=raw,if=none,id=drive-scsi0-0-0-0,cache=none,werror=stop,rerror=stop,aio=native -device scsi-block,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0... ------------------------------------------------------------------------------- Created attachment 1316185 [details]
failover cluster validation test report
Did you add unpriv_sgio yes to the defaults section of multipath.conf? Was this a setup where the only issue was that unpriv_sgio wasn't getting set on all the multipath paths correctly? All those packages did was make it so that you should configure multipath to set unpriv_sgio on its path devices. The packages that will include the code to remove the need for setting the reservation key in multipath.conf will be coming shortly (now that I'm finally done with all my summer time off). Hi, I tried follows 3 tests for this bug with FC env: 1. Tried to test this case with "unpriv_sgio yes" added on multipath.conf. Used the RHVM to complete this test due to it use the qemu user to boot the vms up. The shared disk unpriv_sgio status as follows: [root@hp-dl388g9-01 etc]# grep -e 0 -e 1 /sys/devices/virtual/block/dm-0/queue/unpriv_sgio /sys/devices/platform/host*/session*/target*/*/block/sd*/queue/unpriv_sgio /sys/devices/virtual/block/dm-0/queue/unpriv_sgio:1 /sys/devices/platform/host3/session1/target3:0:0/3:0:0:0/block/sdj/queue/unpriv_sgio:1 the shared disk multipath info: [root@hp-dl388g9-01 ~]# multipath -ll 360050763008084e6e000000000000195 dm-0 IBM ,2145 size=120G features='0' hwhandler='0' wp=rw |-+- policy='service-time 0' prio=50 status=active | |- 1:0:0:1 sdc 8:32 active ready running | `- 2:0:0:1 sdg 8:96 active ready running `-+- policy='service-time 0' prio=10 status=enabled |- 1:0:1:1 sdf 8:80 active ready running `- 2:0:1:1 sdi 8:128 active ready running The test result: Also failed, but the failed step was not same as the comment#23, the step "ListDisk To Be Validated" can be passed, it failed on the step "Validation SCSi-3 Reservation". Report pls refer to the attachment. 2. Tried to test with root user under qemu level(not RHVM). Tried with the disk /dev/mapper/360050763008084e6e000000000000195 to complete the test, the command line is same as comment#23, it also failed, the error message same as the first test result, the step "ListDisk To Be Validated" can be passed, it failed on the step "Validation SCSi-3 Reservation". 3. Tried to test with root user under qemu level(not RHVM). Tried with the disk /dev/sdc and /dev/sdf directly to complete the failover validation test, it can be passed. The command line as: -------------------------------------------------------------------------------- root 18437 9.8 6.6 3082072 2167452 pts/0 Sl+ 22:27 3:24 /usr/libexec/qemu-kvm -name server1...-drive file=/dev/sdc,if=none,media=disk,format=raw,rerror=stop,werror=stop,readonly=off,aio=threads,cache=none,cache.direct=on,id=drive-hotadd,serial=sas-test -device scsi-block,drive=drive-hotadd,bus=scsi-hotadd.0... -------------------------------------------------------------------------------- root 18480 8.4 6.6 3067692 2165884 pts/1 Sl+ 22:27 2:56 /usr/libexec/qemu-kvm -name server2...-drive file=/dev/sdf,if=none,media=disk,format=raw,rerror=stop,werror=stop,readonly=off,aio=threads,cache=none,cache.direct=on,id=drive-hotadd,serial=sas-test -device scsi-block,drive=drive-hotadd,bus=scsi-hotadd.0... -------------------------------------------------------------------------------- Hope the above information is useful~ Best Regards~ Peixiu Hou Created attachment 1317562 [details]
the validation report after set "unpriv_sgio yes"
Along with the unpriv_sgio option, I've now added the ability to use /etc/multipath/prkeys to keep track of the persistent reservation keys. To do this, you must set reservation_key file in /etc/multipath.conf This will tell multipathd to look in the prkeys file to see if there is any reservation key associated with the device's wwid. However, you will not need to manually add or remove the keys from the prkeys file. When you have reservation_key set to "file" mpathpersist will handle this for you. Simply create a registration like normally with mpathpersist # mpathpersist -oGS <key> <device> and along with passing the registration to all the path devices, it will notify multipathd to add the key to the prkeys file, so that it will be used for setting registrations on other paths that come up for this device. When you remove a registration with mpathpersist # mpathpersist -oGK <key> <device> it will notify multipathd to remove the key from the prkeys file. change to verified according to comment 39. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:0884 |