Bug 1920571

Summary: fcp multipath will not recover failed paths automatically
Product: OpenShift Container Platform Reporter: Alexander Klein <alklein>
Component: RHCOSAssignee: Jonathan Lebon <jlebon>
Status: CLOSED ERRATA QA Contact: Michael Nguyen <mnguyen>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.7CC: bbreard, bgilbert, hwolf, imcleod, jligon, miabbott, ndubrovs, nstielau, psundara, slowrie, smilner, sorth, wvoesch
Target Milestone: ---   
Target Release: 4.7.0   
Hardware: s390x   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-02-24 15:56:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1903544, 1922292    

Description Alexander Klein 2021-01-26 16:03:18 UTC
Description of problem:

Install a node and configure it to use multipath by changing the parmline (adding more paths and enabling multipath)

mpatha (36005076307ffc5e3000000000000080b) dm-0 ##,##
size=100G features='0' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=0 status=active
  |- #:#:#:# sdb 8:16 active undef running
  |- #:#:#:# sdd 8:48 active undef running
  |- #:#:#:# sda 8:0 active undef running
  `- #:#:#:# sdc 8:32 active undef running

Fail one path ( e.g. by deconfiguring, detaching in zvm, cable pull).
the failed paths will be detected correctly.

mpatha (36005076307ffc5e3000000000000080b) dm-0 ##,##
size=100G features='0' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=0 status=active
  |- #:#:#:# sdb 8:16 failed undef running
  |- #:#:#:# sdd 8:48 failed undef running
  |- #:#:#:# sda 8:0 active undef running
  `- #:#:#:# sdc 8:32 active undef running

once the path are back online the node will not activate the failed paths again, the only way of reanabling them is restarting the whole node, which is not the expected behaviour.

Some other findings that might be related:
multipath does show that it does not have a multipath config
multipath -ll does not show scsi ids for any devices (just #)
lsscsi is missing in rhcos

Version-Release number of selected component (if applicable):

oc version
Client Version: 4.7.0-0.nightly-s390x-2021-01-15-033217
Server Version: 4.7.0-0.nightly-s390x-2021-01-15-033217
Kubernetes Version: v1.20.0+7616fab

rhcos-47.83.202101142111

Comment 1 Micah Abbott 2021-01-26 16:17:53 UTC
@Jonathan @Prashanth could one of you investigate this scenario?

Comment 2 Jonathan Lebon 2021-01-26 16:49:59 UTC
I'm not a multipath SME so I'm not sure as to the specifics here, but this is unlikely to be an RHCOS-specific bug. We ship the same multipath used in vanilla RHEL.

Is it perhaps missing something in /etc/multipath.conf to have it automatically re-activate previously failed paths?

Comment 3 Prashanth Sundararaman 2021-01-26 17:31:48 UTC
@Alexander Klein - how did you simulate the failover on the zVM ? are there any commands that you can share so i can try the same?

But i agree that this looks more like a general multipath question/issue rather than RHCOS.

Comment 4 Alexander Klein 2021-01-27 07:42:12 UTC
on the zvm side you can just detach the one of the fcp devices with 

#cp det <4 digit fcp device>  

from linux: 

sudo vmcp det 1234


once you attach it again (by 

#cp att 1234 to *   

[* if you are issuing the command from that guest or the guestname to do it remotely]) Linux will automatically detect the fcp device being available again.
Just multipath will not bring it up again.

another way would be to just deconfigure single paths in linux with 

sudo chzdev -d <fcpdevice>:0x<wwpn>:0x<lun> 

and afterward reconfigureing them with 

sudo chzdev -e <fcpdevice>:0x<wwpn>:0x<lun> 

they will come up for the linux but multipath wills till not recognize them.

Comment 5 Alexander Klein 2021-01-27 09:26:52 UTC
i just created a valid /etc/multipath.conf (s390x fcp recommendation)

    defaults {
        default_features "1 queue_if_no_path"
        user_friendly_names yes
        path_grouping_policy multibus
        dev_loss_tmo 2147483647
        fast_io_fail_tmo 5
    }

    blacklist {
        devnode '*'
    }

    blacklist_exceptions {
        devnode "^sd[a-z]+"

    }


and this will make the failed paths recover correctly, also the scsi ids will be displayed correctly, so this should be fine as long as the documentation mentions that you have to do this step.

Comment 6 Alexander Klein 2021-01-27 09:39:08 UTC
on rhel you would get a default multipath.conf by running  
sudo /sbin/mpathconf --enable
this does work on rhcos too but is not done automatically when enabling multipath so it might be a good idea to do this automatically for a default config and still have the option to enable a custom config via mco.

Comment 7 Stefan Orth 2021-01-27 14:14:41 UTC
You can create a multipath.conf with MCO before you enable MP via MCO:

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: "multipath"
  name: worker-enablempath-config
spec:
  config:
     ignition:
      version: 3.2.0      
     storage:
       files:
       - contents:
           source: data:text/plain;charset=utf-8;base64,IyBkZXZpY2UtbWFwcGVyLW11bHRpcGF0aCBjb25maWd1cmF0aW9uIGZpbGUKCiMgRm9yIGEgY29tcGxldGUgbGlzdCBvZiB0aGUgZGVmYXVsdCBjb25maWd1cmF0aW9uIHZhbHVlcywgcnVuIGVpdGhlcjoKIyAjIG11bHRpcGF0aCAtdAojIG9yCiMgIyBtdWx0aXBhdGhkIHNob3cgY29uZmlnCgojIEZvciBhIGxpc3Qgb2YgY29uZmlndXJhdGlvbiBvcHRpb25zIHdpdGggZGVzY3JpcHRpb25zLCBzZWUgdGhlCiMgbXVsdGlwYXRoLmNvbmYgbWFuIHBhZ2UuCgpkZWZhdWx0cyB7Cgl1c2VyX2ZyaWVuZGx5X25hbWVzIHllcwoJZmluZF9tdWx0aXBhdGhzIHllcwoJZW5hYmxlX2ZvcmVpZ24gIl4kIgp9CgpibGFja2xpc3RfZXhjZXB0aW9ucyB7CiAgICAgICAgcHJvcGVydHkgIihTQ1NJX0lERU5UX3xJRF9XV04pIgp9CgpibGFja2xpc3Qgewp9Cg==
         filesystem: root
         mode: 0644
         path: /etc/multipath.conf

I used a default configuration, but it is also possible to create another (recommended) config. I labeled my node as "multipath" to have it on a specific node only.

Comment 8 Prashanth Sundararaman 2021-01-27 16:21:50 UTC
Ok, yes i see this issue too and adding the `/etc/multipath.conf` does work.

@Jonathan Lebon - Could the /etc/multipath.conf be persisted to the real root in the case that the rd.multipath=default kargs is present in dracut? is that a change we could potentially make? and does it make sense?

Comment 9 Jonathan Lebon 2021-01-27 21:31:55 UTC
(In reply to Prashanth Sundararaman from comment #8)
> @Jonathan Lebon - Could the /etc/multipath.conf be persisted to the real
> root in the case that the rd.multipath=default kargs is present in dracut?
> is that a change we could potentially make? and does it make sense?

This is mostly how it's set up: https://github.com/coreos/fedora-coreos-config/blob/6fe1cff5ade8bc7489b6ab542d6f3cb5c92d9082/overlay.d/05core/usr/lib/dracut/modules.d/35coreos-ignition/coreos-teardown-initramfs.sh#L170-L184

But the thing is that this code only kicks in on first boot, while `rd.multipath=default` is now a second boot thing. So the config doesn't get propagated.

But yeah I think it's reasonable to fix this so as to not make the UX too painful. I'll look at hoisting that code out so it runs every boot instead.

Comment 10 Jonathan Lebon 2021-01-27 22:03:17 UTC
@psundara If you have the cycles, would appreciate a sanity-check of that patch on RHCOS with the same s390x scenario. I did confirm it works fine in `cosa run --qemu-multipath` on x86.

Comment 11 Prashanth Sundararaman 2021-01-28 13:50:58 UTC
(In reply to Jonathan Lebon from comment #10)
> @psundara If you have the cycles, would appreciate a sanity-check
> of that patch on RHCOS with the same s390x scenario. I did confirm it works
> fine in `cosa run --qemu-multipath` on x86.

will test this out today. thank you!!

Comment 12 Prashanth Sundararaman 2021-01-28 20:38:14 UTC
The patch posted above works. Tested disk failover and when i enabled the device, the path was recognized as active. thanks Jonathan!

Comment 13 Micah Abbott 2021-01-28 21:52:04 UTC
(In reply to Prashanth Sundararaman from comment #12)
> The patch posted above works. Tested disk failover and when i enabled the
> device, the path was recognized as active. thanks Jonathan!

Good news!  I'm going to target this BZ for 4.7 with hopes of getting it in the next boot image bump.

Comment 14 Prashanth Sundararaman 2021-01-28 22:08:50 UTC
(In reply to Micah Abbott from comment #13)
> (In reply to Prashanth Sundararaman from comment #12)
> > The patch posted above works. Tested disk failover and when i enabled the
> > device, the path was recognized as active. thanks Jonathan!
> 
> Good news!  I'm going to target this BZ for 4.7 with hopes of getting it in
> the next boot image bump.

that would be great. Thanks Micah!

Comment 19 Prashanth Sundararaman 2021-02-03 17:13:33 UTC
@Alexander Klein - Can you please retest with the latest rhcos image: https://releases-rhcos-art.cloud.privileged.psi.redhat.com/?stream=releases/rhcos-4.7-s390x&release=47.83.202102031512-0#47.83.202102031512-0. I tested it and it works fine.

Comment 25 Micah Abbott 2021-02-05 14:53:17 UTC
Prashanth verified this in comment #21 using a recent RHCOS 4.7 build

Comment 28 errata-xmlrpc 2021-02-24 15:56:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633