Bug 1949369

Summary: path devices are not suddenly removed after flushing a multipath device
Product: Red Hat Enterprise Linux 8 Reporter: Takashi Kajinami <tkajinam>
Component: device-mapper-multipathAssignee: Ben Marzinski <bmarzins>
Status: CLOSED NOTABUG QA Contact: Lin Li <lilin>
Severity: urgent Docs Contact:
Priority: low    
Version: 8.2CC: agk, bmarzins, heinzm, jmagrini, lilin, msnitzer, prajnoha, zkabelac
Target Milestone: betaKeywords: Triaged
Target Release: ---Flags: pm-rhel: mirror+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-10-27 01:07:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Takashi Kajinami 2021-04-14 06:51:31 UTC
Description of problem:

In RHOSP, we run the following steps when powering on an instance
to ensure that all volume devices are available.

1. Flush multipath device
2. Remove each iscsi devices by
   $ echo 1 | sudo tee -a /sys/block/<device>/device/delete
3. Logout from all iscsi portals
4. Relogin to iscsi portals
5. Set node.startup=manual
6. Scan iscsi devices
7. Set up a multipath devices using available iscsi devices by
   $ multipathd -a /dev/<device>

A problem here is that the path device was not properly removed from multipathd
after 1-3, and step 7 fails because of orphan paths left.

This issue was initially observed in RHOSP16.1 deployment with the following packages
installed.
 - device-mapper-multipath-0.8.3-3.el8_2.3.x86_64
 - device-mapper-multipath-libs-0.8.3-3.el8_2.3.x86_64
 - kpartx-0.8.3-3.el8_2.3.x86_64

We confirmed that updating the packages to the following version solves the issue.
 - device-mapper-multipath-0.8.4-5.el8.x86_64
 - device-mapper-multipath-libs-0.8.4-5.el8.x86_64
 - kpartx-0.8.4-5.el8.x86_64

This version is available in normal rhel8 repo but is not available in eus repo.
Since RHOSP16.1 depends on RHEL8.2 we do need the fixed version in RHEL8.2


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 Jon Magrini 2021-04-14 13:36:28 UTC
---
1. Flush multipath device
---

What procedure is used to flush the device, ie: multipath -f? Are you able to recreate this issue using multipathd cli vs multipath -f? IE: multipathd remove|del map|multipath $map. From the changelog of 0.8.4-3 I believe the following patches added from bz #1845875 address this issue: 

- Add 0024-libmultipath-make-dm_get_map-status-return-codes-sym.patch
- Add 0025-multipathd-fix-check_path-errors-with-removed-map.patch
- Add 0026-libmultipath-make-dm_flush_maps-only-return-0-on-suc.patch
- Add 0027-multipathd-add-del-maps-multipathd-command.patch
- Add 0028-multipath-make-flushing-maps-work-like-other-command.patch
- Add 0029-multipath-delegate-flushing-maps-to-multipathd.patch
- Add 0030-multipath-add-option-to-skip-multipathd-delegation.patch
  * The above 7 patches fix bz #1845875. Multipath now attempts to
    delegate device removal to multipathd, and multipathd handles
    external device removal better.

Further details are noted here:
https://bugzilla.redhat.com/show_bug.cgi?id=1845875#c3 


If not too invasive possibly backport those to EUS.

Comment 3 Jon Magrini 2021-04-14 13:45:18 UTC
A zstream/eus was cloned from bz #1845875 as bz #1856944 : https://bugzilla.redhat.com/show_bug.cgi?id=1856944#c1 though doesn't seem to contain what is needed to address the issue described here:

- Add 0020-libmultipath-make-dm_get_map-status-return-codes-sym.patch
- Add 0021-multipathd-fix-check_path-errors-with-removed-map.patch
  * The above 2 patches fix bz #1856944. multipathd handles external
    device removal better.

Comment 4 Takashi Kajinami 2021-04-14 13:49:42 UTC
> What procedure is used to flush the device, ie: multipath -f? 

yes. multipath -f is used to flush the device.

> Are you able to recreate this issue using multipathd cli vs multipath -f? IE: multipathd remove|del map|multipath $map.

We have not tried usage of only multipathd cli.
What we've tried before updating the device-mapper-multipath package was is to modify implementation in OpenStack
to execute "mulipathd del path /dev/<device>" after "multipath -f" but before removing iscsi device.
This was tested in the customer's deployment and we confirmed this also solves the problem.

Comment 16 Takashi Kajinami 2021-10-27 01:07:25 UTC
Sorry I forgot to update this bug after we reached to the conclusion in RHOSP side.

I'm closing this bug as notabug because the issue will be fixed in RHOSP (os-brick).