Bug 1418856

Summary: multipath service fails to start on compute node
Product: Red Hat OpenStack Reporter: William <wlehman>
Component: python-os-brickAssignee: Gorka Eguileor <geguileo>
Status: CLOSED ERRATA QA Contact: Avi Avraham <aavraham>
Severity: high Docs Contact:
Priority: high    
Version: 8.0 (Liberty)CC: aathomas, apevec, bschmaus, dmaley, egafford, eharney, geguileo, hmatsumo, jschluet, jthomas, jwaterwo, knoha, lhh, lkuchlan, lyarwood, mas-hatada, mlopes, nlevinki, pgrist, scohen, sgotliv, srevivo, timotcla, vcojot
Target Milestone: zstreamKeywords: Triaged, ZStream
Target Release: 8.0 (Liberty)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: python-os-brick-0.5.0-4.el7ost Doc Type: Bug Fix
Doc Text:
This update contains a complete refactoring of the iSCSI connection mechanism, resulting in improved reliability. For optimal results, use with openstack-cinder >= 7.0.3-8 and iscsi-initiator-utils >= 6.2.0.874-2.
Story Points: ---
Clone Of: 1372428 Environment:
Last Closed: 2017-10-25 17:06:46 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1322044, 1372428, 1422941, 1427342    
Bug Blocks: 1194008, 1372431, 1378574, 1381347, 1386469    

Description William 2017-02-02 22:13:02 UTC
+++ This bug was initially created as a clone of Bug #1372428 +++

+++ This bug was initially created as a clone of Bug #1322044 +++

Configure the storage as alua mode. In this case,
because of OpenStack environment, alua rtpg response failure can happen.
If a target lun is invisible to a host, alua rtpg response failure happens
although the storage is configured as alua. The reason why the lun is invisible
is a race between scan and attach/detach LUNs in parallel as I explained before.
Here is the story the issue happens in OpenStack environment.

When OpenStack allocates an LUN for a new VM(Virtual Machine):

A1. OpenStack requests the storage controller so that the controller creates a new LUN and make it visible to the host. The command provided by EMC is used for the purpose.  The storage controller may reuse same WWID of which LUN had been created in the past but it was already removed.
   
A2. OpenStack does scan so that the host detects the newly added LUN with the
   following command.
   
     echo "- - -" > /sys/class/scsi_host/hostN/scan

When OpenStack removes an LUN in destroying a VM:

R1. OpenStack deletes a path for the LUN with the following command:

     echo 1 > /sys/bus/scsi/drivers/sd/h:b:t:l/delete
     
R2. OpenStack requests the storage controller so that the controller removes the LUN.
    The command provided by EMC is used for the purpose.
    
The problem is OpenStack executes above process in parallel with multiple threads.

Here is an example the race is happen between VM1 and VM2:

       VM1                                  VM2
1. A1(Create LUN):WWID X assigned 
2. A2(Scan LUN):Detected sdaa-WWID X
3. ...
4. R1(Delete path):sdaa removed
5.                                       A1(Create LUN):WWID Y assigned 
6.                                       A2(Scan LUN):Detected sdaa-WWID X, Detected sdab-WWID Y
7. R2(Remove LUN):Remove WWID X

If scan is done for VM2 (step 6) between delete path (4) and remove LUN (7) for VM1, the path sdaa comes back to visible though it had been deleted at step 4 once. However, the storage controller removes the LUN at step 7. The path sdaa still remains.

The example continues for another VM: VM3.

       VM3 
8. A1(Create LUN):WWID X reassigned 
9. A2(Scan LUN):Detected sdac-WWID X

The multipathd falsely recognizes sdaa and sdac must be a bunch because
they have the same WWID X. The LUN for the path sdaa had been already
removed at step 7. Thus, the path sdaa is recognized as a down path.

There are two workarounds:

 1. Change multipath.conf removing alua configuration
 2. Delete all down path manually

--- Additional comment from Red Hat Bugzilla Rules Engine on 2016-03-29 15:20:35 EDT ---

Since this issue was entered in bugzilla without a release flag set, rhos-8.0? has been automatically added to ensure that it is properly evaluated for this release.

--- Additional comment from Sergey Gotliv on 2016-03-30 09:53:48 EDT ---

I guess both Cinder and Nova should scan for the specific device instead of using 
a wildcard '- - -'.

echo "c t l" > /sys/class/scsi_host/hosth/scan

c - channel on the HBA, 
t - SCSI target ID, 
l - LUN.

--- Additional comment from Lee Yarwood on 2016-03-30 12:23:27 EDT ---

(In reply to Sergey Gotliv from comment #2)
> I guess both Cinder and Nova should scan for the specific device instead of
> using 
> a wildcard '- - -'.
> 
> echo "c t l" > /sys/class/scsi_host/hosth/scan
> 
> c - channel on the HBA, 
> t - SCSI target ID, 
> l - LUN.

So we should have the LUN when calling connect_volume via connection_info['target_lun'], we could then use this in the rescan [2]. Would that be enough do you think?

[1] https://github.com/openstack/os-brick/blob/master/os_brick/initiator/connector.py#L1369
[2] https://github.com/openstack/os-brick/blob/master/os_brick/initiator/connector.py#L1418

--- Additional comment from Sergey Gotliv on 2016-04-03 03:33:19 EDT ---

(In reply to Lee Yarwood from comment #3)
> (In reply to Sergey Gotliv from comment #2)
> > I guess both Cinder and Nova should scan for the specific device instead of
> > using 
> > a wildcard '- - -'.
> > 
> > echo "c t l" > /sys/class/scsi_host/hosth/scan
> > 
> > c - channel on the HBA, 
> > t - SCSI target ID, 
> > l - LUN.
> 
> So we should have the LUN when calling connect_volume via
> connection_info['target_lun'], we could then use this in the rescan [2].
> Would that be enough do you think?

I believe we also need to determine correct channel and scsi target id values as described here:

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Storage_Administration_Guide/adding_storage-device-or-path.html

> 
> [1]
> https://github.com/openstack/os-brick/blob/master/os_brick/initiator/
> connector.py#L1369
> [2]
> https://github.com/openstack/os-brick/blob/master/os_brick/initiator/
> connector.py#L1418

--- Additional comment from Sergey Gotliv on 2016-04-07 01:31:18 EDT ---

This upstream patch is trying to scan for the specific LUN

https://review.openstack.org/#/c/299552/

--- Additional comment from Hidehiko Matsumoto on 2016-04-18 10:29:50 EDT ---

Sorry for bothering you.
Is there any update about this?

--- Additional comment from Sergey Gotliv on 2016-06-01 09:28:51 EDT ---

(In reply to Hidehiko Matsumoto from comment #6)
> Sorry for bothering you.
> Is there any update about this?

Its going slow, mostly because we have a limited access to the FC environment.
Will keep you informed with the progress.

--- Additional comment from Hidehiko Matsumoto on 2016-06-09 03:03:30 EDT ---

(In reply to Sergey Gotliv from comment #7)
> > Sorry for bothering you.
> > Is there any update about this?
> 
> Its going slow, mostly because we have a limited access to the FC
> environment.
> Will keep you informed with the progress.

Thank you for your information.  I got it.

--- Additional comment from Hidehiko Matsumoto on 2016-07-05 05:40:59 EDT ---

(In reply to Sergey Gotliv from comment #7)
> (In reply to Hidehiko Matsumoto from comment #6)
> > Sorry for bothering you.
> > Is there any update about this?
> 
> Its going slow, mostly because we have a limited access to the FC
> environment.
> Will keep you informed with the progress.

Sorry for bothering you.  But I would like to confirm about status of this BZ.
So, do we still wait to access to the FC environment?
Thanks,

--- Additional comment from Sean Cohen on 2016-07-21 11:21:00 EDT ---

Gorka,
Please follow-up with update 
Thanks,
Sean

--- Additional comment from Gorka Eguileor on 2016-07-27 08:13:50 EDT ---

I have created a patch for master and RHOS-9 that removes the wildcards if possible, but I'm waiting on a system where I can actually test that it works before submitting it upstream and doing the backports to RHOS-8 and RHOS-7.

--- Additional comment from Hidehiko Matsumoto on 2016-07-29 12:38:13 EDT ---

Sorry for bothering you.  But I would like to confirm about status of this BZ.

Do we still wait to access to the FC environment?
Or do we have schedule to access to the FC environment at the moment?  Can we estimate when we can access to the FC environment?
If we really do not access to the FC environment, please let me know.

Best regards,
Hidehiko Matsumoto

--- Additional comment from Gorka Eguileor on 2016-07-29 12:56:08 EDT ---

We now have a couple of hosts with the dual FC ports, only the configuration and deployment of the cloud remains.

We expect the configuration to be done on Sunday and if everything goes well I'll be testing the patch on Monday.

--- Additional comment from Hidehiko Matsumoto on 2016-07-29 13:02:27 EDT ---

(In reply to Gorka Eguileor from comment #13)
> We now have a couple of hosts with the dual FC ports, only the configuration
> and deployment of the cloud remains.
> 
> We expect the configuration to be done on Sunday and if everything goes well
> I'll be testing the patch on Monday.

Thank you for your information.  I apprecite.
Thanks,

--- Additional comment from Hidehiko Matsumoto on 2016-08-02 10:58:59 EDT ---

(In reply to Gorka Eguileor from comment #13)
> We now have a couple of hosts with the dual FC ports, only the configuration
> and deployment of the cloud remains.
> 
> We expect the configuration to be done on Sunday and if everything goes well
> I'll be testing the patch on Monday.

Sorry for rushing you but the customer NEC would like to know status of testing.
Could we test the patch?  Or was there any problem for testing?
Thanks,

--- Additional comment from Gorka Eguileor on 2016-08-02 11:13:29 EDT ---

Testing went well, and I have submitted the patch upstream.

Now we have to follow the standard flow, reach agreement upstream on the solution, get it merged, and do the downstream backports to 9, 8, and 7, review it downstream, merge it, build a package, test it, etc...

--- Additional comment from Hidehiko Matsumoto on 2016-08-02 12:14:50 EDT ---

(In reply to Gorka Eguileor from comment #16)
> Testing went well, and I have submitted the patch upstream.
> 
> Now we have to follow the standard flow, reach agreement upstream on the
> solution, get it merged, and do the downstream backports to 9, 8, and 7,
> review it downstream, merge it, build a package, test it, etc...

Thank you for your information.  I appreciate.
Thanks,


--- Additional comment from Keigo Noha on 2016-08-23 21:21:45 EDT ---

Hi Gorka,

According to the upstream gerrit, https://review.openstack.org/#/c/349598/,
the fix is already merged.

Could you share the progress of backporting the fix?

Regards,
Keigo

--- Additional comment from Gorka Eguileor on 2016-09-01 12:34:59 EDT ---

(In reply to Keigo Noha from comment #26)
> Hi Gorka,
> 
> According to the upstream gerrit, https://review.openstack.org/#/c/349598/,
> the fix is already merged.
> 
> Could you share the progress of backporting the fix?
> 
> Regards,
> Keigo

We had to come to an agreement upstream regarding how we would deal with these kind of situations, because they were initially rejecting backports to stable branches but we also couldn't bump the upper constraints of the library on stable branches because there were incompatibilities.

We have now reached an agreement and I'll begin backporting from 10 to 7 downstream.

--- Additional comment from Dave Maley on 2016-09-23 10:16:42 EDT ---

delivering python-os-brick-0.5.0-3.el7ost as a supported hotfix 

--- Additional comment from errata-xmlrpc on 2016-09-29 13:57:54 EDT ---

Bug report changed to ON_QA status by Errata System.
A QE request has been submitted for advisory RHBA-2016:25012-01
https://errata.devel.redhat.com/advisory/25012

--- Additional comment from errata-xmlrpc on 2016-10-28 11:16:58 EDT ---

Bug report changed to ON_QA status by Errata System.
A QE request has been submitted for advisory RHBA-2016:25012-02
https://errata.devel.redhat.com/advisory/25012


--- Additional comment from Aaron Thomas on 2016-11-03 17:22:24 EDT ---

I requested the following from Cisco and wanted to relay:

> Can you relay if you were able to confirm the rpm python-os-brick-0.5.0-3.el7ost.noarch.rpm that was provided on 10/10/2016 resolves the issue?  


Cisco's response:

Created By: Tim Clark  (11/2/2016 9:47 AM)

Hey Aaron,

As Charles hinted at below, we're unable to test this RPM in the customer site - we actually hit three separate edge cases with os-brick failing to clean up multipath iscsi connections which impacted cinder-volume.  We've got two other patches in place on the customer site to provide functionality, which we may seek to send upstream in the near future.  The patch you referenced definitely fixed one of the three issues we've encountered.

--- Additional comment from errata-xmlrpc on 2016-11-08 07:53:19 EST ---

Bug report changed to RELEASE_PENDING status by Errata System.
Advisory RHBA-2016:25012-02 has been changed to PUSH_READY status.
https://errata.devel.redhat.com/advisory/25012

--- Additional comment from Benjamin Schmaus on 2016-11-09 14:04:36 EST ---

Another customer launched a heat stack with 48 instances booted from volume with an additional cinder volume attached to each instance.  Backend was EMC VNX.  The stack completed successfully but it still left behind 2 dms on controller node 2.

3600601609e603d00fb620f5fa0a6e611 dm-3 DGC     ,VRAID           
size=10G features='1 retain_attached_hw_handler' hwhandler='1 alua' wp=rw
 |-+- policy='round-robin 0' prio=0 status=enabled
 | |- 4:0:0:184 sdac 65:192 failed faulty running
 | `- 2:0:0:184 sdaa 65:160 failed faulty running
 `-+- policy='round-robin 0' prio=0 status=enabled |- 1:0:0:184 sdz  65:144 failed faulty running `- 3:0:0:184 sdab 65:176 failed faulty running

sosreport is available from controller node 2 if needed.

python-os-brick-0.5.0-3.el7ost was used.

--- Additional comment from Benjamin Schmaus on 2016-11-09 14:05:57 EST ---

Wonder if the above is a different bug?

--- Additional comment from errata-xmlrpc on 2016-11-14 14:58:09 EST ---

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2712.html

Comment 9 Gorka Eguileor 2017-02-16 15:54:02 UTC
There are a couple of issues with the scanning of targets that can be summarized as:

- No retries on "map in use" dm flushing (os-brick)
- iSCSI Scans too broad
- Automatic iSCSI scan performed by iscsid on AER/AEN package reception

Due to these, we end up with leftover dms in the system which will conflict with other paths as time goes by.

There are proposed fixes for all of them, pending review and backport process.

Comment 10 Paul Grist 2017-03-30 01:07:34 UTC
In further testing last week, some key additional fixes were identified for this collection and Gorka is the process of getting those ready to post upstream. We don't have a specific ETA, but we will get the BZs updated once the patches are ready.

Comment 11 Paul Grist 2017-04-12 03:25:00 UTC
Patches are posted for review, comprehensive testing is now passing on iSCSI (FC mpath testing will follow). 

The relevant patch set will actually be the following set and we will confirm the proper collection needed which may actually vary from the initial set proposed.  This BZ will be the right place to track status for the collection.

https://review.openstack.org/#/c/455394/
https://review.openstack.org/#/c/455393/
https://review.openstack.org/#/c/455392/

Comment 13 Lee Yarwood 2017-05-29 10:33:47 UTC
*** Bug 1343377 has been marked as a duplicate of this bug. ***

Comment 15 Eric Harney 2017-09-15 17:11:13 UTC
*** Bug 1378572 has been marked as a duplicate of this bug. ***

Comment 18 Avi Avraham 2017-10-18 12:48:14 UTC
verified
Package installed : 
python-os-brick-0.5.0-4.el7ost.noarch
Environment setup includes XtremIO connected with 2 path ISCSI as backend 
Need to run tests on specific storage backend for relevant setup to verify the fix.

Comment 20 errata-xmlrpc 2017-10-25 17:06:46 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:3067