Bug 1808583 - [Backport][RHOSP16.1] [PowerMax] Dell EMC PowerMax Driver Features
Summary: [Backport][RHOSP16.1] [PowerMax] Dell EMC PowerMax Driver Features
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-cinder
Version: 16.0 (Train)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: z2
: 16.1 (Train on RHEL 8.2)
Assignee: Pablo Caruana
QA Contact: Tzach Shefi
Chuck Copello
URL:
Whiteboard:
Depends On:
Blocks: 1595325 1715964
TreeView+ depends on / blocked
 
Reported: 2020-02-28 21:03 UTC by Rajini Karthik
Modified: 2020-10-28 15:46 UTC (History)
16 users (show)

Fixed In Version: openstack-cinder-15.3.0-0.20200625043419.3ecfe37.el8ost
Doc Type: Release Note
Doc Text:
Red Hat OpenStack Platform 16.1 includes the following PowerMax Driver updates: + _Feature updates:_ + ** PowerMax Driver - Unisphere storage group/array tagging support ** PowerMax Driver - Short host name and port group name override ** PowerMax Driver - SRDF Enhancement ** PowerMax Driver - Support of Multiple Replication + _Bug fixes:_ + ** PowerMax Driver - Debug Metadata Fix ** PowerMax Driver - Volume group delete failure ** PowerMax Driver - Setting minimum Unisphere version to 9.1.0.5 ** PowerMax Driver - Unmanage Snapshot Delete Fix ** PowerMax Driver - RDF clean snapvx target fix ** PowerMax Driver - Get Manageable Volumes Fix ** PowerMax Driver - Print extend volume info ** PowerMax Driver - Legacy volume not found ** PowerMax Driver - Safeguarding retype to some in-use replicated modes ** PowerMax Driver - Replication array serial check ** PowerMax Driver - Support of Multiple Replication ** PowerMax Driver - Update single underscores ** PowerMax Driver - SRDF Replication Fixes ** PowerMax Driver - Replication Metadata Fix ** PowerMax Driver - Limit replication devices ** PowerMax Driver - Allowing for default volume type in group ** PowerMax Driver - Version comparison correction ** PowerMax Driver - Detach RepConfig logging & Retype rename remote fix ** PowerMax Driver - Manage volume emulation check ** PowerMax Driver - Deletion of group with volumes ** PowerMax Driver - PowerMax Pools Fix ** PowerMax Driver - RDF status validation ** PowerMax Driver - Concurrent live migrations failure ** PowerMax Driver - Live migrate remove rep vol from sg ** PowerMax Driver - U4P failover lock not released on exception ** PowerMax Driver - Compression Change Bug Fix
Clone Of:
Environment:
Last Closed: 2020-10-28 15:45:55 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 697436 0 None MERGED PowerMax Driver - Unisphere storage group/array tagging support 2021-01-18 18:24:20 UTC
OpenStack gerrit 702229 0 None MERGED PowerMax Driver - Short host name and port group name override 2021-01-18 18:24:20 UTC
OpenStack gerrit 709564 0 None MERGED PowerMax Driver - SRDF Enhancement 2021-01-18 18:24:20 UTC
OpenStack gerrit 709669 0 None MERGED PowerMax Driver - Support of Multiple Replication 2021-01-18 18:24:20 UTC
Red Hat Product Errata RHSA-2020:4283 0 None None None 2020-10-28 15:46:43 UTC

Description Rajini Karthik 2020-02-28 21:03:58 UTC
Description of problem:
Here are our Ussuri features for the PowerMax driver. 
https://review.opendev.org/#/c/702229/
https://review.opendev.org/#/c/709564/
https://review.opendev.org/#/c/709669/
https://review.opendev.org/#/c/697436/


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Luigi Toscano 2020-03-17 10:43:29 UTC
Most (if not all) of them can't be probably backported upstream.

Comment 5 Pablo Caruana 2020-05-20 20:10:04 UTC
Hello Rajini, After analyzing the mentioned revisions and following  each parent case, we noticed several conflicts in the code change, namely in the following order 

 -1- PowerMax Driver - Unisphere storage group/array tagging support ( upstream-Usurri: https://review.opendev.org/697436 for https://blueprints.launchpad.net/cinder/+spec/powermax-storage-group-tagging)

 -2- PowerMax Driver - Short host name and port group name override (upstream-Usurri: https://review.opendev.org/702229 for blueprint powermax-user-defined-hostname-portgroup)

	Now  from the original comment next should be  PowerMax Driver - SRDF Enhancement as #3 but but before that one we notice code incorporated by:
 
 -3- PowerMax Driver - Print extend volume info (Upstream-Usurri: https://review.opendev.org/711046) that conflicts 1 (this one may be or not relevant but code was incorporated before 5)

 	Then as part of the " SRDF Enhancement" and "Support of Multiple Replication" topic relate, this particular is relevant as  removing lot of code before  5 (#3 according to the original tracking):
 
 -4- PowerMax Driver - Safeguarding retype to some in-use replicated modes (Upstream-Usurri: https://review.opendev.org/706539/ )
 
 -5- PowerMax Driver - SRDF Enhancement    (Upstream-Usurri: https://review.opendev.org/709564 for blueprint powermax-srdf-enhancement)    
 
 -6- PowerMax Driver - Support of Multiple Replication  (Upstream-Usurri: https://review.opendev.org/709669 for blueprint powermax-multiple-replication-devices) marked as old number #4 

	Finally as part of the SDRF topic conversation but not mentioned on the original topic:

-7- PowerMax Driver - SRDF Replication Fixes  (Upstream-Usurri: https://review.opendev.org/714723 for fixing some test scenarios related to 5- )

Meanwhile we perfomed the basics including code parent analysis, resolving conflicts as closer the original code meanwhile incorporating the incoming ones and ran the unit tests including verifying it against our own RHOSP16 CI, still feels more a natural way to handle by backporting to Train branch as  forked repo (just as example) and from that cherry picking the code already merged and tested . This looks fitting better  from dell-emc side as knowing better the code pieces related to the driver, specially if something relevant missing.

Looking forward for your feedback.
Pablo.

Comment 6 Alan Bishop 2020-05-21 15:47:36 UTC
Rajini,

As Pablo noted in comment #5, we've found the task of backporting the Powermax driver too complex for us to undertake, and therefore we ask that Dell EMC do the work and supply us with the results. We have done this with other partner's driver backport requests, and the process works rather well for both companies.

We recognize the fact that the backports need to be downstream-only due to the patches not being suitable for upstream stable branches. With this in mind, here's the procedure we recommend:

Dell EMC tasks:
- Create a publicly accessible fork of the main openstack cinder git repository (https://opendev.org/openstack/cinder.git), perhaps under a Dell EMC area on github.
- Create a branch on the forked repo that is based off stable/train HEAD
- Cherry pick all of the patches required to satisfy your Powermax driver backport
- Signal when the work is done

Red Hat tasks:
- We will cherry pick all of the patches you post on the Dell EMC repo into our Red Hat OSP repo for OSP-16.1

A couple of notes regarding the patches we're looking for you to provide:
- Each patch must be "clean" (passing pep8 and unit tests)
- Whenever the patch requires resolving conflicts, the list of conflicts is included in the commit message immediately after git's "(cherry picked from commit xxx)" so that it would like like this:

    <snip>
    (cherry picked from commit abcdef0123456789abcdef0123456789abcdef01)
    Conflicts:
            cinder/tests/unit/volume/drivers/file_x.py
            cinder/volume/drivers/dell_emc/file_y.py

This is standard practice that Red Hat follows when we perform downstream backports.

Comment 7 Rajini Karthik 2020-05-21 16:24:25 UTC
There were also a lot of bug fixes added for the features in Ussuri, so just backporting the feature in isolation is not enough.
In fact we recommend Ussuri driver as is and retrofit it to Train.  


Having said that, do we still need a fork etc...? The powermax team is waiting to proceed

Comment 8 Alan Bishop 2020-05-21 17:56:56 UTC
I understand additional patches are required to address bug fixes in the new driver. This is a significant part of the reason why we need Dell EMC to do the work, especially so that the full set of backports can be tested by the Dell EMC team (Red Hat cannot do the testing).

And from a support perspective, it's critical that the backports allow us to track the bug fixes. Squashing everything by simply taking the entire Ussuri driver as-is will not allow us to do that.

So yes, please proceed with the step outlined in comment #6.

Comment 9 helen.walsh 2020-05-28 13:14:52 UTC
Hi Alan,
I will be taking up this task.  I have created the following https://github.com/helenwalsh/test_backport.  Please let me know if you have access and I can begin the backports

Kind regards,
Helen

Comment 10 Alan Bishop 2020-05-28 15:07:26 UTC
Hi Helen,

Yes, I can access that repo (although I'm curious why you didn't name it "cinder"). I recommend you backport the first patch, and then ping me before adding further patches. That will give me a chance to provide feedback that could be useful when you work on the subsequent patches.

Comment 11 helen.walsh 2020-05-28 15:41:23 UTC
I will rename it.  Thank you for checking.  I will backport the first patch and get back to you.

Comment 12 helen.walsh 2020-06-05 12:13:50 UTC
Hi Alan,

Apologies, I had forgotten to get back to you after the first patch.
Work has been completed on https://github.com/helenwalsh/cinder.git on the stable/train branch

Please see commits made on June 2nd and June 3rd:
https://github.com/helenwalsh/cinder/commits/stable/train


All commits have clean unit tests and pep8. The cherry-picks were clean for the most part but a couple of changes needed to be made to make pep8 run cleanly.
I ran tempest on this overnight and all tests ran clean.

Thank you,
Helen

Comment 13 Alan Bishop 2020-06-08 20:38:02 UTC
Hi Helen,

I have some good news and some bad news :-/ The bad news is the backport process doesn't seem to be on track for success, and I'll list a few of the issues I'm seeing. The good news is I have another way forward that I think will be easier for both companies to pursue.

Here are some of the issues I see after reviewing your branch [1].

1. It shows 30 patches were cherry picked, FAR more than the 4 patches listed in the BZ description. I'm not sure how we got from 4 to 30 patches, but that's a significant leap.

2. The very first patch [2] has problems. It's supposed to be a cherry pick of [3], but clearly the code changes are quite different.

3. The second patch [4] appears to be a nop (no actual code change), and that's because the patch is already present on stable/train [5].

4. Patches [6] and [7] have the same commit message and Change-Id. I'm guessing this should have been a single patch that initially had a merge conflict.

[1] https://github.com/helenwalsh/cinder/commits/stable/train
[2] https://github.com/helenwalsh/cinder/commit/d3668fe241a1ff308568a53b2c05f8353914a14f
[3] https://github.com/helenwalsh/cinder/commit/f1eb128c821c257ab035518f34b7df269eaa7cdd
[4] https://github.com/helenwalsh/cinder/commit/a13a7b6bc48bb30646e0da9fa9fa73bf0fcebf79
[5] https://review.opendev.org/698600
[6] https://github.com/helenwalsh/cinder/commit/163b1d2a31c6056d028a0e1d492972ee13b0b14e
[7] https://github.com/helenwalsh/cinder/commit/4ac531bc14c42b40445688cdc989dd01d8872af9

Instead of tracking every patch in what's clearly a complex backport, let's try another approach in which you squash all of the pieces into a single commit. This has the disadvantage of obscuring the revision history, but it eliminates the need to supply a large number (4 <= N <= 30) of clean patches. I suggest trying the following:

- Squash all of the PowerMax commits
- Verify pep8, unit tests, etc
- I highly suggest you do a TripleO deployment (Rajini can help) to verify the driver functions correctly in a real deployment
- Update the commit message to summarize all of the changes associated with that single patch (driver version, list of features, bug fixes, etc.)

Comment 14 helen.walsh 2020-06-10 09:22:42 UTC
Thanks Alan.  I will do what you suggest and sync up with Rajini on the TripleO deployment.  
I can verify that tempest ran clean on Train with this driver.
Kind regards,
Helen

Comment 15 helen.walsh 2020-06-11 15:35:51 UTC
Hi Alan,
I have squashed all the commits down to one
https://github.com/helenwalsh/cinder/commit/1eb7cba7a89dfac32a63bfabb705bed8b9a5ef3a

Unit tests and pep8 runs cleanly.
I will be running CI(tempest) on it overnight but I have had clean runs for the last week on this branch.
Kind regards,
Helen

Comment 16 Alan Bishop 2020-06-15 16:23:34 UTC
@Helen, that commit looks good, and we're currently running it through our own downstream CI. Thanks for the update!

Comment 26 errata-xmlrpc 2020-10-28 15:45:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: openstack-cinder security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:4283


Note You need to log in before you can comment on or make changes to this bug.