Bug 1908366

Summary: [OSP-13] OS-Brick: VxFlexOS Volume-attach failed with KeyError: 'config_group'
Product: Red Hat OpenStack Reporter: Alan Bishop <abishop>
Component: python-os-brickAssignee: Alan Bishop <abishop>
Status: CLOSED ERRATA QA Contact: Tzach Shefi <tshefi>
Severity: high Docs Contact:
Priority: high    
Version: 13.0 (Queens)CC: apevec, astupnik, dhill, dhruv, ealcaniz, fiezzi, fnebiolo, gcharot, geguileo, ianwatson, jamsmith, jschluet, jvisser, kholtz, lhh, pgrist
Target Milestone: z15Keywords: OtherQA, Triaged, ZStream
Target Release: 13.0 (Queens)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: python-os-brick-2.3.9-8.el7ost Doc Type: Bug Fix
Doc Text:
This update fixes an incompatibility that caused VxFlex volume detachment attempts to fail. + A recent change in VxFlex cinder volume credentialing methods was not backward compatible with pre-existing volume attachments. If a VxFlex volume attachment was made before the credentialing method change, attempts to detach the volume failed. + Now the detachments do not fail.
Story Points: ---
Clone Of: 1869346 Environment:
Last Closed: 2021-03-18 13:09:23 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1869346    
Bug Blocks:    

Description Alan Bishop 2020-12-16 14:22:23 UTC
+++ This bug was initially created as a clone of Bug #1869346 +++

+++ This bug was initially created as a clone of Bug #1862213 +++

Description of problem:

This BZ covers the os-brick portion of the problem reported in bug #1862213. I am including relevant comments from the original BZ, below.

Additional info:

--- Additional comment from Takashi Kajinami on 2020-08-13 03:07:57 UTC ---

I've checked the pasted patch here, but I'm afraid that the fix in cinder might not be sufficient.

If I understand correctly the current error is raised because "config_group" key doesn't exist
in connection_properties stored in nova bdm record.
This means that even if we apply fix for cinder to make it expose config_group parameter in its response,
it won't resolve attach failure in existing instances which were created before update to 16.1,
but only works with the new attachment created after update.

Please correct me if I'm wrong.I might have missed some logics about attachment management in nova side
is something has been changed side queens (or something special is implemented in scaleio).

--- Additional comment from Gorka Eguileor on 2020-08-13 10:41:39 UTC ---

Hi Takashi-san,

I believe you are correct, this fix will only work for new attachments and won't help with any already attached volumes.
Old attachments should already have the password in the connection properties, so we need to make the ScaleIO connector in OS-Brick backward compatible with that information.
There is no patch to fix this upstream, so we would have to write a new one.

I'm going on PTO today for 2 weeks, but I'll try to write today a patch to fix this and then one of my colleagues will handle the rest (babysitting upstream patch, backporting upstream and downstream, etc.).

Regards,
Gorka.

--- Additional comment from Gorka Eguileor on 2020-08-13 11:24:07 UTC ---

I have submitted patch https://review.opendev.org/#/c/746109 for review.
It should fix the backward compatibility issue mentioned by Takeshi-san on comment #16

Comment 1 Alan Bishop 2020-12-17 21:34:43 UTC
The patch merged on upstream stable/queens. z14 has already been released, and z15 will be a full import from stable/queens.

Comment 3 Alan Bishop 2021-01-11 13:53:25 UTC
I'm not aware of a workaround, and so I requested a hotfix.

Comment 15 Alan Bishop 2021-01-18 14:10:01 UTC
Here is the full list of services that rely on os-brick:

1. nova-compute
2. cinder-volume
3. glance-api (if glance is using cinder for its storage backend)
4. cinder-backup (an optional service)

Typically an os-brick hotfix would need to be applied to each of the above container images (#3, 4 if applicable). But, as noted, the nature of the bug is such that it affects nova. You might be able to update just the nova-compute container image, but cinder-volume should also be updated if there are signs of the failure in the cinder-volume.log.

Comment 28 errata-xmlrpc 2021-03-18 13:09:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 13.0 bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:0932