Bug 1848417 - Local attach with ceph fails on RHEL 8.2
Summary: Local attach with ceph fails on RHEL 8.2
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-os-brick
Version: 16.1 (Train)
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: z1
: 16.1 (Train on RHEL 8.2)
Assignee: Stephen Finucane
QA Contact: Tzach Shefi
URL:
Whiteboard:
Depends On:
Blocks: 1824115
TreeView+ depends on / blocked
 
Reported: 2020-06-18 10:50 UTC by Stephen Finucane
Modified: 2020-09-04 15:04 UTC (History)
11 users (show)

Fixed In Version: python-os-brick-2.10.3-0.20200605063442.55fc998.el8ost
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-08-27 15:19:10 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1884052 0 None None None 2020-06-18 10:50:25 UTC
OpenStack gerrit 736566 0 None MERGED rbd: Support 'rbd showmapped' output from ceph 13.2.0+ 2020-12-18 12:54:51 UTC
OpenStack gerrit 736758 0 None MERGED rbd: Warn if ceph udev rules are not configured 2020-12-18 12:54:51 UTC
Red Hat Product Errata RHBA-2020:3542 0 None None None 2020-08-27 15:19:32 UTC

Description Stephen Finucane 2020-06-18 10:50:01 UTC
Description of problem:

When nova fails to attach a volume to an instance, it will attempt to disconnect it. On a compute node with ceph 14.x (the version currently packaged on RHEL 8.2), this fails:

  Traceback (most recent call last):
    File "/usr/lib/python3.6/site-packages/nova/virt/block_device.py", line 582, in _volume_attach
      device_type=self['device_type'], encryption=encryption)
    File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 1806, in attach_volume
      encryption=encryption)
    File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 1594, in _connect_volume
      vol_driver.disconnect_volume(connection_info, instance)
    File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/volume/net.py", line 96, in disconnect_volume
      self.connector.disconnect_volume(connection_info['data'], None)
    File "/usr/lib/python3.6/site-packages/os_brick/utils.py", line 150, in trace_logging_wrapper
      result = f(*args, **kwargs)
    File "/usr/lib/python3.6/site-packages/os_brick/initiator/connectors/rbd.py", line 248, in disconnect_volume
      root_device = self._find_root_device(connection_properties)
    File "/usr/lib/python3.6/site-packages/os_brick/initiator/connectors/rbd.py", line 229, in _find_root_device
      for index, mapping in jsonutils.loads(out).items():
  AttributeError: 'list' object has no attribute 'items'

The culprit appear to have been a change introduced in ceph v13.2.0 (Mimic). From [1].

  The rbd CLI’s showmapped JSON and XML output has changed.

This specific change appear to have been [2].

There appear to be additional issues here with regards to why the *attach* failed, but I can track those separately.

[1] https://docs.ceph.com/docs/master/releases/mimic/#upgrade-compatibility-notes
[2] https://github.com/ceph/ceph/commit/de97a3be483242eab14175a3cc3408df34d5c19a

Version-Release number of selected component (if applicable):

  python3-os-brick.noarch   2.10.3-0.20200605063441.55fc998.el8ost
  python3-nova.noarch       1:20.2.1-0.20200528080027.1e95025.el8ost

How reproducible:

  Always

Steps to Reproduce:
1. Attempt to attach an encrypted volume to an existing instance

Actual results:

It will fail (for reasons not covered here). When nova attempts to clean up after itself, the call to 'disconnect_volume' will also fail.

Expected results:

The volume should be disconnected.

Additional info:

This is being tracked upstream via https://bugs.launchpad.net/os-brick/+bug/1884052

Comment 6 spower 2020-07-14 18:47:23 UTC
This issue has conditional approval for 16.1 Z1 release, it must be in the first compose and tested before release of 16.1.1. If not, we will move to TM=Z2.

Comment 12 errata-xmlrpc 2020-08-27 15:19:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.1 director bug fix advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:3542


Note You need to log in before you can comment on or make changes to this bug.