Bug 1923789 - RBD _get_pool_stats fails on Ceph Pacific
Summary: RBD _get_pool_stats fails on Ceph Pacific
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-cinder
Version: 17.0 (Wallaby)
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: ---
Assignee: Eric Harney
QA Contact: Tzach Shefi
Chuck Copello
URL:
Whiteboard:
Depends On:
Blocks: 1925312
TreeView+ depends on / blocked
 
Reported: 2021-02-01 22:18 UTC by wes hayutin
Modified: 2021-04-01 10:36 UTC (History)
10 users (show)

Fixed In Version: openstack-cinder-17.1.0-0.20210217042010.bdfcdb6.el8ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-04-01 10:36:59 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1913449 0 None None None 2021-02-03 09:51:47 UTC
OpenStack gerrit 773694 0 None MERGED RBD: Pass bytes type for mon_command inbuf 2021-02-15 07:35:22 UTC

Description wes hayutin 2021-02-01 22:18:08 UTC
Description of problem:

021-01-31 23:37:53.739 43 DEBUG eventlet.wsgi.server [-] (43) accepted ('192.168.24.1', 60744) server /usr/lib/python3.6/site-packages/eventlet/wsgi.py:992
2021-01-31 23:37:53.742 43 INFO eventlet.wsgi.server [-] 192.168.24.1 - - [31/Jan/2021 23:37:53] "GET /healthcheck HTTP/1.0" 200 137 0.001902
2021-01-31 23:37:54.770 43 ERROR glance_store._drivers.cinder [req-31c961e6-8031-4562-8f30-1b1eb1144d1c a098d05f0b1b4d3b9e0e264e05fe49e3 25ab9b38cc0b4bd782cf94ffad96f139 - default default] The status of volume 657c9d55-a807-497d-835c-0895c5c039d5 is unexpected: status = error, expected = available.
2021-01-31 23:37:54.774 43 ERROR glance.api.v2.image_data [req-31c961e6-8031-4562-8f30-1b1eb1144d1c a098d05f0b1b4d3b9e0e264e05fe49e3 25ab9b38cc0b4bd782cf94ffad96f139 - default default] Failed to upload image data due to internal error: glance_store.exceptions.BackendException: The status of volume 657c9d55-a807-497d-835c-0895c5c039d5 is unexpected: status = error, expected = available.
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi [req-31c961e6-8031-4562-8f30-1b1eb1144d1c a098d05f0b1b4d3b9e0e264e05fe49e3 25ab9b38cc0b4bd782cf94ffad96f139 - default default] Caught error: The status of volume 657c9d55-a807-497d-835c-0895c5c039d5 is unexpected: status = error, expected = available.: glance_store.exceptions.BackendException: The status of volume 657c9d55-a807-497d-835c-0895c5c039d5 is unexpected: status = error, expected = available.
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi Traceback (most recent call last):
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi   File "/usr/lib/python3.6/site-packages/glance/common/wsgi.py", line 1348, in __call__
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi     request, **action_args)
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi   File "/usr/lib/python3.6/site-packages/glance/common/wsgi.py", line 1391, in dispatch
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi     return method(*args, **kwargs)
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi   File "/usr/lib/python3.6/site-packages/glance/common/utils.py", line 416, in wrapped
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi     return func(self, req, *args, **kwargs)
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi   File "/usr/lib/python3.6/site-packages/glance/api/v2/image_data.py", line 298, in upload
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi     self._restore(image_repo, image)
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi   File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi     self.force_reraise()
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi   File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi     six.reraise(self.type_, self.value, self.tb)
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi   File "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi     raise value
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi   File "/usr/lib/python3.6/site-packages/glance/api/v2/image_data.py", line 163, in upload
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi     image.set_data(data, size, backend=backend)
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi   File "/usr/lib/python3.6/site-packages/glance/domain/proxy.py", line 208, in set_data
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi     self.base.set_data(data, size, backend=backend, set_active=set_active)
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi   File "/usr/lib/python3.6/site-packages/glance/notifier.py", line 501, in set_data
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi     _send_notification(notify_error, 'image.upload', msg)
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi   File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi     self.force_reraise()
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi   File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi     six.reraise(self.type_, self.value, self.tb)
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi   File "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi     raise value
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi   File "/usr/lib/python3.6/site-packages/glance/notifier.py", line 448, in set_data
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi     set_active=set_active)
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi   File "/usr/lib/python3.6/site-packages/glance/api/policy.py", line 198, in set_data
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi     return self.image.set_data(*args, **kwargs)
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi   File "/usr/lib/python3.6/site-packages/glance/quota/__init__.py", line 319, in set_data
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi     set_active=set_active)
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi   File "/usr/lib/python3.6/site-packages/glance/location.py", line 567, in set_data
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi     self._upload_to_store(data, verifier, backend, size)
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi   File "/usr/lib/python3.6/site-packages/glance/location.py", line 467, in _upload_to_store
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi     verifier=verifier)
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi   File "/usr/lib/python3.6/site-packages/glance_store/multi_backend.py", line 399, in add_with_multihash
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi     image_id, data, size, hashing_algo, store, context, verifier)
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi   File "/usr/lib/python3.6/site-packages/glance_store/multi_backend.py", line 481, in store_add_to_backend_with_multihash
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi     image_id, data, size, hashing_algo, context=context, verifier=verifier)
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi   File "/usr/lib/python3.6/site-packages/glance_store/driver.py", line 279, in add_adapter
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi     metadata_dict) = store_add_fun(*args, **kwargs)
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi   File "/usr/lib/python3.6/site-packages/glance_store/capabilities.py", line 176, in op_checker
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi     return store_op_fun(store, *args, **kwargs)
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi   File "/usr/lib/python3.6/site-packages/glance_store/_drivers/cinder.py", line 854, in add
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi     volume = self._wait_volume_status(volume, 'creating', 'available')
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi   File "/usr/lib/python3.6/site-packages/glance_store/_drivers/cinder.py", line 598, in _wait_volume_status
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi     raise exceptions.BackendException(msg)
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi glance_store.exceptions.BackendException: The status of volume 657c9d55-a807-497d-835c-0895c5c039d5 is unexpected: status = error, expected = available.
2021-01-31 23:37:54.804 43 ERROR glance.common.wsgi 
2021-01-31 23:37:54.833 43 INFO eventlet.wsgi.server [req-31c961e6-8031-4562-8f30-1b1eb1144d1c a098d05f0b1b4d3b9e0e264e05fe49e3 25ab9b38cc0b4bd782cf94ffad96f139 - default default] 192.168.24.1 - - [31/Jan/2021 23:37:54] "PUT /v2/images/6919e796-37bd-433f-9f4d-c294844bc730/file HTTP/1.1" 500 336 1.948595



https://sf.hosted.upshift.rdu2.redhat.com/logs/openstack-periodic-rhos-17/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-rhel-8-scenario001-standalone-rhos-17/98cc08c/logs/undercloud/var/log/containers/glance/api.log



Causes the cirros image upload to fail:
2021-01-31 18:37:55.026055 | primary | 2021-01-31 23:37:52.202 286759 INFO config_tempest.constants [-] Creating image 'cirros-0.4.0-x86_64-disk.img'[00m
2021-01-31 18:37:55.026059 | primary | 2021-01-31 23:37:52.202 286759 INFO config_tempest.constants [-] Downloading 'http://images.rdoproject.org/cirros/cirros-0.4.0-x86_64-disk.img' and saving as 'etc/cirros-0.4.0-x86_64-disk.img'[00m
2021-01-31 18:37:55.026062 | primary | 2021-01-31 23:37:52.742 286759 INFO config_tempest.constants [-] Uploading image 'cirros-0.4.0-x86_64-disk.img' from '/home/zuul/tempest/etc/cirros-0.4.0-x86_64-disk.img'[00m
2021-01-31 18:37:55.026066 | primary | 2021-01-31 23:37:52.873 286759 INFO tempest.lib.common.rest_client [req-3fbe2673-8cef-486b-952f-3e44550856da ] Request (main): 201 POST http://192.168.24.3:9292/v2/images 0.130s[00m
2021-01-31 18:37:55.026069 | primary | 2021-01-31 23:37:54.834 286759 INFO tempest.lib.common.rest_client [req-31c961e6-8031-4562-8f30-1b1eb1144d1c ] Request (main): 500 PUT http://192.168.24.3:9292/v2/images/6919e796-37bd-433f-9f4d-c294844bc730/file 1.961s[00m
2021-01-31 18:37:55.026073 | primary | 2021-01-31 23:37:54.835 286759 CRITICAL tempest [-] Unhandled error: tempest.lib.exceptions.ServerFault: Got server fault
2021-01-31 18:37:55.026077 | primary | Details: b'500 Internal Server Error\n\nThe server has either erred or is incapable of performing the requested operation.\n\n  

https://sf.hosted.upshift.rdu2.redhat.com/logs/openstack-periodic-rhos-17/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-rhel-8-scenario001-standalone-rhos-17/98cc08c/job-output.txt

Comment 1 Giulio Fidente 2021-02-03 08:43:39 UTC
glance is failing to upload the image because there is an error in stats report from cinder-volume [1]

Line 560 in rbd driver is failing with : Error starting thread.: TypeError: Argument 'inbuf' has incorrect type (expected bytes, got str) but that seems a 4 years old piece of code; Ceph cluster seems healthy from logs [2]

1. https://sf.hosted.upshift.rdu2.redhat.com/logs/openstack-periodic-rhos-17/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-rhel-8-scenario001-standalone-rhos-17/98cc08c/logs/undercloud/var/log/containers/cinder/cinder-volume.log
2. https://sf.hosted.upshift.rdu2.redhat.com/logs/openstack-periodic-rhos-17/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-rhel-8-scenario001-standalone-rhos-17/98cc08c/logs/undercloud/var/log/extra/podman/containers/ceph-mon-standalone/stdout.log

Comment 2 Gorka Eguileor 2021-02-03 09:36:21 UTC
This error seems to be caused by the annotations that were "recently" added to Ceph's `mon_command` [1][2] in the rados Python library.

The jobs must be running with a newer library version.  We need to pin it to a lower one until we fix the RBD Cinder driver.


[1]: https://github.com/ceph/ceph/commit/51a2a72215ac93836ccc37733607b8e97d808e10

    +  @requires(('cmd': str), ('inbuf': bytes), ('timeout', opt(int)), ('target', opt(str)))
       def mon_command(self, cmd, inbuf, timeout=0, target=None):


[2]: https://github.com/ceph/ceph/commit/bbf9f5bbe32b37bb1b79bbca4b9cc726dbb2cf34#diff-bc4ce88c894231511678ecdbb17afe336c30badb9dc7a820d10324fafdcf83e7L1359

    - @requires(('cmd': str), ('inbuf': bytes), ('timeout', opt(int)), ('target', opt(str)))
    - def mon_command(self, cmd, inbuf, timeout=0, target=None):
    + def mon_command(self,
    +                 cmd: str,
    +                 inbuf: bytes,
    +                 timeout: int = 0,
    +                 target: Optional[Union[str, int]] = None) -> Tuple[int, str, bytes]:

Comment 3 Luigi Toscano 2021-02-03 09:47:42 UTC
I guess addressed by https://review.opendev.org/c/openstack/cinder/+/773694

Comment 8 Luigi Toscano 2021-04-01 10:36:59 UTC
This has been addressed.


Note You need to log in before you can comment on or make changes to this bug.