Bug 1898578 - [OSP 16.1] n-cpu raising MessageUndeliverable when replying to RPC call [NEEDINFO]
Summary: [OSP 16.1] n-cpu raising MessageUndeliverable when replying to RPC call
Keywords:
Status: VERIFIED
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-oslo-messaging
Version: 16.1 (Train)
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: ---
Assignee: Hervé Beraud
QA Contact: pkomarov
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-11-17 14:45 UTC by Andre
Modified: 2021-04-21 13:28 UTC (History)
21 users (show)

Fixed In Version: python-oslo-messaging-10.2.1-1.20201114001303.el8ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
afariasa: needinfo?
hberaud: needinfo? (jeckersb)


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1854992 0 None None None 2020-11-19 16:00:14 UTC
Launchpad 1905965 0 None None None 2020-11-27 14:05:23 UTC
OpenStack gerrit 764776 0 None MERGED Deprecate the mandatory flag 2021-02-16 13:33:50 UTC
OpenStack gerrit 768252 0 None MERGED Fix type of direct_mandatory_flag opt 2021-02-16 13:33:50 UTC
OpenStack gerrit 771232 0 None MERGED Correctly handle missing RabbitMQ queues 2021-02-16 13:33:50 UTC

Description Andre 2020-11-17 14:45:01 UTC
Description of problem:
Note: Customer is using EMCDell VNX as a cinder driver. But we'd like to make sure this issue is not raised from OpenStack side

Volume attachment is failing, on Cinder it shows as available but nova shows it as attached.
I'll post more information and logs on the next comment as private, since it may contain customer sensitive information.

We're expecting a RCA from this issue.

Version-Release number of selected component (if applicable):
dellemc/openstack-cinder-volume-dellemc-rhosp16:latest
rhosp-rhel8/openstack-cinder-scheduler:16.1-49
rhosp-rhel8/openstack-cinder-api:16.1-49

rhosp-rhel8/openstack-nova-compute:16.1-52.1602000860
rhosp-rhel8/openstack-nova-libvirt:16.1-56.1602000855
rhosp-rhel8/openstack-nova-api:16.1-55
rhosp-rhel8/openstack-nova-novncproxy:16.1-54
rhosp-rhel8/openstack-nova-scheduler:16.1-53
rhosp-rhel8/openstack-nova-conductor:16.1-53

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:
sosreport are available on supportshell under /cases/02791747

Comment 7 Andre 2020-11-23 10:02:56 UTC
Do we currently have any workaround for this issue? Or maybe just to mitigate, like increasing some timeout?

Comment 9 Lee Yarwood 2020-11-23 17:00:12 UTC
(In reply to Andre from comment #7)
> Do we currently have any workaround for this issue? Or maybe just to
> mitigate, like increasing some timeout?

There's no workaround at present, we have spoken about removing this initial RPC call entirely and creating the bdm record in the DB from within the API for sometime to avoid the fallout we are seeing here from the timeout.

That said the underlying issue here seems to be more of a RPC issue so lets address that and I might spawn a separate bug to track the additional rework/refactor in openstack-nova.

Comment 10 Lee Yarwood 2020-11-24 11:46:31 UTC
(In reply to Lee Yarwood from comment #9)
> I might spawn a separate bug to track the additional rework/refactor in openstack-nova.

Apologies, I had already done this in bug #1899581.

Comment 28 Hervé Beraud 2020-12-02 13:46:00 UTC
Fix submitted on master upstream:

https://review.opendev.org/c/openstack/oslo.messaging/+/764776

Comment 58 Hervé Beraud 2021-03-18 10:40:27 UTC
All the fixes have been cherry-picked from upstream train into OSP 16.1

Build successfully completed https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=35535009

Generated version python-oslo-messaging-10.2.1-1.20201114001303.el8ost


Note You need to log in before you can comment on or make changes to this bug.