Bug 1837983

Summary: Implement auto-healing after rabbitmq crash
Product: Red Hat OpenStack Reporter: Andreas Karis <akaris>
Component: python-oslo-messagingAssignee: Hervé Beraud <hberaud>
Status: CLOSED NEXTRELEASE QA Contact: pkomarov
Severity: medium Docs Contact:
Priority: medium    
Version: 13.0 (Queens)CC: apevec, athomas, jeckersb, lhh
Target Milestone: ---Keywords: Triaged, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-05-05 14:26:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Andreas Karis 2020-05-20 10:28:27 UTC
Description of problem:

A customer just ran into https://access.redhat.com/solutions/3880881

This is a known issue, and I think oslo (or the services?) should have an auto-healing feature if the exchange cannot be found.


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 4 John Eckersberg 2020-05-27 16:08:57 UTC
Without diving too deeply into the details of this particular issue, understanding the flow I described here is still generally applicable:

https://bugzilla.redhat.com/show_bug.cgi?id=1399237#c15

Normally the "no exchange" errors are some variation of that scenario, and are "normal" at the application layer.  Oslo.messaging is designed to retry against the exchange for some time in hopes that it reappears.  If not, it should discard the reply and stop logging that error.

Also possibly related:

https://bugzilla.redhat.com/show_bug.cgi?id=1753264#c4

It may be that the origin of the RPC request keeps publishing requests with an incorrect/undeclared exchange.  It is on the sender to ensure the return path is setup before sending the rpc request.