Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1474507

Summary: Need RCA why rabbitmq problems prevented dnsmasq from working.
Product: Red Hat OpenStack Reporter: Jeremy <jmelvin>
Component: rabbitmq-serverAssignee: Peter Lemenkov <plemenko>
Status: CLOSED DUPLICATE QA Contact: Udi Shkalim <ushkalim>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 10.0 (Newton)CC: apevec, chjones, jeckersb, lhh, mzamot, plemenko, srevivo
Target Milestone: ---Keywords: Triaged, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-09-12 15:01:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jeremy 2017-07-24 19:30:47 UTC
Description of problem: The problem the customer noticed is neutron dhcp-agent not starting after package update and after rebooting the controllers. Problem turned out to be with rabbitmq. Question is why was rabbitmq failing and causing other services not to work.


Version-Release number of selected component (if applicable):
rabbitmq-server-3.6.3-6.el7ost.noarch

How reproducible:
unknown

Steps to Reproduce:
1.unknown
2.
3.

Actual results:
rabbitmq db corruption causes dnsmasq not to work.

Expected results:
rabbitmq works properly

Additional info:
Solution was to do pcs resource restart rabbitmq-clone.

Comment 3 Michael Zamot 2017-08-28 19:23:34 UTC
Invalid

Comment 4 Chris Jones 2017-09-11 20:33:00 UTC
@Michael - could you explain your comment "Invalid" please?

Comment 5 Peter Lemenkov 2017-09-12 14:55:06 UTC
I believe we've found what's going on. Short story - please upgrade up to rabbitmq-server-3.6.3-7.el7ost.

Long story. We've tried to speedup various rabbitmqctl operations with experimental out-of-tree patch, which worked rather well in previous RHOS versions. Unfortunately it started causing issues on versions higher that RHOS10 due to changed iptables rules (fairly speaking, improved iptables rules). The same applies to some spiking networking outages, where rabbitmqctl cannot work properly anymore. We reverted back that patch in this version, so everything should work with rabbitmq-server-3.6.3-7.el7ost

Comment 6 Peter Lemenkov 2017-09-12 15:01:25 UTC

*** This bug has been marked as a duplicate of bug 1434593 ***

Comment 7 Michael Zamot 2017-09-12 20:47:52 UTC
The issue reported here is why dnsmasq didn't start when RabbitMQ was down. The explanation is that dnsmasq is started by neutron-dhcp-agent, which won't start if RabbitMQ is down, which is a normal behaviour.

In this case, RabbitMQ was with stuck processes and misbehaving due to networking issues (port flappings).