Bugzilla will be upgraded to version 5.0 on a still to be determined date in the near future. The original upgrade date has been delayed.
Bug 1591501 - Default RabbitMQ timeout settings cause issues with OpenStack services.
Default RabbitMQ timeout settings cause issues with OpenStack services.
Status: CLOSED ERRATA
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates (Show other bugs)
10.0 (Newton)
All Linux
high Severity high
: z9
: 10.0 (Newton)
Assigned To: John Eckersberg
Gurenko Alex
: Triaged, ZStream
Depends On:
Blocks: 1592554
  Show dependency treegraph
 
Reported: 2018-06-14 18:10 EDT by Siggy Sigwald
Modified: 2018-09-17 15:25 EDT (History)
9 users (show)

See Also:
Fixed In Version: openstack-tripleo-heat-templates-5.3.10-14.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1592554 (view as bug list)
Environment:
Last Closed: 2018-09-17 12:56:14 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:2670 None None None 2018-09-17 12:57 EDT

  None (edit)
Description Siggy Sigwald 2018-06-14 18:10:28 EDT
Description of problem:
RabbitMQ timeout is set to 5000ms (5 seconds) by default in all our deployments.
In some cases if the load on the system or network is too high the 5 second timeout can force RabbitMQ into a split brain situation. 
This value is hardcoded in /usr/share/openstack-tripleo-heat-templates/puppet/services/rabbitmq.yaml

Current value:
RABBITMQ_SERVER_ERL_ARGS: '"+K true +P 1048576 -kernel inet_default_connect_options [{nodelay,true},{raw,6,18,<<5000:64/native>>}] -kernel inet_default_listen_options [{raw,6,18,<<5000:64/native>>}]"'

Should be changed to:
RABBITMQ_SERVER_ERL_ARGS: '"+K true +P 1048576 -kernel inet_default_connect_options [{nodelay,true},{raw,6,18,<<30000:64/native>>}] -kernel inet_default_listen_options [{raw,6,18,<<30000:64/native>>}]"'

Version-Release number of selected component (if applicable):
This setting is present in all currently supported versions of RHOSP

How reproducible:
100%
Comment 2 John Eckersberg 2018-06-18 16:02:37 EDT
This was fixed in 12/pike by increasing it from 5 to 15 seconds:

https://review.openstack.org/#/c/485248/

We should backport that for 10/newton and 11/ocata.

Note that 13/queens removes this behavior entirely, see:

https://review.openstack.org/#/c/503788/
https://bugs.launchpad.net/tripleo/+bug/1717006
Comment 14 Alex 2018-09-03 04:01:31 EDT
Hi there,

If this bug requires doc text for errata release, please set the 'Doc Type' and provide draft text according to the template in the 'Doc Text' field.

The documentation team will review, edit, and approve the text.

If this bug does not require doc text, please set the 'requires_doc_text' flag to -.

Thanks,
Alex
Comment 16 errata-xmlrpc 2018-09-17 12:56:14 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2670

Note You need to log in before you can comment on or make changes to this bug.