Bug 1123296
Summary: | Rubygem-staypuft: HA: rabbitmq haproxy config should set its timeout must be a lot higher. | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Leonid Natapov <lnatapov> |
Component: | openstack-foreman-installer | Assignee: | John Eckersberg <jeckersb> |
Status: | CLOSED ERRATA | QA Contact: | Leonid Natapov <lnatapov> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 5.0 (RHEL 7) | CC: | aberezin, bperkins, breeler, fdinitto, jeckersb, jguiditt, kschroed, mburns, morazi, rhos-maint, yeylon |
Target Milestone: | z1 | Keywords: | Triaged |
Target Release: | Installer | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | openstack-foreman-installer-2.0.24-1.el6ost | Doc Type: | Bug Fix |
Doc Text: |
Previously, the timeout for RabbitMQ HAProxy was set too short. As a consequence, services disconnected and reconnected too often.
This has been fixed by increasing the timeout period.
Now, services will disconnect and reconnect less frequently.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2014-10-01 13:25:37 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Leonid Natapov
2014-07-25 09:13:31 UTC
(In reply to Leonid Natapov from comment #0) > Rubygem-staypuft: HA: amqp is still set to A/P hot-standby mode. Should be > A/A. A/P hot standby mode is absolutely valid for when the user has selected qpid as the messaging layer. Did you mean for $subject of the bug to read: HA: When using RabbitMQ, amqp is still set to A/P hot-standby mode and its timout must be a lot higher. (In reply to Leonid Natapov from comment #0) > Rubygem-staypuft: HA: amqp is still set to A/P hot-standby mode. Should be > A/A. > its timeout must be a lot higher. (900m or so) due to rabbitmq oslo driver > missing a rabbitmq_heartbeat feature. This is a known upstream > limitation/bug (already reported). Or you will end up with services > reconnecting a gazillion times to rabbit. Are you referring to the haproxy timeout here, or rabbit itself (or both)? If both, I would think we want haproxy to not have exactly the same timeout value as the service it is in front of, so what are the values we want to set here? (In reply to Jason Guiditta from comment #2) > (In reply to Leonid Natapov from comment #0) > > Rubygem-staypuft: HA: amqp is still set to A/P hot-standby mode. Should be > > A/A. > > its timeout must be a lot higher. (900m or so) due to rabbitmq oslo driver > > missing a rabbitmq_heartbeat feature. This is a known upstream > > limitation/bug (already reported). Or you will end up with services > > reconnecting a gazillion times to rabbit. > > Are you referring to the haproxy timeout here, or rabbit itself (or both)? > If both, I would think we want haproxy to not have exactly the same timeout > value as the service it is in front of, so what are the values we want to > set here? haproxy timeout. Eck, any updates? The A/A bit has been fixed, see bug 1121185. The timeout bit is more involved. The short of it is to read through this upstream bug: https://bugs.launchpad.net/oslo.messaging/+bug/856764/ Setting the haproxy timeout higher isn't necessarily going to help things, and may hurt under some circumstances. I need to do some more legwork to examine all the different versions of components we have for RHOS5 (amqplib, python-kombu, rabbitmq-server) and evaluate the state of oslo.messaging, which components are using it, and which are off doing their own thing. It's a tangled mess. Any updates on this? I am going to update the backend timeout to 900m to match: https://github.com/fabbione/rhos-ha-deploy/blob/master/rhos5-rhel7/mrgcloud-setup/RHOS-RHEL-HA-how-to-mrgcloud-rhos5-on-rhel7-lb-latest.txt#L41 The oslo stuff is being tracked elsewhere. Pretend I didn't mention it here. I should have said, update both the backend and client timeouts to 900m. *** Bug 1142915 has been marked as a duplicate of this bug. *** listen amqp bind 192.168.0.36:5672 mode tcp option tcplog timeout client 900m timeout server 900m Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2014-1350.html |