Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1133958

Summary: Configure mco client conf so broker can handle activemq outages gracefully
Product: OpenShift Container Platform Reporter: Luke Meyer <lmeyer>
Component: DocumentationAssignee: brice <bfallonf>
Status: CLOSED CURRENTRELEASE QA Contact: Alex Dellapenta <adellape>
Severity: medium Docs Contact:
Priority: low    
Version: 2.1.0CC: bleanhar, charles_sheridan, gpei, jokerman, libra-bugs, libra-onpremise-devel, lmeyer, mmccomas, xiama
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1065048 Environment:
Last Closed: 2014-09-15 04:57:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1065048    
Bug Blocks:    

Description Luke Meyer 2014-08-26 14:20:52 UTC
+++ This bug was initially created as a clone of Bug #1065048 +++

Description of problem:
Previously when activemq was unavailable (which can happen due to any number of failures: DNS record missing, network broken, port blocked, activemq stopped or crashed...) the broker set no timeout in its attempt to reach activemq via MCollective. Thus the user experience was that their requests to the broker stalled until the httpd request timed out, and they would get no useful error message. There wasn't even anything in the broker logs to indicate to an administrator what is going on.

The installer was changed to address this by configuring the mco client so that it gives up with a relevant error message after a brief period of trying to connect. Incidentally I also changed the default server (node) connection retry timeout so nodes would reconnect faster after an activemq outage.
https://github.com/openshift/openshift-extras/pull/440

The following related changes should be made in the relevant docs sections:

broker mco client config: add to /opt/rh/ruby193/root/etc/mcollective/client.cfg
# Broker will retry ActiveMQ connection, then report error
plugin.activemq.initial_reconnect_delay = 0.1
plugin.activemq.max_reconnect_attempts = 6

node mco server config: add to /opt/rh/ruby193/root/etc/mcollective/server.cfg
# Node should retry connecting to ActiveMQ forever
plugin.activemq.max_reconnect_attempts = 0
plugin.activemq.initial_reconnect_delay = 0.1
plugin.activemq.max_reconnect_delay = 4.0

Comment 2 brice 2014-08-27 06:06:26 UTC
Hi, Luke.

I added the recommended stanzas to 7.7.2 and 8.7.

Luke, is that all that was needed for this BZ?

Thanks!

Comment 10 Luke Meyer 2014-09-08 19:58:53 UTC
Looks good.

Comment 11 brice 2014-09-09 02:06:55 UTC
Groovy, thanks Luke. 

Putting this onto QA.

Comment 12 Alex Dellapenta 2014-09-11 18:51:11 UTC
QA'd, looks good.