Bug 1348492 - RabbitMQ fails to start on one controllers in an HA environment
Summary: RabbitMQ fails to start on one controllers in an HA environment
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director
Version: 10.0 (Newton)
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: 10.0 (Newton)
Assignee: Angus Thomas
QA Contact: Omri Hochman
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-06-21 10:16 UTC by Marius Cornea
Modified: 2016-10-12 13:32 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-10-12 13:32:03 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Marius Cornea 2016-06-21 10:16:42 UTC
Description of problem:
RabbitMQ fails to start on one controllers in a HA environment

Version-Release number of selected component (if applicable):
rabbitmq-server-3.6.2-3.el7ost.noarch
puppet-rabbitmq-5.4.0-0.20160608174914.bc10e46.el7ost.noarch

How reproducible:
100%

Steps to Reproduce:
1. Deploy overcloud with 3 controllers
2. run pcs status

Actual results:
 Clone Set: rabbitmq-clone [rabbitmq]
     Started: [ overcloud-controller-0 overcloud-controller-2 ]
     Stopped: [ overcloud-controller-1 ]
* rabbitmq_start_0 on overcloud-controller-1 'unknown error' (1): call=81, status=Timed Out, exitreason='none',
    last-rc-change='Tue Jun 21 10:04:37 2016', queued=1ms, exec=100008ms

Expected results:
The server gets started on all nodes.

Additional info:
[root@overcloud-controller-0 ~]# rabbitmqctl cluster_status
Cluster status of node 'rabbit@overcloud-controller-0' ...
[{nodes,[{disc,['rabbit@overcloud-controller-0',
                'rabbit@overcloud-controller-1',
                'rabbit@overcloud-controller-2']}]},
 {running_nodes,['rabbit@overcloud-controller-2',
                 'rabbit@overcloud-controller-0']},
 {cluster_name,<<"rabbit@overcloud-controller-2.localdomain">>},
 {partitions,[]},
 {alarms,[{'rabbit@overcloud-controller-2',[]},
          {'rabbit@overcloud-controller-0',[]}]}]


On overcloud-controller-1 rabbitmq log I can find the following:
=INFO REPORT==== 21-Jun-2016::10:04:51 ===
Server startup complete; 0 plugins started.

=INFO REPORT==== 21-Jun-2016::10:04:53 ===
Stopping RabbitMQ

=INFO REPORT==== 21-Jun-2016::10:04:53 ===
stopped TCP Listener on 10.0.0.13:5672

=INFO REPORT==== 21-Jun-2016::10:04:54 ===
Stopped RabbitMQ application

=INFO REPORT==== 21-Jun-2016::10:04:58 ===
Clustering with ['rabbit@overcloud-controller-0',
                 'rabbit@overcloud-controller-2'] as disc node

=INFO REPORT==== 21-Jun-2016::10:06:36 ===
Stopping RabbitMQ

=INFO REPORT==== 21-Jun-2016::10:06:36 ===
Stopped RabbitMQ application

=INFO REPORT==== 21-Jun-2016::10:06:36 ===
Halting Erlang VM

I can provide a reproducer environment if needed. Please let me know. Thanks.


Note You need to log in before you can comment on or make changes to this bug.