Bug 1285842 - RabbitMQ fails to start
RabbitMQ fails to start
Status: CLOSED CURRENTRELEASE
Product: Red Hat OpenStack
Classification: Red Hat
Component: rabbitmq-server (Show other bugs)
7.0 (Kilo)
x86_64 Linux
unspecified Severity high
: ---
: 6.0 (Juno)
Assigned To: Peter Lemenkov
Shai Revivo
: TestOnly, ZStream
Depends On:
Blocks: 1273812
  Show dependency treegraph
 
Reported: 2015-11-26 11:07 EST by Pablo Iranzo Gómez
Modified: 2018-02-08 06:04 EST (History)
7 users (show)

See Also:
Fixed In Version: rabbitmq-server-3.3.5-12.el7ost
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-08-05 13:39:18 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Pablo Iranzo Gómez 2015-11-26 11:07:23 EST
Description of problem:


Rabbit MQ fails to start on one node, acording to resource agent:


# unconditionally join the cluster
	$RMQ_CTL stop_app > /dev/null 2>&1
	for node in $(echo "$join_list"); do
		ocf_log info "Attempting to join cluster with target node $node"
		$RMQ_CTL join_cluster $node
		if [ $? -eq 0 ]; then
			ocf_log info "Joined cluster by connecting to node $node, starting app"
			$RMQ_CTL start_app
			rc=$?
			if [ $rc -ne 0 ]; then
				ocf_log err "'$RMQ_CTL start_app' failed"
			fi
			break;
		fi
	done


/usr/sbin/rabbitmqctl start_app' fails previously but the join cluster works in order to get there.


Logs:

Nov 12 12:01:19 server rabbitmq-cluster(rabbitmq-server)[22172]: ERROR: '/usr/sbin/rabbitmqctl start_app' failed
Nov 12 12:01:19 server rabbitmq-cluster(rabbitmq-server)[22172]: INFO: Join process incomplete, shutting down.
Nov 12 12:01:19 server rabbitmq-cluster(rabbitmq-server)[22172]: INFO: node failed to join even after reseting local data. Check SELINUX policy
Nov 12 12:01:19 server lrmd[4211]: notice: operation_finished: rabbitmq-server_start_0:22172:stderr [ Error: {rabbit,failure_during_boot, ]
Nov 12 12:01:19 server lrmd[4211]: notice: operation_finished: rabbitmq-server_start_0:22172:stderr [            {could_not_start,rabbitmq_management, ]
Nov 12 12:01:19 server lrmd[4211]: notice: operation_finished: rabbitmq-server_start_0:22172:stderr [                {{shutdown, ]
Nov 12 12:01:19 server lrmd[4211]: notice: operation_finished: rabbitmq-server_start_0:22172:stderr [                     {failed_to_start_child,rabbit_mgmt_sup, ]
Nov 12 12:01:19 server lrmd[4211]: notice: operation_finished: rabbitmq-server_start_0:22172:stderr [                         {'EXIT', ]
Nov 12 12:01:19 server lrmd[4211]: notice: operation_finished: rabbitmq-server_start_0:22172:stderr [                             {{shutdown, ]
Nov 12 12:01:19 server lrmd[4211]: notice: operation_finished: rabbitmq-server_start_0:22172:stderr [                                  [{{already_started,<5817.370.0>}, ]
Nov 12 12:01:19 server lrmd[4211]: notice: operation_finished: rabbitmq-server_start_0:22172:stderr [                                    {child,undefined,rabbit_mgmt_db, ]
Nov 12 12:01:19 server lrmd[4211]: notice: operation_finished: rabbitmq-server_start_0:22172:stderr [                                        {rabbit_mgmt_db,start_link,[]}, ]
Nov 12 12:01:19 server lrmd[4211]: notice: operation_finished: rabbitmq-server_start_0:22172:stderr [                                        permanent,4294967295,worker, ]
Nov 12 12:01:19 server lrmd[4211]: notice: operation_finished: rabbitmq-server_start_0:22172:stderr [                                        [rabbit_mgmt_db]}}]}, ]
Nov 12 12:01:19 server lrmd[4211]: notice: operation_finished: rabbitmq-server_start_0:22172:stderr [                              {gen_server2,call, ]
Nov 12 12:01:19 server lrmd[4211]: notice: operation_finished: rabbitmq-server_start_0:22172:stderr [                                  [<5324.556.0>, ]
Nov 12 12:01:19 server lrmd[4211]: notice: operation_finished: rabbitmq-server_start_0:22172:stderr [                                   {init,<5324.554.0>}, ]
Nov 12 12:01:19 server lrmd[4211]: notice: operation_finished: rabbitmq-server_start_0:22172:stderr [                                   infinity]}}}}}, ]
Nov 12 12:01:19 server lrmd[4211]: notice: operation_finished: rabbitmq-server_start_0:22172:stderr [                 {rabbit_mgmt_app,start,[normal,[]]}}}} ]
Nov 12 12:01:19 server lrmd[4211]: notice: operation_finished: rabbitmq-server_start_0:22172:stderr [ Error: {rabbit,failure_during_boot, ]
Nov 12 12:01:19 server lrmd[4211]: notice: operation_finished: rabbitmq-server_start_0:22172:stderr [            {could_not_start,rabbitmq_management, ]
Nov 12 12:01:19 server lrmd[4211]: notice: operation_finished: rabbitmq-server_start_0:22172:stderr [                {{shutdown, ]
Nov 12 12:01:19 server lrmd[4211]: notice: operation_finished: rabbitmq-server_start_0:22172:stderr [                     {failed_to_start_child,rabbit_mgmt_sup, ]
Nov 12 12:01:19 server lrmd[4211]: notice: operation_finished: rabbitmq-server_start_0:22172:stderr [                         {'EXIT', ]
Nov 12 12:01:19 server lrmd[4211]: notice: operation_finished: rabbitmq-server_start_0:22172:stderr [                             {{shutdown, ]
Nov 12 12:01:19 server lrmd[4211]: notice: operation_finished: rabbitmq-server_start_0:22172:stderr [                                  [{{already_started,<5817.370.0>}, ]
Nov 12 12:01:19 server lrmd[4211]: notice: operation_finished: rabbitmq-server_start_0:22172:stderr [                                    {child,undefined,rabbit_mgmt_db, ]
Nov 12 12:01:19 server lrmd[4211]: notice: operation_finished: rabbitmq-server_start_0:22172:stderr [                                        {rabbit_mgmt_db,start_link,[]}, ]
Nov 12 12:01:19 server lrmd[4211]: notice: operation_finished: rabbitmq-server_start_0:22172:stderr [                                        permanent,4294967295,worker, ]
Nov 12 12:01:19 server lrmd[4211]: notice: operation_finished: rabbitmq-server_start_0:22172:stderr [                                        [rabbit_mgmt_db]}}]}, ]
Nov 12 12:01:19 server lrmd[4211]: notice: operation_finished: rabbitmq-server_start_0:22172:stderr [                              {gen_server2,call, ]
Nov 12 12:01:19 server lrmd[4211]: notice: operation_finished: rabbitmq-server_start_0:22172:stderr [                                  [<5324.666.0>, ]
Nov 12 12:01:19 server lrmd[4211]: notice: operation_finished: rabbitmq-server_start_0:22172:stderr [                                   {init,<5324.664.0>}, ]
Nov 12 12:01:19 server lrmd[4211]: notice: operation_finished: rabbitmq-server_start_0:22172:stderr [                                   infinity]}}}}}, ]
Nov 12 12:01:19 server lrmd[4211]: notice: operation_finished: rabbitmq-server_start_0:22172:stderr [                 {rabbit_mgmt_app,start,[normal,[]]}}}} ]
Nov 12 12:01:19 server crmd[4214]: notice: process_lrm_event: Operation rabbitmq-server_start_0: unknown error (node=pcmk-server, call=1972, rc=1, cib-update=1075, confirmed=true)
Nov 12 12:01:19 server crmd[4214]: notice: process_lrm_event: pcmk-server-rabbitmq-server_start_0:1972 [ Error: {rabbit,failure_during_boot,\n           {could_not_start,rabbitmq_management,\n               {{s
hutdown,\n                    {failed_to_start_child,rabbit_mgmt_sup,\n                        {'EXIT',\n                            {{shutdown,\n                                 [{{already_started,<5817.370.0>},\n                        
           {child,undefined,rabbit_mgmt_db,\n





Version-Release number of selected component (if applicable):


rabbitmq-server-3.3.5-3.el7ost.noarch
Comment 3 Peter Lemenkov 2015-12-02 12:16:17 EST
This looks like a known bug with pacemaker /resource-agent. See these links for the details:

* https://bugs.launchpad.net/fuel/+bug/1495885
* https://bugs.launchpad.net/fuel/+bug/1484280 (this one contains a fix which might be used in this situation as well).

No other news so far.
Comment 4 Peter Lemenkov 2015-12-02 14:04:50 EST
Ok, we've got a patch from RabbitMQ side! I'm working on backporting it and building a test package.
Comment 5 Peter Lemenkov 2015-12-03 10:08:23 EST
Here is a build (for the upcoming RHOS8):

* https://brewweb.devel.redhat.com/taskinfo?taskID=10187990

You may try it if you really want. However I suppose you not to make a rash or hasty decisions. This issue seems to be rare one, so perhaps it's not necessary to upgrade a customer's setup just because of this. If you indeed decided not to take any destructive actions, then next time please reboot the entire node - it should fix this.
Comment 10 Lon Hohberger 2016-07-25 10:27:09 EDT
According to our records, this should be resolved by rabbitmq-server-3.3.5-22.el7ost.  This build is available now.

Note You need to log in before you can comment on or make changes to this bug.