Description of problem: Customer got a rabbitmq issue where the logs grew really huge and exhausted root partition space. I found this trace on the logs, among much other traces Things seems to start here Mirrored queue 'q-reports-plugin_fanout_fe6e03d722504b54983d5b369ee9f545' in vhost '/': Promoting slave <rabbit.1064.0> to master =INFO REPORT==== 28-Jun-2018::17:32:43 === Mirrored queue 'compute.ock00001i2.cor00005.cni.ukcloud.com' in vhost '/': Slave <rabbit.9587.0> saw deaths of mirrors <rabbit.8080.0> =INFO REPORT==== 28-Jun-2018::17:32:43 === Mirrored queue 'compute.ock00001i2.cor00005.cni.ukcloud.com' in vhost '/': Promoting slave <rabbit.9587.0> to master =INFO REPORT==== 28-Jun-2018::17:32:43 === Mirrored queue 'reply_2478c84f57c0400bbc70ccd3d1e06d29' in vhost '/': Slave <rabbit.5786.8> saw deaths of mirrors <rabbit.2934.9> =INFO REPORT==== 28-Jun-2018::17:32:43 === node rabbit@occ00002i2 down: etimedout =INFO REPORT==== 28-Jun-2018::17:32:43 === node rabbit@occ00002i2 up ERROR REPORT==== 28-Jun-2018::17:32:43 === Partial partition detected: * We saw DOWN from rabbit@occ00002i2 * We can still see rabbit@occ00003i2 which can see rabbit@occ00002i2 * pause_minority mode enabled We will therefore pause until the *entire* cluster recovers =WARNING REPORT==== 28-Jun-2018::17:32:43 === Cluster minority/secondary status detected - awaiting recovery =INFO REPORT==== 28-Jun-2018::17:32:43 === Stopping RabbitMQ =INFO REPORT==== 28-Jun-2018::17:32:43 === stopped TCP Listener on 10.41.54.20:5672 =ERROR REPORT==== 28-Jun-2018::17:32:43 === Error on AMQP connection <0.11773.1> (10.41.54.19:58070 -> 10.41.54.20:5672, vhost: '/', user: 'guest', state: running), channel 0: operation none caused a connection exception connection_forced: "broker forced connection closure with reason 'shutdown'" =ERROR REPORT==== 28-Jun-2018::17:32:43 === Error on AMQP connection <0.17892.24> (10.41.54.15:40896 -> 10.41.54.20:5672, vhost: '/', user: 'guest', state: running), channel 0: operation none caused a connection exception connection_forced: "broker forced connection closure with reason 'shutdown'" .. .. =ERROR REPORT==== 28-Jun-2018::17:32:44 === ** Generic server <0.28123.24> terminating ** Last message in was {'$gen_cast',join} ** When Server state == {state, {4,<0.28123.24>}, {{4,<0.28123.24>},undefined}, {{4,<0.28123.24>},undefined}, {resource,<<"/">>,queue, <<"reply_093474841b384c55833c3da39c8561e6">>}, rabbit_mirror_queue_slave,undefined,-1,undefined, [<0.28122.24>], {[],[]}, [],0,undefined, #Fun<rabbit_misc.execute_mnesia_transaction.1>, false} ** Reason for termination == ** {bad_return_value, {error, {{badmatch,{error,not_found}}, [{gm,'-record_dead_member_in_group/5-fun-1-',4, [{file,"src/gm.erl"},{line,1140}]}, {mnesia_tm,apply_fun,3,[{file,"mnesia_tm.erl"},{line,833}]}, {mnesia_tm,execute_transaction,5, [{file,"mnesia_tm.erl"},{line,808}]}, {rabbit_misc,'-execute_mnesia_transaction/1-fun-0-',1, [{file,"src/rabbit_misc.erl"},{line,532}]}, {worker_pool_worker,'-run/2-fun-0-',3, [{file,"src/worker_pool_worker.erl"},{line,77}]}]}}} ... ... more traces ... ... And then this trace that seems [1] or [2] =ERROR REPORT==== 28-Jun-2018::17:32:44 === ** Reason for termination == ** {{badmatch,{error,not_found}}, [{rabbit_mirror_queue_master,stop_all_slaves,2, [{file,"src/rabbit_mirror_queue_master.erl"}, {line,214}]}, {rabbit_mirror_queue_master,delete_and_terminate,2, [{file,"src/rabbit_mirror_queue_master.erl"}, {line,210}]}, {rabbit_amqqueue_process,'-terminate_delete/3-fun-1-',6, [{file,"src/rabbit_amqqueue_process.erl"}, {line,283}]}, {rabbit_amqqueue_process,terminate_shutdown,2, [{file,"src/rabbit_amqqueue_process.erl"}, {line,308}]}, {gen_server2,terminate,3,[{file,"src/gen_server2.erl"},{line,1129}]}, {proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,240}]}]} ** In 'terminate' callback with reason == ** normal Version-Release number of selected component (if applicable): OSP 9 rabbitmq-server-3.6.3-6.el7ost [1] https://github.com/rabbitmq/rabbitmq-server/issues/953 [2] https://github.com/rabbitmq/rabbitmq-server/issues/1035
(In reply to Eduard Barrera from comment #0) > ** {bad_return_value, > {error, > {{badmatch,{error,not_found}}, > [{gm,'-record_dead_member_in_group/5-fun-1-',4, > [{file,"src/gm.erl"},{line,1140}]}, > {mnesia_tm,apply_fun,3,[{file,"mnesia_tm.erl"},{line,833}]}, > {mnesia_tm,execute_transaction,5, > [{file,"mnesia_tm.erl"},{line,808}]}, > {rabbit_misc,'-execute_mnesia_transaction/1-fun-0-',1, > [{file,"src/rabbit_misc.erl"},{line,532}]}, > {worker_pool_worker,'-run/2-fun-0-',3, > [{file,"src/worker_pool_worker.erl"},{line,77}]}]}}} > ^^^ This issue was fixed in this upstream's ticket: https://github.com/rabbitmq/rabbitmq-server/issues/950 The fix is available only in RHOS-13 and higher. > ** {{badmatch,{error,not_found}}, > [{rabbit_mirror_queue_master,stop_all_slaves,2, > > [{file,"src/rabbit_mirror_queue_master.erl"}, > {line,214}]}, > {rabbit_mirror_queue_master,delete_and_terminate,2, > > [{file,"src/rabbit_mirror_queue_master.erl"}, > {line,210}]}, > {rabbit_amqqueue_process,'-terminate_delete/3-fun-1-',6, > [{file,"src/rabbit_amqqueue_process.erl"}, > {line,283}]}, > {rabbit_amqqueue_process,terminate_shutdown,2, > [{file,"src/rabbit_amqqueue_process.erl"}, > {line,308}]}, > {gen_server2,terminate,3,[{file,"src/gen_server2.erl"},{line,1129}]}, > {proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,240}]}]} yes this issue is addressed in these two tickets (fixed completely in rabbitmq-server-3.6.3-10.el7ost): > [1] https://github.com/rabbitmq/rabbitmq-server/issues/953 > [2] https://github.com/rabbitmq/rabbitmq-server/issues/1035
Hi there, If this bug requires doc text for errata release, please set the 'Doc Type' and provide draft text according to the template in the 'Doc Text' field. The documentation team will review, edit, and approve the text. If this bug does not require doc text, please set the 'requires_doc_text' flag to -. Thanks, Alex
I don't think this particular ticket requires any documentation. So I'm going to sed requires_doc_text to '-'.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2671