Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1348276

Summary: Queue master process terminates in rabbit_mirror_queue_master:stop_all_slaves on promotion
Product: Red Hat OpenStack Reporter: Marian Krcmarik <mkrcmari>
Component: rabbitmq-serverAssignee: Peter Lemenkov <plemenko>
Status: CLOSED ERRATA QA Contact: Leonid Natapov <lnatapov>
Severity: high Docs Contact:
Priority: unspecified    
Version: 9.0 (Mitaka)CC: apevec, jeckersb, jjoyce, lhh, mkrcmari, oblaut, sclewis, srevivo, ushkalim
Target Milestone: gaKeywords: AutomationBlocker
Target Release: 9.0 (Mitaka)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: rabbitmq-server-3.6.2-4.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-08-11 12:27:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1319334    

Description Marian Krcmarik 2016-06-20 16:03:28 UTC
Description of problem:
rabbitmq server starts to fail often during HA stress tests of OSP9. It seems to be caused by regression already reported in upstream and targeted for upstream 36.3 release.

Upstream bug: https://github.com/rabbitmq/rabbitmq-server/issues/812

Version-Release number of selected component (if applicable):
rabbitmq-server-3.6.2-3.el7ost.noarch

How reproducible:
Often

Steps to Reproduce:
1. non-gracefully reset a controller of HA OPS9 environment
2. repeat the 1. step multiple times until one of the rabbitmq servers crashes
3.

Additional info:

** Generic server <0.21044.0> terminating
** Last message in was {maybe_expire,4}
** When Server state == {q,
                         {amqqueue,
                          {resource,<<"/">>,queue,
                           <<"heat-engine-listener_fanout_67a9b88061cb4a7d93cb4381fe86ec7f">>},
                          false,false,none,
                          [{<<"x-expires">>,signedint,600000},
                           {<<"x-ha-policy">>,longstr,<<"all">>}],
                          <0.21044.0>,
                          [<24587.4110.0>,<24588.1171.2>],
                          [<24587.4110.0>],
                          ['rabbit@overcloud-controller-0',
                           'rabbit@overcloud-controller-2'],
                          [{vhost,<<"/">>},
                           {name,<<"ha-all">>},
                           {pattern,<<"^(?!amq\\.).*">>},
                           {'apply-to',<<"all">>},
                           {definition,[{<<"ha-mode">>,<<"all">>}]},
                           {priority,0}],
                          [{<24588.2786.2>,<24588.1171.2>},
                           {<24587.4111.0>,<24587.4110.0>},
                           {<0.21048.0>,<0.21044.0>}],
                          [],live},
                         none,false,rabbit_mirror_queue_master,
                         {state,
                          {resource,<<"/">>,queue,
                           <<"heat-engine-listener_fanout_67a9b88061cb4a7d93cb4381fe86ec7f">>},
                          <0.21048.0>,<0.3691.1>,rabbit_priority_queue,
                          {passthrough,rabbit_variable_queue,
                           {vqstate,
                            {0,{[],[]}},
                            {0,{[],[]}},
                            {delta,undefined,0,undefined},
                            {0,{[],[]}},
                            {0,{[],[]}},
                            0,
                            {0,nil},
                            {0,nil},
                            {0,nil},
                            {qistate,
                             "/var/lib/rabbitmq/mnesia/rabbit@overcloud-controller-1/queues/6OOTXLJWFMOE5W8XW53XWFYJT",
                             {{dict,0,16,16,8,80,48,
                               {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
                                []},
                               {{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
                                 []}}},
                              []},
                             undefined,0,32768,
                             #Fun<rabbit_variable_queue.2.131658179>,
                             #Fun<rabbit_variable_queue.3.131658179>,
                             {0,nil},
                             {0,nil},
                             [],[]},
                            {undefined,
                             {client_msstate,msg_store_transient,
                              <<254,88,125,220,97,216,175,80,36,32,92,89,170,
                                80,191,154>>,
                              {dict,0,16,16,8,80,48,
                               {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
                                []},
                               {{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
                                 []}}},
                              {state,933956,
                               "/var/lib/rabbitmq/mnesia/rabbit@overcloud-controller-1/msg_store_transient"},
                              rabbit_msg_store_ets_index,
                              "/var/lib/rabbitmq/mnesia/rabbit@overcloud-controller-1/msg_store_transient",
                              <0.827.0>,938051,929813,942146,946240,
                              {2000,500}}},
                            false,0,4096,0,0,0,0,0,infinity,0,0,0,0,0,0,
                            {rates,0.0,0.0,0.0,0.0,-576458772371924324},
                            {0,nil},
                            {0,nil},
                            {0,nil},
                            {0,nil},
                            0,0,0,0,2048,default}},
                          {dict,0,16,16,8,80,48,
                           {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
                           {{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
                             []}}},
                          [],
                          {set,0,16,16,8,80,48,
                           {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
                           {{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
                             []}}},
                          undefined},
                         {state,{queue,[],[],0},{active,-576458772925912,1.0}},
                         600000,undefined,undefined,
                         {erlang,#Ref<0.0.4.54411>},
                         {state,none,5000,undefined},
                         {0,nil},
                         undefined,undefined,undefined,
                         {state,
                          {dict,0,16,16,8,80,48,
                           {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
                           {{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
                             []}}},
                          delegate},
                         undefined,undefined,undefined,undefined,4,running}
** Reason for termination ==
** {timeout_value,
       [{rabbit_mirror_queue_master,'-stop_all_slaves/2-lc$^1/1-1-',3,
            [{file,"src/rabbit_mirror_queue_master.erl"},{line,217}]},
        {rabbit_mirror_queue_master,stop_all_slaves,2,
            [{file,"src/rabbit_mirror_queue_master.erl"},{line,217}]},
        {rabbit_mirror_queue_master,delete_and_terminate,2,
            [{file,"src/rabbit_mirror_queue_master.erl"},{line,205}]},
        {rabbit_amqqueue_process,'-terminate_delete/3-fun-1-',6,
            [{file,"src/rabbit_amqqueue_process.erl"},{line,252}]},
        {rabbit_amqqueue_process,terminate_shutdown,2,
            [{file,"src/rabbit_amqqueue_process.erl"},{line,277}]},
        {gen_server2,terminate,3,[{file,"src/gen_server2.erl"},{line,1146}]},
        {proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,250}]}]}
** In 'terminate' callback with reason ==
** normal

Comment 2 Peter Lemenkov 2016-06-29 15:42:13 UTC
This build should fix the issue. Marian, please test.

Comment 4 Marian Krcmarik 2016-07-01 22:43:24 UTC
(In reply to Peter Lemenkov from comment #2)
> This build should fix the issue. Marian, please test.

This particular bug seems to be fixed by the build, I was not able to reproduce it on setup with the updated build. We should push the package into puddle.

Comment 6 Udi Shkalim 2016-07-20 12:59:33 UTC
Verified based on comment #4

Comment 8 errata-xmlrpc 2016-08-11 12:27:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-1597.html