Bug 1441685 - RabbitMQ cannot reset the node after failure
Summary: RabbitMQ cannot reset the node after failure
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: rabbitmq-server
Version: 10.0 (Newton)
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: z8
: 10.0 (Newton)
Assignee: Peter Lemenkov
QA Contact: Udi Shkalim
URL:
Whiteboard:
: 1575885 (view as bug list)
Depends On:
Blocks: 1565136 1565164
TreeView+ depends on / blocked
 
Reported: 2017-04-12 13:19 UTC by Sergii Mykhailushko
Modified: 2021-09-09 12:14 UTC (History)
14 users (show)

Fixed In Version: rabbitmq-server-3.6.3-9.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1565136 (view as bug list)
Environment:
Last Closed: 2018-05-17 15:49:28 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github rabbitmq rabbitmq-server issues 530 0 None closed Channel errors causing rabbit cluster to be unreachable 2021-01-22 08:39:37 UTC
Github rabbitmq rabbitmq-server issues 544 0 None closed rabbit_reader doesn't handle exit signals from the socket process, when it's terminating channels 2021-01-22 08:40:19 UTC
Red Hat Bugzilla 1431336 0 high CLOSED Some instances fail to get VIF plugged in 2021-02-22 00:41:40 UTC
Red Hat Bugzilla 1441635 0 urgent CLOSED OSP10 -> OSP11 upgrade: nova instance live migration gets stuck with MIGRATING status before running compute node upgrad... 2021-02-22 00:41:40 UTC
Red Hat Knowledge Base (Solution) 3440481 0 None None None 2018-05-11 08:02:33 UTC
Red Hat Product Errata RHBA-2018:1596 0 None None None 2018-05-17 15:50:40 UTC

Internal Links: 1441635 1447355

Description Sergii Mykhailushko 2017-04-12 13:19:12 UTC
Description of problem:
Socket process errors when terminating channel

Version-Release number of selected component (if applicable):
rabbitmq-server-3.6.3-6.el7ost.noarch 

 socket process errors when terminating channel

rabbitmq-server-3.6.3-6.el7ost.noarch 



=CRASH REPORT==== 27-Mar-2017::19:32:17 ===
  crasher:
    initial call: rabbit_reader:init/4
    pid: <0.22090.0>
    registered_name: []
    exception exit: channel_termination_timeout
      in function  rabbit_reader:wait_for_channel_termination/3 (src/rabbit_reader.erl, line 767)
      in call from rabbit_reader:send_error_on_channel0_and_close/4 (src/rabbit_reader.erl, line 1504)
      in call from rabbit_reader:terminate/2 (src/rabbit_reader.erl, line 612)
      in call from rabbit_reader:handle_other/2 (src/rabbit_reader.erl, line 537)
      in call from rabbit_reader:mainloop/4 (src/rabbit_reader.erl, line 499)
      in call from rabbit_reader:run/1 (src/rabbit_reader.erl, line 424)
      in call from rabbit_reader:start_connection/4 (src/rabbit_reader.erl, line 382)
    ancestors: [<0.22087.0>,<0.340.0>,<0.339.0>,<0.338.0>,rabbit_sup,
                  <0.80.0>]
    messages: [{tcp_closed,#Port<0.12469>},{'EXIT',#Port<0.12469>,normal}]
    links: []
    dictionary: [{{ch_pid,<0.22104.0>},{1,#Ref<0.0.28.1361>}},
                  {{channel,1},
                   {<0.22104.0>,{method,rabbit_framing_amqp_0_9_1}}},
                  {process_name,
                      {rabbit_reader,
                          <<"172.31.11.8:55576 -> 172.31.11.14:5672">>}}]
    trap_exit: true
    status: running
    heap_size: 987
    stack_size: 27
    reductions: 217442
  neighbours:

=SUPERVISOR REPORT==== 27-Mar-2017::19:32:17 ===
     Supervisor: {<0.22087.0>,rabbit_connection_sup}
     Context:    shutdown_error
     Reason:     channel_termination_timeout
     Offender:   [{pid,<0.22090.0>},
                  {name,reader},
                  {mfargs,
                      {rabbit_reader,start_link,
                          [<0.22089.0>,
                           {acceptor,{172,31,11,14},5672},
                           #Port<0.12469>]}},
                  {restart_type,intrinsic},
                  {shutdown,30000},
                  {child_type,worker}]


=CRASH REPORT==== 27-Mar-2017::19:32:17 ===
  crasher:
    initial call: rabbit_reader:init/4
    pid: <0.21972.0>
    registered_name: []
    exception exit: channel_termination_timeout
      in function  rabbit_reader:wait_for_channel_termination/3 (src/rabbit_reader.erl, line 767)
      in call from rabbit_reader:send_error_on_channel0_and_close/4 (src/rabbit_reader.erl, line 1504)
      in call from rabbit_reader:terminate/2 (src/rabbit_reader.erl, line 612)
      in call from rabbit_reader:handle_other/2 (src/rabbit_reader.erl, line 537)
      in call from rabbit_reader:mainloop/4 (src/rabbit_reader.erl, line 499)
      in call from rabbit_reader:run/1 (src/rabbit_reader.erl, line 424)
      in call from rabbit_reader:start_connection/4 (src/rabbit_reader.erl, line 382)
    ancestors: [<0.21970.0>,<0.340.0>,<0.339.0>,<0.338.0>,rabbit_sup,
                  <0.80.0>]
    messages: [{tcp_closed,#Port<0.12456>},{'EXIT',#Port<0.12456>,normal}]
    links: []
    dictionary: [{{channel,1},
                   {<0.21985.0>,{method,rabbit_framing_amqp_0_9_1}}},
                  {{ch_pid,<0.21985.0>},{1,#Ref<0.0.22.2311>}},
                  {process_name,
                      {rabbit_reader,
                          <<"172.31.11.8:55486 -> 172.31.11.14:5672">>}}]
    trap_exit: true
    status: running
    heap_size: 987
    stack_size: 27
    reductions: 220786
  neighbours:

Comment 2 Peter Lemenkov 2017-04-12 14:24:52 UTC
Strange. This is the same issue as one described in bug 1431336. It should be fixed in upstream version 3.6.3.

Really dumb question - are you sure you're using 3.6.3 build all the time (no upgrade from 3.3.5 for example).

Comment 17 Pablo Caruana 2018-01-02 10:49:18 UTC
Closed as insufficient data. Feel free to reopen this bugzilla in case of having the all the debug data in place.

Comment 21 Peter Lemenkov 2018-04-09 13:46:26 UTC
The remaining issue should be fixed in rabbitmq-server-3.6.3-9.el7ost

Comment 34 errata-xmlrpc 2018-05-17 15:49:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1596

Comment 35 Peter Lemenkov 2018-05-28 14:00:07 UTC
*** Bug 1575885 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.