Bug 1715315 - Rabbitmq broker crashed every couple of minutes in OSP14
Summary: Rabbitmq broker crashed every couple of minutes in OSP14
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: rabbitmq-server
Version: 14.0 (Rocky)
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: z4
: 14.0 (Rocky)
Assignee: Peter Lemenkov
QA Contact: pkomarov
URL:
Whiteboard:
Depends On: 1699993
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-05-30 06:20 UTC by Chen
Modified: 2023-03-24 14:56 UTC (History)
5 users (show)

Fixed In Version: rabbitmq-server-3.6.16-4.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-11-06 16:53:25 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
erl_crash.dump file (1.90 MB, text/plain)
2019-05-30 06:20 UTC, Chen
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-23666 0 None None None 2023-03-24 14:56:41 UTC
Red Hat Product Errata RHBA-2019:3747 0 None None None 2019-11-06 16:53:41 UTC

Description Chen 2019-05-30 06:20:51 UTC
Created attachment 1575081 [details]
erl_crash.dump file

Description of problem:

Rabbitmq broker crashed every couple of minutes in OSP14

Version-Release number of selected component (if applicable):

OSP14
14.0-136:pcmklatest

How reproducible:

100% in customer site for all the 3 controller nodes

Steps to Reproduce:
1.
2.
3.

Actual results:

Broker crashed and erl_crash.dump generated	

Expected results:


Additional info:

Comment 1 Chen 2019-06-03 10:40:47 UTC
Hi,

Is there any update for this continous crash issue ?

Best Regards,
Chen

Comment 2 Peter Lemenkov 2019-06-04 07:43:22 UTC
Hello,
We have several other issues which are quite similar. So far our best shot was to increase ARP cache size (see bug 1653242 comment 15 for example). We still investigating this. Meanwhile could you please ask the customer to increase ARP cache to rule out that one.

Comment 3 Chen 2019-06-04 07:52:16 UTC
Hi Peter,

Thank you very much for your reply !

Sure fully understood. I will ask the customer to try to increase ARP cache.

Best Regards,
Chen

Comment 5 Peter Lemenkov 2019-06-14 14:49:53 UTC
See bug 1699993 for the same issue.

Comment 9 Chen 2019-07-31 03:51:59 UTC
Hi Peter,

Thank you very much for your response.

Best Regards,
Chen

Comment 14 pkomarov 2019-10-23 21:55:19 UTC
Verified , 

[stack@undercloud-0 ~]$ rhos-release -L
Installed repositories (rhel-7.7):
  14
  ceph-3
  ceph-osd-3
  rhel-7.7
[stack@undercloud-0 ~]$ cat core_puddle_version 
2019-10-21.1

[root@controller-0 ~]# grep 'Oct 23 14:01:33' -A 99999 /var/log/cluster/corosync.log|grep 'ocf::heartbeat:rabbitmq-cluster'|grep rabbitmq-bundle|grep -v Started||echo 'no rabbit problems were found'
no rabbit problems were found

[root@controller-0 ~]# tail -n 1 /var/log/cluster/corosync.log
Oct 23 21:51:54 [26538] controller-0        cib:     info: cib_process_request:	Completed cib_modify operation for section nodes: OK (rc=0, origin=controller-2/crm_attribute/4, version=0.84.4)

[root@controller-0 ~]# docker exec -it  rabbitmq-bundle-docker-0 bash
()[root@controller-0 /]# rpm -q rabbitmq-server
rabbitmq-server-3.6.16-4.el7ost.noarch

Comment 16 errata-xmlrpc 2019-11-06 16:53:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:3747


Note You need to log in before you can comment on or make changes to this bug.