Bug 1557519 - [OSP9 UPDATES] RMQ fails to start after minor update and reboot
Summary: [OSP9 UPDATES] RMQ fails to start after minor update and reboot
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-puppet-modules
Version: 9.0 (Mitaka)
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: zstream
: 9.0 (Mitaka)
Assignee: John Eckersberg
QA Contact: pkomarov
URL:
Whiteboard:
: 1552043 (view as bug list)
Depends On: 1533406 1536064 1557513
Blocks: 1647474 1647587 1647593 1654041 1654042
TreeView+ depends on / blocked
 
Reported: 2018-03-16 18:51 UTC by John Eckersberg
Modified: 2018-11-27 21:53 UTC (History)
16 users (show)

Fixed In Version: openstack-puppet-modules-8.1.13-8.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1557513
: 1647474 1647593 (view as bug list)
Environment:
Last Closed: 2018-11-27 21:20:29 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 533738 0 None None None 2018-03-16 18:51:23 UTC
RDO 11322 0 None None None 2018-03-16 18:51:23 UTC
Red Hat Product Errata RHBA-2018:3692 0 None None None 2018-11-27 21:20:53 UTC

Comment 1 John Eckersberg 2018-03-16 18:53:47 UTC
For OSP9, opm needs fixes for both puppet-rabbitmq and puppet-tripleo both in this same bug  (10 and 11 have separate bugs per component)

Comment 2 John Eckersberg 2018-03-19 15:04:22 UTC
*** Bug 1552043 has been marked as a duplicate of this bug. ***

Comment 3 John Eckersberg 2018-03-20 14:37:50 UTC
(In reply to John Eckersberg from comment #1)
> For OSP9, opm needs fixes for both puppet-rabbitmq and puppet-tripleo both
> in this same bug  (10 and 11 have separate bugs per component)

Correction, in OSP9, this is in tht directly, there is no rabbitmq logic in puppet-tripleo.

Comment 4 John Eckersberg 2018-03-20 18:55:40 UTC
This likely only happens on pure-IPv6 deployments.  I have an IPv6-enabled OSP9 deployment but it also has IPv4 addresses configured.  

Because OSP9 does not contain the fixes for:

https://bugs.launchpad.net/tripleo/+bug/1645898

This causes epmd to listen on all interfaces.  The erlang resolver returns an IPv4 address for epmd connections, and because epmd is listening on IPv4 it all works, somewhat by accident.

Comment 12 pkomarov 2018-11-19 16:38:19 UTC
verified, 

[stack@undercloud-0 ~]$ rhos-release -L
Installed repositories (rhel-7.6):
  9-director
  9
  ceph-1.3
  ceph-osd-1.3
  rhel-7.6
[stack@undercloud-0 ~]$ openstack stack list
+--------------------------------------+------------+-----------------+---------------------+---------------------+
| ID                                   | Stack Name | Stack Status    | Creation Time       | Updated Time        |
+--------------------------------------+------------+-----------------+---------------------+---------------------+
| 6b8726c8-63ab-4675-b0a4-da0808931fb4 | overcloud  | UPDATE_COMPLETE | 2018-11-19T13:18:34 | 2018-11-19T14:27:39 |
+--------------------------------------+------------+-----------------+---------------------+---------------------+

[stack@undercloud-0 ~]$ cat core_puddle_version 
2018-11-15.1[stack@undercloud-0 ~]$ rpm -qa|grep openstack-tripleo-heat-templates
openstack-tripleo-heat-templates-liberty-2.0.0-69.el7ost.noarch
openstack-tripleo-heat-templates-2.0.0-69.el7ost.noarch

[stack@undercloud-0 ~]$ rpm -qa|grep openstack-puppet-modules
openstack-puppet-modules-8.1.13-8.el7ost.noarch

[stack@undercloud-0 ~]$ ansible controller -mshell -b -a'pcs cluster stop --request-timeout=300'
controller-0 | SUCCESS | rc=0 >>
Stopping Cluster (pacemaker)...
Stopping Cluster (corosync)...

controller-1 | SUCCESS | rc=0 >>
Stopping Cluster (pacemaker)...
Stopping Cluster (corosync)...

controller-2 | SUCCESS | rc=0 >>
Stopping Cluster (pacemaker)...
Stopping Cluster (corosync)...

[stack@undercloud-0 ~]$ ansible controller -mshell -b -a'reboot'


[stack@undercloud-0 ~]$ ansible controller -mshell -b -a'pcs status|grep -A1 rabbitmq-clone'
 [WARNING]: Found both group and host with same name: undercloud

controller-0 | SUCCESS | rc=0 >>
 Clone Set: rabbitmq-clone [rabbitmq]
     Started: [ controller-0 controller-1 controller-2 ]

controller-2 | SUCCESS | rc=0 >>
 Clone Set: rabbitmq-clone [rabbitmq]
     Started: [ controller-0 controller-1 controller-2 ]

controller-1 | SUCCESS | rc=0 >>
 Clone Set: rabbitmq-clone [rabbitmq]
     Started: [ controller-0 controller-1 controller-2 ]



[stack@undercloud-0 ~]$ ansible controller -mshell -b -a' rabbitmqctl eval "rabbit_mnesia:cluster_status_from_mnesia()."'
 [WARNING]: Found both group and host with same name: undercloud

controller-0 | SUCCESS | rc=0 >>
{ok,{['rabbit@controller-1','rabbit@controller-2','rabbit@controller-0'],
     ['rabbit@controller-0','rabbit@controller-1','rabbit@controller-2'],
     ['rabbit@controller-1','rabbit@controller-2','rabbit@controller-0']}}

controller-2 | SUCCESS | rc=0 >>
{ok,{['rabbit@controller-1','rabbit@controller-2','rabbit@controller-0'],
     ['rabbit@controller-0','rabbit@controller-1','rabbit@controller-2'],
     ['rabbit@controller-1','rabbit@controller-0','rabbit@controller-2']}}

controller-1 | SUCCESS | rc=0 >>
{ok,{['rabbit@controller-1','rabbit@controller-2','rabbit@controller-0'],
     ['rabbit@controller-0','rabbit@controller-1','rabbit@controller-2'],
     ['rabbit@controller-0','rabbit@controller-2','rabbit@controller-1']}}

Comment 13 pkomarov 2018-11-19 17:10:02 UTC
more checks are needed

Comment 14 Joanne O'Flynn 2018-11-22 11:26:40 UTC
This bug is marked for inclusion in the errata but does not currently contain draft documentation text. To ensure the timely release of this advisory, please provide draft documentation text for this bug as soon as possible.

If you do not think this bug requires errata documentation, set the requires_doc_text flag to "-".


To add draft documentation text:

1. Select the documentation type from the Doc Type drop-down field.

2. A template is provided in the Doc Text field based on the Doc Type value selected. Enter the draft text in the Doc Text field.

Comment 16 errata-xmlrpc 2018-11-27 21:20:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3692


Note You need to log in before you can comment on or make changes to this bug.