Bug 1533406
Summary: | [UPDATES] RMQ fails to start after minor update and reboot | |||
---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Yurii Prokulevych <yprokule> | |
Component: | puppet-tripleo | Assignee: | John Eckersberg <jeckersb> | |
Status: | CLOSED CURRENTRELEASE | QA Contact: | pkomarov | |
Severity: | urgent | Docs Contact: | ||
Priority: | urgent | |||
Version: | 11.0 (Ocata) | CC: | apevec, augol, chjones, jeckersb, jjoyce, jschluet, lhh, mbultel, pkomarov, slinaber, srevivo, tvignaud | |
Target Milestone: | zstream | Keywords: | TestOnly, Triaged, ZStream | |
Target Release: | 11.0 (Ocata) | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | puppet-tripleo-6.5.10-2.el7ost | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1555317 1557513 (view as bug list) | Environment: | ||
Last Closed: | 2018-06-08 12:21:36 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1536064 | |||
Bug Blocks: | 1555317, 1557513, 1557519, 1557522, 1647474, 1647587, 1647593, 1654041, 1654042 |
Description
Yurii Prokulevych
2018-01-11 09:43:23 UTC
This is probably the same as bug 1522896, except 1522896 is filed against OSP12. OK, this looks related to bug 1461190. Erlang can't reach epmd, because epmd is listening on ipv6, but the erlang resolver returns ipv4: [root@controller-1 rabbitmq]# erl -sname foo -proto_dist inet6_tcp Erlang/OTP 18 [erts-7.3.1.3] [source] [64-bit] [smp:4:4] [async-threads:10] [hipe] [kernel-poll:false] Eshell V7.3.1.3 (abort with ^G) (foo@controller-1)1> inet:gethostbyname("controller-1"). {ok,{hostent,"controller-1",[],inet,4, [{192,168,24,6},{192,168,24,13},{172,17,2,11}]}} Seems the reason is because OSP11 is missing https://github.com/voxpupuli/puppet-rabbitmq/pull/552. I've verified that patch is included in OSP12 as per https://bugzilla.redhat.com/show_bug.cgi?id=1484547#c9, so this is *not* the same as bug 1522896. This also requires the updated erlang from bug 1536064 This has been fixed in rdo stable/ocata for a while now, https://trunk.rdoproject.org/centos7-ocata/current/puppet-rabbitmq-5.6.1-0.20180115161315.5ac45de.el7.centos.noarch.rpm Pending stable/ocata import - https://trello.com/c/YWYdFmLe/703-osp11-import-rdo-ocata-promotion-2018-03-08 According to our records, this should be resolved by puppet-tripleo-6.5.10-3.el7ost. This build is available now. Verified , Tested On Minor Update : OSP11 2018-01-04.2 -> OSP11 2018-05-23.1 Package version : [stack@undercloud-0 ~]$ ansible overcloud -b -mshell -a'rpm -qa |grep puppet-tripleo' compute-0 | SUCCESS | rc=0 >> puppet-tripleo-6.5.10-3.el7ost.noarch compute-1 | SUCCESS | rc=0 >> puppet-tripleo-6.5.10-3.el7ost.noarch controller-1 | SUCCESS | rc=0 >> puppet-tripleo-6.5.10-3.el7ost.noarch controller-0 | SUCCESS | rc=0 >> puppet-tripleo-6.5.10-3.el7ost.noarch controller-2 | SUCCESS | rc=0 >> puppet-tripleo-6.5.10-3.el7ost.noarch Reproducing the Reboot process: For nodes running pacemaker ; do pcs cluster stop ;reboot ;pcs cluster start; Check cluster status after disruption: All OK : Cluster name: tripleo_cluster Stack: corosync Current DC: controller-1 (version 1.1.18-11.el7_5.2-2b07d5c5a9) - partition with quorum Last updated: Sun Jun 3 10:01:28 2018 Last change: Sun Jun 3 09:57:53 2018 by hacluster via crmd on controller-2 3 nodes configured 22 resources configured Online: [ controller-0 controller-1 controller-2 ] Full list of resources: Master/Slave Set: galera-master [galera] Masters: [ controller-0 controller-1 controller-2 ] Clone Set: rabbitmq-clone [rabbitmq] Started: [ controller-0 controller-1 controller-2 ] Master/Slave Set: redis-master [redis] Masters: [ controller-0 ] Slaves: [ controller-1 controller-2 ] ip-192.168.24.14 (ocf::heartbeat:IPaddr2): Started controller-0 ip-10.0.0.101 (ocf::heartbeat:IPaddr2): Started controller-1 ip-172.17.1.16 (ocf::heartbeat:IPaddr2): Started controller-0 ip-172.17.1.10 (ocf::heartbeat:IPaddr2): Started controller-0 ip-172.17.3.10 (ocf::heartbeat:IPaddr2): Started controller-1 ip-172.17.4.18 (ocf::heartbeat:IPaddr2): Started controller-1 Clone Set: haproxy-clone [haproxy] Started: [ controller-0 controller-1 controller-2 ] openstack-cinder-volume (systemd:openstack-cinder-volume): Started controller-0 stonith-fence_ipmilan-52540050eb11 (stonith:fence_ipmilan): Started controller-0 stonith-fence_ipmilan-525400cbbc07 (stonith:fence_ipmilan): Started controller-1 stonith-fence_ipmilan-525400f5c568 (stonith:fence_ipmilan): Started controller-1 Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled Checking empd is listening on ipv4: [stack@undercloud-0 ~]$ ansible controller -b -mshell -a'ss -anlp | grep 4369' [WARNING]: Found both group and host with same name: undercloud controller-2 | SUCCESS | rc=0 >> tcp LISTEN 0 128 172.17.1.17:4369 *:* users:(("epmd",pid=6925,fd=5)) tcp LISTEN 0 128 127.0.0.1:4369 *:* users:(("epmd",pid=6925,fd=3)) tcp LISTEN 0 128 ::1:4369 :::* users:(("epmd",pid=6925,fd=4)) controller-0 | SUCCESS | rc=0 >> nl UNCONN 0 0 16:-4369 0:0 tcp LISTEN 0 128 172.17.1.13:4369 *:* users:(("epmd",pid=4431,fd=5)) tcp LISTEN 0 128 127.0.0.1:4369 *:* users:(("epmd",pid=4431,fd=3)) tcp LISTEN 0 128 ::1:4369 :::* users:(("epmd",pid=4431,fd=4)) controller-1 | SUCCESS | rc=0 >> nl UNCONN 0 0 16:-4369 0:0 tcp LISTEN 0 128 172.17.1.22:4369 *:* users:(("epmd",pid=4426,fd=5)) tcp LISTEN 0 128 127.0.0.1:4369 *:* users:(("epmd",pid=4426,fd=3)) tcp LISTEN 0 128 ::1:4369 :::* users:(("epmd",pid=4426,fd=4)) Checking rabbitmq startup logs: All OK [stack@undercloud-0 ~]$ ansible controller -b -mshell -a'cat /var/log/rabbitmq/startup_log' [WARNING]: Found both group and host with same name: undercloud controller-0 | SUCCESS | rc=0 >> RabbitMQ 3.6.5. Copyright (C) 2007-2016 Pivotal Software, Inc. ## ## Licensed under the MPL. See http://www.rabbitmq.com/ ## ## ########## Logs: /var/log/rabbitmq/rabbit ###### ## /var/log/rabbitmq/rabbit ########## Starting broker... completed with 0 plugins. controller-1 | SUCCESS | rc=0 >> RabbitMQ 3.6.5. Copyright (C) 2007-2016 Pivotal Software, Inc. ## ## Licensed under the MPL. See http://www.rabbitmq.com/ ## ## ########## Logs: /var/log/rabbitmq/rabbit ###### ## /var/log/rabbitmq/rabbit ########## Starting broker... completed with 0 plugins. RabbitMQ 3.6.5. Copyright (C) 2007-2016 Pivotal Software, Inc. ## ## Licensed under the MPL. See http://www.rabbitmq.com/ ## ## ########## Logs: /var/log/rabbitmq/rabbit ###### ## /var/log/rabbitmq/rabbit ########## Starting broker... completed with 0 plugins. controller-2 | SUCCESS | rc=0 >> RabbitMQ 3.6.5. Copyright (C) 2007-2016 Pivotal Software, Inc. ## ## Licensed under the MPL. See http://www.rabbitmq.com/ ## ## ########## Logs: /var/log/rabbitmq/rabbit ###### ## /var/log/rabbitmq/rabbit ########## Starting broker... completed with 0 plugins. RabbitMQ 3.6.5. Copyright (C) 2007-2016 Pivotal Software, Inc. ## ## Licensed under the MPL. See http://www.rabbitmq.com/ ## ## ########## Logs: /var/log/rabbitmq/rabbit ###### ## /var/log/rabbitmq/rabbit ########## Starting broker... completed with 0 plugins. OSP11 EOL'd with a newer version of this package. |