Description of problem: While performing a minor update, the process timedout and it looks like pacemaker can't determine address for bundles: Jul 19 18:01:58 overcloud-controller-2 crmd[313631]: warning: Input I_ELECTION_DC received in state S_INTEGRATION from do_election_check Jul 19 18:01:58 overcloud-controller-2 corosync[313564]: [TOTEM ] A new membership (10.10.10.10:7732) was formed. Members left: 1 Jul 19 18:01:58 overcloud-controller-2 corosync[313564]: [QUORUM] Members[2]: 2 3 Jul 19 18:01:58 overcloud-controller-2 corosync[313564]: [MAIN ] Completed service synchronization, ready to provide service. Jul 19 18:01:58 overcloud-controller-2 pacemakerd[313576]: notice: Node overcloud-controller-0 state is now lost Jul 19 18:01:59 overcloud-controller-2 dnsmasq[180147]: read /var/lib/neutron/dhcp/90292b43-3cd9-4c98-b008-013208e4d9e4/addn_hosts - 4 addresses Jul 19 18:01:59 overcloud-controller-2 dnsmasq-dhcp[180147]: read /var/lib/neutron/dhcp/90292b43-3cd9-4c98-b008-013208e4d9e4/host Jul 19 18:01:59 overcloud-controller-2 dnsmasq-dhcp[180147]: read /var/lib/neutron/dhcp/90292b43-3cd9-4c98-b008-013208e4d9e4/opts Jul 19 18:02:00 overcloud-controller-2 crmd[313631]: notice: Node galera-bundle-2 state is now member Jul 19 18:02:00 overcloud-controller-2 crmd[313631]: notice: Node redis-bundle-0 state is now lost Jul 19 18:02:00 overcloud-controller-2 crmd[313631]: warning: No reason to expect node redis-bundle-0 to be down Jul 19 18:02:00 overcloud-controller-2 crmd[313631]: notice: Stonith/shutdown of redis-bundle-0 not matched Jul 19 18:02:00 overcloud-controller-2 crmd[313631]: notice: Node rabbitmq-bundle-1 state is now lost Jul 19 18:02:00 overcloud-controller-2 crmd[313631]: warning: No reason to expect node rabbitmq-bundle-1 to be down Jul 19 18:02:00 overcloud-controller-2 crmd[313631]: notice: Stonith/shutdown of rabbitmq-bundle-1 not matched Jul 19 18:02:00 overcloud-controller-2 crmd[313631]: notice: Node galera-bundle-0 state is now lost Jul 19 18:02:00 overcloud-controller-2 crmd[313631]: warning: No reason to expect node galera-bundle-0 to be down Jul 19 18:02:00 overcloud-controller-2 crmd[313631]: notice: Stonith/shutdown of galera-bundle-0 not matched Jul 19 18:02:00 overcloud-controller-2 crmd[313631]: notice: Node redis-bundle-2 state is now member Jul 19 18:02:00 overcloud-controller-2 crmd[313631]: notice: Node rabbitmq-bundle-0 state is now member Jul 19 18:02:00 overcloud-controller-2 crmd[313631]: notice: Node overcloud-controller-0 state is now lost Jul 19 18:02:00 overcloud-controller-2 crmd[313631]: warning: No reason to expect node 1 to be down Jul 19 18:02:00 overcloud-controller-2 crmd[313631]: notice: Stonith/shutdown of overcloud-controller-0 not matched Jul 19 18:02:00 overcloud-controller-2 pengine[313630]: warning: Blind faith: not fencing unseen nodes Jul 19 18:02:00 overcloud-controller-2 cib[313626]: warning: A-Sync reply to crmd failed: No message of desired type Jul 19 18:02:00 overcloud-controller-2 pengine[313630]: notice: * Start rabbitmq-bundle-1 ( overcloud-controller-2 ) due to unrunnable rabbitmq-bundle-docker-1 start (blocked) Jul 19 18:02:00 overcloud-controller-2 pengine[313630]: notice: * Start rabbitmq:1 ( rabbitmq-bundle-1 ) due to unrunnable rabbitmq-bundle-docker-1 start (blocked) Jul 19 18:02:00 overcloud-controller-2 pengine[313630]: notice: * Start rabbitmq-bundle-2 ( overcloud-controller-2 ) due to unrunnable rabbitmq-bundle-docker-2 start (blocked) Jul 19 18:02:00 overcloud-controller-2 pengine[313630]: notice: * Start rabbitmq:2 ( rabbitmq-bundle-2 ) due to unrunnable rabbitmq-bundle-docker-2 start (blocked) Jul 19 18:02:00 overcloud-controller-2 pengine[313630]: notice: * Start galera-bundle-0 ( overcloud-controller-1 ) due to unrunnable galera-bundle-docker-0 start (blocked) Jul 19 18:02:00 overcloud-controller-2 pengine[313630]: notice: * Start galera:0 ( galera-bundle-0 ) due to unrunnable galera-bundle-docker-0 start (blocked) Jul 19 18:02:00 overcloud-controller-2 pengine[313630]: notice: * Start redis-bundle-0 ( overcloud-controller-2 ) due to unrunnable redis-bundle-docker-0 start (blocked) Jul 19 18:02:00 overcloud-controller-2 pengine[313630]: notice: * Start redis:0 ( redis-bundle-0 ) due to unrunnable redis-bundle-docker-0 start (blocked) Jul 19 18:02:00 overcloud-controller-2 pengine[313630]: error: Could not determine address for bundle connection rabbitmq-bundle-1 Jul 19 18:02:00 overcloud-controller-2 pengine[313630]: error: Could not determine address for bundle connection rabbitmq-bundle-2 Jul 19 18:02:00 overcloud-controller-2 pengine[313630]: error: Could not determine address for bundle connection galera-bundle-0 Jul 19 18:02:00 overcloud-controller-2 pengine[313630]: error: Could not determine address for bundle connection redis-bundle-0 Jul 19 18:02:00 overcloud-controller-2 pengine[313630]: notice: Calculated transition 0, saving inputs in /var/lib/pacemaker/pengine/pe-input-2328.bz2 Version-Release number of selected component (if applicable): pacemaker-libs-1.1.19-8.el7_6.2.x86_64 How reproducible: This environment Steps to Reproduce: 1. Minor update timedout 2. bundles are not able to start due to "Could not determine address for bundle connection" 3. Actual results: Minor update failure Expected results: No failures. Additional info:
The resource was banned and running "pcs resource clear rabbitmq-bundle" solved this . The issue we have now is that this rabbitmq won't join the cluster and I'm wondering at this stage if simply re-starting the minor update procedure would solve this.
Will the update run with pcs not being in a healthy state, though?