Description of problem: Minor update of overcloud to osp11 + rhel7.4 testing repos + local mirrors stucked. From controller /var/log/messages Jun 23 16:30:24 controller-0 snmpd[72072]: Connection from UDP: [192.168.24.1]:42353->[192.168.24.10]:161 Jun 23 16:30:53 controller-0 proxy-server: STDERR: ERROR:oslo.messaging._drivers.impl_rabbit:[a59afc1f-2efd-4969-a268-a28450ccbb1c] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: None Jun 23 16:31:25 controller-0 proxy-server: STDERR: ERROR:oslo.messaging._drivers.impl_rabbit:[a59afc1f-2efd-4969-a268-a28450ccbb1c] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: None Jun 23 16:31:57 controller-0 proxy-server: STDERR: ERROR:oslo.messaging._drivers.impl_rabbit:[a59afc1f-2efd-4969-a268-a28450ccbb1c] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: None Jun 23 16:32:29 controller-0 proxy-server: STDERR: ERROR:oslo.messaging._drivers.impl_rabbit:[a59afc1f-2efd-4969-a268-a28450ccbb1c] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: None Jun 23 16:33:01 controller-0 proxy-server: STDERR: ERROR:oslo.messaging._drivers.impl_rabbit:[a59afc1f-2efd-4969-a268-a28450ccbb1c] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: None Jun 23 16:33:33 controller-0 proxy-server: STDERR: ERROR:oslo.messaging._drivers.impl_rabbit:[a59afc1f-2efd-4969-a268-a28450ccbb1c] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: None Jun 23 16:34:05 controller-0 proxy-server: STDERR: ERROR:oslo.messaging._drivers.impl_rabbit:[a59afc1f-2efd-4969-a268-a28450ccbb1c] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: None Jun 23 16:34:37 controller-0 proxy-server: STDERR: ERROR:oslo.messaging._drivers.impl_rabbit:[a59afc1f-2efd-4969-a268-a28450ccbb1c] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: None Jun 23 16:35:09 controller-0 proxy-server: STDERR: ERROR:oslo.messaging._drivers.impl_rabbit:[a59afc1f-2efd-4969-a268-a28450ccbb1c] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: None Jun 23 16:35:41 controller-0 proxy-server: STDERR: ERROR:oslo.messaging._drivers.impl_rabbit:[a59afc1f-2efd-4969-a268-a28450ccbb1c] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: None Jun 23 16:36:13 controller-0 proxy-server: STDERR: ERROR:oslo.messaging._drivers.impl_rabbit:[a59afc1f-2efd-4969-a268-a28450ccbb1c] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: None Jun 23 16:36:45 controller-0 proxy-server: STDERR: ERROR:oslo.messaging._drivers.impl_rabbit:[a59afc1f-2efd-4969-a268-a28450ccbb1c] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: None Jun 23 16:37:17 controller-0 proxy-server: STDERR: ERROR:oslo.messaging._drivers.impl_rabbit:[a59afc1f-2efd-4969-a268-a28450ccbb1c] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: None Jun 23 16:37:49 controller-0 proxy-server: STDERR: ERROR:oslo.messaging._drivers.impl_rabbit:[a59afc1f-2efd-4969-a268-a28450ccbb1c] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: None Jun 23 16:38:21 controller-0 proxy-server: STDERR: ERROR:oslo.messaging._drivers.impl_rabbit:[a59afc1f-2efd-4969-a268-a28450ccbb1c] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: None Jun 23 16:38:53 controller-0 proxy-server: STDERR: ERROR:oslo.messaging._drivers.impl_rabbit:[a59afc1f-2efd-4969-a268-a28450ccbb1c] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: None Jun 23 16:39:25 controller-0 proxy-server: STDERR: ERROR:oslo.messaging._drivers.impl_rabbit:[a59afc1f-2efd-4969-a268-a28450ccbb1c] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: None Jun 23 16:39:57 controller-0 proxy-server: STDERR: ERROR:oslo.messaging._drivers.impl_rabbit:[a59afc1f-2efd-4969-a268-a28450ccbb1c] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: None Jun 23 16:40:17 controller-0 snmpd[72072]: Connection from UDP: [192.168.24.1]:48634->[192.168.24.10]:161 [heat-admin@controller-0 ~]$ systemctl --failed UNIT LOAD ACTIVE SUB DESCRIPTION ● dhcp-interface loaded failed failed DHCP interface ovs-system ● haproxy.service loaded failed failed HAProxy Load Balancer ● openstack-cinder-volume.service loaded failed failed OpenStack Cinder Volume Server [heat-admin@controller-0 ~]$ sudo systemctl status haproxy ● haproxy.service - HAProxy Load Balancer Loaded: loaded (/usr/lib/systemd/system/haproxy.service; disabled; vendor preset: disabled) Active: failed (Result: exit-code) since Fri 2017-06-23 13:08:00 UTC; 7h ago Main PID: 31616 (code=exited, status=143) Jun 23 13:07:29 controller-0.localdomain haproxy[116940]: Connect from 172.17.1.14:60200 to 172.17.1.18:3306 (mysql/TCP) Jun 23 13:07:58 controller-0.localdomain haproxy[116940]: Connect from 172.17.1.21:56104 to 172.17.1.18:8778 (nova_placement/HTTP) Jun 23 13:07:58 controller-0.localdomain haproxy[116940]: Connect from 192.168.24.14:43458 to 192.168.24.14:35357 (keystone_admin/HTTP) Jun 23 13:08:00 controller-0.localdomain systemd[1]: Stopping HAProxy Load Balancer... Jun 23 13:08:00 controller-0.localdomain haproxy-systemd-wrapper[31616]: haproxy-systemd-wrapper: SIGTERM -> 116940. Jun 23 13:08:00 controller-0.localdomain haproxy-systemd-wrapper[31616]: haproxy-systemd-wrapper: exit, haproxy RC=143 Jun 23 13:08:00 controller-0.localdomain systemd[1]: haproxy.service: main process exited, code=exited, status=143/n/a Jun 23 13:08:00 controller-0.localdomain systemd[1]: Stopped HAProxy Load Balancer. Jun 23 13:08:00 controller-0.localdomain systemd[1]: Unit haproxy.service entered failed state. Jun 23 13:08:00 controller-0.localdomain systemd[1]: haproxy.service failed. [heat-admin@controller-0 ~]$ Jun 23 13:08:00 controller-0.localdomain pengine[15325]: notice: Stop haproxy:0 (controller-0) Jun 23 13:08:00 controller-0.localdomain crmd[15326]: notice: Initiating stop operation haproxy_stop_0 locally on controller-0 Jun 23 13:08:00 controller-0.localdomain haproxy-systemd-wrapper[31616]: haproxy-systemd-wrapper: SIGTERM -> 116940. Jun 23 13:08:00 controller-0.localdomain haproxy-systemd-wrapper[31616]: haproxy-systemd-wrapper: exit, haproxy RC=143 Jun 23 13:08:00 controller-0.localdomain systemd[1]: haproxy.service: main process exited, code=exited, status=143/n/a Jun 23 13:08:00 controller-0.localdomain systemd[1]: Unit haproxy.service entered failed state. Jun 23 13:08:00 controller-0.localdomain systemd[1]: haproxy.service failed. Jun 23 13:08:02 controller-0.localdomain crmd[15326]: notice: Result of stop operation for haproxy on controller-0: 0 (ok) Jun 23 13:13:14 controller-0.localdomain yum[347498]: Updated: haproxy-1.5.18-6.el7.x86_64 heat-admin@controller-0 ~]$ yum -v repolist Not loading "rhnplugin" plugin, as it is disabled Loading "product-id" plugin Loading "search-disabled-repos" plugin Loading "subscription-manager" plugin Not root, Subscription Management repositories not updated Config time: 0.083 Yum version: 3.4.3 rhelosp-11.0-devtools-puddle 4/4 rhelosp-11.0-puddle 737/737 Setting up Package Sacks pkgsack time: 0.005 Repo-id : rhelosp-11.0-ceph-2.0-mon/x86_64 Repo-name : Ceph 2.0 MON Repo-revision: 1497883293 Repo-updated : Mon Jun 19 14:41:33 2017 Repo-pkgs : 133 Repo-size : 729 M Repo-baseurl : http://rhos-qe-mirror-qeos.usersys.redhat.com/pulp/content/dist/rhel/server/7/7Server/x86_64/ceph-mon/2/os/ Repo-expire : 21,600 second(s) (last: Fri Jun 23 13:01:45 2017) Filter : read-only:present Repo-filename: /etc/yum.repos.d/rhos-release-11.repo Repo-id : rhelosp-11.0-ceph-2.0-osd/x86_64 Repo-name : Ceph 2.0 OSD Repo-revision: 1497883292 Repo-updated : Mon Jun 19 14:41:32 2017 Repo-pkgs : 115 Repo-size : 672 M Repo-baseurl : http://rhos-qe-mirror-qeos.usersys.redhat.com/pulp/content/dist/rhel/server/7/7Server/x86_64/ceph-osd/2/os/ Repo-expire : 21,600 second(s) (last: Fri Jun 23 13:01:45 2017) Filter : read-only:present Repo-filename: /etc/yum.repos.d/rhos-release-11.repo Repo-id : rhelosp-11.0-ceph-2.0-tools/x86_64 Repo-name : Ceph 2.0 Tools Repo-revision: 1497883292 Repo-updated : Mon Jun 19 14:41:32 2017 Repo-pkgs : 153 Repo-size : 230 M Repo-baseurl : http://rhos-qe-mirror-qeos.usersys.redhat.com/pulp/content/dist/rhel/server/7/7Server/x86_64/ceph-tools/2/os/ Repo-expire : 21,600 second(s) (last: Fri Jun 23 13:01:45 2017) Filter : read-only:present Repo-filename: /etc/yum.repos.d/rhos-release-11.repo Repo-id : rhelosp-11.0-devtools-puddle/x86_64 Repo-name : RHOS-11.0 Repo-revision: 1497972122 Repo-updated : Tue Jun 20 15:22:03 2017 Repo-pkgs : 4 Repo-size : 1.2 M Repo-baseurl : http://rhos-qe-mirror-qeos.usersys.redhat.com/rcm-guest/puddles/OpenStack/11.0-RHEL-7/2017-06-20.2/RH7-RHOS-DEVTOOLS-11.0/x86_64/os Repo-expire : 21,600 second(s) (last: Fri Jun 23 13:01:45 2017) Filter : read-only:present Repo-filename: /etc/yum.repos.d/rhos-release-11.repo Repo-id : rhelosp-11.0-puddle/x86_64 Repo-name : RHOS-11.0 Repo-revision: 1497972077 Repo-updated : Tue Jun 20 15:21:56 2017 Repo-pkgs : 737 Repo-size : 2.0 G Repo-baseurl : http://rhos-qe-mirror-qeos.usersys.redhat.com/rcm-guest/puddles/OpenStack/11.0-RHEL-7/2017-06-20.2/RH7-RHOS-11.0/x86_64/os Repo-expire : 21,600 second(s) (last: Fri Jun 23 13:01:45 2017) Filter : read-only:present Repo-filename: /etc/yum.repos.d/rhos-release-11.repo Repo-id : rhelosp-rhel-7.4-extras/x86_64 Repo-name : Red Hat Enterprise Linux 7Server - x86_64 - Extras Repo-revision: 1498209942 Repo-updated : Fri Jun 23 09:25:41 2017 Repo-pkgs : 1 Repo-size : 452 k Repo-baseurl : http://rhos-qe-mirror-qeos.usersys.redhat.com/composes/nightly/EXTRAS-RHEL-7.4/latest-EXTRAS-7-RHEL-7/compose/Server/x86_64/os/ Repo-expire : 21,600 second(s) (last: Fri Jun 23 13:01:45 2017) Filter : read-only:present Repo-filename: /etc/yum.repos.d/rhos-release-rhel-7.4.repo Repo-id : rhelosp-rhel-7.4-ha/x86_64 Repo-name : Red Hat Enterprise Linux 7Server - x86_64 - HA Repo-revision: 1498067830 Repo-updated : Wed Jun 21 17:57:10 2017 Repo-pkgs : 35 Repo-size : 13 M Repo-baseurl : http://rhos-qe-mirror-qeos.usersys.redhat.com/composes/nightly/latest-RHEL-7/compose/Server/x86_64/os/addons/HighAvailability/ Repo-expire : 21,600 second(s) (last: Fri Jun 23 13:01:45 2017) Filter : read-only:present Repo-filename: /etc/yum.repos.d/rhos-release-rhel-7.4.repo Repo-id : rhelosp-rhel-7.4-server/x86_64 Repo-name : Red Hat Enterprise Linux 7Server - x86_64 - Server Repo-revision: 1498067805 Repo-updated : Wed Jun 21 17:56:45 2017 Repo-pkgs : 5,142 Repo-size : 3.7 G Repo-baseurl : http://rhos-qe-mirror-qeos.usersys.redhat.com/composes/nightly/latest-RHEL-7/compose/Server/x86_64/os/ Repo-expire : 21,600 second(s) (last: Fri Jun 23 13:01:45 2017) Filter : read-only:present Repo-filename: /etc/yum.repos.d/rhos-release-rhel-7.4.repo Repo-id : rhos-release Repo-name : RHOS Release Repo-revision: 1498149399 Repo-updated : Thu Jun 22 16:36:43 2017 Repo-pkgs : 165 Repo-size : 3.0 M Repo-baseurl : http://download-node-02.eng.bos.redhat.com/rcm-guest/puddles/OpenStack/rhos-release/ Repo-expire : 21,600 second(s) (last: Fri Jun 23 13:01:45 2017) Filter : read-only:present Repo-filename: /etc/yum.repos.d/rhos-release.repo Repo-id : rhos-release-extras/7Server Repo-name : RHOS Release Extras Repo-revision: 1443035482 Repo-updated : Wed Sep 23 19:11:23 2015 Repo-pkgs : 2 Repo-size : 655 k Repo-baseurl : http://download-node-02.eng.bos.redhat.com/rcm-guest/puddles/OpenStack/rhos-release/extras/7Server Repo-expire : 21,600 second(s) (last: Fri Jun 23 13:01:46 2017) Filter : read-only:present Repo-filename: /etc/yum.repos.d/rhos-release.repo repolist: 6,487 [heat-admin@controller-0 ~]$ Version-Release number of selected component (if applicable): osp11 -> osp11+ rhel7.4 testing repos How reproducible: Steps to Reproduce: 1.deploy undercloud and overcloud osp11 using infrared infrared virsh -v --host-address $HOST--host-key ~/.ssh/id_rsa --cleanup yes && infrared virsh -v --host-address $HOST --host-key ~/.ssh/id_rsa --topology-nodes undercloud:1,controller:1,compute:1 -e override.controller.cpu=6 -e override.controller.memory=16384 -e override.undercloud.disks.disk1.size=100G && infrared tripleo-undercloud --version 11 --images-task=rpm && infrared tripleo-overcloud -v --introspect yes --tagging yes --post no --deployment-files virt --version 11 --deploy yes 2.minor update of undercloud node using local mirrors ir tripleo-undercloud -v --update-undercloud yes --mirror qeos --build 7.4-testing --osrelease 7.4 3. update overcloud using ir and local mirrors ir tripleo-overcloud -v --updateto 7.4-testing --deployment-files virt --mirror qeos --osrelease 7.4 Actual results: overcloud update stucked Expected results: overcloud update succeeded Additional info:
This bugzilla has been removed from the release and needs to be reviewed and Triaged for another Target Release.
Update failed after 4 hours: cmd: source ~/stackrc ; openstack stack failures list overcloud start: 2017-06-25 22:57:03.016532 end: 2017-06-25 22:57:19.857321 delta: 0:00:16.840789 stdout: overcloud.Controller.0.UpdateDeployment: resource_type: OS::Heat::SoftwareDeployment physical_resource_id: 1951898e-23d8-4eb1-8c85-bbe787f9e4c0 status: UPDATE_FAILED status_reason: | UPDATE aborted deploy_stdout: | Started yum_update.sh on server 39200262-621f-427d-98cf-229b49140c83 at Sun Jun 25 17:50:56 EDT 2017 Not running due to unset update_identifier deploy_stderr: | overcloud.Compute.0: resource_type: OS::TripleO::Compute physical_resource_id: 1a9fa9f1-0049-4231-afc2-cf091e0c2f27 status: UPDATE_FAILED status_reason: | UPDATE aborted [[ previous task time: 4:01:33.617169 = 14493.62s / 14808.38s ]]
I had a look at the logs, and it seems rabbit crashed early in the update (or perhaps even before the update was started), with errors like: =ERROR REPORT==== 25-Jun-2017::22:59:31 === Error on AMQP connection <0.4605.0> (172.17.1.13:47914 -> 172.17.1.13:5672 - neutron-server:109031:f8f7b06a-abd1-463e-928a-406867f0a948, vhost: '/', user: 'guest', state: running), channel 0: operation none caused a connection exception connection_forced: "broker forced connection closure with reason 'shutdown'" The update then got stuck because the yum update triggers service restarts, which failed because rabbit wasn't working. So the question is why did rabbit fail, and was it working before the update was attempted?
This is quite common for openstack services, we should be able to handle rabbitmq failure and die gracefully too.
before update rabbit worked as expected. During updated was shut-down and didn't recoveded http://pastebin.test.redhat.com/497588
Deployment is with pacemaker which means we shut down pacemaker services on the controller node that runs yum but at it is single controller this efectively takes down rabbitmq and mysql. [root@controller-0 ~]# pcs status Error: cluster is not currently running on this node Doing so causes openstack services to loop on AMQP and mariadb and even though they get shutdown request they don't stop nor they restart. pcs cluster start and pcs resource cleanup could fix this but I'm pretty sure galera won't survive it anyway. Is this supported by HA/pacemaker team?
Seventh attempt of redeployment and updates was succeed.
again stucked no w/a. blocked us in testing non-ha upgrade from osp11->osp112 2017-06-27 21:04:47.617 80982 ERROR heat.engine.service [req-2701d8db-688b-4696-be24-6199b41380c2 - - - - -] Service 62b80531-5a95-4554-9339-ef8d51b1be85 update failed: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '172.17.1.15' ([Errno 113] EHOSTUNREACH)") 2017-06-27 21:04:47.617 80979 ERROR heat.engine.service [req-aa1cfb5a-07a8-4bfb-a9cb-77832046e1bc - - - - -] Service 70739846-253a-4816-bde7-8fa639862eae update failed: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '172.17.1.15' ([Errno 113] EHOSTUNREACH)") 2017-06-27 21:04:47.618 80981 ERROR heat.engine.service [req-99cf4795-e6c6-46d0-8255-baae368813b7 - - - - -] Service 8eb3f8ef-06d0-4d8b-9686-62f737815ce5 update failed: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '172.17.1.15' ([Errno 113] EHOSTUNREACH)") 2017-06-27 21:04:50.621 80984 ERROR heat.engine.service [req-75774a56-aca8-4346-bf85-d3d43c4dec6c - - - - -] Service d50eafaa-2331-4ac7-bafe-db344f13b2b2 update failed: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '172.17.1.15' ([Errno 113] EHOSTUNREACH)") 2017-06-27 21:04:50.621 80980 ERROR heat.engine.service [req-431f04f5-28d2-4425-921b-f2a40c9f9edd - - - - -] Service 64658e0c-1d87-4d87-ac9b-16b3b1b31b91 update failed: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '172.17.1.15' ([Errno 113] EHOSTUNREACH)") 2017-06-27 21:04:50.621 80985 ERROR heat.engine.service [req-cba05be3-4214-494b-9b68-9b488467442f - - - - -] Service adf56641-803d-49f0-a264-ba88356dfa27 update failed: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '172.17.1.15' ([Errno 113] EHOSTUNREACH)") 2017-06-27 21:04:50.621 80978 ERROR heat.engine.service [req-2a580256-f07c-4d30-80fd-7850e663bdfa - - - - -] Service 1035c73f-d520-478b-8e66-8c6ac3ccf4d3 update failed: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '172.17.1.15' ([Errno 113] EHOSTUNREACH)") 2017-06-27 21:04:50.621 80983 ERROR heat.engine.service [req-e4615cf4-798c-4104-a70e-afaf259d8916 - - - - -] Service 6dafb60a-54ee-4cec-b1eb-1430441b9b9b update failed: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '172.17.1.15' ([Errno 113] EHOSTUNREACH)") 2017-06-27 21:04:59.785 80978 ERROR oslo.messaging._drivers.impl_rabbit [-] [eea6961b-ffb3-45a8-a4b7-23fda988df3e] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: 45358 2017-06-27 21:04:59.786 80978 ERROR oslo.messaging._drivers.impl_rabbit [-] [6f5c7f3a-d0be-4b61-aa37-1058f12b7353] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: None 2017-06-27 21:04:59.787 80978 ERROR oslo.messaging._drivers.impl_rabbit [-] [1cd0b13e-86b8-47e9-861f-6cff12f3149f] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: 45386 2017-06-27 21:04:59.792 80983 ERROR oslo.messaging._drivers.impl_rabbit [-] [bd3be3e0-6288-47ea-bde9-ee01b5ec3832] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: 45384 2017-06-27 21:04:59.793 80983 ERROR oslo.messaging._drivers.impl_rabbit [-] [a533e0e3-6577-40e5-93fd-c7d74c649b1c] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: 45364 2017-06-27 21:04:59.794 80983 ERROR oslo.messaging._drivers.impl_rabbit [-] [350b7603-6228-465e-bec8-012bdbc6131b] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: 45350 2017-06-27 21:04:59.836 80984 ERROR oslo.messaging._drivers.impl_rabbit [-] [71f28f53-986c-49de-b8e6-1b1f26a3fcf0] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: 45348 2017-06-27 21:04:59.836 80984 ERROR oslo.messaging._drivers.impl_rabbit [-] [c8e685ec-894a-45d6-95f8-e56372372b86] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: 45368 2017-06-27 21:04:59.837 80984 ERROR oslo.messaging._drivers.impl_rabbit [-] [6e5eb076-0e91-4a76-a6e2-669335cf40c3] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: 45382 2017-06-27 21:04:59.901 80982 ERROR oslo.messaging._drivers.impl_rabbit [-] [6da7814b-aebf-4503-9bf1-c4d89b65b0f8] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: 45374 2017-06-27 21:04:59.902 80980 ERROR oslo.messaging._drivers.impl_rabbit [-] [a47deabb-c688-4c91-88a2-5249b2ea7b8c] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: 45344 2017-06-27 21:04:59.902 80982 ERROR oslo.messaging._drivers.impl_rabbit [-] [32b02e87-6008-42b9-b0af-fe20ebb06212] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: 45360 2017-06-27 21:04:59.902 80980 ERROR oslo.messaging._drivers.impl_rabbit [-] [d80d5bb1-675e-446c-b9df-37005bbba66b] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: None 2017-06-27 21:04:59.903 80982 ERROR oslo.messaging._drivers.impl_rabbit [-] [f1ae847e-d26f-4f50-b82d-93607d63fdc5] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: 45346 2017-06-27 21:04:59.903 80980 ERROR oslo.messaging._drivers.impl_rabbit [-] [e3584be9-76b4-4081-9e90-bbd79febdb7c] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: 45362 2017-06-27 21:04:59.907 80981 ERROR oslo.messaging._drivers.impl_rabbit [-] [4aff3f62-64d6-4235-b4f3-55102e87b8c0] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: None 2017-06-27 21:04:59.921 80981 ERROR oslo.messaging._drivers.impl_rabbit [-] [a9b8fea3-88b8-4f1c-850a-26889d2163d0] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: None 2017-06-27 21:04:59.922 80981 ERROR oslo.messaging._drivers.impl_rabbit [-] [ad9d7ac4-a016-47ac-82eb-84f015bfdfd5] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: 45342 2017-06-27 21:05:00.038 80985 ERROR oslo.messaging._drivers.impl_rabbit [-] [d0376174-50da-41e7-a298-f3d09fafb1f4] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: 45352 2017-06-27 21:05:00.039 80979 ERROR oslo.messaging._drivers.impl_rabbit [-] [2fd77107-0f57-4700-b09f-e00ea9e61366] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: 45366 2017-06-27 21:05:00.047 80985 ERROR oslo.messaging._drivers.impl_rabbit [-] [6f1dba1a-906f-4405-885a-8cce351acc7b] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: None 2017-06-27 21:05:00.047 80979 ERROR oslo.messaging._drivers.impl_rabbit [-] [d4a2ac11-8b30-4c0a-9e5b-901e5e3c5d06] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: None 2017-06-27 21:05:00.055 80979 ERROR oslo.messaging._drivers.impl_rabbit [-] [2503334f-a607-4b04-8f26-698debe70e1d] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: None 2017-06-27 21:05:00.055 80985 ERROR oslo.messaging._drivers.impl_rabbit [-] [fc6d1799-4576-4890-9a65-10a2c9289845] AMQP server on controller-0.internalapi.localdomain:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 32 seconds. Client port: None
See https://bugzilla.redhat.com/show_bug.cgi?id=1464588#c6 last time it passed only because of running pcs cluster start and pcs resource cleanup during yum update on controller. It will _always_ fail otherwise.
Confirmed. pcs cluster start and pcs resource cleanup during yum update on controller helps
Hi Artem, as Lukas pointed out, non-ha with pacemaker is not supported unless you do some manual workaround. It's more a quick dev platform. I'm closing this as not a bug, but if you still think this should be support then we can have it as an RFE for next release I guess tracked in its own bz. Thanks,
*** Bug 1463287 has been marked as a duplicate of this bug. ***