Description of problem: ----------------------- During major RHOS upgrade images newer images have own namespace. This has to handles for pcs* managed services to avoid issue when docker image from current RHOS version is already removed, but is still referenced in bundles. E.g: pcs status Cluster name: tripleo_cluster Stack: corosync Current DC: database-0 (version 1.1.18-11.el7-2b07d5c5a9) - partition with quorum Last updated: Fri Apr 6 09:28:56 2018 Last change: Fri Apr 6 09:20:13 2018 by hacluster via crmd on controller-2 20 nodes configured 38 resources configured Online: [ controller-0 controller-1 controller-2 database-0 database-1 database-2 messaging-0 messaging-1 messaging-2 ] RemoteOnline: [ networker-0 networker-1 ] GuestOnline: [ galera-bundle-0@database-0 galera-bundle-1@database-1 galera-bundle-2@database-2 rabbitmq-bundle-0@messaging-0 rabbitmq-bundle-1@messaging-1 rabbitmq-bundle-2@messaging-2 ] Full list of resources: networker-0 (ocf::pacemaker:remote): Started database-0 networker-1 (ocf::pacemaker:remote): Started database-1 Docker container set: galera-bundle [192.168.24.1:8787/rhosp12/openstack-mariadb:pcmklatest] galera-bundle-0 (ocf::heartbeat:galera): Master database-0 galera-bundle-1 (ocf::heartbeat:galera): Master database-1 galera-bundle-2 (ocf::heartbeat:galera): Master database-2 Docker container set: redis-bundle [192.168.24.1:8787/rhosp12/openstack-redis:pcmklatest] redis-bundle-0 (ocf::heartbeat:redis): Stopped redis-bundle-1 (ocf::heartbeat:redis): Stopped redis-bundle-2 (ocf::heartbeat:redis): Stopped Docker container set: rabbitmq-bundle [192.168.24.1:8787/rhosp12/openstack-rabbitmq:pcmklatest] rabbitmq-bundle-0 (ocf::heartbeat:rabbitmq-cluster): Started messaging-0 rabbitmq-bundle-1 (ocf::heartbeat:rabbitmq-cluster): Started messaging-1 rabbitmq-bundle-2 (ocf::heartbeat:rabbitmq-cluster): Started messaging-2 ip-192.168.24.7 (ocf::heartbeat:IPaddr2): Stopped ip-10.0.0.101 (ocf::heartbeat:IPaddr2): Stopped ip-172.17.1.13 (ocf::heartbeat:IPaddr2): Stopped ip-172.17.1.17 (ocf::heartbeat:IPaddr2): Stopped ip-172.17.3.11 (ocf::heartbeat:IPaddr2): Stopped ip-172.17.4.14 (ocf::heartbeat:IPaddr2): Stopped Docker container set: haproxy-bundle [192.168.24.1:8787/rhosp12/openstack-haproxy:pcmklatest] haproxy-bundle-docker-0 (ocf::heartbeat:docker): Stopped haproxy-bundle-docker-1 (ocf::heartbeat:docker): Stopped haproxy-bundle-docker-2 (ocf::heartbeat:docker): Stopped Failed Actions: * redis-bundle-docker-0_start_0 on controller-2 'unknown error' (1): call=227, status=complete, exitreason='failed to pull image 192.168.24.1:8787/rhosp12/openstack-redis:pcmklatest', last-rc-change='Fri Apr 6 09:20:20 2018', queued=0ms, exec=253ms * redis-bundle-docker-1_start_0 on controller-2 'unknown error' (1): call=231, status=complete, exitreason='failed to pull image 192.168.24.1:8787/rhosp12/openstack-redis:pcmklatest', last-rc-change='Fri Apr 6 09:20:21 2018', queued=0ms, exec=280ms * redis-bundle-docker-2_start_0 on controller-2 'unknown error' (1): call=220, status=complete, exitreason='failed to pull image 192.168.24.1:8787/rhosp12/openstack-redis:pcmklatest', last-rc-change='Fri Apr 6 09:20:15 2018', queued=0ms, exec=226ms Version-Release number of selected component (if applicable): ------------------------------------------------------------- openstack-tripleo-heat-templates-8.0.2-0.20180327213843.f25e2d8.el7ost.noarch How reproducible: ----------------- 100% Steps to Reproduce: ------------------- 1. Upgrade UC to RHOS-13 2. Setup latest repos on oc 3. Prepare docker images from RHOS-13 4. Run `openstack overcloud upgrade prepare ...` to generate upgrade playbooks 5. Start upgrade of nodes hosting pcs* managed services Actual results: --------------- Upgrade process hungs until bundles are updated with correct image, e.g.: pcs resource bundle update redis-bundle container image=<path/to/new/image>
review 560426 should fix the bug in queens upstream. It should be applied on top of 560322 which is not merged yet in queens, but is merged already in master.
in build, sorry for the noise.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:2086