Bug 1576782 - [UPDATE] update failed at Task [Retag pcmklatest to latest Cinder-Backup image]
Summary: [UPDATE] update failed at Task [Retag pcmklatest to latest Cinder-Backup image]
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: rc
: 13.0 (Queens)
Assignee: Emilien Macchi
QA Contact: Raviv Bar-Tal
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-05-10 11:54 UTC by Raviv Bar-Tal
Modified: 2018-06-27 13:56 UTC (History)
8 users (show)

Fixed In Version: openstack-tripleo-heat-templates-8.0.2-22.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-06-27 13:55:31 UTC
Target Upstream Version:


Attachments (Terms of Use)
controller sosreport part a (19.00 MB, application/x-xz)
2018-05-10 12:01 UTC, Raviv Bar-Tal
no flags Details
controller sosreport part b (19.00 MB, application/octet-stream)
2018-05-10 12:05 UTC, Raviv Bar-Tal
no flags Details
controller sosreport part c (19.00 MB, application/octet-stream)
2018-05-10 12:06 UTC, Raviv Bar-Tal
no flags Details
controller sosreport part d (19.00 MB, application/octet-stream)
2018-05-10 12:08 UTC, Raviv Bar-Tal
no flags Details
controller sosreport part e (615.74 KB, application/octet-stream)
2018-05-10 12:10 UTC, Raviv Bar-Tal
no flags Details
/home/stack files (15.18 MB, application/x-xz)
2018-05-10 12:12 UTC, Raviv Bar-Tal
no flags Details


Links
System ID Priority Status Summary Last Updated
Launchpad 1770598 None None None 2018-05-11 14:59:31 UTC
OpenStack gerrit 567806 None MERGED Fix cinder-backup image wrangling on update 2020-05-20 20:06:25 UTC
OpenStack gerrit 569146 None MERGED Fix cinder-backup image wrangling on update 2020-05-20 20:06:26 UTC
Red Hat Product Errata RHEA-2018:2086 None None None 2018-06-27 13:56:36 UTC

Description Raviv Bar-Tal 2018-05-10 11:54:50 UTC
Description of problem:
Update from 2018-05-07.2 build failed  on controller update in the task [Retag pcmklatest to latest Cinder-Backup image]
Error message: 
"Error response from daemon: no such id: 192.168.24.1:8787/rhosp13/openstack-cinder-backup:2018-05-07.2"

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Install osp13 build 2018-05-07.2
2. update unercloud
3. update overcloud


Actual results:


Expected results:


Additional info:
See attached logs.
Automatic job on stage server:
http://staging-jenkins2-qe-playground.usersys.redhat.com/view/DFG/view/upgrades/view/update/job/DFG-upgrades-updates-13-from-2018-05-07.2-HA-ipv4/1/console

Comment 1 Raviv Bar-Tal 2018-05-10 12:01:50 UTC
Created attachment 1434337 [details]
controller sosreport part a

Comment 2 Raviv Bar-Tal 2018-05-10 12:03:17 UTC
As a result of the error controller 2 is offline:
[heat-admin@controller-0 ~]$ sudo pcs status
Cluster name: tripleo_cluster
Stack: corosync
Current DC: controller-1 (version 1.1.18-11.el7_5.2-2b07d5c5a9) - partition with quorum
Last updated: Thu May 10 11:59:37 2018
Last change: Wed May  9 16:26:09 2018 by root via cibadmin on controller-0

12 nodes configured
38 resources configured

Online: [ controller-0 controller-1 ]
OFFLINE: [ controller-2 ]
GuestOnline: [ galera-bundle-0@controller-0 galera-bundle-1@controller-1 rabbitmq-bundle-0@controller-0 rabbitmq-bundle-1@controller-1 redis-bundle-0@controller-0 redis-bundle-1@controller-1 ]

Full list of resources:

 Docker container set: rabbitmq-bundle [192.168.24.1:8787/rhosp13/openstack-rabbitmq:pcmklatest]
   rabbitmq-bundle-0	(ocf::heartbeat:rabbitmq-cluster):	Started controller-0
   rabbitmq-bundle-1	(ocf::heartbeat:rabbitmq-cluster):	Started controller-1
   rabbitmq-bundle-2	(ocf::heartbeat:rabbitmq-cluster):	Stopped
 Docker container set: galera-bundle [192.168.24.1:8787/rhosp13/openstack-mariadb:pcmklatest]
   galera-bundle-0	(ocf::heartbeat:galera):	Master controller-0
   galera-bundle-1	(ocf::heartbeat:galera):	Master controller-1
   galera-bundle-2	(ocf::heartbeat:galera):	Stopped
 Docker container set: redis-bundle [192.168.24.1:8787/rhosp13/openstack-redis:pcmklatest]
   redis-bundle-0	(ocf::heartbeat:redis):	Master controller-0
   redis-bundle-1	(ocf::heartbeat:redis):	Slave controller-1
   redis-bundle-2	(ocf::heartbeat:redis):	Stopped
 ip-192.168.24.8	(ocf::heartbeat:IPaddr2):	Started controller-0
 ip-10.0.0.101	(ocf::heartbeat:IPaddr2):	Started controller-1
 ip-172.17.1.12	(ocf::heartbeat:IPaddr2):	Started controller-1
 ip-172.17.1.13	(ocf::heartbeat:IPaddr2):	Started controller-0
 ip-172.17.3.10	(ocf::heartbeat:IPaddr2):	Started controller-1
 ip-172.17.4.19	(ocf::heartbeat:IPaddr2):	Started controller-0
 Docker container set: haproxy-bundle [192.168.24.1:8787/rhosp13/openstack-haproxy:pcmklatest]
   haproxy-bundle-docker-0	(ocf::heartbeat:docker):	Started controller-0
   haproxy-bundle-docker-1	(ocf::heartbeat:docker):	Started controller-1
   haproxy-bundle-docker-2	(ocf::heartbeat:docker):	Stopped
 Docker container: openstack-cinder-volume [192.168.24.1:8787/rhosp13/openstack-cinder-volume:pcmklatest]
   openstack-cinder-volume-docker-0	(ocf::heartbeat:docker):	Started controller-0
 Docker container: openstack-cinder-backup [192.168.24.1:8787/rhosp13/openstack-cinder-backup:pcmklatest]
   openstack-cinder-backup-docker-0	(ocf::heartbeat:docker):	Started controller-1

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
[heat-admin@controller-0 ~]$

Comment 3 Raviv Bar-Tal 2018-05-10 12:05:14 UTC
Created attachment 1434338 [details]
controller sosreport part b

Comment 4 Raviv Bar-Tal 2018-05-10 12:06:26 UTC
Created attachment 1434339 [details]
controller sosreport part c

Comment 5 Raviv Bar-Tal 2018-05-10 12:08:17 UTC
Created attachment 1434340 [details]
controller sosreport part d

Comment 6 Raviv Bar-Tal 2018-05-10 12:10:35 UTC
Created attachment 1434341 [details]
controller sosreport part e

Comment 7 Raviv Bar-Tal 2018-05-10 12:12:05 UTC
Created attachment 1434342 [details]
/home/stack files

Comment 9 Jiri Stransky 2018-05-11 14:59:31 UTC
Looking at logs + code, this is probably specifically affecting cinder-backup service. I have a fix proposal but wasn't able to test it yet as i hit unrelated issues with upstream env.

Raviv, to progress forward with testing, i think you can either:

* apply the intended fix https://review.openstack.org/567806 to your enviornment (this would be nice as we'd also pre-validate the fix downstream),

or

* temporarily remove environments/cinder-backup.yaml from the command lines used when testing.

Comment 10 Raviv Bar-Tal 2018-05-14 11:25:43 UTC
I have manually applied the patch and the update passed this stage,
We should have this patch merged and landing downstream asap×¥

Comment 12 Jiri Stransky 2018-05-15 13:16:57 UTC
The patch is hitting instability in the upstream CI, but once it lands at least to master, we can propose a downstream backport without waiting on the upstream one i think.

Comment 21 errata-xmlrpc 2018-06-27 13:55:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:2086


Note You need to log in before you can comment on or make changes to this bug.