Bug 1523707
Summary: | [UPDATES] PCS managed containers ain't restarted with latest images | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Yurii Prokulevych <yprokule> |
Component: | openstack-tripleo-heat-templates | Assignee: | mathieu bultel <mbultel> |
Status: | CLOSED ERRATA | QA Contact: | Yurii Prokulevych <yprokule> |
Severity: | urgent | Docs Contact: | |
Priority: | urgent | ||
Version: | 12.0 (Pike) | CC: | achernet, amachluf, aopincar, aschultz, atalmor, augol, chjones, dbecker, dciabrin, dnavale, lbezdick, maandre, mbracho, mbultel, mburns, michele, morazi, rhel-osp-director-maint, sasha, sathlang, skatlapa, tvignaud |
Target Milestone: | z2 | Keywords: | Regression, Triaged, ZStream |
Target Release: | 12.0 (Pike) | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | openstack-tripleo-heat-templates-7.0.3-24.el7ost python-tripleoclient-7.3.3-8.el7ost openstack-tripleo-common-7.6.9-1.el7ost | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-03-28 17:14:53 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Yurii Prokulevych
2017-12-08 16:16:06 UTC
Hi, so in the ansible run we can see (for haproxy for instance): u'TASK [Get a list of container using Haproxy image] *****************************', u'skipping: [192.168.24.20]', u'', u'TASK [Remove any container using the same Haproxy image] ***********************', u'skipping: [192.168.24.20]', u'', u'TASK [Remove previous Haproxy images] ******************************************', u'skipping: [192.168.24.20]', u'', u'TASK [Pull latest Haproxy images] **********************************************', u'skipping: [192.168.24.20]', u'', u'TASK [Retag pcmklatest to latest Haproxy image] ********************************', u'skipping: [192.168.24.20]', u'', the crucial tasks are skipped. Previous comment has to be ignore, this is done later on. So the problem seems to be that we stop pcs cluster at step 1 and search for pcs managed containers at step 2. Problem is that containers are stopped and we run 'docker ps -q -f ancestor=<image_id>', which by default show just running containers. So Damien, Yurii and I spent some more time on this. We started from a clean environment and we could not reproduce the problem: - Each controller did exactly as we expected it and updated to the latest pacemaker image Tomorrow we will run some more tests. Right now the theory is that some additional steps need to happen for us to see the issue (maybe rerunning some steps like the minor-init-update or the config download multiple times). I think we need to fully understand the root cause before we look at throwing any patches at the problem. So the issue here is that the config container is updated before the heat stack update is finished, thats why the config doesn't get all the latest docker images. The workaround would be to run --init-minor-update twice for GA only if we want to update the docker registry file. For 0 day or Z release, I have something that fix this wrong behavior. LP and master review attached Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:0602 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 365 days |