Bug 1810119
Summary: | OSP13 update from GA to latest, image name registry change make pacemaker fail to restart. | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Sofer Athlan-Guyot <sathlang> |
Component: | openstack-tripleo-heat-templates | Assignee: | Sofer Athlan-Guyot <sathlang> |
Status: | CLOSED ERRATA | QA Contact: | Sasha Smolyak <ssmolyak> |
Severity: | urgent | Docs Contact: | |
Priority: | urgent | ||
Version: | 13.0 (Queens) | CC: | chjones, jjoyce, lmiccini, mburns |
Target Milestone: | z12 | Keywords: | TestBlocker, Triaged, ZStream |
Target Release: | 13.0 (Queens) | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | openstack-tripleo-heat-templates-8.4.1-53.el7ost | Doc Type: | No Doc Update |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-06-24 11:33:20 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Sofer Athlan-Guyot
2020-03-04 15:13:32 UTC
Solved upstream for osp16, need somehow to be backported. Hi, so the last puddle we can update to is 2020-01-15.3[1]. Starting with 2020-02-10.8[2] we have a new path in the registry that breaks update of HA containers. Note that ci can still be green as pacemaker can recover during update and thus the "breakage" (which need to be formally analysed) can stay unseen in ci. This means that sequence is happening: 1. stop pacemaker on ctl-0; ctl-1,2 are still up and running; 2. update the resource with the new pcmklatest on ctl-0; 3. the change is taken into right away by ctl-1 and ctl-2, they try to pull that new image and fail; 4. at that time all HA services are down but on ctl-0. So at 3. we shouldn't have a cut in api as ctl-0 will take the load, but ctl-1 and ctl-2 will be down. They will recover when we get to update those node, but we loose High availability during the update. They may be other consequences, that need to be further analysed. Thanks, [1] http://rhos-qe-mirror-tlv.usersys.redhat.com/rcm-guest/puddles/OpenStack/13.0-RHEL-7/2020-01-15.3/overcloud_container_image_prepare.yaml [2] http://rhos-qe-mirror-tlv.usersys.redhat.com/rcm-guest/puddles/OpenStack/13.0-RHEL-7/2020-02-10.8/overcloud_container_image_prepare.yaml Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2718 |