Bug 1516275
Summary: | OSP11 -> OSP12 upgrade: major-upgrade-composable-steps-docker.yaml fails while running cinder-manage db_sync when an incorrect location of Docker images is provided | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Marius Cornea <mcornea> |
Component: | python-paunch | Assignee: | Steve Baker <sbaker> |
Status: | CLOSED ERRATA | QA Contact: | Yurii Prokulevych <yprokule> |
Severity: | urgent | Docs Contact: | |
Priority: | urgent | ||
Version: | 12.0 (Pike) | CC: | augol, dbecker, emacchi, goneri, jamsmith, maandre, mandreou, m.andre, mburns, morazi, ohochman, rhel-osp-director-maint, sbaker |
Target Milestone: | z3 | Keywords: | TestOnly, Triaged, ZStream |
Target Release: | 12.0 (Pike) | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | python-paunch-1.5.3-1.el7ost | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-08-20 12:53:41 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1516634 |
Description
Marius Cornea
2017-11-22 11:51:24 UTC
we discussed this on the upgrades call today... reaching out to Containers and Deployment DFGs to see if they have any thoughts about how we might catch this earlier. The upgrade_tasks and upgrade workflow itself doesn't check images or do anything to the containers currently (mainly stopping/disabling of systemd services). please see comment #1 thanks There is an enhancement to paunch which would make this failure a lot less obscure. Currently detached containers are launched by doing a "docker run" then continuing with the next tasks. If the image can't be pulled (wrong image ref, network issue) then the container will eventually fail to start. If paunch checked whether the image exists locally, then does a docker pull, then it could fail early with a clear message. This won't catch the cases where the container isn't starting for some other reason, because paunch is not a service manager. For this we would need specific validator resources in tripleo-heat-templates which (for example) assert that mariadb is running and responding just before the first db_sync thing runs. Upstream fix has landed, I'd like to know whether this should get downstream via a stable/pike backport or a direct downstream backport There is no downstream git/gerrit for paunch[1], but there is an upstream stable backport. [1] http://git.app.eng.bos.redhat.com/git/?q=python-paunch According to our records, this should be resolved by python-paunch-1.5.3-1.el7ost. This build is available now. Verified with python-paunch-1.5.5-1.el7ost.noarch Upgrade step failed: overcloud.AllNodesDeploySteps.AllNodesPostUpgradeSteps.CephStorageDeployment_Step1.1: resource_type: OS::Heat::StructuredDeployment physical_resource_id: 2c45805b-e217-4bb1-a446-7bd738652292 status: CREATE_FAILED status_reason: | Error: resources[1]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2 deploy_stdout: | ... "See '/usr/bin/docker-current run --help'.", "2018-07-02 12:17Heat Stack update failed. Heat Stack update failed. :23,043 INFO: 62691 -- Finished processing puppet configs", "2018-07-02 12:17:23,043 ERROR: 62690 -- ERROR configuring crond" ] } to retry, use: --limit @/var/lib/heat-config/heat-config-ansible/fbad6ef5-eae2-4898-a39c-1df245294d85_playbook.retry PLAY RECAP ********************************************************************* localhost : ok=6 changed=2 unreachable=0 failed=1 And in os-collect-config logs: ... "2018-07-02 12:17:28,334 ERROR: 62966 -- Failed running docker-puppet.py for crond", "2018-07-02 12:17:28,335 ERROR: 62966 -- Unable to find image '192.168.24.1:8787/rhosp12/openstack-cron:inexistent' locally", "Trying to pull repository 192.168.24.1:8787/rhosp12/openstack-cron ... ", "Pulling repository 192.168.24.1:8787/rhosp12/openstack-cron", "/usr/bin/docker-current: Error: image rhosp12/openstack-cron:inexistent not found.", "See '/usr/bin/docker-current run --help'.", "2018-07-02 12:17:28,335 INFO: 62966 -- Finished processing puppet configs", "2018-07-02 12:17:28,335 ERROR: 62965 -- ERROR configuring crond" ... Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2521 |