Created attachment 1106501 [details] 2nd.update.output Description of problem: I'm trying to update a 7.0 environment and simulate a failure during the update. To accomplish this I didn't install any repos on the overcloud nodes. The first update attempt failed, I then installed the repos on the nodes and reran the update command but I still got a failed stack. Version-Release number of selected component (if applicable): openstack-heat-api-2015.1.2-4.el7ost.noarch openstack-heat-api-cfn-2015.1.2-4.el7ost.noarch openstack-heat-common-2015.1.2-4.el7ost.noarch openstack-tripleo-heat-templates-0.8.6-94.el7ost.noarch python-heatclient-0.6.0-1.el7ost.noarch openstack-heat-engine-2015.1.2-4.el7ost.noarch openstack-heat-templates-0-0.8.20150605git.el7ost.noarch openstack-heat-api-cloudwatch-2015.1.2-4.el7ost.noarch Steps to Reproduce: 1. Deploy 7.0 with the overcloud nodes having no repos installed openstack overcloud deploy \ --templates ~/templates/my-overcloud \ --control-scale 1 --compute-scale 1 --ceph-storage-scale 0 \ --ntp-server clock.redhat.com \ --libvirt-type qemu \ -e ~/templates/my-overcloud/environments/network-isolation.yaml \ -e ~/templates/network-environment.yaml \ 2. Update the undercloud to 7.2 and attempt to update the overcloud /usr/bin/yes '' | openstack overcloud update stack overcloud -i \ --templates ~/templates/my-overcloud \ -e ~/templates/my-overcloud/overcloud-resource-registry-puppet.yaml \ -e ~/templates/my-overcloud/environments/network-isolation.yaml \ -e ~/templates/network-environment.yaml \ -e ~/templates/my-overcloud/environments/updates/update-from-keystone-admin-internal-api.yaml \ -e ~/templates/param-updates.yaml Result: update finished with status FAILED "deploy_stderr": "cat: /var/lib/tripleo/installed-packages/*: No such file or directory\nThere are no enabled repos.\n Run \"yum repolist all\" to see the repos you have.\n You can enable repos with yum-config-manager --enable <repo>\n", 3. Manually add the 7.2 repos on the overcloud nodes 4. Rerun the update command Actual results: The update command fails after trying to remove the breakpoint on overcloud-compute-0 for a couple of times(see attached 2nd.update.output). The yum update has been run on the overcloud-compute-0 node as there are no available packages updates. The stack_status_reason is resources.Controller: Stack overcloud-Controller-5ldijz5uvxog already has an action (UPDATE) in progress and indeed there seem to be nested stacks with UPDATE_IN_PROGRESS while the overcloud stack is UPDATE_FAILED. heat stack-list -n | grep PROGRESS | a4f9c92d-cbd3-427b-8829-f3913f187cae | overcloud-Controller-5ldijz5uvxog | UPDATE_IN_PROGRESS | 2015-12-16T16:49:28Z | e48ee6e8-e167-4a84-91e3-04978ec3520e | | 2b03d6ba-e222-4565-a3f3-2d2b2744ec83 | overcloud-Controller-5ldijz5uvxog-0-spbwkx4yxzeo | UPDATE_IN_PROGRESS | 2015-12-16T16:49:31Z | a4f9c92d-cbd3-427b-8829-f3913f187cae | heat stack-list +--------------------------------------+------------+---------------+----------------------+ | id | stack_name | stack_status | creation_time | +--------------------------------------+------------+---------------+----------------------+ | e48ee6e8-e167-4a84-91e3-04978ec3520e | overcloud | UPDATE_FAILED | 2015-12-16T16:49:12Z | +--------------------------------------+------------+---------------+----------------------+ Expected results: The update process completes OK.
Workaround for this is to to wait until that IN_PROGRESS update times out and rerun update.
Note that you *should* be able to speed up the workaround by restarting heat-engine on the undercloud, as that puts the nested stacks into the FAILED state at startup. I say "should" because this was notoriously buggy in the past, though we believe it's working now. I'm going to close this as a duplicate, since it's a known problem. *** This bug has been marked as a duplicate of bug 1253773 ***