Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1292212

Summary: Updating a failed stack fails with Stack already has an action (UPDATE) in progress.
Product: Red Hat OpenStack Reporter: Marius Cornea <mcornea>
Component: rhosp-directorAssignee: chris alfonso <calfonso>
Status: CLOSED DUPLICATE QA Contact: yeylon <yeylon>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 7.0 (Kilo)CC: ggillies, hbrock, mburns, rhel-osp-director-maint, srevivo, zbitter
Target Milestone: ---   
Target Release: 7.0 (Kilo)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1301511 (view as bug list) Environment:
Last Closed: 2016-01-07 19:35:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
2nd.update.output none

Description Marius Cornea 2015-12-16 18:18:04 UTC
Created attachment 1106501 [details]
2nd.update.output

Description of problem:
I'm trying to update a 7.0 environment and simulate a failure during the update. To accomplish this I didn't install any repos on the overcloud nodes. The first update attempt failed, I then installed the repos on the nodes and reran the update command but I still got a failed stack.

Version-Release number of selected component (if applicable):
openstack-heat-api-2015.1.2-4.el7ost.noarch
openstack-heat-api-cfn-2015.1.2-4.el7ost.noarch
openstack-heat-common-2015.1.2-4.el7ost.noarch
openstack-tripleo-heat-templates-0.8.6-94.el7ost.noarch
python-heatclient-0.6.0-1.el7ost.noarch
openstack-heat-engine-2015.1.2-4.el7ost.noarch
openstack-heat-templates-0-0.8.20150605git.el7ost.noarch
openstack-heat-api-cloudwatch-2015.1.2-4.el7ost.noarch


Steps to Reproduce:
1. Deploy 7.0 with the overcloud nodes having no repos installed
openstack overcloud deploy \
    --templates ~/templates/my-overcloud \
    --control-scale 1 --compute-scale 1 --ceph-storage-scale 0 \
    --ntp-server clock.redhat.com \
    --libvirt-type qemu \
    -e ~/templates/my-overcloud/environments/network-isolation.yaml \
    -e ~/templates/network-environment.yaml \

2. Update the undercloud to 7.2 and attempt to update the overcloud
/usr/bin/yes '' | openstack overcloud update stack overcloud -i \
         --templates ~/templates/my-overcloud \
         -e ~/templates/my-overcloud/overcloud-resource-registry-puppet.yaml \
         -e ~/templates/my-overcloud/environments/network-isolation.yaml \
         -e ~/templates/network-environment.yaml \
         -e ~/templates/my-overcloud/environments/updates/update-from-keystone-admin-internal-api.yaml \
         -e ~/templates/param-updates.yaml

Result: update finished with status FAILED
 "deploy_stderr": "cat: /var/lib/tripleo/installed-packages/*: No such file or directory\nThere are no enabled repos.\n Run \"yum repolist all\" to see the repos you have.\n You can enable repos with yum-config-manager --enable <repo>\n", 

3. Manually add the 7.2 repos on the overcloud nodes

4. Rerun the update command

Actual results:
The update command fails after trying to remove the breakpoint on overcloud-compute-0 for a couple of times(see attached 2nd.update.output). The yum update has been run on the overcloud-compute-0 node as there are no available packages updates. 

The stack_status_reason is resources.Controller: Stack overcloud-Controller-5ldijz5uvxog already has an action (UPDATE) in progress and indeed there seem to be nested stacks with UPDATE_IN_PROGRESS while the overcloud stack is UPDATE_FAILED.

heat stack-list -n | grep PROGRESS
| a4f9c92d-cbd3-427b-8829-f3913f187cae | overcloud-Controller-5ldijz5uvxog                                                                             | UPDATE_IN_PROGRESS | 2015-12-16T16:49:28Z | e48ee6e8-e167-4a84-91e3-04978ec3520e |
| 2b03d6ba-e222-4565-a3f3-2d2b2744ec83 | overcloud-Controller-5ldijz5uvxog-0-spbwkx4yxzeo                                                              | UPDATE_IN_PROGRESS | 2015-12-16T16:49:31Z | a4f9c92d-cbd3-427b-8829-f3913f187cae |

heat stack-list
+--------------------------------------+------------+---------------+----------------------+
| id                                   | stack_name | stack_status  | creation_time        |
+--------------------------------------+------------+---------------+----------------------+
| e48ee6e8-e167-4a84-91e3-04978ec3520e | overcloud  | UPDATE_FAILED | 2015-12-16T16:49:12Z |
+--------------------------------------+------------+---------------+----------------------+
 
Expected results:
The update process completes OK.

Comment 2 Marius Cornea 2015-12-17 08:01:25 UTC
Workaround for this is to to wait until that IN_PROGRESS update times out and 
rerun update.

Comment 3 Zane Bitter 2016-01-07 19:35:37 UTC
Note that you *should* be able to speed up the workaround by restarting heat-engine on the undercloud, as that puts the nested stacks into the FAILED state at startup. I say "should" because this was notoriously buggy in the past, though we believe it's working now.

I'm going to close this as a duplicate, since it's a known problem.

*** This bug has been marked as a duplicate of bug 1253773 ***