Hide Forgot
rhel-osp-director: Update from 7.0: Heat stack times out with no resources IN_PROGRESS after engine restart Environment: python-heatclient-0.6.0-1.el7ost.noarch openstack-heat-api-2015.1.2-8.el7ost.noarch heat-cfntools-1.2.8-2.el7.noarch openstack-heat-templates-0-0.8.20150605git.el7ost.noarch instack-undercloud-2.1.2-39.el7ost.noarch openstack-heat-engine-2015.1.2-8.el7ost.noarch openstack-tripleo-heat-templates-0.8.6-117.el7ost.noarch openstack-heat-api-cloudwatch-2015.1.2-8.el7ost.noarch openstack-heat-api-cfn-2015.1.2-8.el7ost.noarch openstack-heat-common-2015.1.2-8.el7ost.noarch Steps to reproduce: 1. Deploy 7.0 with openstack overcloud deploy --templates --control-scale 3 --compute-scale 2 --ceph-storage-scale 1 --neutron-network-type vxlan --neutron-tunnel-types vxlan --ntp-server x.x.x.x --timeout 90 -e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e network-environment.yaml 2. Attempt to update the deployment. 3. The deployment will fail due to timeout - need to raise the "rpc_response_timeout = " value to 600 in /etc/heat/heat.conf (https://bugzilla.redhat.com/show_bug.cgi?id=1305947 ) and restart the heat engine. 4. Resume the overcloud update. Result: The deployment times out with this message: "ERROR: openstack ERROR: Authentication failed. Please try again with option --include-password or export HEAT_INCLUDE_PASSWORD=1" Expected result: The update should complete successfully.
Correction, the initially deployed version is 7.2GA.
The Controller-0 and Controller-1 nested stacks were the only resources still in progress when the stack timed out: [stack@instack ~]$ heat resource-list -n5 overcloud|grep -v COMPLE +-----------------------------------------------+-----------------------------------------------+---------------------------------------------------+-----------------+----------------------+-----------------------------------------------+ | resource_name | physical_resource_id | resource_type | resource_status | updated_time | parent_resource | +-----------------------------------------------+-----------------------------------------------+---------------------------------------------------+-----------------+----------------------+-----------------------------------------------+ | Controller | 280e7277-1646-4b51-8fa7-d7b50cb0310e | OS::Heat::ResourceGroup | UPDATE_FAILED | 2016-02-10T21:30:47Z | | | 1 | 348fbeed-1521-439c-8dc0-85de4197a438 | OS::TripleO::Controller | UPDATE_FAILED | 2016-02-10T21:30:59Z | Controller | | 0 | b0997bf0-dc0c-4380-ac4b-e101658a6c02 | OS::TripleO::Controller | UPDATE_FAILED | 2016-02-10T21:31:44Z | Controller | +-----------------------------------------------+-----------------------------------------------+---------------------------------------------------+-----------------+----------------------+-----------------------------------------------+
Analysis of the log shows that this is actually a duplicate of bug 1290950. However, I'm going to leave this open for the moment because there are reports of a similar failure mode *not* involving a heat-engine restart, which presumably couldn't have the same cause.
Other issue was unrelated, so closing this as a duplicate. *** This bug has been marked as a duplicate of bug 1290950 ***