Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1372844

Summary: rhel-osp-director: Scaling of computes fails " UPDATE_FAILED NotFound_Remote: resources[0]: Software config with id a9c8e15c-edce-401f-a2d9-932bed21cfb5 not found"
Product: Red Hat OpenStack Reporter: Alexander Chuzhoy <sasha>
Component: openstack-heatAssignee: Steve Baker <sbaker>
Status: CLOSED ERRATA QA Contact: Alexander Chuzhoy <sasha>
Severity: high Docs Contact:
Priority: high    
Version: 10.0 (Newton)CC: dbecker, jcoufal, jschluet, jslagle, mburns, morazi, rhel-osp-director-maint, sasha, sbaker, shardy, srevivo, zbitter
Target Milestone: rcKeywords: Triaged
Target Release: 10.0 (Newton)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-heat-7.0.0-0.20160907124808.21e49dc.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-12-14 15:56:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
heat logs from the undercloud none

Description Alexander Chuzhoy 2016-09-02 21:59:44 UTC
Environment:
instack-undercloud-5.0.0-0.20160818065636.41ef775.el7ost.noarch
openstack-puppet-modules-9.0.0-0.20160802183056.8c758d6.el7ost.noarch
openstack-tripleo-heat-templates-5.0.0-0.20160823140311.72404b.1.el7ost.noarch

Steps to reproduce:
1. Deploy overcloud with:
openstack overcloud deploy --templates --control-scale 3 --compute-scale 1 --neutron-network-type vxlan --neutron-tunnel-types vxlan --ntp-server clock.redhat.com --timeout 90 -e /usr/share/openstack-tripleo-heat-templates/environments/puppet-pacemaker.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e network-environment.yaml --ceph-storage-scale 1

2. Attempt to scale computes with:
openstack overcloud deploy --templates --control-scale 3 --compute-scale 2 --neutron-network-type vxlan --neutron-tunnel-types vxlan --ntp-server clock.redhat.com --timeout 90 -e /usr/share/openstack-tripleo-heat-templates/environments/puppet-pacemaker.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e network-environment.yaml --ceph-storage-scale 1

rhel-osp-director: Scaling of computes fails " UPDATE_FAILED NotFound_Remote: resources[0]: Software config with id a9c8e15c-edce-401f-a2d9-932bed21cfb5 not found" 

Result:
2016-09-02 18:45:48 [ObjectStorageAllNodesDeployment]: UPDATE_IN_PROGRESS state changed
2016-09-02 18:45:48 [overcloud-CephStorageAllNodesDeployment-w4aac2qgjxyt]: UPDATE_IN_PROGRESS Stack UPDATE started
2016-09-02 18:45:49 [ControllerAllNodesDeployment]: UPDATE_IN_PROGRESS state changed
2016-09-02 18:45:49 [0]: UPDATE_IN_PROGRESS state changed
2016-09-02 18:45:49 [overcloud-ControllerAllNodesDeployment-x4zjnyh3qsxe]: UPDATE_IN_PROGRESS Stack UPDATE started
2016-09-02 18:45:50 [0]: UPDATE_FAILED NotFound_Remote: resources[0]: Software config with id a9c8e15c-edce-401f-a2d9-932bed21cfb5 not found
Traceback (most recent call last):

  File "/usr/lib/python2.7/site-packages/heat/common/context.py", line 424, in wrapped
    return func(self, ctx, *ar
2016-09-02 18:45:50 [2]: UPDATE_IN_PROGRESS state changed
2016-09-02 18:45:50 [2]: UPDATE_FAILED NotFound_Remote: resources[2]: Software config with id a9c8e15c-edce-401f-a2d9-932bed21cfb5 not found
Traceback (most recent call last):

  File "/usr/lib/python2.7/site-packages/heat/common/context.py", line 424, in wrapped
    return func(self, ctx, *ar
2016-09-02 18:45:50 [overcloud-CephStorageAllNodesDeployment-w4aac2qgjxyt]: UPDATE_FAILED NotFound_Remote: resources[0]: Software config with id a9c8e15c-edce-401f-a2d9-932bed21cfb5 not found
Traceback (most recent call last):

  File "/usr/lib/python2.7/site-packages/heat/common/context.py", line 424, in wrapped
    return func(self, ctx, *ar
2016-09-02 18:45:50 [overcloud-ControllerAllNodesDeployment-x4zjnyh3qsxe]: UPDATE_FAILED NotFound_Remote: resources[2]: Software config with id a9c8e15c-edce-401f-a2d9-932bed21cfb5 not found
Traceback (most recent call last):

  File "/usr/lib/python2.7/site-packages/heat/common/context.py", line 424, in wrapped
    return func(self, ctx, *ar
2016-09-02 18:45:50 [ComputeAllNodesDeployment]: UPDATE_IN_PROGRESS state changed
2016-09-02 18:45:51 [overcloud-ComputeAllNodesDeployment-l4sohum2k4aq]: UPDATE_IN_PROGRESS Stack UPDATE started
2016-09-02 18:45:51 [BlockStorageAllNodesDeployment]: UPDATE_IN_PROGRESS state changed
2016-09-02 18:45:51 [1]: CREATE_IN_PROGRESS state changed
2016-09-02 18:45:52 [CephStorageAllNodesDeployment]: UPDATE_FAILED resources.CephStorageAllNodesDeployment: NotFound_Remote: resources[0]: Software config with id a9c8e15c-edce-401f-a2d9-932bed21cfb5 not found
Traceback (most recent call last):

  File "/usr/lib/python2.7/site-packages/heat/common/context.py", line 424,
2016-09-02 18:45:52 [ObjectStorageAllNodesDeployment]: UPDATE_FAILED UPDATE aborted
2016-09-02 18:45:52 [ControllerAllNodesDeployment]: UPDATE_FAILED UPDATE aborted
2016-09-02 18:45:52 [ComputeAllNodesDeployment]: UPDATE_FAILED UPDATE aborted
2016-09-02 18:45:52 [BlockStorageAllNodesDeployment]: UPDATE_FAILED UPDATE aborted
2016-09-02 18:45:52 [overcloud]: UPDATE_FAILED resources.CephStorageAllNodesDeployment: NotFound_Remote: resources[0]: Software config with id a9c8e15c-edce-401f-a2d9-932bed21cfb5 not found
Traceback (most recent call last):

  File "/usr/lib/python2.7/site-packages/heat/common/context.py", line 424,
2016-09-02 18:45:53 [0]: UPDATE_IN_PROGRESS state changed
2016-09-02 18:45:53 [0]: UPDATE_FAILED NotFound_Remote: resources[0]: Software config with id a9c8e15c-edce-401f-a2d9-932bed21cfb5 not found
Traceback (most recent call last):

  File "/usr/lib/python2.7/site-packages/heat/common/context.py", line 424, in wrapped
    return func(self, ctx, *ar
Stack overcloud UPDATE_FAILED
Heat Stack update failed.


Expected result:
Successful scale of computes.

Comment 2 Alexander Chuzhoy 2016-09-02 22:02:06 UTC
Created attachment 1197322 [details]
heat logs from the undercloud

Comment 3 James Slagle 2016-09-13 20:15:55 UTC
steve, could you have a look at this one and do some initial triage (or assign it to someone who can)? I've not seen this error before, but it seems like it might be a Heat bug as opposed to a problem with tripleo-heat-templates

Comment 4 Steve Baker 2016-09-13 21:14:36 UTC
I'll take a look

Comment 5 Steve Baker 2016-09-13 21:28:14 UTC
This sounds a lot like something which was fixed in this change https://review.openstack.org/#/c/360122/

Before I investigate this further can you confirm that your heat has this fix?

Comment 6 Steve Baker 2016-09-13 21:30:10 UTC
Attaching the upstream bug, which looks quite similar

Comment 7 Alexander Chuzhoy 2016-09-15 17:30:00 UTC
Steve,
the fix isn't there.

Comment 8 James Slagle 2016-09-20 16:52:53 UTC
steve, can we get the fix downstream? it could be that it will get picked up in the next import, or a patch will need to be proposed.

Comment 9 Steve Baker 2016-09-20 23:25:41 UTC
Current OSP-10 puddle has the following which has the fixes for this.

  openstack-heat-engine-7.0.0-0.20160907124808.21e49dc.el7ost.noarch

If you still see this issue with this or later versions then we'll need to investigate.

Comment 11 Alexander Chuzhoy 2016-09-27 20:15:51 UTC
Verified:

Environment:
openstack-heat-engine-7.0.0-0.20160907124808.21e49dc.el7ost.noarch
openstack-heat-common-7.0.0-0.20160907124808.21e49dc.el7ost.noarch

Was able to scale the deployment with one additional compute node and live migrate a running instance to it + launch an instance directly on the new node.

Comment 15 errata-xmlrpc 2016-12-14 15:56:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-2948.html