Bug 1372844 - rhel-osp-director: Scaling of computes fails " UPDATE_FAILED NotFound_Remote: resources[0]: Software config with id a9c8e15c-edce-401f-a2d9-932bed21cfb5 not found"
Summary: rhel-osp-director: Scaling of computes fails " UPDATE_FAILED NotFound_Remote:...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-heat
Version: 10.0 (Newton)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: 10.0 (Newton)
Assignee: Steve Baker
QA Contact: Alexander Chuzhoy
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-09-02 21:59 UTC by Alexander Chuzhoy
Modified: 2016-12-14 15:56 UTC (History)
12 users (show)

Fixed In Version: openstack-heat-7.0.0-0.20160907124808.21e49dc.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-12-14 15:56:43 UTC


Attachments (Terms of Use)
heat logs from the undercloud (5.85 MB, application/x-gzip)
2016-09-02 22:02 UTC, Alexander Chuzhoy
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2016:2948 normal SHIPPED_LIVE Red Hat OpenStack Platform 10 enhancement update 2016-12-14 19:55:27 UTC
OpenStack gerrit 360122 None None None 2016-09-20 16:53:15 UTC
Launchpad 1616550 None None None 2016-09-13 21:30:10 UTC

Description Alexander Chuzhoy 2016-09-02 21:59:44 UTC
Environment:
instack-undercloud-5.0.0-0.20160818065636.41ef775.el7ost.noarch
openstack-puppet-modules-9.0.0-0.20160802183056.8c758d6.el7ost.noarch
openstack-tripleo-heat-templates-5.0.0-0.20160823140311.72404b.1.el7ost.noarch

Steps to reproduce:
1. Deploy overcloud with:
openstack overcloud deploy --templates --control-scale 3 --compute-scale 1 --neutron-network-type vxlan --neutron-tunnel-types vxlan --ntp-server clock.redhat.com --timeout 90 -e /usr/share/openstack-tripleo-heat-templates/environments/puppet-pacemaker.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e network-environment.yaml --ceph-storage-scale 1

2. Attempt to scale computes with:
openstack overcloud deploy --templates --control-scale 3 --compute-scale 2 --neutron-network-type vxlan --neutron-tunnel-types vxlan --ntp-server clock.redhat.com --timeout 90 -e /usr/share/openstack-tripleo-heat-templates/environments/puppet-pacemaker.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e network-environment.yaml --ceph-storage-scale 1

rhel-osp-director: Scaling of computes fails " UPDATE_FAILED NotFound_Remote: resources[0]: Software config with id a9c8e15c-edce-401f-a2d9-932bed21cfb5 not found" 

Result:
2016-09-02 18:45:48 [ObjectStorageAllNodesDeployment]: UPDATE_IN_PROGRESS state changed
2016-09-02 18:45:48 [overcloud-CephStorageAllNodesDeployment-w4aac2qgjxyt]: UPDATE_IN_PROGRESS Stack UPDATE started
2016-09-02 18:45:49 [ControllerAllNodesDeployment]: UPDATE_IN_PROGRESS state changed
2016-09-02 18:45:49 [0]: UPDATE_IN_PROGRESS state changed
2016-09-02 18:45:49 [overcloud-ControllerAllNodesDeployment-x4zjnyh3qsxe]: UPDATE_IN_PROGRESS Stack UPDATE started
2016-09-02 18:45:50 [0]: UPDATE_FAILED NotFound_Remote: resources[0]: Software config with id a9c8e15c-edce-401f-a2d9-932bed21cfb5 not found
Traceback (most recent call last):

  File "/usr/lib/python2.7/site-packages/heat/common/context.py", line 424, in wrapped
    return func(self, ctx, *ar
2016-09-02 18:45:50 [2]: UPDATE_IN_PROGRESS state changed
2016-09-02 18:45:50 [2]: UPDATE_FAILED NotFound_Remote: resources[2]: Software config with id a9c8e15c-edce-401f-a2d9-932bed21cfb5 not found
Traceback (most recent call last):

  File "/usr/lib/python2.7/site-packages/heat/common/context.py", line 424, in wrapped
    return func(self, ctx, *ar
2016-09-02 18:45:50 [overcloud-CephStorageAllNodesDeployment-w4aac2qgjxyt]: UPDATE_FAILED NotFound_Remote: resources[0]: Software config with id a9c8e15c-edce-401f-a2d9-932bed21cfb5 not found
Traceback (most recent call last):

  File "/usr/lib/python2.7/site-packages/heat/common/context.py", line 424, in wrapped
    return func(self, ctx, *ar
2016-09-02 18:45:50 [overcloud-ControllerAllNodesDeployment-x4zjnyh3qsxe]: UPDATE_FAILED NotFound_Remote: resources[2]: Software config with id a9c8e15c-edce-401f-a2d9-932bed21cfb5 not found
Traceback (most recent call last):

  File "/usr/lib/python2.7/site-packages/heat/common/context.py", line 424, in wrapped
    return func(self, ctx, *ar
2016-09-02 18:45:50 [ComputeAllNodesDeployment]: UPDATE_IN_PROGRESS state changed
2016-09-02 18:45:51 [overcloud-ComputeAllNodesDeployment-l4sohum2k4aq]: UPDATE_IN_PROGRESS Stack UPDATE started
2016-09-02 18:45:51 [BlockStorageAllNodesDeployment]: UPDATE_IN_PROGRESS state changed
2016-09-02 18:45:51 [1]: CREATE_IN_PROGRESS state changed
2016-09-02 18:45:52 [CephStorageAllNodesDeployment]: UPDATE_FAILED resources.CephStorageAllNodesDeployment: NotFound_Remote: resources[0]: Software config with id a9c8e15c-edce-401f-a2d9-932bed21cfb5 not found
Traceback (most recent call last):

  File "/usr/lib/python2.7/site-packages/heat/common/context.py", line 424,
2016-09-02 18:45:52 [ObjectStorageAllNodesDeployment]: UPDATE_FAILED UPDATE aborted
2016-09-02 18:45:52 [ControllerAllNodesDeployment]: UPDATE_FAILED UPDATE aborted
2016-09-02 18:45:52 [ComputeAllNodesDeployment]: UPDATE_FAILED UPDATE aborted
2016-09-02 18:45:52 [BlockStorageAllNodesDeployment]: UPDATE_FAILED UPDATE aborted
2016-09-02 18:45:52 [overcloud]: UPDATE_FAILED resources.CephStorageAllNodesDeployment: NotFound_Remote: resources[0]: Software config with id a9c8e15c-edce-401f-a2d9-932bed21cfb5 not found
Traceback (most recent call last):

  File "/usr/lib/python2.7/site-packages/heat/common/context.py", line 424,
2016-09-02 18:45:53 [0]: UPDATE_IN_PROGRESS state changed
2016-09-02 18:45:53 [0]: UPDATE_FAILED NotFound_Remote: resources[0]: Software config with id a9c8e15c-edce-401f-a2d9-932bed21cfb5 not found
Traceback (most recent call last):

  File "/usr/lib/python2.7/site-packages/heat/common/context.py", line 424, in wrapped
    return func(self, ctx, *ar
Stack overcloud UPDATE_FAILED
Heat Stack update failed.


Expected result:
Successful scale of computes.

Comment 2 Alexander Chuzhoy 2016-09-02 22:02:06 UTC
Created attachment 1197322 [details]
heat logs from the undercloud

Comment 3 James Slagle 2016-09-13 20:15:55 UTC
steve, could you have a look at this one and do some initial triage (or assign it to someone who can)? I've not seen this error before, but it seems like it might be a Heat bug as opposed to a problem with tripleo-heat-templates

Comment 4 Steve Baker 2016-09-13 21:14:36 UTC
I'll take a look

Comment 5 Steve Baker 2016-09-13 21:28:14 UTC
This sounds a lot like something which was fixed in this change https://review.openstack.org/#/c/360122/

Before I investigate this further can you confirm that your heat has this fix?

Comment 6 Steve Baker 2016-09-13 21:30:10 UTC
Attaching the upstream bug, which looks quite similar

Comment 7 Alexander Chuzhoy 2016-09-15 17:30:00 UTC
Steve,
the fix isn't there.

Comment 8 James Slagle 2016-09-20 16:52:53 UTC
steve, can we get the fix downstream? it could be that it will get picked up in the next import, or a patch will need to be proposed.

Comment 9 Steve Baker 2016-09-20 23:25:41 UTC
Current OSP-10 puddle has the following which has the fixes for this.

  openstack-heat-engine-7.0.0-0.20160907124808.21e49dc.el7ost.noarch

If you still see this issue with this or later versions then we'll need to investigate.

Comment 11 Alexander Chuzhoy 2016-09-27 20:15:51 UTC
Verified:

Environment:
openstack-heat-engine-7.0.0-0.20160907124808.21e49dc.el7ost.noarch
openstack-heat-common-7.0.0-0.20160907124808.21e49dc.el7ost.noarch

Was able to scale the deployment with one additional compute node and live migrate a running instance to it + launch an instance directly on the new node.

Comment 15 errata-xmlrpc 2016-12-14 15:56:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-2948.html


Note You need to log in before you can comment on or make changes to this bug.