Bug 1313674 - Failing to update the existing overcloud adding more compute nodes.
Summary: Failing to update the existing overcloud adding more compute nodes.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 7.0 (Kilo)
Hardware: x86_64
OS: Unspecified
unspecified
high
Target Milestone: rc
: 10.0 (Newton)
Assignee: Jiri Stransky
QA Contact: Omri Hochman
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-03-02 08:06 UTC by Francesco Vollero
Modified: 2023-09-14 03:18 UTC (History)
13 users (show)

Fixed In Version: openstack-tripleo-heat-templates-5.0.0-0.5.0rc3.el7ost
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-12-14 15:25:11 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
heat-engine (14.65 KB, text/plain)
2016-03-02 08:06 UTC, Francesco Vollero
no flags Details
nova-conductor (6.80 KB, text/plain)
2016-03-02 08:08 UTC, Francesco Vollero
no flags Details
nova-api (18.69 KB, text/plain)
2016-03-02 08:08 UTC, Francesco Vollero
no flags Details
grep of a failing request (2.76 KB, text/plain)
2016-03-02 08:12 UTC, Francesco Vollero
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2016:2948 0 normal SHIPPED_LIVE Red Hat OpenStack Platform 10 enhancement update 2016-12-14 19:55:27 UTC

Description Francesco Vollero 2016-03-02 08:06:50 UTC
Created attachment 1132164 [details]
heat-engine

Description of problem:

When I am trying to update an existing overcloud I am facing the most hated error in heat, the UPDATE_FAILED one. 

I am executing this deployments on physical hardware, with a director having 4 vcores (running on kvm instance) and 32 gb or ram and having the engine_num_workers to 8 as suggested.

If the deployment is executed from scratch it succeed without any issues, but if is an update, it always (based on my number of trials, 15) fail.

I collected the logs from heat-engine, nova-conductor and neutron and it seems to be a problem 'related' to neutron, but is just speculation.


Version-Release number of selected component (if applicable):
7.3

How reproducible:
Run a deployment to update the number of compute nodes.

Steps to Reproduce:
The same for creating a normal deployment.

Actual results:
UPDATE_FAILED
Stack failed with status: resources.Compute: ResourceInError: resources[2].resources.NovaCompute: Went to status ERROR due to "Message: Unknown, Code: Unknown"
ERROR: openstack Heat Stack update failed.


Expected results:
Deployment succeeded

Additional info:

Comment 2 Francesco Vollero 2016-03-02 08:08:02 UTC
Created attachment 1132165 [details]
nova-conductor

Comment 3 Francesco Vollero 2016-03-02 08:08:57 UTC
Created attachment 1132166 [details]
nova-api

Comment 4 Francesco Vollero 2016-03-02 08:12:37 UTC
Created attachment 1132167 [details]
grep of a failing request

Comment 5 Ryan Brown 2016-03-02 19:18:42 UTC
Extended nova logs: http://chunk.io/f/c1078acec8ee4286995813db6e020481

Comment 6 Ryan Brown 2016-03-02 19:24:55 UTC
It looks to me like a source of this problem is in Neutron - sometimes a 404 can indicate not enough floating IPs. The next step would be to get the neutron logs to see what's causing the 404.

Comment 7 Ryan Brown 2016-03-03 16:23:56 UTC
Neutron logs http://chunk.io/f/7da633892d50410aa17dec6963afbc41

Comment 8 Francesco Vollero 2016-03-04 10:57:46 UTC
The range of floating ip have size 20 while the number of nodes is maximum 9.

Comment 9 Mike Burns 2016-04-07 21:14:44 UTC
This bug did not make the OSP 8.0 release.  It is being deferred to OSP 10.

Comment 10 Jaromir Coufal 2016-10-11 13:19:45 UTC
Francesco, still experiencing similar issues? Seems like one off.

Comment 11 James Slagle 2016-10-14 16:34:08 UTC
setting TestOnly

Comment 13 Marius Cornea 2016-11-22 13:47:08 UTC
Moving this to verified since scaling out with an additional compute node complete ok on OSP10.

Comment 16 errata-xmlrpc 2016-12-14 15:25:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-2948.html

Comment 17 Red Hat Bugzilla 2023-09-14 03:18:47 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days


Note You need to log in before you can comment on or make changes to this bug.