|Summary:||Failing to update the existing overcloud adding more compute nodes.|
|Product:||Red Hat OpenStack||Reporter:||Francesco Vollero <fvollero>|
|Component:||openstack-tripleo-heat-templates||Assignee:||Jiri Stransky <jstransk>|
|Status:||CLOSED ERRATA||QA Contact:||Omri Hochman <ohochman>|
|Version:||7.0 (Kilo)||CC:||astellwa, dbecker, fvollero, jcoufal, jraju, jslagle, mburns, mcornea, mgandolf, morazi, rhel-osp-director-maint, riontel, rybrown|
|Target Milestone:||rc||Keywords:||TestOnly, Triaged|
|Target Release:||10.0 (Newton)||Flags:||jcoufal:
|Fixed In Version:||openstack-tripleo-heat-templates-5.0.0-0.5.0rc3.el7ost||Doc Type:||Bug Fix|
|Doc Text:||Story Points:||---|
|Last Closed:||2016-12-14 15:25:11 UTC||Type:||Bug|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
Description Francesco Vollero 2016-03-02 08:06:50 UTC
Created attachment 1132164 [details] heat-engine Description of problem: When I am trying to update an existing overcloud I am facing the most hated error in heat, the UPDATE_FAILED one. I am executing this deployments on physical hardware, with a director having 4 vcores (running on kvm instance) and 32 gb or ram and having the engine_num_workers to 8 as suggested. If the deployment is executed from scratch it succeed without any issues, but if is an update, it always (based on my number of trials, 15) fail. I collected the logs from heat-engine, nova-conductor and neutron and it seems to be a problem 'related' to neutron, but is just speculation. Version-Release number of selected component (if applicable): 7.3 How reproducible: Run a deployment to update the number of compute nodes. Steps to Reproduce: The same for creating a normal deployment. Actual results: UPDATE_FAILED Stack failed with status: resources.Compute: ResourceInError: resources.resources.NovaCompute: Went to status ERROR due to "Message: Unknown, Code: Unknown" ERROR: openstack Heat Stack update failed. Expected results: Deployment succeeded Additional info:
Comment 2 Francesco Vollero 2016-03-02 08:08:02 UTC
Created attachment 1132165 [details] nova-conductor
Comment 4 Francesco Vollero 2016-03-02 08:12:37 UTC
Created attachment 1132167 [details] grep of a failing request
Comment 5 Ryan Brown 2016-03-02 19:18:42 UTC
Extended nova logs: http://chunk.io/f/c1078acec8ee4286995813db6e020481
Comment 6 Ryan Brown 2016-03-02 19:24:55 UTC
It looks to me like a source of this problem is in Neutron - sometimes a 404 can indicate not enough floating IPs. The next step would be to get the neutron logs to see what's causing the 404.
Comment 7 Ryan Brown 2016-03-03 16:23:56 UTC
Neutron logs http://chunk.io/f/7da633892d50410aa17dec6963afbc41
Comment 8 Francesco Vollero 2016-03-04 10:57:46 UTC
The range of floating ip have size 20 while the number of nodes is maximum 9.
Comment 9 Mike Burns 2016-04-07 21:14:44 UTC
This bug did not make the OSP 8.0 release. It is being deferred to OSP 10.
Comment 10 Jaromir Coufal 2016-10-11 13:19:45 UTC
Francesco, still experiencing similar issues? Seems like one off.
Comment 11 James Slagle 2016-10-14 16:34:08 UTC
Comment 13 Marius Cornea 2016-11-22 13:47:08 UTC
Moving this to verified since scaling out with an additional compute node complete ok on OSP10.
Comment 16 errata-xmlrpc 2016-12-14 15:25:11 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2016-2948.html