Created attachment 1262050 [details] conductor CPU usage during deploy. Description of problem: Currently Ironic Conductor is deployed as a single process. It is routinely seen pegged a core when doing overcloud deploys, leading to slow deployments. workers_pool_size = 100 is set but that doesn't affect process count. Version-Release number of selected component (if applicable): RHOP 10 2017-03-03.1 puddle How reproducible: Happens with every OC deploy attempt Steps to Reproduce: 1. Deploy OC with defaults on UC 2. 3. Actual results: ironic-conductor is limited to one process and pegs a core Expected results: TripleO should deploy with multiple processes of ironic-conductor, or we should look at what is casuing this high utilization resolve the bottleneck there. Additional info:
This is a 51 node deploy (total 50 nodes in ironic).
We've just merged a patch into Newton upstream that deals with Ironic using up increasing CPU over time. It may be worth trying it out, as it could be the same fix that is required here also https://review.openstack.org/#/c/451459/
I can confirm that this behavior isnt being seen on OSP 11 GA.
This issue has no longer been seen in OSP 11. Based on Sai comments in https://bugzilla.redhat.com/show_bug.cgi?id=1431270#c4 and also shared with me some real-time performance data we can verify
The problem fixed for this bug was a problem in ironic that was causing the CPU usage of conductor to increase over time. The fix is present in openstack-ironic-common-6.2.3-1.el7ost.noarch.rpm. The original report also mentions workers_pool_size, this isn't relevant to the increasing CPU usage over time.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:1592