Bug 1431270 - Ironic Conductor is pegged during OC Deploy
Summary: Ironic Conductor is pegged during OC Deploy
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-ironic
Version: 10.0 (Newton)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: z3
: 10.0 (Newton)
Assignee: Lucas Alvares Gomes
QA Contact: mlammon
URL:
Whiteboard: scale_lab
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-03-10 19:52 UTC by Sai Sindhur Malleni
Modified: 2017-06-28 15:37 UTC (History)
8 users (show)

Fixed In Version: openstack-ironic-6.2.3-1.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-06-28 15:37:57 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
conductor CPU usage during deploy. (292.70 KB, image/png)
2017-03-10 19:52 UTC, Sai Sindhur Malleni
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1672457 0 None None None 2017-03-17 13:40:21 UTC
OpenStack gerrit 451459 0 None MERGED Copy and append to static lists 2020-07-27 19:19:31 UTC
Red Hat Product Errata RHBA-2017:1592 0 normal SHIPPED_LIVE openstack-ironic bug fix advisory 2017-06-28 19:13:53 UTC

Description Sai Sindhur Malleni 2017-03-10 19:52:04 UTC
Created attachment 1262050 [details]
conductor CPU usage during deploy.

Description of problem:
Currently Ironic Conductor is deployed as a single process. It is routinely seen pegged a core when doing overcloud deploys, leading to slow deployments.
workers_pool_size = 100 is set but that doesn't affect process count.

Version-Release number of selected component (if applicable):
RHOP 10 2017-03-03.1 puddle

How reproducible:
Happens with every OC deploy attempt

Steps to Reproduce:
1. Deploy OC with defaults on UC
2.
3.

Actual results:
ironic-conductor is limited to one process and pegs a core

Expected results:
TripleO should deploy with multiple processes of ironic-conductor, or we should look at what is casuing this high utilization resolve the bottleneck there.

Additional info:

Comment 1 Sai Sindhur Malleni 2017-03-10 19:54:07 UTC
This is a 51 node deploy (total 50 nodes in ironic).

Comment 2 Derek Higgins 2017-03-30 15:24:56 UTC
We've just merged a patch into Newton upstream that deals with Ironic using up increasing CPU over time. It may be worth trying it out, as it could be the same fix that is required here also

https://review.openstack.org/#/c/451459/

Comment 4 Sai Sindhur Malleni 2017-06-12 14:48:58 UTC
I can confirm that this behavior isnt being seen on OSP 11 GA.

Comment 5 mlammon 2017-06-12 14:55:17 UTC
This issue has no longer been seen in OSP 11. Based on Sai comments in https://bugzilla.redhat.com/show_bug.cgi?id=1431270#c4 and also shared with me some real-time performance data we can verify

Comment 6 Derek Higgins 2017-06-12 16:24:34 UTC
The problem fixed for this bug was a problem in ironic that was causing the CPU usage of conductor to increase over time. The fix is present in openstack-ironic-common-6.2.3-1.el7ost.noarch.rpm.

The original report also mentions workers_pool_size, this isn't relevant to the increasing CPU usage over time.

Comment 8 errata-xmlrpc 2017-06-28 15:37:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1592


Note You need to log in before you can comment on or make changes to this bug.