Bug 1393450 - Ironic conductor consumes all available memory on undercloud causing it to go into swap
Summary: Ironic conductor consumes all available memory on undercloud causing it to go...
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-ironic
Version: 10.0 (Newton)
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 10.0 (Newton)
Assignee: RHOS Maint
QA Contact: mlammon
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-11-09 15:10 UTC by Sai Sindhur Malleni
Modified: 2017-04-26 09:04 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-04-26 09:04:25 UTC
Target Upstream Version:


Attachments (Terms of Use)
Ironic and Mistral Logs (1.80 MB, application/x-gzip)
2016-11-09 15:12 UTC, Sai Sindhur Malleni
no flags Details
Ironic-conductor memory usage (45.48 KB, image/png)
2016-11-15 15:39 UTC, Sai Sindhur Malleni
no flags Details
Undercloud memory swapping (33.55 KB, image/png)
2016-11-15 15:40 UTC, Sai Sindhur Malleni
no flags Details

Description Sai Sindhur Malleni 2016-11-09 15:10:12 UTC
Description of problem: Deployed the latest passed phase 1 build 2016-11-04.2 and ran some rally benchmarks. Everything seemed fine including the undercloud in terms of resource utilization. However, on an idle undercloud/overcloud ironci-conductor grows in memory abruptly to about 93 GB eating up all available memory and causing the undercloud to swap. There is nothing too interesting in the ironic logs at the same time (16:00 UTC on 11/8/16) except for the fact that it also complains about no free memory as can be seen here http://pastebin.test.redhat.com/428458.

Below is the link to the grafana dashboard showing ironic-conductor sudden memory spike:
http://norton.perf.lab.eng.rdu.redhat.com:3000/dashboard/snapshot/05rRcnI3560IH40bpBUSwkFG8OMkN9Dc

The entire set of dashboards showing undercloud resource utilization and OpenStack process resource utilization can be found at:

http://norton.perf.lab.eng.rdu.redhat.com:3000/dashboard/db/openstack-general-system-performance?var-Cloud=sai-latest&var-NodeType=*&var-Node=undercloud&var-Interface=interface-br-ctlplane&var-Disk=disk-sda&var-cpus0=All&var-cpus00=All&from=1478613866952&to=1478667866000

Version-Release number of selected component (if applicable):
RHOP 10 Build 2016-11-04.2

How reproducible: Happened once and remained the same way


Steps to Reproduce:
1. Deploy undercloud
2. Deploy overcloud
3.

Actual results: There's a sudden spike in ironic-conductor memory.


Expected results: This sort of sudden memory surge leading to undercloud swapping shouldn't happen.


Additional info: Ironic and mistral logs attached

Comment 1 Sai Sindhur Malleni 2016-11-09 15:12:49 UTC
Created attachment 1218971 [details]
Ironic and Mistral Logs

Comment 2 Sai Sindhur Malleni 2016-11-14 15:35:39 UTC
FWIW, restarting ironic-conductor puts things back into a sane state.

Comment 3 Sai Sindhur Malleni 2016-11-15 15:39:36 UTC
Created attachment 1220867 [details]
Ironic-conductor memory usage

Comment 4 Sai Sindhur Malleni 2016-11-15 15:40:00 UTC
Created attachment 1220868 [details]
Undercloud memory swapping

Comment 5 Dmitry Tantsur 2017-04-25 09:58:39 UTC
Resetting assignees, as these people no longer work in our team.

Sai, do you still experience this issue?

Comment 6 Sai Sindhur Malleni 2017-04-25 15:42:21 UTC
Dmitry,

I have only see this that one time. I haven't seen/reproduced this issue after that.

Comment 7 Dmitry Tantsur 2017-04-26 09:04:25 UTC
Thanks. Please feel free to reopen if you see it again, sorry for not reacting quickly.


Note You need to log in before you can comment on or make changes to this bug.