Bug 1393450

Summary: Ironic conductor consumes all available memory on undercloud causing it to go into swap
Product: Red Hat OpenStack Reporter: Sai Sindhur Malleni <smalleni>
Component: openstack-ironicAssignee: RHOS Maint <rhos-maint>
Status: CLOSED INSUFFICIENT_DATA QA Contact: mlammon
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 10.0 (Newton)CC: dtantsur, mburns, rhel-osp-director-maint, smalleni, srevivo
Target Milestone: ---Keywords: Triaged, ZStream
Target Release: 10.0 (Newton)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-04-26 09:04:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
Ironic and Mistral Logs
none
Ironic-conductor memory usage
none
Undercloud memory swapping none

Description Sai Sindhur Malleni 2016-11-09 15:10:12 UTC
Description of problem: Deployed the latest passed phase 1 build 2016-11-04.2 and ran some rally benchmarks. Everything seemed fine including the undercloud in terms of resource utilization. However, on an idle undercloud/overcloud ironci-conductor grows in memory abruptly to about 93 GB eating up all available memory and causing the undercloud to swap. There is nothing too interesting in the ironic logs at the same time (16:00 UTC on 11/8/16) except for the fact that it also complains about no free memory as can be seen here http://pastebin.test.redhat.com/428458.

Below is the link to the grafana dashboard showing ironic-conductor sudden memory spike:
http://norton.perf.lab.eng.rdu.redhat.com:3000/dashboard/snapshot/05rRcnI3560IH40bpBUSwkFG8OMkN9Dc

The entire set of dashboards showing undercloud resource utilization and OpenStack process resource utilization can be found at:

http://norton.perf.lab.eng.rdu.redhat.com:3000/dashboard/db/openstack-general-system-performance?var-Cloud=sai-latest&var-NodeType=*&var-Node=undercloud&var-Interface=interface-br-ctlplane&var-Disk=disk-sda&var-cpus0=All&var-cpus00=All&from=1478613866952&to=1478667866000

Version-Release number of selected component (if applicable):
RHOP 10 Build 2016-11-04.2

How reproducible: Happened once and remained the same way


Steps to Reproduce:
1. Deploy undercloud
2. Deploy overcloud
3.

Actual results: There's a sudden spike in ironic-conductor memory.


Expected results: This sort of sudden memory surge leading to undercloud swapping shouldn't happen.


Additional info: Ironic and mistral logs attached

Comment 1 Sai Sindhur Malleni 2016-11-09 15:12:49 UTC
Created attachment 1218971 [details]
Ironic and Mistral Logs

Comment 2 Sai Sindhur Malleni 2016-11-14 15:35:39 UTC
FWIW, restarting ironic-conductor puts things back into a sane state.

Comment 3 Sai Sindhur Malleni 2016-11-15 15:39:36 UTC
Created attachment 1220867 [details]
Ironic-conductor memory usage

Comment 4 Sai Sindhur Malleni 2016-11-15 15:40:00 UTC
Created attachment 1220868 [details]
Undercloud memory swapping

Comment 5 Dmitry Tantsur 2017-04-25 09:58:39 UTC
Resetting assignees, as these people no longer work in our team.

Sai, do you still experience this issue?

Comment 6 Sai Sindhur Malleni 2017-04-25 15:42:21 UTC
Dmitry,

I have only see this that one time. I haven't seen/reproduced this issue after that.

Comment 7 Dmitry Tantsur 2017-04-26 09:04:25 UTC
Thanks. Please feel free to reopen if you see it again, sorry for not reacting quickly.