Bug 1414479
Summary: | [scale] - vdsm minor memory leak - 3Mbs in 12 hours (running on a host with 111 VMs) | ||
---|---|---|---|
Product: | [oVirt] vdsm | Reporter: | Eldad Marciano <emarcian> |
Component: | Core | Assignee: | Francesco Romani <fromani> |
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | eberman |
Severity: | low | Docs Contact: | |
Priority: | unspecified | ||
Version: | 4.18.17 | CC: | bugs, emarcian, fromani, mperina, nsoffer, oourfali, tjelinek, ybronhei, ykaul |
Target Milestone: | ovirt-4.2.0 | Flags: | rule-engine:
ovirt-4.2+
|
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2017-04-24 15:38:19 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | Virt | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Eldad Marciano
2017-01-18 15:46:32 UTC
Is the rate correlated with the number of VMs? How much is VDSM consuming overall? Does restarting VDSM bring it back to the initial value? (In reply to Yaniv Kaul from comment #2) > Is the rate correlated with the number of VMs? We need to test it to provide an answer. but, once all the vms was populated and running, vdsm start leaking other works, the leakage may come from vdsm monitoring. > How much is VDSM consuming overall? 182 mb (was the last sample from the test duration) after 3 days: -=>>ps -eo pid,rss |grep 15189 |awk '{print $2 / 1024}' 194.59 vdsm res memory utilization (@test period): min 79mb, max 185mb, avg 173mb, last 182mb. > Does restarting VDSM bring it back to the initial value? No, it's produce ± the same size of memory before restart: -=>>ps -eo pid,rss |grep 15189 |awk '{print $2 / 1024}' 194.59 after restart: -=>>ps -eo pid,rss,cmd |grep '/usr/bin/python2 /usr/share/vdsm/vdsm' |awk '{print $2 / 1024}' 187.422 it's back us to you first comment, the initial memory footprint correlated to the number of vms: - no vms = 79mb. - 111 vms = 187mb. but, it does not change the fact that vdsm has a leakage. Moving to Virt as it seems related to the number of VMS. Francesco, can you explore this one? Move back to infra if you feel that it is an infra issue. Well, it is hard to see from this sample if this is even a leak. Especially considering that vdsm is eating here about 200MB, that 3MB difference can be caused by lots of factors. Could you please try to let it run for about a week and make a sample every 12h (more often would be better) to see if it really grows the whole time or is it just going up and down? If it indeed grows, please provide all due logs so we can look at the issue. (In reply to Tomas Jelinek from comment #5) > Well, it is hard to see from this sample if this is even a leak. Especially > considering that vdsm is eating here about 200MB, that 3MB difference can be > caused by lots of factors. > > Could you please try to let it run for about a week and make a sample every > 12h (more often would be better) to see if it really grows the whole time or > is it just going up and down? > > If it indeed grows, please provide all due logs so we can look at the issue. Sure we can do so, seems like week is too much, and we can't occupied the lab for that period. lets start with weekend, and if it's not enough we can move forward. well, not sure if one weekend will help but lets try. Putting needinfo back (In reply to Tomas Jelinek from comment #7) > well, not sure if one weekend will help but lets try. Putting needinfo back please set target release, severity, priority (In reply to Eldad Marciano from comment #8) > (In reply to Tomas Jelinek from comment #7) > > well, not sure if one weekend will help but lets try. Putting needinfo back > > please set target release, severity, priority I can not set any of these until I will get some data from which I will be able to asses if it is a bug and how severe it is. Please update the bug with latest findings. (In reply to Yaniv Kaul from comment #10) > Please update the bug with latest findings. currently its targeted to 4.2, so we will hit it there. if you think it needs higher priority so lets coordinate it to 4.1.x |