Bug 1158108 - [scale] VDSM leaks small mount of memory (~300KiB/h)
Summary: [scale] VDSM leaks small mount of memory (~300KiB/h)
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: oVirt
Classification: Retired
Component: vdsm
Version: 3.5
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: 3.5.3
Assignee: Petr Horáček
QA Contact: Eldad Marciano
URL:
Whiteboard: network
: 1220154 (view as bug list)
Depends On:
Blocks: 1230644
TreeView+ depends on / blocked
 
Reported: 2014-10-28 15:18 UTC by Daniel Helgenberger
Modified: 2016-02-10 19:38 UTC (History)
26 users (show)

Fixed In Version: vdsm-4.16.16-1.el6ev
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-06-15 08:38:39 UTC
oVirt Team: Network


Attachments (Terms of Use)
strace showing mprotect during what looks like the libnl socket calls (1.62 MB, text/plain)
2015-03-23 19:03 UTC, John Taylor
no flags Details
valgrind report for a reprodcer script (839.74 KB, text/plain)
2015-03-27 17:31 UTC, Dan Kenigsberg
no flags Details
vdsm mem usage graph (195.87 KB, image/png)
2015-05-31 11:36 UTC, Eldad Marciano
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 39416 0 master MERGED vdsm: Fix memory leak in netlink/link.py Never
oVirt gerrit 40134 0 ovirt-3.5 MERGED vdsm: Fix memory leak in netlink/link.py Never
oVirt gerrit 40383 0 ovirt-3.5 MERGED netlink: fix PEP8 in netlink memory leak patch Never
oVirt gerrit 40488 0 ovirt-3.5 MERGED netlink: fix PEP8 missing line in netlink memory leak patch Never
oVirt gerrit 40812 0 master MERGED netlink: rtnl_put_link after rtnl_link_get_kernel Never

Description Daniel Helgenberger 2014-10-28 15:18:03 UTC
Description of problem:
The vdsm process grows with a rate of about 300KiB/h.

Version-Release number of selected component (if applicable):
Starting latest 3.3.1 [1] - 3.5
CentOS 6.5
vdsm-4.13.3-3.el6.x86_64 ... vdsm-4.16.7-1.gitdb83943.el6.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Deploy a host
2. Run some VMs (though it is unrelated as it seams to grow no matter what)
3. Watch VDSM RSS size

Actual results:
Constantly growing at a rate ~300KiB/h 

Expected results:
RSS size stays the same (well, linear growing according to the number of VMs).

Additional info:
This might be neglectable but at least we need to keep track of it I think. Also, it still amounts to something as hosts run a long time.
This is a different issue and seems not in any way related to m2crypto BZ1157749

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1147148#c15

Comment 1 Chris Adams 2015-03-13 13:40:16 UTC
I'm seeing this too, although my RSS seems to be growing faster.  I just checked one node, and in 5 minutes I saw the vdsm RSS increase by 952K (VSZ did not change at all).

Comment 2 John Taylor 2015-03-23 19:01:31 UTC
I want to add that I'm also seeing this on both a Centos 7 host and a fed 20 host that are part of a 3.5 cluster (ovirt 3.5.1  vdsm 4.16.10)  I do not see it on a fed 20 host that is in a 3.4 cluster (vdsm 4.14.11.2)

Note that I see the memory growth over time even for a host that is in maintenance(i.e. not running any vms)

I've tried to debug by adding the memory profiling patch, but dowser didn't show anything obvious as far as object counts.  

But taking another approach, I used pmap to dump the memory map over time, and tried to correlate the ones that show increase in rss to a strace of the vdsm process.

What I've come up is that it looks like there's something going on in sampling._get_interfaces_and_samples ->  ipwrapper.getLinks -> netlink.iter_links   because there are suspicious mprotect calls to regions of memory that show growing rss from the pmap, and the strace *looks* like it is from
the _nl_link stuff. 


I've attached the strace in trace_showing_mprotect.txt to try to show what I'm talking about.

Comment 3 John Taylor 2015-03-23 19:03:33 UTC
Created attachment 1005558 [details]
strace showing mprotect during what looks like the libnl socket calls

Comment 4 Dan Kenigsberg 2015-03-25 10:11:15 UTC
Tomas, could you take a look if libnl3 is leaking? Or could it be something wrong that we do in our python-ctypes wrapper thereof?

https://gerrit.ovirt.org/gitweb?p=vdsm.git;a=blob;f=lib/vdsm/netlink/link.py;h=0aafdb667018267530e02db21a40b160161b3322;hb=HEAD#l41

Comment 5 Dan Kenigsberg 2015-03-25 12:38:28 UTC
Meni, do you notice this kind of leak your systems (please check if rss is rising over hours, in a steady state)

Comment 6 Thomas Haller 2015-03-25 13:30:10 UTC
(In reply to Dan Kenigsberg from comment #4)
> Tomas, could you take a look if libnl3 is leaking? Or could it be something
> wrong that we do in our python-ctypes wrapper thereof?
> 
> https://gerrit.ovirt.org/gitweb?p=vdsm.git;a=blob;f=lib/vdsm/netlink/link.py;
> h=0aafdb667018267530e02db21a40b160161b3322;hb=HEAD#l41

I don't see anything wrong with it yet.

How could I reproduce this? I am not familiar with vdsm.

Or could you reproduce it with valgrind? Something like

   valgrind \
      --leak-check=full \
      --log-file="$LOGFILE" \
      $PROGRAM

Comment 7 Kristaps Tigeris 2015-03-27 13:48:53 UTC
I want to add that I'am seeing a similar behavior of vdsm process on CentOS 6.5 and CentOS 6.6 servers. VDSM version 4.16.10-8. All part of oVirt 3.5.1 cluster.

However, for me there is no constant growth of memory usage all the time. We are monitoring VDSM process memory usage every minute and we can see that every 2 hours 25-30 minutes VDSM process memory usage grows by 60-70MiB. This is the pattern that repeats at that interval. This is quite a lot more than 300KiB/h.

Once VDSM process gets restarted VDSM process memory usage gets back down to around 4GiB, while before restart we have seen it even grown up to 14GiB, and same pattern of raising memory keeps repeating.

One more thing, this keeps happening regardless whatever new VMs are being migrated to the host or not.

Comment 8 Dan Kenigsberg 2015-03-27 17:31:18 UTC
Created attachment 1007405 [details]
valgrind report for a reprodcer script

Tomas, Darrell Budic sent this valgrind report for the following script (placed under /usr/share/vdsm, with vdsm installed)

    #!/usr/bin/python

    from time import sleep
    from virt.sampling import _get_interfaces_and_samples

    while True:
        _get_interfaces_and_samples()
        sleep(0.2)

can you assist with it?

Comment 9 Darrell 2015-03-27 17:42:08 UTC
I'll just note that the script leaks at a similar rate to vdsmd itself in my environment (this valgrind dump is from a server with 10 VMs and 17 interface between them). Causing _get_interfaces_and_samples to return an empty or static dict effectively eliminates the leak in my testing as well. Happy to provide additional info or future testing on this one.

Comment 10 John Taylor 2015-03-28 14:25:02 UTC
from my email to the users list

I think there are, at least for the purpose of this discussion, 3 leaks:
1. the M2Crypto leak
2. a slower leak
3. a large leak that's not M2Crypto related that's part of sampling

My efforts have been around finding the source of my larger leak, which
I think is #3.  I had disabled ssl so I knew that M2Crypto
isn't/shouldn't be the problem as in bz1147148, and ssl is beside the
point as it happens with a deactived host. It's part of sampling which
always runs.

What I've found is, after trying to get the smallest reproducer, that
it's not the netlink.iter_links that I commented on in [1] that is the
problem. But in the _get_intefaces_and_samples loop is the call to
create an InterfaceSample and that has getLinkSpeed() which, for vlans,
ends up calling ipwrapper.getLink, and that to
netlink.get_link(name)

netlink.get_link(name) *is* the source of my big leak. This is vdsm
4.16.10, so it is [2] and it's been changed in master for the removal of
support for libnl v1 so it might not be a problem anymore.
 
def get_link(name):
    """Returns the information dictionary of the name specified link."""
    with _pool.socket() as sock:
        with _nl_link_cache(sock) as cache:
            link = _rtnl_link_get_by_name(cache, name)
            if not link:
                raise IOError(errno.ENODEV, '%s is not present in the system' %
                              name)
            return _link_info(cache, link)


The libnl documentation note at [3] says that for the rtnl_link_get_by_name function 
"Attention
    The reference counter of the returned link object will be incremented. Use rtnl_link_put() to release the reference."

So I took that hint, and made a change that does the rtnl_link_put() in
get_link(name) and it looks like it works for me.

diff oldnetlink.py netlink.py
67d66
<             return _link_info(cache, link)
68a68,70
>             li = _link_info(cache, link)
>             _rtnl_link_put(link)
>             return li
333a336,337
> 
> _rtnl_link_put  = _none_proto(('rtnl_link_put', LIBNL_ROUTE))


[1] https://bugzilla.redhat.com/show_bug.cgi?id=1158108
[2]
https://gerrit.ovirt.org/gitweb?p=vdsm.git;a=blob;f=lib/vdsm/netlink.py;h=afae5cecb5ce701d00fb8f019ec92b3331a39036;hb=5608cfdf43db9186dabac4b2a779f9557e798968
[3]
http://www.infradead.org/~tgr/libnl/doc/api/group__link.html#ga1d583e4f0b43c89d854e5e681a529fad

Comment 11 Thomas Haller 2015-04-08 10:40:52 UTC
Clear need-info-flag from my side, I agree with the analysis of John Taylor

Comment 12 Yaniv Lavi 2015-04-08 12:01:08 UTC
Should this be on POST?

Comment 13 Petr Horáček 2015-04-08 12:08:34 UTC
(In reply to Yaniv Dary from comment #12)
> Should this be on POST?

There is a well-looking patch, so probably yes.

Comment 14 Petr Horáček 2015-04-08 12:12:45 UTC
Taking my POST back, it has patch only on master branch.

Comment 15 Petr Horáček 2015-04-08 12:23:49 UTC
Sorry for confusions and spam, it deserves its POST.

Comment 16 Sven Kieske 2015-05-11 09:42:14 UTC
what's the status here? will this get into 3.5.3 at least?

is there anything left to be done (coding)?

thanks!

Comment 17 Petr Horáček 2015-05-11 10:31:28 UTC
All required code should be merged.

However we need to test it for memory leaks. John, could you test it again please?

Comment 18 John Taylor 2015-05-12 03:44:42 UTC
I can confirm that the patch for the 3.5 branch corrects the memory leak problem  I commented on in comment #10 (although I should probably point out that the "memory leak problem" here was a larger one than the small one reported originally on this bz so probably should have been another bz but we were kind of scurrying )

I can *not* confirm the patch for the master branch [1] . It has been reverted on commit b92fa03326b805d7ab67a9c1269990681b02e213 [2] . I'm fairly certain it shouldn't have been reverted completely as the rtnl_link_get_kernel calls introduce *another* place that memory allocated by libnl has to be returned with _put (see [3] for an example of the get/put ). Do you agree Petr?


[1] https://gerrit.ovirt.org/#/c/39416
[2] https://gerrit.ovirt.org/#/c/40571/
[3] http://www.infradead.org/~tgr/libnl/doc/route.html#link_direct_lookup

Comment 19 Petr Horáček 2015-05-12 11:11:49 UTC
Yay! You are right. In libnl source code they have `put` related @attention for rtnl_link_get_by_name but not for rtnl_link_get_kernel. However, according to documentation we have to handle this function too. Thanks very much!

Comment 20 Eyal Edri 2015-05-14 13:06:43 UTC
moving back to modified, its not part of vt15.

Comment 21 Oved Ourfali 2015-05-14 13:08:30 UTC
*** Bug 1220154 has been marked as a duplicate of this bug. ***

Comment 22 Michael Burman 2015-05-26 07:26:33 UTC
Hi Dan,

VDSM still leaks small mount of memory (~20KiB/h). I have 5 VMs running on the server.

[root@zeus-vds1 ~]# while sleep 60; do date; ps -o rsz= 11640; done

Mon May 25 16:12:36 IDT 2015
60348    

Mon May 25 16:35:36 IDT 2015
60364             

Mon May 25 16:36:36 IDT 2015
60384  

Mon May 25 16:41:36 IDT 2015
60380                       
Mon May 25 16:42:36 IDT 2015
60384         

Mon May 25 17:15:37 IDT 2015
60384                       
Mon May 25 17:16:37 IDT 2015
60372    

Mon May 25 17:42:37 IDT 2015
60384                       
Mon May 25 17:43:37 IDT 2015
60388   

Mon May 25 17:55:38 IDT 2015
60388                       
Mon May 25 17:56:38 IDT 2015
60404                       
Mon May 25 17:57:38 IDT 2015
60408                       


Mon May 25 18:15:38 IDT 2015
60408                       
Mon May 25 18:16:38 IDT 2015
60396            

Mon May 25 18:19:38 IDT 2015
60396                       
Mon May 25 18:20:38 IDT 2015
60404                       
Mon May 25 18:21:38 IDT 2015
60408      

Mon May 25 18:26:38 IDT 2015
60408        

Tue May 26 08:15:53 IDT 2015                                       
60452   

Tue May 26 10:23:55 IDT 2015
60452

What do you think Dan?

Comment 23 Dan Kenigsberg 2015-05-26 09:32:16 UTC
I think that there might be another memory leak. The big netlink one is sealed.

Comment 24 Eldad Marciano 2015-05-26 10:27:44 UTC
patch 40134 tested on top of vt14.3, looks like vdsm leaks now just less than ~400k, instead of 2mb.

Comment 25 Eldad Marciano 2015-05-26 10:28:56 UTC
(In reply to Eldad Marciano from comment #24)
> patch 40134 tested on top of vt14.3, looks like vdsm leaks now just less
> than ~400k, instead of 2mb.

test case 48h CHO, with 57 running vms, same like previous test.

Comment 26 Eldad Marciano 2015-05-31 11:36:36 UTC
Created attachment 1032824 [details]
vdsm mem usage graph

we have monitored vdsmd process for 3 days, including the patch vdsmd still leaking 
we are monitoring the resident memory for vdsmd process every 1 min.
at some point looks like we have peak of almost 2mb.
see the attached graph.

Comment 27 Dan Kenigsberg 2015-05-31 16:21:06 UTC
Thanks Eldad. Can you specify the conditions on the monitored host? How many VMs running? how many storage domains?

Can you find vdsm.log and supervdsm.log from May 30 morning, the time of the 2MiB spike?

Comment 28 Eldad Marciano 2015-06-01 13:58:25 UTC
test case for the monitored host:
-1 storage domain.
-57 vms.

about the logs there are many ziped logs for May 30. so i think it's better to take it offline and i'll give you access.

Comment 30 Dan Kenigsberg 2015-06-07 16:34:16 UTC
I'm afraid I don't find anything particular happening circa 30 May 07:39. The 2MiB surge at that time may be an allocation of a new slab by python's memory manager.

We'd need much more research finding the source of the remaining leak. (e.g does it go on when Engine is not polling? Is it proportional to the number of VMs? Number of storage domains? Network interfaces?) and it probably belongs to a fresh new bug.

Comment 31 Eldad Marciano 2015-06-11 08:52:02 UTC
verified on top of vt15.1,
as Dan mention, another bug and research required, with several different test cases such as:
- engine off \ on.
- vm's capacity.

Comment 32 Sandro Bonazzola 2015-06-15 08:38:39 UTC
This is an automated message.
oVirt 3.5.3 has been released on June 15th 2015 and should include the fix for this BZ. Moving to closed current release.


Note You need to log in before you can comment on or make changes to this bug.