Bug 595490

Summary: [RHEL6]: Slow memory leak in libvirtd udev backend
Product: Red Hat Enterprise Linux 6 Reporter: Chris Lalancette <clalance>
Component: libvirtAssignee: Alex Jia <ajia>
Status: CLOSED CURRENTRELEASE QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: low    
Version: 6.0CC: ajia, berrange, dallan, hbrock, mjenner, xen-maint, xhu, zhpeng
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-0_8_1-10_el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-11-11 14:48:20 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
monitor memory usage before creating node device
none
monitor memory usage after creating node device
none
monitor memory usage after destroying node device
none
The details of test
none
test script
none
test log
none
virtual_hba.xml
none
test log on RHEL6 RC3
none
mem monitor log none

Description Chris Lalancette 2010-05-24 19:19:57 UTC
Description of problem:
As reported in launchpad here:

https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/571093

There are a couple of slow memory leaks in the udev node device backend.  The first one is pretty minor, and just forgetting to free up a NodeDeviceDef structure on an error path.  The other one is more serious in that it forgets to drop a reference to a udev device after adding it, meaning that we slowly fill up memory with udev resources.

A patch was posted to upstream libvirt by Nigel Jones:

https://www.redhat.com/archives/libvir-list/2010-May/msg00920.html

I believe we should pull some variant of this patch into RHEL-6.  However, Dave Allan (who originally wrote the code) has some concerns over this approach, and wants to review it more detail before we do that.

Comment 1 Dave Allan 2010-05-29 03:23:05 UTC
After reviewing Nigel's patch, I think it's the right approach.  I modified it slightly, found one additional leak and submitted it upstream:

https://www.redhat.com/archives/libvir-list/2010-May/msg01206.html

IMO, this needs to be fixed in 6.0, since the rate of leak is dependent on udev activity.

Comment 2 Martin Jenner 2010-06-01 15:02:32 UTC
qa_ack provided.

(10:50:42 AM) dallan: mjenner: testing is straightforward: start libvirt, monitor memory usage, add and remove devices

Comment 3 Dave Allan 2010-06-01 15:07:19 UTC
I used iscsi login/logout in a loop, but any host activity that creates udev add/remove events will do.

Comment 5 Dave Allan 2010-06-23 21:29:28 UTC
libvirt-0_8_1-10_el6 has been built in RHEL-6-candidate with the fix.

Dave

Comment 7 Alex Jia 2010-07-09 08:46:13 UTC
(In reply to comment #2)
> qa_ack provided.
> 
> (10:50:42 AM) dallan: mjenner: testing is straightforward: start libvirt,
> monitor memory usage, add and remove devices    

Hi Martin,
I am not sure whether the following test method is valid for the bug, and I used top tool to check memory usage, but I don't know how much change is normal for memory usage, the following is steps of test:

1. Install a rhel6 host machine with HBA cards
2. Use top tool to monitor used memory
3. Create new virtual node device from xml
4. Repeat 2
5. Destroy the virtual node device 
6. Repeat 2


The details is attached as an attachment.

Comment 8 Alex Jia 2010-07-09 08:49:01 UTC
Created attachment 430565 [details]
monitor memory usage before creating node device

Comment 9 Alex Jia 2010-07-09 08:49:41 UTC
Created attachment 430568 [details]
monitor memory usage after creating node device

Comment 10 Alex Jia 2010-07-09 08:50:34 UTC
Created attachment 430569 [details]
monitor memory usage after destroying node device

Comment 11 Alex Jia 2010-07-09 08:51:25 UTC
Created attachment 430570 [details]
The details of test

Comment 12 Dave Allan 2010-07-09 16:10:54 UTC
I used:

ps -C libvirtd -o rss,size,vsize

to monitor memory usage.  Note that you may still see some increase in these values for a little while, so the test strategy should be to do an extended run (more than 6 hours) of creating and destroying devices in a loop.  If you see the numbers stabilize, there's no leak.  A leak will cause them to increase without bound.

Comment 13 Alex Jia 2010-07-13 03:41:01 UTC
(In reply to comment #12)
> I used:
> 
> ps -C libvirtd -o rss,size,vsize
> 
> to monitor memory usage.  Note that you may still see some increase in these
> values for a little while, so the test strategy should be to do an extended run
> (more than 6 hours) of creating and destroying devices in a loop.  If you see
> the numbers stabilize, there's no leak.  A leak will cause them to increase
> without bound.    

Hi Dave,
I retest the bug according to your suggestion, the test result looks like fine, 
the RSS range from 9468 to 11560, and SZ=457100, VSZ=609080.


Test strategy:
total time: 8h
interval: 2s 

In the loop, It will execute a round node device operation of creating and destroying every 6s, so the times=(28800/6)=4800. Please see the attachment.


# cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 6.0 Beta (Santiago)
# uname -a
Linux intel-e5530-8-4.englab.englab.nay.redhat.com 2.6.32-25.el6.x86_64 #1 SMP Mon May 10 17:30:22 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux
# rpm -q libvirt
libvirt-0.8.1-13.el6.x86_64

Comment 14 Alex Jia 2010-07-13 03:42:24 UTC
Created attachment 431339 [details]
test script

Comment 15 Alex Jia 2010-07-13 03:43:23 UTC
Created attachment 431340 [details]
test log

Comment 16 Alex Jia 2010-07-13 03:44:37 UTC
The bug has been fixed on RHEL6-beta(2.6.32-25.el6.x86_64) with libvirt-0.8.1-13.el6.x86_64.

Comment 17 Alex Jia 2010-07-13 05:13:24 UTC
Created attachment 431348 [details]
virtual_hba.xml

Comment 18 Dave Allan 2010-07-13 15:20:55 UTC
Hi Alex,

That test result looks good.  How many LUN were visible to the vHBA you created, btw?

Dave

Comment 19 xhu 2010-09-17 02:55:48 UTC
I have tested it on RHEL6 RC3 and the test log can be seen in the attachments:
kernel-2.6.32-71.el6.x86_64
libvirt-0.8.1-27.el6.x86_64
qemu-kvm-0.12.1.2-2.113.el6.x86_64

Comment 20 xhu 2010-09-17 02:56:54 UTC
Created attachment 447889 [details]
test log on RHEL6 RC3

Comment 21 Dave Allan 2010-09-17 18:09:38 UTC
That test result looks fine.

Comment 22 releng-rhel@redhat.com 2010-11-11 14:48:20 UTC
Red Hat Enterprise Linux 6.0 is now available and should resolve
the problem described in this bug report. This report is therefore being closed
with a resolution of CURRENTRELEASE. You may reopen this bug report if the
solution does not work for you.

Comment 23 zhpeng 2012-02-20 05:50:49 UTC
I test it with libvirt-0.9.10-1.el6.x86_64 and it seams like memleaked.
the test log is attached.

Comment 24 zhpeng 2012-02-20 05:56:14 UTC
Created attachment 564300 [details]
mem monitor log

Comment 25 zhpeng 2012-04-26 05:35:26 UTC
Well, i tested it again with valgrind, and it looks good.
So, pls ignore comment23 comment24.