Red Hat Bugzilla – Bug 595490
[RHEL6]: Slow memory leak in libvirtd udev backend
Last modified: 2012-04-26 01:35:26 EDT
Description of problem:
As reported in launchpad here:
There are a couple of slow memory leaks in the udev node device backend. The first one is pretty minor, and just forgetting to free up a NodeDeviceDef structure on an error path. The other one is more serious in that it forgets to drop a reference to a udev device after adding it, meaning that we slowly fill up memory with udev resources.
A patch was posted to upstream libvirt by Nigel Jones:
I believe we should pull some variant of this patch into RHEL-6. However, Dave Allan (who originally wrote the code) has some concerns over this approach, and wants to review it more detail before we do that.
After reviewing Nigel's patch, I think it's the right approach. I modified it slightly, found one additional leak and submitted it upstream:
IMO, this needs to be fixed in 6.0, since the rate of leak is dependent on udev activity.
(10:50:42 AM) dallan: mjenner: testing is straightforward: start libvirt, monitor memory usage, add and remove devices
I used iscsi login/logout in a loop, but any host activity that creates udev add/remove events will do.
libvirt-0_8_1-10_el6 has been built in RHEL-6-candidate with the fix.
(In reply to comment #2)
> qa_ack provided.
> (10:50:42 AM) dallan: mjenner: testing is straightforward: start libvirt,
> monitor memory usage, add and remove devices
I am not sure whether the following test method is valid for the bug, and I used top tool to check memory usage, but I don't know how much change is normal for memory usage, the following is steps of test:
1. Install a rhel6 host machine with HBA cards
2. Use top tool to monitor used memory
3. Create new virtual node device from xml
4. Repeat 2
5. Destroy the virtual node device
6. Repeat 2
The details is attached as an attachment.
Created attachment 430565 [details]
monitor memory usage before creating node device
Created attachment 430568 [details]
monitor memory usage after creating node device
Created attachment 430569 [details]
monitor memory usage after destroying node device
Created attachment 430570 [details]
The details of test
ps -C libvirtd -o rss,size,vsize
to monitor memory usage. Note that you may still see some increase in these values for a little while, so the test strategy should be to do an extended run (more than 6 hours) of creating and destroying devices in a loop. If you see the numbers stabilize, there's no leak. A leak will cause them to increase without bound.
(In reply to comment #12)
> I used:
> ps -C libvirtd -o rss,size,vsize
> to monitor memory usage. Note that you may still see some increase in these
> values for a little while, so the test strategy should be to do an extended run
> (more than 6 hours) of creating and destroying devices in a loop. If you see
> the numbers stabilize, there's no leak. A leak will cause them to increase
> without bound.
I retest the bug according to your suggestion, the test result looks like fine,
the RSS range from 9468 to 11560, and SZ=457100, VSZ=609080.
total time: 8h
In the loop, It will execute a round node device operation of creating and destroying every 6s, so the times=(28800/6)=4800. Please see the attachment.
# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 6.0 Beta (Santiago)
# uname -a
Linux intel-e5530-8-4.englab.englab.nay.redhat.com 2.6.32-25.el6.x86_64 #1 SMP Mon May 10 17:30:22 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux
# rpm -q libvirt
Created attachment 431339 [details]
Created attachment 431340 [details]
The bug has been fixed on RHEL6-beta(2.6.32-25.el6.x86_64) with libvirt-0.8.1-13.el6.x86_64.
Created attachment 431348 [details]
That test result looks good. How many LUN were visible to the vHBA you created, btw?
I have tested it on RHEL6 RC3 and the test log can be seen in the attachments:
Created attachment 447889 [details]
test log on RHEL6 RC3
That test result looks fine.
Red Hat Enterprise Linux 6.0 is now available and should resolve
the problem described in this bug report. This report is therefore being closed
with a resolution of CURRENTRELEASE. You may reopen this bug report if the
solution does not work for you.
I test it with libvirt-0.9.10-1.el6.x86_64 and it seams like memleaked.
the test log is attached.
Created attachment 564300 [details]
mem monitor log
Well, i tested it again with valgrind, and it looks good.
So, pls ignore comment23 comment24.