Bug 606318

Summary: libvirt: logfile FD leak on domain startup
Product: [Fedora] Fedora Reporter: Tom Horsley <horsley1953>
Component: libvirtAssignee: Daniel Veillard <veillard>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 13CC: berrange, clalance, crobinso, itamar, jforbes, laine, veillard, virt-maint
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-07-28 13:07:33 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
the ls -l of libvirtd's /proc/pid/fd directory none

Description Tom Horsley 2010-06-21 12:14:39 UTC
Description of problem:
I see stuff like this in my testbed logs:

2010-06-21 07:56:13 starting: ./scripts/xen/bootup-kvm rhel4u4i

error: Failed to start domain rhel4u4i
error: cannot create pipe: Too many open files

That bootup-kvm script simply ssh's to the host named "xen" and
does a virsh start command (along with a bunch of error checking).

If I go to the host and look at /proc/<libvirtd pid>/fd I see 1018
open file descriptors, most pointing to log files for vms that are
no longer running.

Sometimes VMs crash under test, or the network goes screwy, and the
testbed has to resort to "virsh destroy" to make the VM go away. Since
the host has been up for about 12 days, I suspect it is the virsh destroy
that leaks the descriptor (but haven't tried to prove it).

Version-Release number of selected component (if applicable):
libvirt-client-0.7.7-4.fc13.x86_64
libvirt-0.7.7-4.fc13.x86_64
libvirt-python-0.7.7-4.fc13.x86_64


How reproducible:
Seems likely to be reproducible, but I haven't tried restarting everything
and waiting another 12 days yet :-).


Steps to Reproduce:
1. constant starting and shutting down of many different KVM machines
2. eventually libvirtd runs out of file descriptors.
3.
  
Actual results:
Out of file descriptors

Expected results:
Clean up after machines after shutdown or destruction

Additional info:
As long as everything is hosed now, I'm going to go ahead and get the latest
updates and reboot my host machines, so perhaps a newer libvirt will fix
this.

Comment 1 Tom Horsley 2010-06-21 12:15:31 UTC
Created attachment 425608 [details]
the ls -l of libvirtd's /proc/pid/fd directory

Comment 2 Tom Horsley 2010-06-21 12:40:25 UTC
Well, I updated and still see these versions on the system, so I
guess I already had the latest:

libvirt-client-0.7.7-4.fc13.x86_64
libvirt-0.7.7-4.fc13.x86_64
libvirt-python-0.7.7-4.fc13.x86_64

I've just rebooted to get new kernel active, and before ever
starting a VM, I see these file descriptors:

lr-x------ 1 root root 64 Jun 21 08:34 0 -> /dev/null
l-wx------ 1 root root 64 Jun 21 08:34 1 -> /dev/null
lrwx------ 1 root root 64 Jun 21 08:34 10 -> socket:[11232]
lrwx------ 1 root root 64 Jun 21 08:34 11 -> socket:[11228]
lrwx------ 1 root root 64 Jun 21 08:34 12 -> socket:[11246]
l-wx------ 1 root root 64 Jun 21 08:34 13 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 14 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 15 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 16 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 17 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 18 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 19 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 2 -> /dev/null
l-wx------ 1 root root 64 Jun 21 08:34 20 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 21 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 22 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 23 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 24 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 25 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 26 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 27 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 28 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 29 -> /proc/mtrr
lr-x------ 1 root root 64 Jun 21 08:34 3 -> pipe:[11222]
l-wx------ 1 root root 64 Jun 21 08:34 30 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 31 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 32 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 33 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 34 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 35 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 36 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 37 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 38 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 39 -> /proc/mtrr
lrwx------ 1 root root 64 Jun 21 08:34 4 -> socket:[11236]
l-wx------ 1 root root 64 Jun 21 08:34 40 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 41 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 42 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 43 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 44 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 45 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 46 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 47 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 48 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 49 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 5 -> pipe:[11222]
l-wx------ 1 root root 64 Jun 21 08:34 50 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 51 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 52 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 53 -> /proc/mtrr
l-wx------ 1 root root 64 Jun 21 08:34 54 -> /proc/mtrr
lr-x------ 1 root root 64 Jun 21 08:34 55 -> inotify
lr-x------ 1 root root 64 Jun 21 08:34 6 -> pipe:[11223]
l-wx------ 1 root root 64 Jun 21 08:34 7 -> pipe:[11223]
lrwx------ 1 root root 64 Jun 21 08:34 8 -> socket:[11224]
lrwx------ 1 root root 64 Jun 21 08:34 9 -> socket:[11226]

I don't have any idea what /proc/mtrr is, but does it really need 42
copies?

Comment 3 Cole Robinson 2010-06-21 13:11:47 UTC
I think /proc/mtrr is used by netcf, which is used by libvirt. Is netcf fully updated? Laine, see the previous comment, user is reporting a ton of open /proc/mtrr FDs.

Comment 4 Tom Horsley 2010-06-21 14:14:17 UTC
I just did a yum update, so everything should be up to date. I'm showing
netcf-libs-0.1.6-1.fc13.x86_64

But the /proc/mtrr FDs don't bother me as much as the nearly 1000 qemu
log files that are shown left open in the attachment showing the original
ls -l of libvirtd's file descriptors.

Comment 5 Tom Horsley 2010-06-21 18:28:29 UTC
I just did an experiment:

virsh start <machine>
wait a few seconds
virsh destroy <machine>

Every time you repeat that sequence, a new file descriptor appears
in the libvirtd process pointing at the qemu log file for <machine>.

If you do the exact same machine 10 times in a row, you'll get 10 new
file descriptors for log files for the same machine.

Comment 6 Cole Robinson 2010-06-21 18:43:21 UTC
Sweet, thanks for narrowing it down Tom. Sounds like this commit might fix the issue:

http://libvirt.org/git/?p=libvirt.git;a=commit;h=caad0a878337b52c4453444bb227b74cce6e6d5f

Comment 7 Tom Horsley 2010-07-28 12:49:48 UTC
Just to let you know, the newest update:

libvirt-0.8.2-1.fc13.x86_64

does seem to fix the primary problem - no more log file descriptor leaks
(there are still lots of /proc/mtrr instances, but they don't seem to
be leaking, there are just lots of them).

Comment 8 Cole Robinson 2010-07-28 13:07:33 UTC
Cool, since 0.8.2 is in stable now, we can close this. I think the /proc/mtrr instances have something to do with netcf and the apps they launch.