Bug 877095
Summary: | libvirt doesn't clean up open files for device assignment | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Alex Williamson <alex.williamson> | ||||||
Component: | libvirt | Assignee: | Jiri Denemark <jdenemar> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||
Severity: | unspecified | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 6.4 | CC: | acathrow, ajia, berrange, dyasny, dyuan, eblake, mzhan, rwu, weizhan, whuang | ||||||
Target Milestone: | rc | ||||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | libvirt-0.10.2-11.el6 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2013-02-21 07:26:52 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 886216 | ||||||||
Attachments: |
|
Description
Alex Williamson
2012-11-15 16:55:46 UTC
Libvirt already has to hold at least 1 fd per guest for the monitor; if you plan to scale to a large number of guests, then you have to edit /etc/libvirt/qemu.conf to bump up max_processes and max_files to match. But that said, you are correct that libvirt should not leak fds that are no longer needed. (In reply to comment #1) > Libvirt already has to hold at least 1 fd per guest for the monitor; if you > plan to scale to a large number of guests, then you have to edit > /etc/libvirt/qemu.conf to bump up max_processes and max_files to match. But > that said, you are correct that libvirt should not leak fds that are no > longer needed. Thanks, I didn't know about max_files. However, max_files only seems to change the qemu process limit, not the libvirtd limit. > Looking at the libvirt process, it has the config file for the assigned device > opened 997 times. So there are at least two problems, 1) libvirt isn't closing > the file when it's unused, and 2) libvirt doesn't scale if it holds even one fd > open per guest for the duration of the device (imagine 1000 VMs, each with 1
> assigned device).
Assuming you're referring to the /sys/bus/pci/devices/XXXXXXXXXX/config file handles, libvirt should *not* be holding them open long term. When attaching a hostdev during initial startup, we use virCommandTransferFD which should pass it down to QEMU, while closing it in the parent. When attaching a hostdev during hotplug, we use qemuMonitorAddDeviceWithFd() to pass a copy to the QEMU monitor, and then (AFAICT) we VIR_FORCE_CLOSE the FD in all code paths. So I'm not immediately spotting where our flaw is :-(
(In reply to comment #0) > Additional info: > > # cat hostdev.xml > <hostdev mode='subsystem' type='pci' managed='yes'> > <source> > <address domain='0x0000' bus='0x00' slot='0x19' function='0x0'/> > </source> > </hostdev> > > > #!/bin/sh > > i=0 > while (true); do > virsh attach-device rhel6 hostdev.xml --persistent > sleep 10 > DRV=$(basename $(readlink -f > /sys/bus/pci/devices/0000\:00\:19.0/driver/)) > if [ "$DRV" != "pci-stub" ]; then > exit 1 > fi > virsh detach-device rhel6 hostdev.xml --persistent > sleep 10 > DRV=$(basename $(readlink -f > /sys/bus/pci/devices/0000\:00\:19.0/driver/)) > if [ "$DRV" != "e1000e" ]; then > exit 1 > fi > i=$(( i + 1 )) > echo count $i > done Hi Alex, I just tried this testing according to your hostdev.xml configuration and shell script(ran "count 501" times) on libvirt-0.10.2-9.el6.x86_64(I assume that we haven't a patch to fix it on this version), but it seems I haven't met your issue. [root@201 ~]# sh 877095.sh <--- it's the same with your shell script. xxx count 500 Device attached successfully Device detached successfully count 501 Device attached successfully Device detached successfully ^C [root@201 ~]# ls -l /proc/`pidof qemu-kvm`/fd total 0 lrwx------. 1 qemu qemu 64 Nov 26 14:23 0 -> /dev/null l-wx------. 1 qemu qemu 64 Nov 26 14:23 1 -> /var/log/libvirt/qemu/myRHEL6.log lrwx------. 1 qemu qemu 64 Nov 26 14:23 10 -> /mnt/myRHEL6 lrwx------. 1 qemu qemu 64 Nov 26 14:23 11 -> anon_inode:[signalfd] lrwx------. 1 qemu qemu 64 Nov 26 14:23 12 -> anon_inode:kvm-vcpu lrwx------. 1 qemu qemu 64 Nov 26 14:23 13 -> socket:[114259924] lrwx------. 1 qemu qemu 64 Nov 26 14:23 14 -> socket:[114259922] lrwx------. 1 qemu qemu 64 Nov 26 14:23 15 -> anon_inode:[eventfd] lrwx------. 1 qemu qemu 64 Nov 26 14:23 16 -> anon_inode:[eventfd] lrwx------. 1 qemu qemu 64 Nov 26 14:23 17 -> anon_inode:[signalfd] lrwx------. 1 qemu qemu 64 Nov 26 14:23 18 -> socket:[114259925] lrwx------. 1 qemu qemu 64 Nov 26 14:23 19 -> socket:[114259927] l-wx------. 1 qemu qemu 64 Nov 26 14:23 2 -> /var/log/libvirt/qemu/myRHEL6.log lrwx------. 1 qemu qemu 64 Nov 26 14:23 23 -> /dev/net/tun lrwx------. 1 qemu qemu 64 Nov 26 14:23 3 -> socket:[114259910] lrwx------. 1 qemu qemu 64 Nov 26 14:23 4 -> /dev/ptmx lrwx------. 1 qemu qemu 64 Nov 26 14:23 5 -> socket:[114259915] lrwx------. 1 qemu qemu 64 Nov 26 14:23 6 -> /dev/kvm lrwx------. 1 qemu qemu 64 Nov 26 14:23 7 -> anon_inode:kvm-vm lr-x------. 1 qemu qemu 64 Nov 26 14:23 8 -> pipe:[114259918] l-wx------. 1 qemu qemu 64 Nov 26 14:23 9 -> pipe:[114259918] Alex, I think '501' times should be enough to reproduce it, but unfortunately, everything is okay for me, any advise about it? Thanks in advance, Alex (In reply to comment #5) > (In reply to comment #0) > > Additional info: > > > > # cat hostdev.xml > > <hostdev mode='subsystem' type='pci' managed='yes'> > > <source> > > <address domain='0x0000' bus='0x00' slot='0x19' function='0x0'/> > > </source> > > </hostdev> > > > > > > #!/bin/sh > > > > i=0 > > while (true); do > > virsh attach-device rhel6 hostdev.xml --persistent > > sleep 10 > > DRV=$(basename $(readlink -f > > /sys/bus/pci/devices/0000\:00\:19.0/driver/)) > > if [ "$DRV" != "pci-stub" ]; then > > exit 1 > > fi > > virsh detach-device rhel6 hostdev.xml --persistent > > sleep 10 > > DRV=$(basename $(readlink -f > > /sys/bus/pci/devices/0000\:00\:19.0/driver/)) > > if [ "$DRV" != "e1000e" ]; then > > exit 1 > > fi > > i=$(( i + 1 )) > > echo count $i > > done > > Hi Alex, > I just tried this testing according to your hostdev.xml configuration and > shell script(ran "count 501" times) on libvirt-0.10.2-9.el6.x86_64(I assume > that we haven't a patch to fix it on this version), but it seems I haven't > met your issue. > > [root@201 ~]# sh 877095.sh <--- it's the same with your shell script. > xxx > > count 500 > Device attached successfully > > Device detached successfully > > count 501 > Device attached successfully > > Device detached successfully > > ^C > > [root@201 ~]# ls -l /proc/`pidof qemu-kvm`/fd ^^^^^^^^ It's a libvirtd problem, not qemu-kvm. Look at the fds opened by libvirtd. (In reply to comment #6) > > [root@201 ~]# ls -l /proc/`pidof qemu-kvm`/fd > ^^^^^^^^ > It's a libvirtd problem, not qemu-kvm. Look at the fds opened by libvirtd. Alex, thanks for pointing out this, I will test it again. I reproduced this question on libvirt-0.10.2-10.el6.x86_64. [root@201 ~]# sh 877095.sh XXX Device attached successfully Device detached successfully count 16 ^C [root@201 ~]# # ll /proc/`pidof libvirtd`/fd total 0 lr-x------. 1 root root 64 Nov 27 16:25 0 -> /dev/null l-wx------. 1 root root 64 Nov 27 16:25 1 -> /dev/null l-wx------. 1 root root 64 Nov 27 16:25 10 -> pipe:[118817327] lrwx------. 1 root root 64 Nov 27 16:25 11 -> socket:[118817328] lrwx------. 1 root root 64 Nov 27 16:25 12 -> socket:[118817349] lrwx------. 1 root root 64 Nov 27 16:25 13 -> socket:[118817358] lrwx------. 1 root root 64 Nov 27 16:25 14 -> socket:[118817490] lrwx------. 1 root root 64 Nov 27 16:25 15 -> socket:[118817420] l-wx------. 1 root root 64 Nov 27 16:25 16 -> /proc/mtrr lrwx------. 1 root root 64 Nov 27 16:25 17 -> socket:[118817875] lrwx------. 1 root root 64 Nov 27 16:25 18 -> /var/run/libvirt/network/nwfilter.leases lrwx------. 1 root root 64 Nov 27 16:28 19 -> socket:[119746477] l-wx------. 1 root root 64 Nov 27 16:25 2 -> /dev/null lrwx------. 1 root root 64 Nov 27 16:28 20 -> socket:[119746478] lrwx------. 1 root root 64 Nov 27 16:28 21 -> socket:[119746479] lrwx------. 1 root root 64 Nov 27 16:28 22 -> /sys/devices/pci0000:00/0000:00:19.0/config lrwx------. 1 root root 64 Nov 27 16:28 24 -> socket:[119742094] lrwx------. 1 root root 64 Nov 27 16:28 25 -> socket:[119741984] lrwx------. 1 root root 64 Nov 27 16:28 28 -> /sys/devices/pci0000:00/0000:00:19.0/config lrwx------. 1 root root 64 Nov 27 16:34 29 -> /sys/devices/pci0000:00/0000:00:19.0/config lr-x------. 1 root root 64 Nov 27 16:25 3 -> /dev/urandom lrwx------. 1 root root 64 Nov 27 16:34 30 -> /sys/devices/pci0000:00/0000:00:19.0/config lrwx------. 1 root root 64 Nov 27 16:34 31 -> /sys/devices/pci0000:00/0000:00:19.0/config lrwx------. 1 root root 64 Nov 27 16:34 32 -> /sys/devices/pci0000:00/0000:00:19.0/config lrwx------. 1 root root 64 Nov 27 16:34 33 -> /sys/devices/pci0000:00/0000:00:19.0/config lrwx------. 1 root root 64 Nov 27 16:34 34 -> /sys/devices/pci0000:00/0000:00:19.0/config lrwx------. 1 root root 64 Nov 27 16:34 35 -> /sys/devices/pci0000:00/0000:00:19.0/config lrwx------. 1 root root 64 Nov 27 16:34 36 -> /sys/devices/pci0000:00/0000:00:19.0/config lrwx------. 1 root root 64 Nov 27 16:34 37 -> /sys/devices/pci0000:00/0000:00:19.0/config lrwx------. 1 root root 64 Nov 27 16:34 38 -> /sys/devices/pci0000:00/0000:00:19.0/config lrwx------. 1 root root 64 Nov 27 16:34 39 -> /sys/devices/pci0000:00/0000:00:19.0/config l-wx------. 1 root root 64 Nov 27 16:25 4 -> /var/log/libvirt/libvirtd.log lrwx------. 1 root root 64 Nov 27 16:34 40 -> /sys/devices/pci0000:00/0000:00:19.0/config lrwx------. 1 root root 64 Nov 27 16:34 41 -> /sys/devices/pci0000:00/0000:00:19.0/config lrwx------. 1 root root 64 Nov 27 16:34 42 -> /sys/devices/pci0000:00/0000:00:19.0/config l-wx------. 1 root root 64 Nov 27 16:25 5 -> /var/run/libvirtd.pid lrwx------. 1 root root 64 Nov 27 16:25 6 -> socket:[118817451] lr-x------. 1 root root 64 Nov 27 16:25 7 -> pipe:[118817326] l-wx------. 1 root root 64 Nov 27 16:25 8 -> pipe:[118817326] lr-x------. 1 root root 64 Nov 27 16:25 9 -> pipe:[118817327] I was able to partially reproduce the issue. The config file remains open after attach-device and even detach-device but as soon as I call attach-device again, the originally closed fd gets closed and a new one is open. In other words, I can only see one open file descriptor for the same config file at a time. To confirm it's just bad luck rather than a different bug, could you run the attached systemtap script? When everything is ready for the test loop, run the following command: stap -d /usr/sbin/libvirtd --ldd ./unclosed-fds.stp libvirtd wait, until it says "Tracking unclosed fds..." and run the reproducer script (limiting the number of iterations may be a good idea). Once the reproducer stops, abort the systemtap script with Ctrl-C. The script will produce a lot of data but I'm mainly interested in the data following "Unclosed fds:", which is printed after the script gets Ctrl-C. Created attachment 654929 [details]
unclosed-fds.stp
Created attachment 655169 [details]
Full log from stap script
Full log attached, but here's the end result from a 3 loop run:
Unclosed fds:
2607: "/sys/bus/pci/devices/0000:00:19.0/config" still open as 26 from:
__open_nocancel+0x24 [libpthread-2.12.so]
pciDeviceFileIterate+0x29e [libvirt.so.0.10.2]
pciDeviceFileIterate+0x652 [libvirt.so.0.10.2]
pciResetDevice+0x17d [libvirt.so.0.10.2]
qemuDomainSnapshotDiscardAll+0x234d1 [libvirtd]
qemuDomainSnapshotDiscardAll+0xf1f9 [libvirtd]
qemuDomainSnapshotDiscardAll+0xf76e [libvirtd]
_init+0x437c4 [libvirtd]
virDomainAttachDeviceFlags+0x11f [libvirt.so.0.10.2]
_init+0x16759 [libvirtd]
virNetServerProgramDispatch+0x462 [libvirt.so.0.10.2]
virNetServerA
2607: "/sys/bus/pci/devices/0000:00:19.0/config" still open as 27 from:
__open_nocancel+0x24 [libpthread-2.12.so]
pciDeviceFileIterate+0x29e [libvirt.so.0.10.2]
pciDeviceFileIterate+0x652 [libvirt.so.0.10.2]
pciResetDevice+0x17d [libvirt.so.0.10.2]
qemuDomainSnapshotDiscardAll+0x234d1 [libvirtd]
qemuDomainSnapshotDiscardAll+0xf1f9 [libvirtd]
qemuDomainSnapshotDiscardAll+0xf76e [libvirtd]
_init+0x437c4 [libvirtd]
virDomainAttachDeviceFlags+0x11f [libvirt.so.0.10.2]
_init+0x16759 [libvirtd]
virNetServerProgramDispatch+0x462 [libvirt.so.0.10.2]
virNetServerA
2607: "/sys/bus/pci/devices/0000:00:19.0/config" still open as 28 from:
__open_nocancel+0x24 [libpthread-2.12.so]
pciDeviceFileIterate+0x29e [libvirt.so.0.10.2]
pciDeviceFileIterate+0x652 [libvirt.so.0.10.2]
pciResetDevice+0x17d [libvirt.so.0.10.2]
qemuDomainSnapshotDiscardAll+0x234d1 [libvirtd]
qemuDomainSnapshotDiscardAll+0xf1f9 [libvirtd]
qemuDomainSnapshotDiscardAll+0xf76e [libvirtd]
_init+0x437c4 [libvirtd]
virDomainAttachDeviceFlags+0x11f [libvirt.so.0.10.2]
_init+0x16759 [libvirtd]
virNetServerProgramDispatch+0x462 [libvirt.so.0.10.2]
virNetServerA
Hmm, the stacktraces are quite confused. But anyway, I was finally able to reproduce this issue completely, i.e., with the increasing number of unclosed file descriptors: Unclosed fds: 2618: "/sys/bus/pci/devices/0000:07:06.0/config" still open as 24 from: __open_nocancel+0x24 [libpthread-2.12.so] pciOpenConfig+0x6e [libvirt.so.0.10.2] pciInitDevice+0x12 [libvirt.so.0.10.2] pciResetDevice+0x17d [libvirt.so.0.10.2] qemuPrepareHostdevPCIDevices+0x2d1 [libvirtd] qemuDomainAttachHostPciDevice+0x99 [libvirtd] qemuDomainAttachHostDevice+0x1ae [libvirtd] qemuDomainModifyDeviceFlags+0x884 [libvirtd] virDomainAttachDeviceFlags+0x11f [libvirt.so.0.10.2] remoteDispatchDomainAttachDeviceFlagsHelper+0x109 [libvirtd] virNetServerProgramDispatch+0x462 [ 2618: "/sys/bus/pci/devices/0000:07:06.0/config" still open as 26 from: __open_nocancel+0x24 [libpthread-2.12.so] pciOpenConfig+0x6e [libvirt.so.0.10.2] pciInitDevice+0x12 [libvirt.so.0.10.2] pciResetDevice+0x17d [libvirt.so.0.10.2] qemuPrepareHostdevPCIDevices+0x2d1 [libvirtd] qemuDomainAttachHostPciDevice+0x99 [libvirtd] qemuDomainAttachHostDevice+0x1ae [libvirtd] qemuDomainModifyDeviceFlags+0x884 [libvirtd] virDomainAttachDeviceFlags+0x11f [libvirt.so.0.10.2] remoteDispatchDomainAttachDeviceFlagsHelper+0x109 [libvirtd] virNetServerProgramDispatch+0x462 [ 2618: "/sys/bus/pci/devices/0000:07:06.0/config" still open as 27 from: __open_nocancel+0x24 [libpthread-2.12.so] pciOpenConfig+0x6e [libvirt.so.0.10.2] pciInitDevice+0x12 [libvirt.so.0.10.2] pciResetDevice+0x17d [libvirt.so.0.10.2] qemuPrepareHostdevPCIDevices+0x2d1 [libvirtd] qemuDomainAttachHostPciDevice+0x99 [libvirtd] qemuDomainAttachHostDevice+0x1ae [libvirtd] qemuDomainModifyDeviceFlags+0x884 [libvirtd] virDomainAttachDeviceFlags+0x11f [libvirt.so.0.10.2] remoteDispatchDomainAttachDeviceFlagsHelper+0x109 [libvirtd] virNetServerProgramDispatch+0x462 [ ... Patches sent upstream for review: https://www.redhat.com/archives/libvir-list/2012-December/msg00060.html Verify this bug : libvirt-0.10.2-11.el6.x86_64 1) you should update domain name and pci's bus slot and function number to match your host #cat bug.sh #!/bin/sh i=0 while (true); do virsh attach-device w host.xml --persistent sleep 10 DRV=$(basename $(readlink -f /sys/bus/pci/devices/0000\:03\:10.4/driver/)) if [ "$DRV" != "pci-stub" ]; then exit 1 fi virsh detach-device w host.xml --persistent sleep 10 DRV=$(basename $(readlink -f /sys/bus/pci/devices/0000\:03\:10.4/driver/)) if [ "$DRV" != "igbvf" ]; then exit 1 fi i=$(( i + 1 )) echo count $i done 2) #cat host.xml <hostdev mode='subsystem' type='pci' managed='yes'> <source> <address domain='0x0000' bus='0x03' slot='0x10' function='0x4'/> </source> </hostdev> 3) run bug.sh # sh -x bug.sh + i=0 + true + virsh attach-device w host.xml --persistent Device attached successfully + sleep 10 +++ readlink -f /sys/bus/pci/devices/0000:03:10.4/driver/ ++ basename /sys/bus/pci/drivers/pci-stub + DRV=pci-stub + '[' pci-stub '!=' pci-stub ']' + virsh detach-device w host.xml --persistent Device detached successfully + sleep 10 +++ readlink -f /sys/bus/pci/devices/0000:03:10.4/driver/ ++ basename /sys/bus/pci/drivers/igbvf + DRV=igbvf + '[' igbvf '!=' igbvf ']' + i=1 + echo count 1 count 1 + true + virsh attach-device w host.xml --persistent Device attached successfully ... 4) check the files in dir #ll /proc/`pidof libvirtd`/fd total 0 lr-x------. 1 root root 64 Dec 6 14:20 0 -> /dev/null l-wx------. 1 root root 64 Dec 6 14:20 1 -> /dev/null l-wx------. 1 root root 64 Dec 6 14:20 10 -> pipe:[28888] lrwx------. 1 root root 64 Dec 6 14:20 11 -> socket:[28889] lrwx------. 1 root root 64 Dec 6 14:20 12 -> socket:[28892] lrwx------. 1 root root 64 Dec 6 14:20 13 -> socket:[28894] lrwx------. 1 root root 64 Dec 6 14:20 14 -> socket:[28897] l-wx------. 1 root root 64 Dec 6 14:20 15 -> /proc/mtrr lrwx------. 1 root root 64 Dec 6 14:20 16 -> socket:[28899] lrwx------. 1 root root 64 Dec 6 14:20 17 -> socket:[28916] lrwx------. 1 root root 64 Dec 6 14:20 18 -> socket:[30555] lrwx------. 1 root root 64 Dec 6 14:20 19 -> socket:[30561] l-wx------. 1 root root 64 Dec 6 14:20 2 -> /dev/null lrwx------. 1 root root 64 Dec 6 14:20 20 -> socket:[30562] lrwx------. 1 root root 64 Dec 6 14:20 21 -> socket:[46361] lrwx------. 1 root root 64 Dec 6 14:20 22 -> socket:[46366] lrwx------. 1 root root 64 Dec 6 14:20 23 -> socket:[46367] lrwx------. 1 root root 64 Dec 6 14:20 25 -> socket:[46770] lrwx------. 1 root root 64 Dec 6 14:20 26 -> socket:[46751] lr-x------. 1 root root 64 Dec 6 14:20 3 -> /dev/urandom l-wx------. 1 root root 64 Dec 6 14:20 4 -> /var/log/libvirtd.log l-wx------. 1 root root 64 Dec 6 14:20 5 -> /var/run/libvirtd.pid lrwx------. 1 root root 64 Dec 6 14:20 6 -> socket:[28898] lr-x------. 1 root root 64 Dec 6 14:20 7 -> pipe:[28887] l-wx------. 1 root root 64 Dec 6 14:20 8 -> pipe:[28887] lr-x------. 1 root root 64 Dec 6 14:20 9 -> pipe:[28888] there is not fd should be closed like: lrwx------. 1 root root 64 Nov 27 16:34 35 -> /sys/devices/pci0000:00/0000:00:19.0/config Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2013-0276.html |