Description of problem: When passing file descriptors into either virDomainCreateXMLWithFiles or virDomainCreateWithFiles, they are dup'ed but never closed after they're sent to libvirtd. The virNetClientProgramCall function dup's the file descriptors and adds them to a virNetMessagePtr, however when the file descriptors are eventually sent in the virNetClientIOWriteMessage function they are removed from the msg but not closed. How reproducible: Using libvirt client to connect to libvirtd with lxc driver, 100% reproducible. Steps to Reproduce: 1. fd = open("/dev/null", O_RDWR); 2. virDomainCreateXMLWithFiles(domain, xml, 1, &fd, 0); 3. close(fd); Actual results: lsof will show that /dev/null still has an open file descriptor. Expected results: lsof should show that /dev/null is closed. Additional info: I did the following change to virNetClientIOWriteMessage which fixed the problem for me, but haven't done a full investigation into where else it could be leaking. static ssize_t virNetClientIOWriteMessage(virNetClientPtr client, virNetClientCallPtr thecall) { ssize_t ret = 0; if (thecall->msg->bufferOffset < thecall->msg->bufferLength) { ret = virNetSocketWrite(client->sock, thecall->msg->buffer + thecall->msg->bufferOffset, thecall->msg->bufferLength - thecall->msg->bufferOffset); if (ret <= 0) return ret; thecall->msg->bufferOffset += ret; } if (thecall->msg->bufferOffset == thecall->msg->bufferLength) { size_t i; for (i = thecall->msg->donefds; i < thecall->msg->nfds; i++) { int rv; if ((rv = virNetSocketSendFD(client->sock, thecall->msg->fds[i])) < 0) return -1; if (rv == 0) /* Blocking */ return 0; + VIR_FORCE_CLOSE(thecall->msg->fds[i]); thecall->msg->donefds++; } thecall->msg->donefds = 0; thecall->msg->bufferOffset = thecall->msg->bufferLength = 0; VIR_FREE(thecall->msg->fds); VIR_FREE(thecall->msg->buffer); if (thecall->expectReply) thecall->mode = VIR_NET_CLIENT_MODE_WAIT_RX; else thecall->mode = VIR_NET_CLIENT_MODE_COMPLETE; } return ret;
Sorry this didn't receive a timely response, and thanks for trying to fix this. I don't really know how that API is expected to be used... do you know if this is still an issue with more recent libvirt?
Yup - it's still an issue, we recently moved up to v1.3.2 and the problem was still present there (we also applied the above patch to fix it). We use the 'with files' API to create LXC containers with various file descriptors passed in for things like logging pipes, wayland sockets, IPC mechanism, etc. FWIW - we've recently discovered another bug with the RPC code regarding passing file descriptors, causing corrupt RPC messages, I'll raise a separate bug for that shortly.
I dug into this a bunch, and I think your patch is correct and probably the best we can do at the moment. Sent upstream with a bunch of comments: http://www.redhat.com/archives/libvir-list/2016-April/msg01618.html
This is not a bug and we don't leak any FD here. In fact, this is expected behavior. You start an LXC domain with FD passed to this domain so it's expected. I've tried it too and as you can see from those outputs: from ps: 4418 ? Sl 0:00 /usr/libexec/libvirt_lxc --name test --console 24 --passfd 23 --security=selinux --handshake 27 4419 ? S 0:00 \_ /bin/sh from lsof: sh 4419 root 3u REG 253,2 0 394757 /tmp/test.txt and this is the XML of that domain: <domain type='lxc'> <name>test</name> <memory>500000</memory> <os> <type>exe</type> <init>/bin/sh</init> </os> <vcpu>1</vcpu> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>destroy</on_crash> <devices> <emulator>/usr/libexec/libvirt_lxc</emulator> <console type='pty' /> </devices> </domain> The 4419 process is actually the /bin/sh process defined in the domain XML.
Reopening per further list discussion: http://www.redhat.com/archives/libvir-list/2016-April/msg01820.html
Pushed upstream now, thanks for the report! commit 5ba48584fbc5079c0ddbc9e9a52c96d7bcef0761 Author: Ben Gray <ben.r.gray> Date: Sat Apr 23 18:38:21 2016 -0400 rpc: Don't leak fd via CreateXMLWithFiles