Bug 1159766 - File descriptor leak when calling virDomainCreateXMLWithFiles or virDomainCreateWithFiles.
Summary: File descriptor leak when calling virDomainCreateXMLWithFiles or virDomainCre...
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Virtualization Tools
Classification: Community
Component: libvirt
Version: unspecified
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Libvirt Maintainers
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-11-03 09:49 UTC by Ben Gray
Modified: 2016-04-29 10:48 UTC (History)
5 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2016-04-29 10:48:48 UTC
Embargoed:


Attachments (Terms of Use)

Description Ben Gray 2014-11-03 09:49:46 UTC
Description of problem:
When passing file descriptors into either virDomainCreateXMLWithFiles or virDomainCreateWithFiles, they are dup'ed but never closed after they're sent to libvirtd.

The virNetClientProgramCall function dup's the file descriptors and adds them to a virNetMessagePtr, however when the file descriptors are eventually sent in the virNetClientIOWriteMessage function they are removed from the msg but not closed.


How reproducible:
Using libvirt client to connect to libvirtd with lxc driver, 100% reproducible.

Steps to Reproduce:
1. fd = open("/dev/null", O_RDWR);
2. virDomainCreateXMLWithFiles(domain, xml, 1, &fd, 0);
3. close(fd);

Actual results:
lsof will show that /dev/null still has an open file descriptor.

Expected results:
lsof should show that /dev/null is closed.


Additional info:
I did the following change to virNetClientIOWriteMessage which fixed the problem for me, but haven't done a full investigation into where else it could be leaking.


 static ssize_t
 virNetClientIOWriteMessage(virNetClientPtr client,
                           virNetClientCallPtr thecall)
 {
    ssize_t ret = 0;

    if (thecall->msg->bufferOffset < thecall->msg->bufferLength) {
        ret = virNetSocketWrite(client->sock,
                                thecall->msg->buffer + thecall->msg->bufferOffset,
                                thecall->msg->bufferLength - thecall->msg->bufferOffset);
        if (ret <= 0)
            return ret;

        thecall->msg->bufferOffset += ret;
    }

    if (thecall->msg->bufferOffset == thecall->msg->bufferLength) {
        size_t i;
        for (i = thecall->msg->donefds; i < thecall->msg->nfds; i++) {
            int rv;
            if ((rv = virNetSocketSendFD(client->sock, thecall->msg->fds[i])) < 0)
                return -1;
            if (rv == 0) /* Blocking */
                return 0;
+           VIR_FORCE_CLOSE(thecall->msg->fds[i]);
            thecall->msg->donefds++;
        }
        thecall->msg->donefds = 0;
        thecall->msg->bufferOffset = thecall->msg->bufferLength = 0;
        VIR_FREE(thecall->msg->fds);
        VIR_FREE(thecall->msg->buffer);
        if (thecall->expectReply)
            thecall->mode = VIR_NET_CLIENT_MODE_WAIT_RX;
        else
            thecall->mode = VIR_NET_CLIENT_MODE_COMPLETE;
    }

    return ret;

Comment 1 Cole Robinson 2016-04-10 19:57:11 UTC
Sorry this didn't receive a timely response, and thanks for trying to fix this. I don't really know how that API is expected to be used... do you know if this is still an issue with more recent libvirt?

Comment 2 Ben Gray 2016-04-11 07:41:15 UTC
Yup - it's still an issue, we recently moved up to v1.3.2 and the problem was still present there (we also applied the above patch to fix it).

We use the 'with files' API to create LXC containers with various file descriptors passed in for things like logging pipes, wayland sockets, IPC mechanism, etc.

FWIW - we've recently discovered another bug with the RPC code regarding passing file descriptors, causing corrupt RPC messages, I'll raise a separate bug for that shortly.

Comment 3 Cole Robinson 2016-04-23 23:03:20 UTC
I dug into this a bunch, and I think your patch is correct and probably the best we can do at the moment. Sent upstream with a bunch of comments:

http://www.redhat.com/archives/libvir-list/2016-April/msg01618.html

Comment 4 Pavel Hrdina 2016-04-27 16:17:26 UTC
This is not a bug and we don't leak any FD here.  In fact, this is expected behavior.  You start an LXC domain with FD passed to this domain so it's expected.  I've tried it too and as you can see from those outputs:

from ps:

 4418 ?        Sl     0:00 /usr/libexec/libvirt_lxc --name test --console 24 --passfd 23 --security=selinux --handshake 27
 4419 ?        S      0:00  \_ /bin/sh

from lsof:

 sh      4419 root    3u   REG  253,2        0 394757 /tmp/test.txt

and this is the XML of that domain:

<domain type='lxc'>                                                              
  <name>test</name>                                                              
  <memory>500000</memory>                                                        
  <os>                                                                           
    <type>exe</type>                                                             
    <init>/bin/sh</init>                                                         
  </os>                                                                          
  <vcpu>1</vcpu>                                                                 
  <clock offset='utc'/>                                                          
  <on_poweroff>destroy</on_poweroff>                                             
  <on_reboot>restart</on_reboot>                                                 
  <on_crash>destroy</on_crash>                                                   
  <devices>                                                                      
    <emulator>/usr/libexec/libvirt_lxc</emulator>                                
    <console type='pty' />                                                       
  </devices>                                                                     
</domain>

The 4419 process is actually the /bin/sh process defined in the domain XML.

Comment 5 Cole Robinson 2016-04-27 16:52:08 UTC
Reopening per further list discussion:

http://www.redhat.com/archives/libvir-list/2016-April/msg01820.html

Comment 6 Cole Robinson 2016-04-29 10:48:48 UTC
Pushed upstream now, thanks for the report!

commit 5ba48584fbc5079c0ddbc9e9a52c96d7bcef0761
Author: Ben Gray <ben.r.gray>
Date:   Sat Apr 23 18:38:21 2016 -0400

    rpc: Don't leak fd via CreateXMLWithFiles


Note You need to log in before you can comment on or make changes to this bug.