Bug 913774

Summary: libguestfs: could not create appliance through libvirt when URI is qemu:///system or running as root (which causes system to be used implicitly)
Product: [Community] Virtualization Tools Reporter: Dan Prince <dprince>
Component: libvirtAssignee: Libvirt Maintainers <libvirt-maint>
Status: CLOSED DEFERRED QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: berrange, crobinso, dprince, mbooth, rbalakri, rjones, unicell
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-04-09 22:36:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Dan Prince 2013-02-22 01:43:55 UTC
Description of problem:

I get the following error in OpenStack Nova's compute.log file when trying to inject files into images via libguestfs when attach mode is set with:

self.handle.set_attach_method('libvirt:qemu:///system')

2013-02-21 10:48:55.924 ERROR nova.compute.manager [req-5cbb03b2-6645-4df5-a21d-965893a3a059 6dcaa7505f9340268bcc7b0f1f11f651 d0c6d5dec6474bd68c880061060199bc] [instance: 1a6883e8-3d95-4474-8d58-6d642e116f71] Error: ['Traceback (most recent call last):\n', '  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 771, in _run_instance\n    injected_files, admin_password)\n', '  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1040, in _spawn\n    block_device_info)\n', '  File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 1480, in spawn\n    admin_pass=admin_password)\n', '  File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 1857, in _create_image\n    mandatory=(\'files\',))\n', '  File "/usr/lib/python2.7/site-packages/nova/virt/disk/api.py", line 304, in inject_data\n    fs.setup()\n', '  File "/usr/lib/python2.7/site-packages/nova/virt/disk/vfs/guestfs.py", line 111, in setup\n    {\'imgfile\': self.imgfile, \'e\': e})\n', 'NovaException: Error mounting /var/lib/nova/instances/1a6883e8-3d95-4474-8d58-6d642e116f71/disk with libguestfs (could not create appliance through libvirt: internal error process exited while connecting to monitor: connect(unix:/tmp/libguestfsRajfrp/console.sock): Permission denied\nchardev: opening backend "socket" failed\n [code=1 domain=10])\n']

NOTE: This is not currently the default attach mode for OpenStack Nova but we would like it to be since Nova uses qemu:///system as the default libvirt URI.

Version-Release number of selected component (if applicable):

 libguestfs versions 1.20.1-3.fc18.x86_64

 libvirt versions 0.10.2.3-1.fc18.x86_64

How reproducible:

 This always happens if I use attach mode 'libvirt:qemu:///system' for libguestfs with OpenStack Nova.

Comment 1 Richard W.M. Jones 2013-02-22 13:30:26 UTC
Another report that was similar (in heat).  Notice that the
user is running libguestfs as root, which IIUC would cause libvirt
to try to connect to qemu:///system (implicitly).

    sudo -E heat-jeos -y create F17-x86_64-gold -d
    DEBUG:Debug level logging enabled
    DEBUG:libvirt bridge name is virbr0
    DEBUG:Libvirt type is kvm
    DEBUG:Name: F17-x86_64-gold, UUID: 868bc92a-1048-46b5-8f4d-1571ca4f6073
    ...
    ...
    libguestfs: trace: set_verbose true
    libguestfs: trace: set_verbose = 0
    libguestfs: create: flags = 0, handle = 0x2af8e50
    DEBUG:Adding ISO image /var/lib/oz/isos/Fedora17x86_64-iso.iso
    libguestfs: trace: add_drive "/var/lib/oz/isos/Fedora17x86_64-iso.iso" "readonly:true" "format:raw"
    libguestfs: trace: add_drive = 0
    DEBUG:Launching guestfs
    libguestfs: trace: launch
    libguestfs: trace: get_tmpdir
    libguestfs: trace: get_tmpdir = "/tmp"
    libguestfs: libvirt version = 10002
    libguestfs: [00000ms] connect to libvirt
    libguestfs: [00005ms] get libvirt capabilities
    libguestfs: [00742ms] parsing capabilities XML
    libguestfs: [00742ms] build appliance
    libguestfs: command: run: febootstrap-supermin-helper
    libguestfs: command: run: \ --verbose
    libguestfs: command: run: \ -f checksum
    libguestfs: command: run: \ /usr/lib64/guestfs/supermin.d
    libguestfs: command: run: \ x86_64
    supermin helper [00000ms] whitelist = (not specified), host_cpu = x86_64, kernel = (null), initrd = (null), appliance = (null)
    supermin helper [00000ms] inputs[0] = /usr/lib64/guestfs/supermin.d
    checking modpath /lib/modules/3.6.10-4.fc18.x86_64 is a directory
    picked vmlinuz-3.6.10-4.fc18.x86_64 because modpath /lib/modules/3.6.10-4.fc18.x86_64 exists
    checking modpath /lib/modules/3.7.4-204.fc18.x86_64 is a directory
    picked vmlinuz-3.7.4-204.fc18.x86_64 because modpath /lib/modules/3.7.4-204.fc18.x86_64 exists
    supermin helper [00000ms] finished creating kernel
    supermin helper [00000ms] visiting /usr/lib64/guestfs/supermin.d
    supermin helper [00000ms] visiting /usr/lib64/guestfs/supermin.d/base.img
    supermin helper [00000ms] visiting /usr/lib64/guestfs/supermin.d/daemon.img
    supermin helper [00000ms] visiting /usr/lib64/guestfs/supermin.d/hostfiles
    supermin helper [00023ms] visiting /usr/lib64/guestfs/supermin.d/init.img
    supermin helper [00023ms] visiting /usr/lib64/guestfs/supermin.d/udev-rules.img
    supermin helper [00023ms] adding kernel modules
    supermin helper [00046ms] finished creating appliance
    libguestfs: checksum of existing appliance: dda1e9feee07461eb36d11467ba5f0c2616dd92b19871973bc4c4667c3860bfa
    libguestfs: trace: get_cachedir
    libguestfs: trace: get_cachedir = "/var/tmp"
    libguestfs: command: run: qemu-img
    libguestfs: command: run: \ create
    libguestfs: command: run: \ -f qcow2
    libguestfs: command: run: \ -b /var/tmp/.guestfs-0/root.5770
    libguestfs: command: run: \ -o backing_fmt=raw
    libguestfs: command: run: \ /tmp/libguestfsKY8kTY/snapshot1
    Formatting '/tmp/libguestfsKY8kTY/snapshot1', fmt=qcow2 size=4294967296 backing_file='/var/tmp/.guestfs-0/root.5770' backing_fmt='raw' encryption=off cluster_size=65536 lazy_refcounts=off
    libguestfs: command: run: qemu-img
    libguestfs: command: run: \ create
    libguestfs: command: run: \ -f qcow2
    libguestfs: command: run: \ -b /var/lib/oz/isos/Fedora17x86_64-iso.iso
    libguestfs: command: run: \ -o backing_fmt=raw
    libguestfs: command: run: \ /tmp/libguestfsKY8kTY/snapshot2
    Formatting '/tmp/libguestfsKY8kTY/snapshot2', fmt=qcow2 size=3834642432 backing_file='/var/lib/oz/isos/Fedora17x86_64-iso.iso' backing_fmt='raw' encryption=off cluster_size=65536 lazy_refcounts=off
    libguestfs: set_socket_create_context: getcon failed: Invalid argument
    libguestfs: clear_socket_create_context: setsockcreatecon (NULL) failed: Invalid argument
    libguestfs: [00821ms] create libvirt XML
    libguestfs: trace: get_cachedir
    libguestfs: trace: get_cachedir = "/var/tmp"
    libguestfs: libvirt XML:\n<?xml version="1.0"?>\n<domain type="kvm" xmlns:qemu="http://libvirt.org/schemas/domain/qemu/1.0">\n <name>guestfs-rdzmql9qfeib6t63</name>\n <memory unit="MiB">500</memory>\n <currentMemory unit="MiB">500</currentMemory>\n <vcpu>1</vcpu>\n <clock offset="utc"/>\n <os>\n <type>hvm</type>\n <kernel>/var/tmp/.guestfs-0/kernel.5770</kernel>\n <initrd>/var/tmp/.guestfs-0/initrd.5770</initrd>\n <cmdline>panic=1 console=ttyS0 udevtimeout=600 no_timer_check acpi=off printk.time=1 cgroup_disable=memory root=/dev/sdb selinux=0 guestfs_verbose=1 TERM=xterm</cmdline>\n </os>\n <on_reboot>destroy</on_reboot>\n <devices>\n <controller type="scsi" index="0" model="virtio-scsi"/>\n <disk device="disk" type="file">\n <source file="/tmp/libguestfsKY8kTY/snapshot2"/>\n <target dev="sda" bus="scsi"/>\n <driver name="qemu" type="qcow2"/>\n <address type="drive" controller="0" bus="0" target="0" unit="0"/>\n </disk>\n <disk type="file" device="disk">\n <source file="/tmp/libguestfsKY8kTY/snapshot1"/>\n <target dev="sdb" bus="scsi"/>\n <driver name="qemu" type="qcow2" cache="unsafe"/>\n <address type="drive" controller="0" bus="0" target="1" unit="0"/>\n <shareable/>\n </disk>\n <serial type="unix">\n <source mode="connect" path="/tmp/libguestfsKY8kTY/console.sock"/>\n <target port="0"/>\n </serial>\n <channel type="unix">\n <source mode="connect" path="/tmp/libguestfsKY8kTY/guestfsd.sock"/>\n <target type="virtio" name="org.libguestfs.channel.0"/>\n </channel>\n </devices>\n <qemu:commandline>\n <qemu:env name="TMPDIR" value="/var/tmp"/>\n </qemu:commandline>\n</domain>\n
    libguestfs: [00822ms] launch libvirt guest
    libguestfs: clear_socket_create_context: setsockcreatecon (NULL) failed: Invalid argument
    libguestfs: trace: launch = -1 (error)
    INFO:Cleaning up after install
    Usage: heat-jeos <command> [options] [args]
     
    Commands:
     
    list Prepare a template ready for Oz
     
    create Create a JEOS image from a template
     
    help <command> Output help for one of the commands below
     
    ERROR:ERROR: could not create appliance through libvirt: internal error process exited while connecting to monitor: connect(unix:/tmp/libguestfsKY8kTY/console.sock): Permission denied
    chardev: opening backend "socket" failed
    [code=1 domain=10]
    libguestfs: trace: close
    libguestfs: closing guestfs handle 0x2af8e50 (state 0)
    libguestfs: command: run: rm
    libguestfs: command: run: \ -rf /tmp/libguestfsKY8kTY

Comment 2 Richard W.M. Jones 2013-02-22 13:33:09 UTC
Adding Richard Harman to this bug who seemed to encounter
something pretty similar (see bug 909619 description).

Comment 3 Richard W.M. Jones 2013-02-28 14:42:27 UTC
A one-line reproducer for this bug is:

  LIBGUESTFS_ATTACH_METHOD=libvirt:qemu:///system libguestfs-test-tool

On Fedora 18:

 - didn't try non-root because it needs PolicyKit changes
 - fails as root

On Fedora Rawhide:

 - fails as non-root
 - OK as root

In both cases the error is a variation on:

libguestfs: error: could not create appliance through libvirt: internal error process exited while connecting to monitor: qemu-system-x86_64: -chardev socket,id=charserial0,path=/home/rjones/d/libguestfs/tmp/libguestfsVCZ6Up/console.sock: Failed to connect to socket: Permission denied
chardev: opening backend "socket" failed
 [code=1 domain=10]

Comment 4 Richard W.M. Jones 2013-03-05 17:09:24 UTC
I can no longer reproduce the Fedora 18 problem, but the
non-root Rawhide case is fairly clear.  The good news: it's
not SELinux!

When you run libguestfs as non-root with a qemu:///system URI,
what happens is that the systemwide libvirtd starts qemu as
'qemu.qemu'.  The sockets that it needs to open for write has
default permissions, so for example if your umask is 022 then
it would be srwxr-xr-x.  Since 'qemu.qemu' is in the "other"
category, it won't be able to open the sockets and you'll get
permission denied.

There isn't any good way to solve this: You probably wouldn't
want the sockets to be 0777.  I think this is something that can
only be truly solved by having libvirtd open the sockets and pass
the file descriptor over to qemu.

However, I have added some debugging to libguestfs which should
make it easier in future to find out what's really going on with
this bug in all the different cases:

https://github.com/libguestfs/libguestfs/commit/ba08a51094409811d0bd01a7af5ec596a8640cc1

I will include this debugging patch in stable-1.20.

Comment 5 Daniel Berrangé 2013-03-06 13:22:13 UTC
I've run the libguestfs-test-tool example above and got the following

 F18
  - Run as root: pass
  - Run as non-root: fail

 Rawhide:
  - Run as root: pass
  - Run as non-root: fail

Not sure why the behaviour would be different yet though. In both cases the appliance ends up running as 'qemu:qemu' so whether the files were originally owned by root:root or  fred:fred, 'qemu:qemu' should face the same access control issues.

Comment 6 Richard W.M. Jones 2013-03-06 13:38:56 UTC
The patch mentioned in comment 4 is now in libguestfs 1.20.3,
waiting to go into updates-testing in Fedora 18:

https://admin.fedoraproject.org/updates/libguestfs-1.20.3-1.fc18

If you enable debug (which is enabled always for libguestfs-test-tool)
then it will print out the permissions and SELinux labels on the
appliance and the sockets directory, eg you'll see something like
this:

libguestfs: libvirt XML: [lots of XML]
libguestfs: command: run: ls
libguestfs: command: run: \ -a
libguestfs: command: run: \ -l
libguestfs: command: run: \ -Z /home/rjones/d/libguestfs/tmp/.guestfs-1000
libguestfs: drwxr-xr-x. rjones rjones unconfined_u:object_r:user_tmp_t:s0 .
libguestfs: drwxrwxr-x. rjones rjones system_u:object_r:tmp_t:s0       ..
libguestfs: -rwxr-xr-x. rjones rjones unconfined_u:object_r:user_tmp_t:s0 checksum
libguestfs: -rw-r--r--. rjones rjones unconfined_u:object_r:user_tmp_t:s0 initrd
libguestfs: -rw-r--r--. rjones rjones unconfined_u:object_r:user_tmp_t:s0 initrd.29438
libguestfs: -rw-r--r--. rjones rjones unconfined_u:object_r:user_tmp_t:s0 kernel
libguestfs: -rw-r--r--. rjones rjones unconfined_u:object_r:user_tmp_t:s0 kernel.29438
libguestfs: -rw-r--r--. rjones rjones unconfined_u:object_r:user_tmp_t:s0 root
libguestfs: -rw-r--r--. rjones rjones unconfined_u:object_r:user_tmp_t:s0 root.29438
libguestfs: command: run: ls
libguestfs: command: run: \ -a
libguestfs: command: run: \ -l
libguestfs: command: run: \ -Z /home/rjones/d/libguestfs/tmp/libguestfs8y6JiL
libguestfs: drwxr-xr-x. rjones rjones unconfined_u:object_r:user_tmp_t:s0 .
libguestfs: drwxrwxr-x. rjones rjones system_u:object_r:tmp_t:s0       ..
libguestfs: srwxrwxr-x. rjones rjones unconfined_u:object_r:user_tmp_t:s0 console.sock
libguestfs: srwxrwxr-x. rjones rjones unconfined_u:object_r:user_tmp_t:s0 guestfsd.sock
libguestfs: -rw-r--r--. rjones rjones unconfined_u:object_r:user_tmp_t:s0 snapshot1
libguestfs: [23352ms] launch libvirt guest

From that it should be pretty obvious why qemu.qemu cannot
access the socket.

Without making those sockets mode 0777 which would be very
undesirable, I don't think there is a good way to solve this
within libguestfs.  Since we're not running as root, we cannot
chgrp the sockets.

libvirt should be opening the resources and passing file
descriptors to qemu, which IIRC was discussed upstream
(maybe just discussed for disk images, but IMHO it should do
it for sockets and everything else too).  Alternately perhaps
we could have a flag in the URI to tell libvirt not to chown
the qemu process (like qemu:///system?chown=0)

Comment 7 Richard W.M. Jones 2013-03-06 13:40:58 UTC
(In reply to comment #5)
> Not sure why the behaviour would be different yet though. In both cases the
> appliance ends up running as 'qemu:qemu' so whether the files were
> originally owned by root:root or  fred:fred, 'qemu:qemu' should face the
> same access control issues.

On this last point, the reason the behaviour is different is
because when libguestfs *is* running as root, it chgrp's the
sockets to root.qemu, mode 0660:

https://github.com/libguestfs/libguestfs/blob/master/src/launch-libvirt.c#L311

This is a horrible compromise because we don't know what user
libvirt will actually use (without parsing libvirtd.conf).
libvirt could make this sort of information available to callers.

Comment 8 Daniel Berrangé 2013-03-06 13:45:55 UTC
(In reply to comment #7)
> (In reply to comment #5)
> > Not sure why the behaviour would be different yet though. In both cases the
> > appliance ends up running as 'qemu:qemu' so whether the files were
> > originally owned by root:root or  fred:fred, 'qemu:qemu' should face the
> > same access control issues.
> 
> On this last point, the reason the behaviour is different is
> because when libguestfs *is* running as root, it chgrp's the
> sockets to root.qemu, mode 0660:
> 
> https://github.com/libguestfs/libguestfs/blob/master/src/launch-libvirt.
> c#L311
> 
> This is a horrible compromise because we don't know what user
> libvirt will actually use (without parsing libvirtd.conf).
> libvirt could make this sort of information available to callers.

Ah ha, that's the bit of the puzzle I was missing. So really if we pretend that hack didn't exist, we have an issue we need to solve in general for both root & non-root usage.

Comment 9 Richard W.M. Jones 2013-07-15 11:55:05 UTC
This is a libvirt bug.  Can't be solved in libguestfs, libvirt
needs to set up the ownership of the sockets correctly, or else
open the sockets and pass the file descriptor to qemu.

Comment 10 Cole Robinson 2016-04-09 22:36:12 UTC
It doesn't sound like there's any functional impact remaining... maybe it was worked around in libguestfs code. If this is still an issue in libvirt, please reopen, and retitle the bug, and summarize exactly what the expected libvirt change is