RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1431112 - Failure to start guest when /dev contains a mount point that is a file rather than directory
Summary: Failure to start guest when /dev contains a mount point that is a file rather...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt
Version: 7.4
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Michal Privoznik
QA Contact: yafu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-03-10 12:17 UTC by Daniel Berrangé
Modified: 2018-06-04 02:58 UTC (History)
6 users (show)

Fixed In Version: libvirt-3.7.0-1.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-04-10 10:42:33 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
ns_exec.c (1.33 KB, text/x-csrc)
2017-04-07 06:39 UTC, yafu
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2018:0704 0 None None None 2018-04-10 10:43:05 UTC

Description Daniel Berrangé 2017-03-10 12:17:46 UTC
Description of problem:
When setting up private filesystem namespaces for a guest, libvirt tries to preserve all sub-mounts under /dev. It does this by creating a temporary directory in /var/lib/libvirt and moving the mount point. MS_MOVE requires that the source and target are the same type of inode however. ie if the source mount is a file, then the target of MS_MOVE must be a file too. Libvirt code currently assume all mounts directories.

This creates a failure when running libvirtd inside a docker container, because docker creates a file /dev/termination-log that is a file mount, not a directory mount.

libvirt:  error : Unable to move /dev/termination-log mount to /var/run/libvirt/qemu/kube-a4ad338b-369f-4d1a-90c3-e11a276983dc.termination-log: Invalid argument')


Version-Release number of selected component (if applicable):
libvirt-3.1.0-2.el7

How reproducible:
Always

Steps to Reproduce:
1. touch /mnt/demo
2. touch /dev/demo
3. mount --bind /mnt/demo /dev/demo
4. virsh start $GUEST

Actual results:
Error message trying to move /dev/demo

Expected results:
Guest starts

Additional info:

Comment 2 Michal Privoznik 2017-03-13 12:38:30 UTC
Patch proposed upstream:

https://www.redhat.com/archives/libvir-list/2017-March/msg00528.html

Comment 5 yafu 2017-04-06 09:41:51 UTC
I can reproduce the bug with libvirt-3.1.0-2.el7.x86_64

Verify pass with libvirt-3.2.0-1.el7.x86_64.
Test steps:
1.touch /mnt/demo
2.touch /dev/demo
3.mount --bind /mnt/demo /dev/demo
4.Start a guest and the guest can start correctly:
  #virsh start rhel7.3
5.Compile the ns_exec.c in the attachment:
 #gcc -o ns_exec ns_exec.c
6.Create a script to check the files in the /dev dir:
 #vim check.sh
  #!/bin/bash
 ls -lRZ /dev
7.Check the /dev/demo is mounted correctly in the /dev of the qemu process's namespace:
 #./ns_exec /proc/`pidof qemu-kvm`/ns/mnt `pwd`/check.sh | grep -i demo
-rw-r--r--. root root unconfined_u:object_r:etc_runtime_t:s0 demo

Comment 6 yafu 2017-04-07 06:39:07 UTC
Created attachment 1269577 [details]
ns_exec.c

Comment 7 yafu 2017-06-12 07:32:00 UTC
Hi, Michal,

I found the guest failed to start when mounting a dir/file under preserved mount points. I am not sure whether it needs to be fixed. Would you help to check that please? Thanks a lot.

Test steps:
1.touch /mnt/demo
2.touch /dev/shm/demo
3.mount --bind /mnt/demo /dev/shm/demo
4.Start a guest:
#virsh start q35
error: Failed to start domain q35
error: internal error: Process exited prior to exec: libvirt: QEMU Driver error : Unable to stat: /dev/shm/test: No such file or directory
5.The guest can start successfully when disable creating namespaces for qemu process.

Comment 8 Michal Privoznik 2017-06-12 16:03:04 UTC
Oh, thank you for catching that. I've proposed the patches online:

https://www.redhat.com/archives/libvir-list/2017-June/msg00504.html

Comment 11 yafu 2017-06-15 10:06:10 UTC
Hi, Michal,

When I attached a disk with source file in the /dev dir, it failed with error:
# virsh attach-device rhel7.3 disk.xml 
error: Failed to attach device from disk.xml
error: internal error: child reported: unable to set user and group to '107:107' on '/dev/test.img': No such file or directory

Would you help to check it please? Thanks a lot.

Test steps:
1.Prepare a img in the /dev:
#qemu-img create -f qcow2 /dev/test.img 10M

2.Prepare the disk xml:
#cat disk.xml
 <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/dev/test.img'/>
      <target dev='vdc' bus='virtio'/>
    </disk>

3.Attach the disk to a running guest:
#virsh attach-device full-73 disk.xml
error: Failed to attach device from disk.xml
error: internal error: child reported: unable to set user and group to '107:107' on '/dev/test.img': No such file or directory


4.The disk with source file in the /dev dir can be attached successfully if disabled namespace.

Comment 12 Michal Privoznik 2017-06-15 12:13:06 UTC
(In reply to yafu from comment #11)
> Hi, Michal,
> 
> When I attached a disk with source file in the /dev dir, it failed with
> error:
> # virsh attach-device rhel7.3 disk.xml 
> error: Failed to attach device from disk.xml
> error: internal error: child reported: unable to set user and group to
> '107:107' on '/dev/test.img': No such file or directory
> 
> Would you help to check it please? Thanks a lot.
> 
> Test steps:
> 1.Prepare a img in the /dev:
> #qemu-img create -f qcow2 /dev/test.img 10M
> 
> 2.Prepare the disk xml:
> #cat disk.xml
>  <disk type='file' device='disk'>
>       <driver name='qemu' type='qcow2'/>
>       <source file='/dev/test.img'/>
>       <target dev='vdc' bus='virtio'/>
>     </disk>
> 
> 3.Attach the disk to a running guest:
> #virsh attach-device full-73 disk.xml
> error: Failed to attach device from disk.xml
> error: internal error: child reported: unable to set user and group to
> '107:107' on '/dev/test.img': No such file or directory
> 
> 
> 4.The disk with source file in the /dev dir can be attached successfully if
> disabled namespace.

Yeah, this is a bug in our code. The namespace implementation expects that anything under /dev is a device not a regular file. So libvirt got confused here and didn't created the file in the namespace.

Comment 13 yafu 2017-06-16 04:38:46 UTC
(In reply to Michal Privoznik from comment #12)
> (In reply to yafu from comment #11)
> > Hi, Michal,
> > 
> > When I attached a disk with source file in the /dev dir, it failed with
> > error:
> > # virsh attach-device rhel7.3 disk.xml 
> > error: Failed to attach device from disk.xml
> > error: internal error: child reported: unable to set user and group to
> > '107:107' on '/dev/test.img': No such file or directory
> > 
> > Would you help to check it please? Thanks a lot.
> > 
> > Test steps:
> > 1.Prepare a img in the /dev:
> > #qemu-img create -f qcow2 /dev/test.img 10M
> > 
> > 2.Prepare the disk xml:
> > #cat disk.xml
> >  <disk type='file' device='disk'>
> >       <driver name='qemu' type='qcow2'/>
> >       <source file='/dev/test.img'/>
> >       <target dev='vdc' bus='virtio'/>
> >     </disk>
> > 
> > 3.Attach the disk to a running guest:
> > #virsh attach-device full-73 disk.xml
> > error: Failed to attach device from disk.xml
> > error: internal error: child reported: unable to set user and group to
> > '107:107' on '/dev/test.img': No such file or directory
> > 
> > 
> > 4.The disk with source file in the /dev dir can be attached successfully if
> > disabled namespace.
> 
> Yeah, this is a bug in our code. The namespace implementation expects that
> anything under /dev is a device not a regular file. So libvirt got confused
> here and didn't created the file in the namespace.


Thanks Michal. File a bug to track this issue:
https://bugzilla.redhat.com/show_bug.cgi?id=1462060

Comment 14 Michal Privoznik 2017-06-16 12:43:29 UTC
I've just pushed the patches upstream:

commit 6451b55ec3d801bb03e912b0811408cf82cfc880 (HEAD -> master, origin/master, origin/HEAD, qemu_ns)
Author:     Michal Privoznik <mprivozn>
AuthorDate: Mon Jun 12 16:44:45 2017 +0200
Commit:     Michal Privoznik <mprivozn>
CommitDate: Fri Jun 16 14:38:49 2017 +0200

    qemuDomainGetPreservedMounts: Fix suffixes for corner cases
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1431112
    
    Imagine a FS mounted on /dev/blah/blah2. Our process of creating
    suffix for temporary location where all the mounted filesystems
    are moved is very simplistic. We want:
    
    /var/run/libvirt/qemu/$domName.$suffix\
    
    were $suffix is just the mount point path stripped of the "/dev/"
    prefix. For instance:
    
    /var/run/libvirt/qemu/fedora.mqueue  for /dev/mqueue
    /var/run/libvirt/qemu/fedora.pts     for /dev/pts
    
    and so on. Now if we plug /dev/blah/blah2 into the example we see
    some misbehaviour:
    
    /var/run/libvirt/qemu/fedora.blah/blah2
    
    Well, misbehaviour if /dev/blah/blah2 is a file, because in that
    case we call virFileTouch() instead of virFileMakePath().
    The solution is to replace all the slashes in the suffix with say
    dots. That way we don't have to care about nested directories.
    IOW, the result we want for given example is:
    
    /var/run/libvirt/qemu/fedora.blah.blah2
    
    Signed-off-by: Michal Privoznik <mprivozn>
    Reviewed-by: John Ferlan <jferlan>

commit cdd9205dfffa3aaed935446a41f0d2dd1357c268
Author:     Michal Privoznik <mprivozn>
AuthorDate: Mon Jun 12 16:28:03 2017 +0200
Commit:     Michal Privoznik <mprivozn>
CommitDate: Fri Jun 16 14:38:23 2017 +0200

    qemuDomainGetPreservedMounts: Prune nested mount points
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1431112
    
    There can be nested mount points. For instance /dev/shm/blah can
    be a mount point and /dev/shm too. It doesn't make much sense to
    return the former path because callers preserve the latter (and
    with that the former too). Therefore prune nested mount points.
    
    Signed-off-by: Michal Privoznik <mprivozn>
    Reviewed-by: John Ferlan <jferlan>

commit 6ab3e2f6c4c665efdddb313ac9ecd80bf9c67670
Author:     Michal Privoznik <mprivozn>
AuthorDate: Mon Jun 12 17:46:30 2017 +0200
Commit:     Michal Privoznik <mprivozn>
CommitDate: Fri Jun 16 14:29:12 2017 +0200

    qemuDomainBuildNamespace: Clean up temp files
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1431112
    
    After 290a00e41d we know how to deal with file mount points.
    However, when cleaning up the temporary location for preserved
    mount points we are still calling rmdir(). This won't fly for
    files. We need to call unlink(). Now, since we don't really care
    if the cleanup succeeded or not (it's the best effort anyway), we
    can call both rmdir() and unlink() without need for
    differentiation between files and directories.
    
    Signed-off-by: Michal Privoznik <mprivozn>
    Reviewed-by: John Ferlan <jferlan>

v3.4.0-125-g6451b55ec

Comment 16 Antenore Gatta 2017-11-13 14:16:45 UTC
I just wanted to add that this happens also with overlayfs, like:

overlaid                 3.7G  1.6G  2.1G  44% /dev/shm/asd-antenore/home/antenore/lotus/notes/data

This mount point is created automatically by anything-sync-daemon

Comment 17 yafu 2017-12-14 12:00:18 UTC
Verified pass with libvirt-3.9.0-6.el7.x86_64.

Test steps:
Scenario 1:Start a guest when  mounting a dir/file under preserved mount points:
1.#touch /tmp/demo
2.#touch /dev/shm/demo
3.#mount --bind /tmp/demo /dev/shm/demo
4.#virsh start avocado-vt-vm1
Domain avocado-vt-vm1 started


Scenario 2: Start a guest when /dev contains a mount point that is a file:
1.#touch /tmp/demo
2.#touch /dev/demo
3.#mount --bind /tmp/demo /dev/demo
4.#virsh start avocado-vt-vm1
Domain avocado-vt-vm1 started

Scenario 3: Start a guest when /dev/shm contains a overlaid mount point:
1.Install asd and edit /etc/asd.conf:
WHATTOSYNC=('/foo/bar')
VOLATILE="/dev/shm"
USE_OVERLAYFS="yes"

2.Create /foo/bar:
#mkdir -p /foo/bar
3.Restart asd service:
#systemctl restart asd
4.Check the mount point
#df -h
overlaid                            5.7G   30M  5.7G   1% /dev/shm/asd-root/foo/bar
5.Start the guest:
#virsh start avocado-vt-vm1
Domain avocado-vt-vm1 started

Comment 21 errata-xmlrpc 2018-04-10 10:42:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:0704


Note You need to log in before you can comment on or make changes to this bug.