Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 921135

Summary:	qemu: could not load kernel ... Permission denied
Product:	Red Hat Enterprise Linux 7	Reporter:	Dave Allan <dallan>
Component:	libvirt	Assignee:	Jiri Denemark <jdenemar>
Status:	CLOSED ERRATA	QA Contact:	Virtualization Bugs <virt-bugs>
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	7.0	CC:	berrange, clalancette, cwei, dyuan, itamar, jdenemar, jforbes, laine, libvirt-maint, mprivozn, mzhan, rbalakri, rjones, veillard, virt-maint, yafu, zhanghongming
Target Milestone:	rc
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	libvirt-1.3.1-1.el7	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:	871196	Environment:
Last Closed:	2016-11-03 18:05:39 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1269975
Bug Blocks:	910269, 910270, 922891

Description Dave Allan 2013-03-13 14:40:21 UTC

+++ This bug was initially created as a clone of Bug #871196 +++

Description of problem:

This manifests itself as the libguestfs test suite
occasionally failing like this:

*stdin*:2: libguestfs: error: could not create appliance through libvirt: internal error process exited while connecting to monitor: 2012-10-29 21:27:55.614+0000: 20366: debug : virFileClose:72 : Closed fd 22
2012-10-29 21:27:55.614+0000: 20366: debug : virFileClose:72 : Closed fd 35
2012-10-29 21:27:55.630+0000: 20366: debug : virFileClose:72 : Closed fd 3
2012-10-29 21:27:55.653+0000: 20367: debug : virCommandHook:2143 : Hook is done 0
qemu: could not load kernel '/home/rjones/d/libguestfs/tmp/.guestfs-1000/kernel.20244': Permission denied
 [code=1 domain=10]

There are multiple parallel libguestfs instances running.
The best explanation of this I can come up with is that
libvirtd is unlabelling the <kernel/> element at the same
time that another qemu is trying to open it.

Version-Release number of selected component (if applicable):

libvirt-0.10.2-3.fc18.x86_64

How reproducible:

Very rare.

--- Additional comment from Richard W.M. Jones on 2013-03-13 07:12:22 EDT ---

Still happening with libguestfs upstream and recent libvirt.

libvirt-daemon-1.0.2-3.fc19.x86_64

A test program which demonstrates this and nearly always
fails for me is here:

https://github.com/libguestfs/libguestfs/blob/master/tests/parallel/test-parallel.c

Comment 1 Michal Privoznik 2013-03-13 14:47:31 UTC

I believe this can be solved with my "Keep original file label" patchset:

https://www.redhat.com/archives/libvir-list/2013-March/msg00497.html

Although I am solving this issue for DAC only for now. I am introducing reference counter on XATTR level to trace how many times did libvirt label a file, so it the label is restored only at the last restore request. So if your explanation is right, the kernel image will be:

1. labeled due to domain A startup, refCount = 1,
2. labeled due to domain B startup, refCount = 2,
3. domain A shutdown will just decrease refCount to 1,
4. shutting down domain B will decrease refCount and since it's 0 now, the original file label is restored.

Comment 3 Dave Allan 2013-07-12 20:05:50 UTC

(In reply to Michal Privoznik from comment #1)
> I believe this can be solved with my "Keep original file label" patchset:
> 
> https://www.redhat.com/archives/libvir-list/2013-March/msg00497.html

Any motion on this patchset?

Comment 4 Michal Privoznik 2013-07-15 07:48:44 UTC

Seems like we've got upstream agreement on design. However, the implementation won't be trivial (as it'll involve sanlock/virlockd and nothing involving sanlock is trivial, is it? :) ). It's on my TODO list though.

Comment 6 Michal Privoznik 2013-12-11 16:02:28 UTC

Rich,

by the time I have something useful to send upstream, I think this workaround may be sufficient for you: just copy the kernel images for each domain that about to run under different permissions.

It would be the best, if qemu would allow us to pass every file (including kernel and initrd) via FD. Then this bug would instantly go away.

Comment 7 Richard W.M. Jones 2013-12-12 08:45:26 UTC

We do have a separate copy of the kernel for every UID already.

The problem is we have a multi-threaded program running multiple
libvirt domains from different threads.  Each thread has the same
UID & PID [obviously] and thus uses the same kernel file, called
$TMPDIR/.guestfs-$UID/kernel.$PID

However libvirt labels the kernel before each domain runs and
unlabels it when the domain exits.  Since multiple domains are
starting and stopping in the same program, there is a race between
one thread labelling the kernel and another (closing domain) thread
unlabelling the kernel.

Comment 8 Michal Privoznik 2013-12-12 14:39:35 UTC

(In reply to Richard W.M. Jones from comment #7)
> We do have a separate copy of the kernel for every UID already.
> 

What about a separate kernel per each domain?

Comment 9 Richard W.M. Jones 2013-12-12 14:47:21 UTC

Sure, it's possible.  We would have to copy the kernel because we can't
just link it (as labels are per-inode), and the kernel is 5 MB and
we launch 12 domains in parallel, so that's like ~60 MB.

Comment 15 Richard W.M. Jones 2016-01-14 09:44:05 UTC

*** Bug 1298124 has been marked as a duplicate of this bug. ***

Comment 16 Richard W.M. Jones 2016-01-14 09:44:09 UTC

*** Bug 1298122 has been marked as a duplicate of this bug. ***

Comment 17 Jiri Denemark 2016-01-15 10:12:38 UTC

This specific case can be fixed even without handling bug 547546. Kernel/initrd files are essentially read-only shareable images thus should be handled in the same way. Patch sent upstream for review: https://www.redhat.com/archives/libvir-list/2016-January/msg00675.html

Comment 19 Jiri Denemark 2016-01-15 10:27:24 UTC

Pushed upstream as v1.3.1-rc2-2-g68acc70:

commit 68acc701bd449481e3206723c25b18fcd3d261b7
Author: Jiri Denemark <jdenemar>
Date:   Fri Jan 15 10:55:58 2016 +0100

    security: Do not restore kernel and initrd labels
    
    Kernel/initrd files are essentially read-only shareable images and thus
    should be handled in the same way. We already use the appropriate label
    for kernel/initrd files when starting a domain, but when a domain gets
    destroyed we would remove the labels which would make other running
    domains using the same files very unhappy.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=921135
    
    Signed-off-by: Jiri Denemark <jdenemar>

Comment 20 yafu 2016-01-29 07:05:07 UTC

Hi, jdenemar, I tried to reproduce the issue using test-virt-alignment-scan-guests.sh in the libguestfs (https://github.com/libguestfs/libguestfs/blob/master/align/test-virt-alignment-scan-guests.sh , the script will start up lots of parallel libvirt instances), and the libvirtd process hang because of deadlock caused by parallel labelling the same file.
Would you please help me to check the issue whether is same as this bug? Thank you very much.

test version:
libvirt-1.2.17-13.el7_2.2.x86_64
qemu-kvm-rhev-2.3.0-31.el7_2.5.x86_64

steps to reproduce:
1.Run test-virt-alignment-scan-guests.sh in the libguestfs:
# while true ; do ./test-virt-alignment-scan-guests.sh ; done

2.Check the output of 'virsh list' at the same time:
#watch virsh list

3.After about 20 hours, the libvirtd daemon hang and the output of 'virsh list' did not change any more.

4.Using gdb to print the libvirtd process backtrace:
(gdb)bt
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007fec0cafed02 in _L_lock_791 () from /lib64/libpthread.so.0
#2  0x00007fec0cafec08 in __GI___pthread_mutex_lock (mutex=mutex@entry=0x7febec1d92b0) at pthread_mutex_lock.c:64
#3  0x00007fec0f491ba5 in virMutexLock (m=m@entry=0x7febec1d92b0) at util/virthread.c:89
#4  0x00007fec0f47922e in virObjectLock (anyobj=anyobj@entry=0x7febec1d92a0) at util/virobject.c:323
#5  0x00007fec0f60db1c in virSecurityManagerSetSocketLabel (mgr=0x7febec1d92a0, vm=vm@entry=0x7febd800bb20) at security/security_manager.c:431
#6  0x00007fec0f60ade3 in virSecurityStackSetSocketLabel (mgr=<optimized out>, vm=0x7febd800bb20) at security/security_stack.c:456
#7  0x00007fec0f60db29 in virSecurityManagerSetSocketLabel (mgr=0x7febec1d91f0, vm=0x7febd800bb20) at security/security_manager.c:432
#8  0x00007febf66e896e in qemuProcessHook (data=0x7febfee740f0) at qemu/qemu_process.c:3227
#9  0x00007fec0f43f8fa in virExec (cmd=cmd@entry=0x7febd8007a70) at util/vircommand.c:692
#10 0x00007fec0f442267 in virCommandRunAsync (cmd=cmd@entry=0x7febd8007a70, pid=pid@entry=0x0) at util/vircommand.c:2429
#11 0x00007fec0f442616 in virCommandRun (cmd=cmd@entry=0x7febd8007a70, exitstatus=exitstatus@entry=0x0) at util/vircommand.c:2261
#12 0x00007febf66f053d in qemuProcessStart (conn=conn@entry=0x7febe0008110, driver=driver@entry=0x7febec11d7b0, vm=<optimized out>, asyncJob=asyncJob@entry=0, 
    migrateFrom=migrateFrom@entry=0x0, stdin_fd=stdin_fd@entry=-1, stdin_path=stdin_path@entry=0x0, snapshot=snapshot@entry=0x0, vmop=vmop@entry=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, 
    flags=flags@entry=5) at qemu/qemu_process.c:4859
#13 0x00007febf673d7df in qemuDomainCreateXML (conn=0x7febe0008110, xml=<optimized out>, flags=<optimized out>) at qemu/qemu_driver.c:1768
#14 0x00007fec0f520a11 in virDomainCreateXML (conn=0x7febe0008110, 
    xmlDesc=0x7febd8000cb0 "<?xml version=\"1.0\"?>\n<domain type=\"kvm\" xmlns:qemu=\"http://libvirt.org/schemas/domain/qemu/1.0\">\n  <name>guestfs-fxyf5s4hmc0iipn6</name>\n  <memory unit=\"MiB\">500</memory>\n  <currentMemory unit=\"MiB\">"..., flags=2) at libvirt-domain.c:180
#15 0x00007fec1018411a in remoteDispatchDomainCreateXML (server=0x7fec11e1bb60, msg=0x7fec11e362f0, ret=0x7febd800d0b0, args=0x7febd80087f0, rerr=0x7febfee74c30, client=0x7fec11e3e320)
    at remote_dispatch.h:3754
#16 remoteDispatchDomainCreateXMLHelper (server=0x7fec11e1bb60, client=0x7fec11e3e320, msg=0x7fec11e362f0, rerr=0x7febfee74c30, args=0x7febd80087f0, ret=0x7febd800d0b0)
    at remote_dispatch.h:3732
#17 0x00007fec0f59c3c2 in virNetServerProgramDispatchCall (msg=0x7fec11e362f0, client=0x7fec11e3e320, server=0x7fec11e1bb60, prog=0x7fec11e30000) at rpc/virnetserverprogram.c:437
#18 virNetServerProgramDispatch (prog=0x7fec11e30000, server=server@entry=0x7fec11e1bb60, client=0x7fec11e3e320, msg=0x7fec11e362f0) at rpc/virnetserverprogram.c:307
#19 0x00007fec0f59763d in virNetServerProcessMsg (msg=<optimized out>, prog=<optimized out>, client=<optimized out>, srv=0x7fec11e1bb60) at rpc/virnetserver.c:135
#20 virNetServerHandleJob (jobOpaque=<optimized out>, opaque=0x7fec11e1bb60) at rpc/virnetserver.c:156
#21 0x00007fec0f4924f5 in virThreadPoolWorker (opaque=opaque@entry=0x7fec11e10de0) at util/virthreadpool.c:145
#22 0x00007fec0f491a18 in virThreadHelper (data=<optimized out>) at util/virthread.c:206
#23 0x00007fec0cafcdc5 in start_thread (arg=0x7febfee75700) at pthread_create.c:308
#24 0x00007fec0c82a1cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Comment 21 Jiri Denemark 2016-01-29 08:52:11 UTC

See bug 1302965

Comment 22 Jiri Denemark 2016-01-29 10:01:21 UTC

Actually, this backtrace is different from 1302965; this one looks like you were lucky and fork() was called when another thread had security manager locked. Please, file a separate bug for that. Don't forget to include debug logs from libvirtd.

Comment 23 yafu 2016-01-29 11:26:32 UTC

(In reply to Jiri Denemark from comment #22)
> Actually, this backtrace is different from 1302965; this one looks like you
> were lucky and fork() was called when another thread had security manager
> locked. Please, file a separate bug for that. Don't forget to include debug
> logs from libvirtd.


Thanks for your quick reply. 
File another bug for the issue described in comment #20 :
https://bugzilla.redhat.com/show_bug.cgi?id=1303031

Comment 25 yafu 2016-04-13 10:33:21 UTC

Hi, Jiri,
 I tried to verify this bug with libvirt-1.3.3-1.el7.x86_64. Would you please help to check the steps below are enough for verifying this bug? Thanks a lot.

Steps:
1.Define a guest with direct kernel boot:
#virsh edit rhel7
...
  <os>
    <type arch='x86_64' machine='pc-i440fx-rhel7.2.0'>hvm</type>
    <kernel>/images/vmlinuz</kernel>
    <initrd>/images/initrd.img</initrd>
    <cmdline>method=http://download.englab.nay.redhat.com/pub/rhel/rel-eng/RHEL-7.2-20150820.0/compose/Server/x86_64/os/</cmdline>
    <boot dev='hd'/>
  </os>
...

2.Start the guest:
#virsh start rhel7

3.Check the labels of the kernel and initrd file:
#ll -Z /images/vmlinuz
-rw-rw-r--. qemu qemu system_u:object_r:virt_content_t:s0 /images/vmlinuz
#ll -Z /images/initrd.img
-rw-rw-r--. qemu qemu system_u:object_r:virt_content_t:s0 /images/initrd.img

4.Destroy the guest:
 #virsh destroy rhel7

5.Check the label of the kernel and initrd file again, could see the labels of the kernel and initrd file is not restored:
#ll -Z /images/vmlinuz
-rw-rw-r--. qemu qemu system_u:object_r:virt_content_t:s0 /images/vmlinuz
#ll -Z /images/initrd.img
-rw-rw-r--. qemu qemu system_u:object_r:virt_content_t:s0 /images/initrd.img

As above, the labels of the kernel and initrd file are not restored when a domain gets destroyed.

Comment 26 yafu 2016-04-15 03:13:10 UTC

Reproduce this bug with libvirt-1.2.17-13.el7.x86_64.

With steps as comment 25, when the guest gets destroyed, the labels of kernel and initrd file are restored:
#ll -Z /images/vmlinuz 
-rw-rw-r--. root root system_u:object_r:default_t:s0   /images/vmlinuz
#ll -Z /images/initrd.img 
-rw-rw-r--. root root system_u:object_r:default_t:s0   /images/initrd.img

Comment 27 Jiri Denemark 2016-04-29 14:38:14 UTC

Yes, the steps from comment 25 are sufficient for verifying the bug.

Comment 28 yafu 2016-08-31 03:01:32 UTC

Verified pass with libvirt-2.0.0-6.el7.x86_64.

Comment 30 errata-xmlrpc 2016-11-03 18:05:39 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-2577.html