Bug 1586377 - Fail to start VM on v4.4.0-118-ga9884d7
Summary: Fail to start VM on v4.4.0-118-ga9884d7
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: Virtualization Tools
Classification: Community
Component: libvirt
Version: unspecified
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
Assignee: Libvirt Maintainers
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-06-06 06:22 UTC by Han Han
Modified: 2023-03-16 01:55 UTC (History)
7 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2023-03-16 01:55:43 UTC
Embargoed:


Attachments (Terms of Use)
The guest xml/libvirtd.conf/guest log/libvirtd.log (38.67 KB, application/x-gzip)
2018-06-06 06:22 UTC, Han Han
no flags Details

Description Han Han 2018-06-06 06:22:50 UTC
Created attachment 1448122 [details]
The guest xml/libvirtd.conf/guest log/libvirtd.log

Description of problem:
As subject

Version-Release number of selected component (if applicable):
libvirt v4.4.0-118-ga9884d7
qemu-kvm-rhev-2.12.0-3.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Download lastest libvirt source code via git, then compile and install the RPMs
# ./configure --prefix=/usr && make rpm && rpm -Uvh --force ~/rpmbuild/RPMS/x86_64/libvirt*

2. Restart libvirtd and virtlogd

3. Tried to start a VM
# virsh start RHEL8
error: Failed to start domain RHEL8
error: internal error: qemu unexpectedly closed the monitor: stnet0,id=net0,mac=52:54:00:d4:c4:27,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -spice port=5900,addr=127.0.0.1,disable-ticketing,image-compression=off,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=1 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=o

Actual results:
As above

Expected results:
Vm started

Additional info:
The guest xml/libvirtd.conf/guest log/libvirtd.log is in the attachment.
It is likely a fatal regression, please debug and fix it soon.

Comment 1 Han Han 2018-06-06 06:29:59 UTC
BTW, the same VM works well on libvirt-v4.4.0-2-g67c56f6

Comment 2 Daniel Berrangé 2018-06-06 09:07:19 UTC
I can't reproduce a failure myself and the logs don't show anything particularly interesting to report.

Could you try to git bisect the problem, eg start by doing:

  git bisect good 67c56f6
  git bisect bad a9884d7

and then test each step to find our where it breaks.

Comment 3 Daniel Berrangé 2018-06-06 13:52:58 UTC
Actually before bothering with that, can you confirm whether you have SELinux in "enforcing" mode.  If so, I guess that putting it in "permissive" mode will probably fix it.  In which case we'll need to find whatever AVC is logged, and file a bug report against SELinux policy to have it fixed.

Comment 4 Han Han 2018-06-07 02:22:26 UTC
(In reply to Daniel Berrange from comment #3)
> Actually before bothering with that, can you confirm whether you have
> SELinux in "enforcing" mode.  If so, I guess that putting it in "permissive"
> mode will probably fix it.  In which case we'll need to find whatever AVC is
> logged, and file a bug report against SELinux policy to have it fixed.

Well, my selinux is "enforcing" when bug happened. And it is not reproduced with "permissive" mode.
The most strange is that I don't find anything like "deny" or "avc" in audit.log and augsearch...
I will use git bisect to find from where it breaks

Comment 5 Han Han 2018-06-07 04:09:14 UTC
git bisect result:

Bug reproduced since this commit:

commit 30fb2276d88b275dc2aad6ddd28c100d944b59a5
Author: Daniel P. Berrangé <berrange>
Date:   Wed Mar 14 12:16:11 2018 +0000

    qemu: support passing pre-opened UNIX socket listen FD
    
    There is a race condition when spawning QEMU where libvirt has spawned
    QEMU but the monitor socket is not yet open. Libvirt has to repeatedly
    try to connect() to QEMU's monitor until eventually it succeeds, or
    times out. We use kill() to check if QEMU is still alive so we avoid
    waiting a long time if QEMU exited, but having a timeout at all is still
    unpleasant.
    
    With QEMU 2.12 we can pass in a pre-opened FD for UNIX domain or TCP
    sockets. If libvirt has called bind() and listen() on this FD, then we
    have a guarantee that libvirt can immediately call connect() and
    succeed without any race.
    
    Although we only really care about this for the monitor socket and agent
    socket, this patch does FD passing for all UNIX socket based character
    devices since there appears to be no downside to it.
    
    We don't do FD passing for TCP sockets, however, because it is only
    possible to pass a single FD, while some hostnames may require listening
    on multiple FDs to cover IPv4 and IPv6 concurrently.
    
    Reviewed-by: John Ferlan <jferlan>
    Signed-off-by: Daniel P. Berrangé <berrange>

Comment 6 Ján Tomko 2018-06-07 06:15:22 UTC
You might need to enable reporting of dontaudit rules to see the violation, see: semanage-dontaudit(8)

Comment 7 Han Han 2018-06-07 06:32:20 UTC
Thanks for Ján's advice. After turned off dontaudit, I find these AVC in audit.log:
type=AVC msg=audit(1528352823.310:3434): avc:  denied  { read write } for  pid=2011 comm="qemu-kvm" path="socket:[633251]" dev="sockfs" ino=633251 scontext=system_u:system_r:svirt_t:s0:c498,c609 tcontext=system_u:system_r:virtd_t:s0-s0:c0.c1023 tclass=unix_stream_socket
type=SYSCALL msg=audit(1528352823.310:3434): arch=c000003e syscall=59 success=yes exit=0 a0=7fcc5055d0f0 a1=7fcc5055f6c0 a2=7fcc5055c250 a3=8 items=0 ppid=1 pid=2011 auid=4294967295 uid=107 gid=107 euid=107 suid=107 fsuid=107 egid=107 sgid=107 fsgid=107 tty=(none) ses=4294967295 comm="qemu-kvm" exe="/usr/libexec/qemu-kvm" subj=system_u:system_r:svirt_t:s0:c498,c609 key=(null)

Comment 8 Daniel Berrangé 2018-06-07 08:20:15 UTC
Hmm, so I think we need to set the right selinux context on the FD in libvirt before we leak it into QEMU.

Comment 9 Han Han 2023-03-16 01:55:43 UTC
Close it since no update for a long time.


Note You need to log in before you can comment on or make changes to this bug.