Bug 1586377
| Summary: | Fail to start VM on v4.4.0-118-ga9884d7 | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Community] Virtualization Tools | Reporter: | Han Han <hhan> | ||||
| Component: | libvirt | Assignee: | Libvirt Maintainers <libvirt-maint> | ||||
| Status: | CLOSED DEFERRED | QA Contact: | |||||
| Severity: | high | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | unspecified | CC: | berrange, dyuan, hhan, jtomko, libvirt-maint, xuzhang, yafu | ||||
| Target Milestone: | --- | Keywords: | Regression, TestBlocker | ||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2023-03-16 01:55:43 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
BTW, the same VM works well on libvirt-v4.4.0-2-g67c56f6 I can't reproduce a failure myself and the logs don't show anything particularly interesting to report. Could you try to git bisect the problem, eg start by doing: git bisect good 67c56f6 git bisect bad a9884d7 and then test each step to find our where it breaks. Actually before bothering with that, can you confirm whether you have SELinux in "enforcing" mode. If so, I guess that putting it in "permissive" mode will probably fix it. In which case we'll need to find whatever AVC is logged, and file a bug report against SELinux policy to have it fixed. (In reply to Daniel Berrange from comment #3) > Actually before bothering with that, can you confirm whether you have > SELinux in "enforcing" mode. If so, I guess that putting it in "permissive" > mode will probably fix it. In which case we'll need to find whatever AVC is > logged, and file a bug report against SELinux policy to have it fixed. Well, my selinux is "enforcing" when bug happened. And it is not reproduced with "permissive" mode. The most strange is that I don't find anything like "deny" or "avc" in audit.log and augsearch... I will use git bisect to find from where it breaks git bisect result:
Bug reproduced since this commit:
commit 30fb2276d88b275dc2aad6ddd28c100d944b59a5
Author: Daniel P. Berrangé <berrange>
Date: Wed Mar 14 12:16:11 2018 +0000
qemu: support passing pre-opened UNIX socket listen FD
There is a race condition when spawning QEMU where libvirt has spawned
QEMU but the monitor socket is not yet open. Libvirt has to repeatedly
try to connect() to QEMU's monitor until eventually it succeeds, or
times out. We use kill() to check if QEMU is still alive so we avoid
waiting a long time if QEMU exited, but having a timeout at all is still
unpleasant.
With QEMU 2.12 we can pass in a pre-opened FD for UNIX domain or TCP
sockets. If libvirt has called bind() and listen() on this FD, then we
have a guarantee that libvirt can immediately call connect() and
succeed without any race.
Although we only really care about this for the monitor socket and agent
socket, this patch does FD passing for all UNIX socket based character
devices since there appears to be no downside to it.
We don't do FD passing for TCP sockets, however, because it is only
possible to pass a single FD, while some hostnames may require listening
on multiple FDs to cover IPv4 and IPv6 concurrently.
Reviewed-by: John Ferlan <jferlan>
Signed-off-by: Daniel P. Berrangé <berrange>
You might need to enable reporting of dontaudit rules to see the violation, see: semanage-dontaudit(8) Thanks for Ján's advice. After turned off dontaudit, I find these AVC in audit.log:
type=AVC msg=audit(1528352823.310:3434): avc: denied { read write } for pid=2011 comm="qemu-kvm" path="socket:[633251]" dev="sockfs" ino=633251 scontext=system_u:system_r:svirt_t:s0:c498,c609 tcontext=system_u:system_r:virtd_t:s0-s0:c0.c1023 tclass=unix_stream_socket
type=SYSCALL msg=audit(1528352823.310:3434): arch=c000003e syscall=59 success=yes exit=0 a0=7fcc5055d0f0 a1=7fcc5055f6c0 a2=7fcc5055c250 a3=8 items=0 ppid=1 pid=2011 auid=4294967295 uid=107 gid=107 euid=107 suid=107 fsuid=107 egid=107 sgid=107 fsgid=107 tty=(none) ses=4294967295 comm="qemu-kvm" exe="/usr/libexec/qemu-kvm" subj=system_u:system_r:svirt_t:s0:c498,c609 key=(null)
Hmm, so I think we need to set the right selinux context on the FD in libvirt before we leak it into QEMU. Close it since no update for a long time. |
Created attachment 1448122 [details] The guest xml/libvirtd.conf/guest log/libvirtd.log Description of problem: As subject Version-Release number of selected component (if applicable): libvirt v4.4.0-118-ga9884d7 qemu-kvm-rhev-2.12.0-3.el7.x86_64 How reproducible: 100% Steps to Reproduce: 1. Download lastest libvirt source code via git, then compile and install the RPMs # ./configure --prefix=/usr && make rpm && rpm -Uvh --force ~/rpmbuild/RPMS/x86_64/libvirt* 2. Restart libvirtd and virtlogd 3. Tried to start a VM # virsh start RHEL8 error: Failed to start domain RHEL8 error: internal error: qemu unexpectedly closed the monitor: stnet0,id=net0,mac=52:54:00:d4:c4:27,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -spice port=5900,addr=127.0.0.1,disable-ticketing,image-compression=off,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=1 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=o Actual results: As above Expected results: Vm started Additional info: The guest xml/libvirtd.conf/guest log/libvirtd.log is in the attachment. It is likely a fatal regression, please debug and fix it soon.