Description of problem: cockpit-machine's Fedora gating tests picked up a QEMU crash in Rawhide [1]. This coincides with the upload of QEMU 6.0.0-rc2 [2]. This happens when trying to attach a network interface to a running VM. Our tests do this through libvirt, so the reproducer uses that. Version-Release number of selected component (if applicable): libvirt-daemon-7.2.0-1.fc35.x86_64 qemu-system-x86-core-6.0.0-0.1.rc2.fc35.x86_64 How reproducible: Always Steps to Reproduce: 1. Create some simple VM. We use this one: # virsh dumpxml subVmTest1 <domain type='qemu'> <name>subVmTest1</name> <uuid>c6fbcaf5-7b34-4990-ad1f-1a2bed761301</uuid> <memory unit='KiB'>131072</memory> <currentMemory unit='KiB'>131072</currentMemory> <vcpu placement='static'>1</vcpu> <os> <type arch='x86_64' machine='pc-i440fx-6.0'>hvm</type> <boot dev='hd'/> <boot dev='network'/> </os> <features> <acpi/> </features> <cpu mode='custom' match='exact' check='none'> <model fallback='forbid'>qemu64</model> </cpu> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>destroy</on_crash> <devices> <emulator>/usr/bin/qemu-system-x86_64</emulator> <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/var/lib/libvirt/images/subVmTest1-2.img'/> <target dev='vda' bus='virtio'/> <serial>SECOND</serial> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </disk> <controller type='scsi' index='0' model='virtio-scsi'> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </controller> <controller type='usb' index='0' model='piix3-uhci'> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> </controller> <controller type='pci' index='0' model='pci-root'/> <controller type='virtio-serial' index='0'> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </controller> <interface type='network'> <mac address='52:54:00:76:50:a9'/> <source network='default'/> <model type='rtl8139'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> <serial type='file'> <source path='/var/log/libvirt/console-subVmTest1.log'/> <target type='isa-serial' port='0'> <model name='isa-serial'/> </target> </serial> <console type='file'> <source path='/var/log/libvirt/console-subVmTest1.log'/> <target type='serial' port='0'/> </console> <channel type='unix'> <target type='virtio' name='org.qemu.guest_agent.0'/> <address type='virtio-serial' controller='0' bus='0' port='1'/> </channel> <input type='mouse' bus='ps2'/> <input type='keyboard' bus='ps2'/> <graphics type='spice' autoport='yes' listen='127.0.0.1'> <listen type='address' address='127.0.0.1'/> <image compression='off'/> </graphics> <audio id='1' type='spice'/> <video> <model type='vga' vram='16384' heads='1' primary='yes'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </video> <memballoon model='virtio'> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> </memballoon> </devices> </domain> But there is really nothing special to it. It should be running: virsh list --all # output: # Id Name State #---------------------------- # 4 subVmTest1 running 2. Create a libvirt network interface: cat > net.xml <<EOF <network> <name>test_network</name> <forward mode='nat'/> <bridge name='virbr1' stp='on' delay='0'/> <mac address='52:54:00:bc:93:8e'/> <ip address='192.168.123.1' netmask='255.255.255.0'> <dhcp> <range start='192.168.123.2' end='192.168.123.254'/> </dhcp> </ip> </network> EOF virsh net-define net.xml virsh net-start test_network virsh net-list --all # output: # Name State Autostart Persistent #------------------------------------------------- # default active yes yes # test_network active no yes 3. Try to attach the network to the domain/VM: virsh attach-interface subVmTest1 network test_network Actual results: 3. fails with # error: Failed to attach interface # error: Unable to read from monitor: Connection reset by peer QEMU crashes, and the domain stops: # virsh list --all Id Name State ----------------------------- - subVmTest1 shut off Journal output: audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-hostnamed comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' audit: BPF prog-id=902 op=UNLOAD audit: BPF prog-id=901 op=UNLOAD systemd[1]: systemd-hostnamed.service: Deactivated successfully. NetworkManager[556]: <info> [1618465365.4157] manager: (vnet2): new Tun device (/org/freedesktop/NetworkManager/Devices/91) kernel: virbr1: port 1(vnet2) entered blocking state kernel: virbr1: port 1(vnet2) entered disabled state audit: ANOM_PROMISCUOUS dev=vnet2 prom=256 old_prom=0 auid=4294967295 uid=0 gid=0 ses=4294967295 kernel: device vnet2 entered promiscuous mode kernel: virbr1: port 1(vnet2) entered blocking state kernel: virbr1: port 1(vnet2) entered listening state systemd-udevd[51680]: Using default interface naming scheme 'v247'. audit[50999]: VIRT_RESOURCE pid=50999 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:virtd_t:s0-s0:c0.c1023 msg='virt=qemu resrc=net reason=open vm="subVmTest1" uuid=c6fbcaf5-7b34-4990-ad1f-1a2bed761301 net=52:54:00:a5:f8:c0 path="/dev/net/tun" rdev=0A:C8 exe="/usr/sbin/libvirtd" hostname=? addr=? terminal=? res=success' systemd-udevd[51680]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable. kernel: qemu-system-x86[51622]: segfault at 71 ip 0000559045bc4589 sp 00007ffefa7a7db0 error 4 in qemu-system-x86_64[559045b37000+4a5000] kernel: Code: 00 00 41 8b 94 24 20 15 01 00 49 89 ee 85 d2 74 14 41 80 bc 24 0e 15 01 00 00 75 09 41 89 d6 41 29 d7 49 01 ee 49 8b 44 24 20 <80> 78 71 00 75 81 48 8d 7c 24 20 49 63 cf 48 8d 74 24 18 4c 89 f2 audit[51622]: ANOM_ABEND auid=4294967295 uid=107 gid=107 ses=4294967295 subj=system_u:system_r:svirt_tcg_t:s0:c16,c645 pid=51622 comm="qemu-system-x86" exe="/usr/bin/qemu-system-x86_64" sig=11 res=1 kernel: virbr1: port 1(vnet2) entered disabled state kernel: device vnet2 left promiscuous mode kernel: virbr1: port 1(vnet2) entered disabled state audit: ANOM_PROMISCUOUS dev=vnet2 prom=0 old_prom=256 auid=4294967295 uid=107 gid=107 ses=4294967295 kernel: virbr0: port 1(vnet1) entered disabled state kernel: device vnet1 left promiscuous mode kernel: virbr0: port 1(vnet1) entered disabled state audit: ANOM_PROMISCUOUS dev=vnet1 prom=0 old_prom=256 auid=4294967295 uid=107 gid=107 ses=4294967295 systemd[1]: machine-qemu\x2d2\x2dsubVmTest1.scope: Deactivated successfully. systemd[1]: machine-qemu\x2d2\x2dsubVmTest1.scope: Consumed 10.027s CPU time. libvirtd[50999]: Unable to read from monitor: Connection reset by peer audit[50999]: VIRT_RESOURCE pid=50999 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:virtd_t:s0-s0:c0.c1023 msg='virt=qemu resrc=net reason=attach vm="subVmTest1" uuid=c6fbcaf5-7b34-4990-ad1f-1a2bed761301 old-net="?" new-net="52:54:00:a5:f8:c0" exe="/usr/sbin/libvirtd" hostname=? addr=? terminal=? res=failed' libvirtd[50999]: ethtool ioctl error on vnet2: No such device libvirtd[50999]: Failed to remove network backend for netdev hostnet1 libvirtd[50999]: ethtool ioctl error on vnet2: No such device systemd-machined[12154]: Machine qemu-2-subVmTest1 terminated. NetworkManager[556]: <info> [1618465365.5090] device (vnet2): released from master device virbr1 NetworkManager[556]: <info> [1618465365.5338] device (vnet1): state change: activated -> unmanaged (reason 'unmanaged', sys-iface-state: 'removed') NetworkManager[556]: <info> [1618465365.5418] device (vnet1): released from master device virbr0 libvirtd[50999]: ethtool ioctl error on vnet2: No such device systemd[1]: Starting Network Manager Script Dispatcher Service... systemd[1]: Started Network Manager Script Dispatcher Service. audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' libvirtd[50999]: ethtool ioctl error on vnet2: No such device libvirtd[50999]: unable to open '/sys/fs/cgroup/machine.slice/machine-qemu\x2d2\x2dsubVmTest1.scope/': No such file or directory libvirtd[50999]: Failed to remove cgroup for subVmTest1 Expected results: Attaching network interface works. Additional info: [1] https://osci-jenkins-1.ci.fedoraproject.org/job/fedora-ci/job/dist-git-pipeline/job/master/36822/testReport/(root)/tests/ [2] https://bodhi.fedoraproject.org/updates/FEDORA-2021-a873b164fa
It would be incredibly useful to get a stack trace. There may be one collected by coredumpctl, ie: # coredumpctl list If there is a stack trace, note down the ID and then: # coredumpctl gdb ID (gdb) t a a bt You may need to install some debuginfo packages to get good symbols, but gdb will guide you.
I already checked coredumpctl, there was no core dump (otherwise the journal would have had a minimal one). The test environment is just a regular current Fedora Rawhide cloud image, but this doesn't look very sensitive to the image configuration around it. Is there some way to see what libvirt sends to qemu's monitor when doing the attach-interface command? Then the reproducer could be re-done in terms of pure qemu.
It's kind of tedious, but: https://libvirt.org/kbase/debuglogs.html Basically it involves editing /etc/libvirt/libvirtd.log to enable debugging, restart libvirtd, and then you should get (extremely copious in my experience) debugging logs, and the information should be in there ... somewhere.
This is still broken in rc4 unfortunately. I sent a patch upstream. I'm duping this to an earlier filed bug https://lists.gnu.org/archive/html/qemu-devel/2021-04/msg04119.html *** This bug has been marked as a duplicate of bug 1948959 ***