Note the 'D' in: cagney 14175 36.9 2.1 3446808 355444 ? D 12:18 88:09 /usr/bin/qemu-system-x86_64 -machine accel=kvm -name guest=l.east,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-489-l.east/master-key.aes -machine pc-0.15,accel=kvm,usb=off,dump-guest-core=off -m 512 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 2b47a9c0-ab71-47e3-a818-b812e68fdd46 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-489-l.east/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0xa -drive file=/home/libreswan/pool/l.east.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=writeback -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0xb,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -fsdev local,security_model=none,id=fsdev-fs0,path=/home/libreswan/wip-lswlog/testing -device virtio-9p-pci,id=fs0,fsdev=fsdev-fs0,mount_tag=testing,bus=pci.0,addr=0x3 -fsdev local,security_model=none,id=fsdev-fs1,path=/home/libreswan/wip-lswlog -device virtio-9p-pci,id=fs1,fsdev=fsdev-fs1,mount_tag=swansource,bus=pci.0,addr=0x4 -fsdev local,security_model=none,id=fsdev-fs2,path=/tmp -device virtio-9p-pci,id=fs2,fsdev=fsdev-fs2,mount_tag=tmp,bus=pci.0,addr=0x5 -netdev tap,fd=28,id=hostnet0,vhost=on,vhostfd=30 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=12:00:00:dc:bc:ff,bus=pci.0,addr=0x6 -netdev tap,fd=31,id=hostnet1,vhost=on,vhostfd=32 -device virtio-net-pci,netdev=hostnet1,id=net1,mac=12:00:00:64:64:23,bus=pci.0,addr=0x8 -netdev tap,fd=33,id=hostnet2,vhost=on,vhostfd=34 -device virtio-net-pci,netdev=hostnet2,id=net2,mac=12:00:00:32:64:23,bus=pci.0,addr=0x9 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5901,addr=127.0.0.1,disable-ticketing,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xc -object rng-random,id=objrng0,filename=/dev/random -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x7 -msg timestamp=on $ sudo virsh dominfo l.east Id: 489 Name: l.east UUID: 2b47a9c0-ab71-47e3-a818-b812e68fdd46 OS Type: hvm State: in shutdown CPU(s): 1 CPU time: 5289.4s Max memory: 524288 KiB Used memory: 524288 KiB Persistent: yes Autostart: disable Managed save: no Security model: none Security DOI: 0 $ uname -a Linux bernard 4.14.8-300.fc27.x86_64 #1 SMP Wed Dec 20 19:00:18 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux $ rpm -q kernel kernel-4.14.5-300.fc27.x86_64 kernel-4.14.8-300.fc27.x86_64 $ rpm -q qemu-kvm qemu-kvm-2.11.0-4.fc27.x86_64 This happens with the 4.14.8-300 kernel but not with the 4.14.5-300 kernel. As usual, this can be reproduced by running libreswan's test-suite (which is rebooting domains hundreds of times). See: https://libreswan.org/wiki/Test_Suite
I guess it's hanging in a system call but without knowing what syscall it's hard to say more. The only thing which is going to help us is if you can dump a stack trace of the qemu process when it gets into this state. See: http://blog.kevac.org/2013/02/uninterruptible-sleep-d-state.html
Attempt #1 [21703.440546] sysrq: SysRq : This sysrq operation is disabled.
Attempt #2 [root@bernard ~]# echo "1" > /proc/sys/kernel/sysrq [root@bernard ~]# cat !$ cat /proc/sys/kernel/sysrq 1 [root@bernard ~]# echo w > /proc/sysrq-trigger [root@bernard ~]# dmesg -c [22052.776488] sysrq: SysRq : Show Blocked State [22052.776499] task PC stack pid father [22052.776849] qemu-system-x86 D 0 11335 1 0x00000004 [22052.776857] Call Trace: [22052.776870] __schedule+0x239/0x860 [22052.776876] schedule+0x2c/0x80 [22052.776886] vhost_net_ubuf_put_and_wait+0x60/0x90 [vhost_net] [22052.776894] ? finish_wait+0x80/0x80 [22052.776901] vhost_net_ioctl+0x532/0x900 [vhost_net] [22052.776909] ? kmem_cache_free+0x1ba/0x1e0 [22052.776915] ? __dentry_kill+0x115/0x150 [22052.776919] ? dput.part.23+0x18d/0x1c0 [22052.776926] do_vfs_ioctl+0xa5/0x600 [22052.776933] ? ____fput+0xe/0x10 [22052.776939] SyS_ioctl+0x79/0x90 [22052.776946] entry_SYSCALL_64_fastpath+0x1a/0xa5 [22052.776951] RIP: 0033:0x7f47b684f817 [22052.776954] RSP: 002b:00007ffdde355b48 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [22052.776959] RAX: ffffffffffffffda RBX: 00007f47a542b0f0 RCX: 00007f47b684f817 [22052.776963] RDX: 00007ffdde355b50 RSI: 000000004008af30 RDI: 0000000000000020 [22052.776966] RBP: 0000000000000000 R08: 000055794a8b94b0 R09: 000055794a8b4b12 [22052.776969] R10: 0000000000000002 R11: 0000000000000246 R12: 0000000000000000 [22052.776972] R13: 000055794ca622d0 R14: 00007f47a542b090 R15: 0000000000000001
Interesting, something related to vhost-net, and I notice that you're using 3 x vhost-net network interfaces in your guest. You could try turning vhost-net off and seeing if that makes a difference but based on your stack trace that would seem to be the cause of the problem.
BTW next step is to examine all the commits in the 4.14 stable branch of the kernel and see if any of them are likely causing a regression with vhost-net.
(In reply to Richard W.M. Jones from comment #4) > Interesting, something related to vhost-net, and I notice that you're > using 3 x vhost-net network interfaces in your guest. Yes, and multiple domains are also sharing these virtual networks. l.east just happens to be the first domain being told to reboot (and had "successfully" rebooted - but note bug 1374918 - for several hours before locking up). I'll look at my logs and see if any thing jumps out. > You could try turning vhost-net off and seeing if that makes a difference > but based on your stack trace that would seem to be the cause of the problem.
Kernel kernel-4.14.11-300.fc27.x86_64 doesn't appear to be broken.