Yesterday, pretty pretty much all of cockpit-machines tests started to fail in rawhide. I bisected this to the qemu 9.1.1.2 → 9.2.0-0.1.rc1 upgrade: https://bodhi.fedoraproject.org/updates/FEDORA-2024-d9af925186 which causes a crash. Reproducible: Always Steps to Reproduce: virsh net-start default virt-install --memory 50 --pxe --virt-type qemu --os-variant alpinelinux3.8 --disk none --wait 0 --name test1 # should be running now virsh list --all virsh domstats test1 # now it's stopped virsh list --all Actual Results: VM crashes. /var/log/libvirt/qemu/test1.log shows qemu-system-x86_64: ../qapi/qobject-output-visitor.c:95: qobject_output_add_obj: Assertion `name' failed. 2024-11-29 08:31:13.738+0000: shutting down, reason=crashed Expected Results: VM runs and successfully retrieves domstats qemu-system-x86-core-9.2.0-0.1.rc1.fc42.x86_64 libvirt-daemon-driver-qemu-10.9.0-1.fc42.x86_64 (but libvirt didn't change)
I was able to reproduce this. Note that it's the 'domstats' command which causes the crash in qemu.
Richard: Yes, I suppose I didn't point that out explicitly. Indeed it's domstats.
Thread 5 (Thread 0x7ff0224006c0 (LWP 3862699) "qemu-system-x86"): #0 syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38 #1 0x0000564a0a26ff62 in qemu_futex_wait (val=<optimized out>, f=<optimized out>) at /usr/src/debug/qemu-9.2.0-0.1.rc1.fc42.x86_64/include/qemu/futex.h:29 #2 qemu_event_wait (ev=ev@entry=0x564a0b386b48 <rcu_call_ready_event>) at ../util/qemu-thread-posix.c:464 #3 0x0000564a0a27a666 in call_rcu_thread (opaque=<optimized out>) at ../util/rcu.c:278 #4 0x0000564a0a26f455 in qemu_thread_start (args=0x564a355cdf80) at ../util/qemu-thread-posix.c:541 #5 0x00007ff0237ee797 in start_thread (arg=<optimized out>) at pthread_create.c:447 #6 0x00007ff02387278c in __GI___clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78 Thread 4 (Thread 0x7ff0212006c0 (LWP 3862700) "IO mon_iothread"): #0 0x00007ff023864f70 in __GI_ppoll (fds=fds@entry=0x7fefc8000cf0, nfds=nfds@entry=3, timeout=<optimized out>, timeout@entry=0x0, sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:42 #1 0x00007ff023d818c0 in ppoll (__fds=0x7fefc8000cf0, __nfds=3, __timeout=0x0, __ss=0x0) at /usr/include/bits/poll2.h:101 #2 g_main_context_poll_unlocked (priority=2147483647, context=0x564a35644030, t--Type <RET> for more, q to quit, c to continue without paging--c imeout_usec=<optimized out>, fds=0x7fefc8000cf0, n_fds=3) at ../glib/gmain.c:4579 #3 g_main_context_iterate_unlocked.isra.0 (context=0x564a35644030, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at ../glib/gmain.c:4257 #4 0x00007ff023d273f7 in g_main_loop_run (loop=0x564a356441f0) at ../glib/gmain.c:4464 #5 0x0000564a0a13bae9 in iothread_run (opaque=0x564a35926850) at ../iothread.c:70 #6 0x0000564a0a26f455 in qemu_thread_start (args=0x564a35644210) at ../util/qemu-thread-posix.c:541 #7 0x00007ff0237ee797 in start_thread (arg=<optimized out>) at pthread_create.c:447 #8 0x00007ff023872594 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:100 Thread 3 (Thread 0x7fefcec006c0 (LWP 3862701) "CPU 0/TCG"): #0 0x00007ff0237eae69 in __futex_abstimed_wait_common64 (private=0, futex_word=0x564a3564593c, expected=0, op=393, abstime=0x0, cancel=true) at futex-internal.c:57 #1 __futex_abstimed_wait_common (futex_word=futex_word@entry=0x564a3564593c, expected=expected@entry=0, clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=private@entry=0, cancel=cancel@entry=true) at futex-internal.c:87 #2 0x00007ff0237eaeef in __GI___futex_abstimed_wait_cancelable64 (futex_word=futex_word@entry=0x564a3564593c, expected=expected@entry=0, clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=private@entry=0) at futex-internal.c:139 #3 0x00007ff0237ed8b9 in __pthread_cond_wait_common (cond=0x564a35645910, mutex=0x564a35645910, clockid=0, abstime=0x0) at pthread_cond_wait.c:503 #4 ___pthread_cond_wait (cond=0x564a35645910, mutex=mutex@entry=0x564a0b3598e0 <bql>) at pthread_cond_wait.c:618 #5 0x0000564a0a26f9ca in qemu_cond_wait_impl (cond=<optimized out>, mutex=0x564a0b3598e0 <bql>, file=0x564a0a3e9bff "../system/cpus.c", line=462) at ../util/qemu-thread-posix.c:225 #6 0x0000564a09ec8e63 in qemu_wait_io_event (cpu=cpu@entry=0x564a3592eda0) at ../system/cpus.c:462 #7 0x00007ff0212af211 in mttcg_cpu_thread_fn (arg=0x564a3592eda0) at ../accel/tcg/tcg-accel-ops-mttcg.c:118 #8 0x0000564a0a26f455 in qemu_thread_start (args=0x564a35991110) at ../util/qemu-thread-posix.c:541 #9 0x00007ff0237ee797 in start_thread (arg=<optimized out>) at pthread_create.c:447 #10 0x00007ff023872594 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:100 Thread 2 (Thread 0x7fefcda006c0 (LWP 3862702) "SPICE Worker"): #0 0x00007ff023864f70 in __GI_ppoll (fds=fds@entry=0x7fefb0000b70, nfds=nfds@entry=2, timeout=<optimized out>, timeout@entry=0x7fefcd9fee70, sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:42 #1 0x00007ff023d818c0 in ppoll (__fds=0x7fefb0000b70, __nfds=2, __timeout=0x7fefcd9fee70, __ss=0x0) at /usr/include/bits/poll2.h:101 #2 g_main_context_poll_unlocked (priority=2147483647, context=0x564a3682b2f0, timeout_usec=<optimized out>, fds=0x7fefb0000b70, n_fds=2) at ../glib/gmain.c:4579 #3 g_main_context_iterate_unlocked.isra.0 (context=0x564a3682b2f0, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at ../glib/gmain.c:4257 #4 0x00007ff023d273f7 in g_main_loop_run (loop=0x7fefb0000cb0) at ../glib/gmain.c:4464 #5 0x00007ff021a7957a in red_worker_main (arg=0x564a3682b240) at /usr/src/debug/spice-0.15.1-6.fc41.x86_64/server/red-worker.cpp:1021 #6 0x00007ff0237ee797 in start_thread (arg=<optimized out>) at pthread_create.c:447 #7 0x00007ff023872594 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:100 Thread 1 (Thread 0x7ff022bd7700 (LWP 3862697) "qemu-system-x86"): #0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44 #1 0x00007ff0237f0793 in __pthread_kill_internal (threadid=<optimized out>, signo=6) at pthread_kill.c:78 #2 0x00007ff023797d0e in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 #3 0x00007ff02377f942 in __GI_abort () at abort.c:79 #4 0x00007ff02377f85e in __assert_fail_base (fmt=0x7ff023933cb0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x564a0a3fd38f "name", file=file@entry=0x564a0a46ff28 "../qapi/qobject-output-visitor.c", line=line@entry=95, function=function@entry=0x564a0a54b240 <__PRETTY_FUNCTION__.6> "qobject_output_add_obj") at assert.c:94 #5 0x00007ff02378fe47 in __assert_fail (assertion=assertion@entry=0x564a0a3fd38f "name", file=file@entry=0x564a0a46ff28 "../qapi/qobject-output-visitor.c", line=line@entry=95, function=function@entry=0x564a0a54b240 <__PRETTY_FUNCTION__.6> "qobject_output_add_obj") at assert.c:103 #6 0x0000564a09cd2560 in qobject_output_add_obj (qov=qov@entry=0x564a36dbf7e0, name=name@entry=0x0, value=<optimized out>) at ../qapi/qobject-output-visitor.c:95 #7 0x0000564a0a260547 in qobject_output_type_uint64 (v=0x564a36dbf7e0, name=0x0, obj=<optimized out>, errp=<optimized out>) at ../qapi/qobject-output-visitor.c:163 #8 0x0000564a0a03b731 in balloon_stats_get_all (obj=<optimized out>, v=0x564a35e016d0, name=<optimized out>, opaque=<optimized out>, errp=0x7ffec773ebb0) at ../hw/virtio/virtio-balloon.c:258 #9 0x0000564a0a0d7142 in object_property_get (obj=0x564a368d2200, name=0x564a368d3bd0 "guest-stats", v=v@entry=0x564a35e016d0, errp=errp@entry=0x7ffec773ec20) at ../qom/object.c:1435 #10 0x0000564a0a0d71f8 in property_get_alias (obj=<optimized out>, v=<optimized out>, name=<optimized out>, opaque=0x564a368d3bb0, errp=0x7ffec773ec20) at ../qom/object.c:2779 #11 0x0000564a0a0d7142 in object_property_get (obj=obj@entry=0x564a368c9c60, name=name@entry=0x564a364c4790 "guest-stats", v=v@entry=0x564a36dbf7e0, errp=errp@entry=0x7ffec773ecc0) at ../qom/object.c:1435 #12 0x0000564a0a0daca3 in object_property_get_qobject (obj=0x564a368c9c60, name=0x564a364c4790 "guest-stats", errp=errp@entry=0x7ffec773ecc0) at ../qom/qom-qobject.c:40 #13 0x0000564a0a1e416e in qmp_qom_get (path=<optimized out>, property=<optimized out>, errp=errp@entry=0x7ffec773ecc0) at ../qom/qom-qmp-cmds.c:89 #14 0x0000564a0a23c8cb in qmp_marshal_qom_get (args=0x7fefc8005a00, ret=0x7ff022bd3e98, errp=0x7ff022bd3e90) at qapi/qapi-commands-qom.c:130 #15 0x0000564a0a262222 in do_qmp_dispatch_bh (opaque=0x7ff022bd3ea0) at ../qapi/qmp-dispatch.c:128 #16 0x0000564a0a282776 in aio_bh_poll (ctx=ctx@entry=0x564a35640620) at ../util/async.c:219 #17 0x0000564a0a26c61b in aio_dispatch (ctx=0x564a35640620) at ../util/aio-posix.c:424 #18 0x0000564a0a2824f6 in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at ../util/async.c:361 #19 0x00007ff023d2136c in g_main_dispatch (context=0x564a356c1980) at ../glib/gmain.c:3348 #20 g_main_context_dispatch_unlocked (context=context@entry=0x564a356c1980) at ../glib/gmain.c:4197 #21 0x00007ff023d21635 in g_main_context_dispatch (context=0x564a356c1980) at ../glib/gmain.c:4185 #22 0x0000564a0a283c28 in glib_pollfds_poll () at ../util/main-loop.c:287 #23 os_host_main_loop_wait (timeout=0) at ../util/main-loop.c:310 #24 main_loop_wait (nonblocking=nonblocking@entry=0) at ../util/main-loop.c:589 #25 0x0000564a09ed35f0 in qemu_main_loop () at ../system/runstate.c:835 #26 0x0000564a0a1e7372 in qemu_default_main () at ../system/main.c:37 #27 0x00007ff023781248 in __libc_start_call_main (main=main@entry=0x564a09cda440 <main>, argc=argc@entry=87, argv=argv@entry=0x7ffec773f0a8) at ../sysdeps/nptl/libc_start_call_main.h:58 #28 0x00007ff02378130b in __libc_start_main_impl (main=0x564a09cda440 <main>, argc=87, argv=0x7ffec773f0a8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffec773f098) at ../csu/libc-start.c:360 #29 0x0000564a09cda935 in _start ()
An update of the linux headers introduced new stats, which broke existing code in QEMU https://lists.nongnu.org/archive/html/qemu-devel/2024-11/msg05455.html I'm patching Fedora now, and the fix ought to be upstream for the 9.2.0 GA release
Thanks Richard and Daniel, that was light-speed bug fixing! ♥
@berrange This bug could cause VM down by unprivileged cmd 'virsh -r domstats'. I think we should request a CVE for it.
(In reply to Han Han from comment #6) > @berrange This bug could cause VM down by unprivileged cmd 'virsh > -r domstats'. I think we should request a CVE for it. This bug was never present in any upstream release of QEMU, nor in a release of Fedora. It only existed in Fedora rawhide for a short time because we shipped a pre-release snapshot of code for QEMU. As such it doesn't justify a CVE from Fedora or upstream POV.
This bug appears to have been reported against 'rawhide' during the Fedora Linux 42 development cycle. Changing version to 42.