Bug 2329448

Summary: qemu 9.1.1.2 → 9.2.0-0.1.rc1 crashes with ../qapi/qobject-output-visitor.c:95: qobject_output_add_obj: Assertion `name' failed.
Product: [Fedora] Fedora Reporter: Martin Pitt <mpitt>
Component: qemuAssignee: Fedora Virtualization Maintainers <virt-maint>
Status: NEW --- QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 42CC: berrange, cfergeau, crobinso, hhan, mcascell, pbonzini, philmd, rjones, suraj.ghimire7, virt-maint
Target Milestone: ---Keywords: Regression, Security
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
URL: https://artifacts.dev.testing-farm.io/6c318378-1eb8-4502-8722-4835a1596af8/
Whiteboard: CockpitTest
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Martin Pitt 2024-11-29 08:35:56 UTC
Yesterday, pretty pretty much all of cockpit-machines tests started to fail in rawhide. I bisected this to the qemu 9.1.1.2 → 9.2.0-0.1.rc1 upgrade: https://bodhi.fedoraproject.org/updates/FEDORA-2024-d9af925186 which causes a crash.



Reproducible: Always

Steps to Reproduce:
virsh net-start default
virt-install --memory 50 --pxe --virt-type qemu --os-variant alpinelinux3.8 --disk none --wait 0 --name test1

# should be running now
virsh list --all

virsh domstats test1

# now it's stopped
virsh list --all

Actual Results:  
VM crashes. /var/log/libvirt/qemu/test1.log shows

qemu-system-x86_64: ../qapi/qobject-output-visitor.c:95: qobject_output_add_obj: Assertion `name' failed.
2024-11-29 08:31:13.738+0000: shutting down, reason=crashed


Expected Results:  
VM runs and successfully retrieves domstats

qemu-system-x86-core-9.2.0-0.1.rc1.fc42.x86_64
libvirt-daemon-driver-qemu-10.9.0-1.fc42.x86_64 (but libvirt didn't change)

Comment 1 Richard W.M. Jones 2024-11-29 09:01:52 UTC
I was able to reproduce this.  Note that it's the 'domstats' command
which causes the crash in qemu.

Comment 2 Martin Pitt 2024-11-29 09:04:21 UTC
Richard: Yes, I suppose I didn't point that out explicitly. Indeed it's domstats.

Comment 3 Richard W.M. Jones 2024-11-29 09:12:11 UTC
Thread 5 (Thread 0x7ff0224006c0 (LWP 3862699) "qemu-system-x86"):
#0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
#1  0x0000564a0a26ff62 in qemu_futex_wait (val=<optimized out>, f=<optimized out>) at /usr/src/debug/qemu-9.2.0-0.1.rc1.fc42.x86_64/include/qemu/futex.h:29
#2  qemu_event_wait (ev=ev@entry=0x564a0b386b48 <rcu_call_ready_event>) at ../util/qemu-thread-posix.c:464
#3  0x0000564a0a27a666 in call_rcu_thread (opaque=<optimized out>) at ../util/rcu.c:278
#4  0x0000564a0a26f455 in qemu_thread_start (args=0x564a355cdf80) at ../util/qemu-thread-posix.c:541
#5  0x00007ff0237ee797 in start_thread (arg=<optimized out>) at pthread_create.c:447
#6  0x00007ff02387278c in __GI___clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78

Thread 4 (Thread 0x7ff0212006c0 (LWP 3862700) "IO mon_iothread"):
#0  0x00007ff023864f70 in __GI_ppoll (fds=fds@entry=0x7fefc8000cf0, nfds=nfds@entry=3, timeout=<optimized out>, timeout@entry=0x0, sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:42
#1  0x00007ff023d818c0 in ppoll (__fds=0x7fefc8000cf0, __nfds=3, __timeout=0x0, __ss=0x0) at /usr/include/bits/poll2.h:101
#2  g_main_context_poll_unlocked (priority=2147483647, context=0x564a35644030, t--Type <RET> for more, q to quit, c to continue without paging--c
imeout_usec=<optimized out>, fds=0x7fefc8000cf0, n_fds=3) at ../glib/gmain.c:4579
#3  g_main_context_iterate_unlocked.isra.0 (context=0x564a35644030, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at ../glib/gmain.c:4257
#4  0x00007ff023d273f7 in g_main_loop_run (loop=0x564a356441f0) at ../glib/gmain.c:4464
#5  0x0000564a0a13bae9 in iothread_run (opaque=0x564a35926850) at ../iothread.c:70
#6  0x0000564a0a26f455 in qemu_thread_start (args=0x564a35644210) at ../util/qemu-thread-posix.c:541
#7  0x00007ff0237ee797 in start_thread (arg=<optimized out>) at pthread_create.c:447
#8  0x00007ff023872594 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:100

Thread 3 (Thread 0x7fefcec006c0 (LWP 3862701) "CPU 0/TCG"):
#0  0x00007ff0237eae69 in __futex_abstimed_wait_common64 (private=0, futex_word=0x564a3564593c, expected=0, op=393, abstime=0x0, cancel=true) at futex-internal.c:57
#1  __futex_abstimed_wait_common (futex_word=futex_word@entry=0x564a3564593c, expected=expected@entry=0, clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=private@entry=0, cancel=cancel@entry=true) at futex-internal.c:87
#2  0x00007ff0237eaeef in __GI___futex_abstimed_wait_cancelable64 (futex_word=futex_word@entry=0x564a3564593c, expected=expected@entry=0, clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=private@entry=0) at futex-internal.c:139
#3  0x00007ff0237ed8b9 in __pthread_cond_wait_common (cond=0x564a35645910, mutex=0x564a35645910, clockid=0, abstime=0x0) at pthread_cond_wait.c:503
#4  ___pthread_cond_wait (cond=0x564a35645910, mutex=mutex@entry=0x564a0b3598e0 <bql>) at pthread_cond_wait.c:618
#5  0x0000564a0a26f9ca in qemu_cond_wait_impl (cond=<optimized out>, mutex=0x564a0b3598e0 <bql>, file=0x564a0a3e9bff "../system/cpus.c", line=462) at ../util/qemu-thread-posix.c:225
#6  0x0000564a09ec8e63 in qemu_wait_io_event (cpu=cpu@entry=0x564a3592eda0) at ../system/cpus.c:462
#7  0x00007ff0212af211 in mttcg_cpu_thread_fn (arg=0x564a3592eda0) at ../accel/tcg/tcg-accel-ops-mttcg.c:118
#8  0x0000564a0a26f455 in qemu_thread_start (args=0x564a35991110) at ../util/qemu-thread-posix.c:541
#9  0x00007ff0237ee797 in start_thread (arg=<optimized out>) at pthread_create.c:447
#10 0x00007ff023872594 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:100

Thread 2 (Thread 0x7fefcda006c0 (LWP 3862702) "SPICE Worker"):
#0  0x00007ff023864f70 in __GI_ppoll (fds=fds@entry=0x7fefb0000b70, nfds=nfds@entry=2, timeout=<optimized out>, timeout@entry=0x7fefcd9fee70, sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:42
#1  0x00007ff023d818c0 in ppoll (__fds=0x7fefb0000b70, __nfds=2, __timeout=0x7fefcd9fee70, __ss=0x0) at /usr/include/bits/poll2.h:101
#2  g_main_context_poll_unlocked (priority=2147483647, context=0x564a3682b2f0, timeout_usec=<optimized out>, fds=0x7fefb0000b70, n_fds=2) at ../glib/gmain.c:4579
#3  g_main_context_iterate_unlocked.isra.0 (context=0x564a3682b2f0, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at ../glib/gmain.c:4257
#4  0x00007ff023d273f7 in g_main_loop_run (loop=0x7fefb0000cb0) at ../glib/gmain.c:4464
#5  0x00007ff021a7957a in red_worker_main (arg=0x564a3682b240) at /usr/src/debug/spice-0.15.1-6.fc41.x86_64/server/red-worker.cpp:1021
#6  0x00007ff0237ee797 in start_thread (arg=<optimized out>) at pthread_create.c:447
#7  0x00007ff023872594 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:100

Thread 1 (Thread 0x7ff022bd7700 (LWP 3862697) "qemu-system-x86"):
#0  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
#1  0x00007ff0237f0793 in __pthread_kill_internal (threadid=<optimized out>, signo=6) at pthread_kill.c:78
#2  0x00007ff023797d0e in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3  0x00007ff02377f942 in __GI_abort () at abort.c:79
#4  0x00007ff02377f85e in __assert_fail_base (fmt=0x7ff023933cb0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x564a0a3fd38f "name", file=file@entry=0x564a0a46ff28 "../qapi/qobject-output-visitor.c", line=line@entry=95, function=function@entry=0x564a0a54b240 <__PRETTY_FUNCTION__.6> "qobject_output_add_obj") at assert.c:94
#5  0x00007ff02378fe47 in __assert_fail (assertion=assertion@entry=0x564a0a3fd38f "name", file=file@entry=0x564a0a46ff28 "../qapi/qobject-output-visitor.c", line=line@entry=95, function=function@entry=0x564a0a54b240 <__PRETTY_FUNCTION__.6> "qobject_output_add_obj") at assert.c:103
#6  0x0000564a09cd2560 in qobject_output_add_obj (qov=qov@entry=0x564a36dbf7e0, name=name@entry=0x0, value=<optimized out>) at ../qapi/qobject-output-visitor.c:95
#7  0x0000564a0a260547 in qobject_output_type_uint64 (v=0x564a36dbf7e0, name=0x0, obj=<optimized out>, errp=<optimized out>) at ../qapi/qobject-output-visitor.c:163
#8  0x0000564a0a03b731 in balloon_stats_get_all (obj=<optimized out>, v=0x564a35e016d0, name=<optimized out>, opaque=<optimized out>, errp=0x7ffec773ebb0) at ../hw/virtio/virtio-balloon.c:258
#9  0x0000564a0a0d7142 in object_property_get (obj=0x564a368d2200, name=0x564a368d3bd0 "guest-stats", v=v@entry=0x564a35e016d0, errp=errp@entry=0x7ffec773ec20) at ../qom/object.c:1435
#10 0x0000564a0a0d71f8 in property_get_alias (obj=<optimized out>, v=<optimized out>, name=<optimized out>, opaque=0x564a368d3bb0, errp=0x7ffec773ec20) at ../qom/object.c:2779
#11 0x0000564a0a0d7142 in object_property_get (obj=obj@entry=0x564a368c9c60, name=name@entry=0x564a364c4790 "guest-stats", v=v@entry=0x564a36dbf7e0, errp=errp@entry=0x7ffec773ecc0) at ../qom/object.c:1435
#12 0x0000564a0a0daca3 in object_property_get_qobject (obj=0x564a368c9c60, name=0x564a364c4790 "guest-stats", errp=errp@entry=0x7ffec773ecc0) at ../qom/qom-qobject.c:40
#13 0x0000564a0a1e416e in qmp_qom_get (path=<optimized out>, property=<optimized out>, errp=errp@entry=0x7ffec773ecc0) at ../qom/qom-qmp-cmds.c:89
#14 0x0000564a0a23c8cb in qmp_marshal_qom_get (args=0x7fefc8005a00, ret=0x7ff022bd3e98, errp=0x7ff022bd3e90) at qapi/qapi-commands-qom.c:130
#15 0x0000564a0a262222 in do_qmp_dispatch_bh (opaque=0x7ff022bd3ea0) at ../qapi/qmp-dispatch.c:128
#16 0x0000564a0a282776 in aio_bh_poll (ctx=ctx@entry=0x564a35640620) at ../util/async.c:219
#17 0x0000564a0a26c61b in aio_dispatch (ctx=0x564a35640620) at ../util/aio-posix.c:424
#18 0x0000564a0a2824f6 in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at ../util/async.c:361
#19 0x00007ff023d2136c in g_main_dispatch (context=0x564a356c1980) at ../glib/gmain.c:3348
#20 g_main_context_dispatch_unlocked (context=context@entry=0x564a356c1980) at ../glib/gmain.c:4197
#21 0x00007ff023d21635 in g_main_context_dispatch (context=0x564a356c1980) at ../glib/gmain.c:4185
#22 0x0000564a0a283c28 in glib_pollfds_poll () at ../util/main-loop.c:287
#23 os_host_main_loop_wait (timeout=0) at ../util/main-loop.c:310
#24 main_loop_wait (nonblocking=nonblocking@entry=0) at ../util/main-loop.c:589
#25 0x0000564a09ed35f0 in qemu_main_loop () at ../system/runstate.c:835
#26 0x0000564a0a1e7372 in qemu_default_main () at ../system/main.c:37
#27 0x00007ff023781248 in __libc_start_call_main (main=main@entry=0x564a09cda440 <main>, argc=argc@entry=87, argv=argv@entry=0x7ffec773f0a8) at ../sysdeps/nptl/libc_start_call_main.h:58
#28 0x00007ff02378130b in __libc_start_main_impl (main=0x564a09cda440 <main>, argc=87, argv=0x7ffec773f0a8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffec773f098) at ../csu/libc-start.c:360
#29 0x0000564a09cda935 in _start ()

Comment 4 Daniel Berrangé 2024-11-29 14:04:42 UTC
An update of the linux headers introduced new stats, which broke existing code in QEMU

https://lists.nongnu.org/archive/html/qemu-devel/2024-11/msg05455.html

I'm patching Fedora now, and the fix ought to be upstream for the 9.2.0 GA release

Comment 5 Martin Pitt 2024-11-29 14:19:23 UTC
Thanks Richard and Daniel, that was light-speed bug fixing! ♥

Comment 6 Han Han 2025-01-15 09:43:16 UTC
@berrange This bug could cause VM down by unprivileged cmd 'virsh -r domstats'. I think we should request a CVE for it.

Comment 7 Daniel Berrangé 2025-01-15 10:09:32 UTC
(In reply to Han Han from comment #6)
> @berrange This bug could cause VM down by unprivileged cmd 'virsh
> -r domstats'. I think we should request a CVE for it.

This bug was never present in any upstream release of QEMU, nor in a release of Fedora. It only existed in Fedora rawhide for a short time because we shipped a pre-release snapshot of code for QEMU. As such it doesn't justify a CVE from Fedora or upstream POV.

Comment 8 Aoife Moloney 2025-02-26 13:18:22 UTC
This bug appears to have been reported against 'rawhide' during the Fedora Linux 42 development cycle.
Changing version to 42.