RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2124756 - qemu-kvm crashed when starting guest with cputune + kvm dirty-ring enabled
Summary: qemu-kvm crashed when starting guest with cputune + kvm dirty-ring enabled
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: qemu-kvm
Version: 9.1
Hardware: Unspecified
OS: Unspecified
low
unspecified
Target Milestone: rc
: ---
Assignee: Peter Xu
QA Contact: Li Xiaohui
URL:
Whiteboard:
Depends On: 2180898
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-09-07 04:13 UTC by yafu
Modified: 2023-11-07 09:15 UTC (History)
18 users (show)

Fixed In Version: qemu-kvm-8.0.0-1.el9
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-11-07 08:26:38 UTC
Type: Bug
Target Upstream Version:
Embargoed:
pm-rhel: mirror+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-133438 0 None None None 2022-09-07 12:53:34 UTC
Red Hat Product Errata RHSA-2023:6368 0 None None None 2023-11-07 08:27:17 UTC

Description yafu 2022-09-07 04:13:27 UTC
Description of problem:
qemu-kvm crashed when starting guest with hugepage + cpu hotpluggable + kvm dirty-ring enabled

Version-Release number of selected component (if applicable):
libvirt-8.5.0-5.el9.x86_64
qemu-kvm-7.0.0-12.el9.x86_64

How reproducible:
100%

Steps to Reproduce:
1.Edit guest xml with  hugepage + cpu hotpluggable + kvm dirty-ring enabled setting:
#virsh edit r9
<domain>
...
<memoryBacking>
    <hugepages>
      <page size='2048' unit='KiB'/>
    </hugepages>
  </memoryBacking>
<vcpus>
    <vcpu id='0' enabled='yes' hotpluggable='no' order='1'/>
    <vcpu id='1' enabled='yes' hotpluggable='no' order='2'/>
    <vcpu id='2' enabled='yes' hotpluggable='no' order='3'/>
    <vcpu id='3' enabled='yes' hotpluggable='no' order='4'/>
    <vcpu id='4' enabled='yes' hotpluggable='no' order='5'/>
    <vcpu id='5' enabled='yes' hotpluggable='no' order='6'/>
    <vcpu id='6' enabled='yes' hotpluggable='no' order='7'/>
    <vcpu id='7' enabled='yes' hotpluggable='no' order='8'/>
    <vcpu id='8' enabled='no' hotpluggable='yes'/>
    <vcpu id='9' enabled='no' hotpluggable='yes'/>
    <vcpu id='10' enabled='no' hotpluggable='yes'/>
    <vcpu id='11' enabled='no' hotpluggable='yes'/>
    <vcpu id='12' enabled='no' hotpluggable='yes'/>
    <vcpu id='13' enabled='no' hotpluggable='yes'/>
    <vcpu id='14' enabled='no' hotpluggable='yes'/>
    <vcpu id='15' enabled='no' hotpluggable='yes'/>
  </vcpus>
 <features>
    <acpi/>
    <apic/>
    <pae/>
    <kvm>
      <poll-control state='on'/>
      <pv-ipi state='off'/>
      <dirty-ring state='on' size='4096'/>
    </kvm>
</features>
...
</domain>

2.Start the guest:
# virsh start r9
error: Failed to start domain 'r9'
error: internal error: qemu unexpectedly closed the monitor: qemu-kvm: ../accel/kvm/kvm-all.c:737: uint32_t kvm_dirty_ring_reap_one(KVMState *, CPUState *): Assertion `dirty_gfns && ring_size' failed.

3.Check the coredump file:
# coredumpctl list
TIME                          PID UID GID SIG     COREFILE EXE                     SIZE
Tue 2022-09-06 23:01:06 EDT 38568 107 107 SIGABRT present  /usr/libexec/qemu-kvm 747.2K

4.Check the backtrace:
(gdb) t a a bt

Thread 5 (Thread 0x7fac3bbff640 (LWP 44860)):
#0  futex_wait (private=0, expected=2, futex_word=0x5623367c1688 <qemu_global_mutex>) at ../sysdeps/nptl/futex-internal.h:146
#1  __GI___lll_lock_wait (futex=futex@entry=0x5623367c1688 <qemu_global_mutex>, private=0) at lowlevellock.c:50
#2  0x00007facce410c22 in lll_mutex_lock_optimized (mutex=0x5623367c1688 <qemu_global_mutex>) at pthread_mutex_lock.c:49
#3  ___pthread_mutex_lock (mutex=0x5623367c1688 <qemu_global_mutex>) at pthread_mutex_lock.c:89
#4  0x000056233611ad7f in qemu_mutex_lock_impl (mutex=0x5623367c1688 <qemu_global_mutex>, file=0x80 <error: Cannot access memory at address 0x80>, line=2) at ../util/qemu-thread-posix.c:80
#5  0x0000562335eec972 in kvm_vcpu_thread_fn (arg=0x5623384d0300) at ../softmmu/cpus.c:503
#6  0x000056233611bbfa in qemu_thread_start (args=0x5623384e0250) at ../util/qemu-thread-posix.c:556
#7  0x00007facce40d802 in start_thread (arg=<optimized out>) at pthread_create.c:443
#8  0x00007facce3ad314 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:100

Thread 4 (Thread 0x7faccaf8a640 (LWP 44859)):
#0  0x00007facce4b071f in __GI___poll (fds=0x7facbc0035d0, nfds=3, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:29
#1  0x00007facce72359c in g_main_context_poll (priority=<optimized out>, n_fds=3, fds=0x7facbc0035d0, timeout=<optimized out>, context=0x5623384bbb00) at ../glib/gmain.c:4434
#2  g_main_context_iterate.constprop.0 (context=0x5623384bbb00, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at ../glib/gmain.c:4126
#3  0x00007facce6ce463 in g_main_loop_run (loop=0x5623382d3220) at ../glib/gmain.c:4329
#4  0x0000562335f3ba9f in iothread_run (opaque=0x562338364640) at ../iothread.c:74
#5  0x000056233611bbfa in qemu_thread_start (args=0x5623382d12f0) at ../util/qemu-thread-posix.c:556
#6  0x00007facce40d802 in start_thread (arg=<optimized out>) at pthread_create.c:443
#7  0x00007facce3ad314 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:100

Thread 3 (Thread 0x7faccc390640 (LWP 44851)):
#0  0x00007facce481845 in __GI___clock_nanosleep (clock_id=clock_id@entry=0, flags=flags@entry=0, req=req@entry=0x7faccc38f5e0, rem=rem@entry=0x7faccc38f5d0) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:48
#1  0x00007facce4863f7 in __GI___nanosleep (req=req@entry=0x7faccc38f5e0, rem=rem@entry=0x7faccc38f5d0) at ../sysdeps/unix/sysv--Type <RET> for more, q to quit, c to continue without paging--
/linux/nanosleep.c:25
#2  0x00007facce6f2ef7 in g_usleep (microseconds=<optimized out>) at ../glib/gtimer.c:277
#3  0x000056233612781a in call_rcu_thread (opaque=<optimized out>) at ../util/rcu.c:253
#4  0x000056233611bbfa in qemu_thread_start (args=0x56233827f4e0) at ../util/qemu-thread-posix.c:556
#5  0x00007facce40d802 in start_thread (arg=<optimized out>) at pthread_create.c:443
#6  0x00007facce3ad450 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

Thread 2 (Thread 0x7faccd395f00 (LWP 44708)):
#0  __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x5623367c1640 <qemu_cpu_cond+40>) at futex-internal.c:57
#1  __futex_abstimed_wait_common (futex_word=futex_word@entry=0x5623367c1640 <qemu_cpu_cond+40>, expected=expected@entry=0, clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=private@entry=0, cancel=cancel@entry=true) at futex-internal.c:87
#2  0x00007facce40a3ff in __GI___futex_abstimed_wait_cancelable64 (futex_word=futex_word@entry=0x5623367c1640 <qemu_cpu_cond+40>, expected=expected@entry=0, clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=private@entry=0) at futex-internal.c:139
#3  0x00007facce40cba0 in __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x5623367c1688 <qemu_global_mutex>, cond=0x5623367c1618 <qemu_cpu_cond>) at pthread_cond_wait.c:504
#4  ___pthread_cond_wait (cond=0x5623367c1618 <qemu_cpu_cond>, mutex=0x5623367c1688 <qemu_global_mutex>) at pthread_cond_wait.c:619
#5  0x000056233611b25f in qemu_cond_wait_impl (cond=0x5623367c1640 <qemu_cpu_cond+40>, mutex=0x189, file=0x0, line=-834624614) at ../util/qemu-thread-posix.c:195
#6  0x0000562335c89d37 in qemu_init_vcpu (cpu=0x5623384d0300) at ../softmmu/cpus.c:643
#7  0x0000562335d62b99 in x86_cpu_realizefn (dev=0x5623384d0300, errp=0x7faccd391ba0) at ../target/i386/cpu.c:6554
#8  0x0000562335efbdb1 in device_set_realized (obj=0x5623384d0300, value=true, errp=0x7faccd391bc8) at ../hw/core/qdev.c:531
#9  0x0000562335f06119 in property_set_bool (obj=0x5623384d0300, v=<optimized out>, name=<optimized out>, opaque=0x5623382f29a0, errp=0x7faccd391bc8) at ../qom/object.c:2273
#10 0x0000562335f017de in object_property_set (obj=0x5623384d0300, name=0x5623361c4aaf "realized", v=0x5623384dfaf0, errp=0x7faccd391bc8) at ../qom/object.c:1408
#11 0x0000562335f09aac in object_property_set_qobject (obj=0x5623384d0300, name=0x5623361c4aaf "realized", value=0x5623384bdce0--Type <RET> for more, q to quit, c to continue without paging--
, errp=0x5623367f1d48 <error_fatal>) at ../qom/qom-qobject.c:28
#12 0x0000562335f04647 in object_property_set_bool (obj=0x5623384d0300, name=0x5623361c4aaf "realized", value=true, errp=0x5623367f1d48 <error_fatal>) at ../qom/object.c:1477
#13 0x0000562335d2f750 in x86_cpu_new (x86ms=<optimized out>, apic_id=0, errp=0x5623367f1d48 <error_fatal>) at ../hw/core/qdev.c:333
#14 0x0000562335d2f86a in x86_cpus_init (x86ms=0x5623384318c0, default_cpu_version=<optimized out>) at ../hw/i386/x86.c:128
#15 0x0000562335d35b86 in pc_q35_init (machine=0x5623384318c0) at ../hw/i386/pc_q35.c:182
#16 0x0000562335babe77 in machine_run_board_init (machine=0x5623384318c0) at ../hw/core/machine.c:1416
#17 0x0000562335c9313d in qmp_x_exit_preconfig (errp=<optimized out>) at ../softmmu/vl.c:2665
#18 0x0000562335c9b34e in qemu_init (argc=<optimized out>, argv=0x7ffd3c6dde88, envp=<optimized out>) at ../softmmu/vl.c:3785
#19 0x0000562335b1f7bd in main (argc=914101824, argv=0x189, envp=0x0) at ../softmmu/main.c:49

Thread 1 (Thread 0x7facc9944640 (LWP 44855)):
#0  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
#1  0x00007facce40f5b3 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78
#2  0x00007facce3c2ce6 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3  0x00007facce3967f3 in __GI_abort () at abort.c:79
#4  0x00007facce39671b in __assert_fail_base (fmt=<optimized out>, assertion=<optimized out>, file=<optimized out>, line=<optimized out>, function=<optimized out>) at assert.c:92
#5  0x00007facce3bbc66 in __GI___assert_fail (assertion=0x5623361c262f "dirty_gfns && ring_size", file=0x5623361c21af "../accel/kvm/kvm-all.c", line=737, function=0x5623361c2647 "uint32_t kvm_dirty_ring_reap_one(KVMState *, CPUState *)") at assert.c:101
#6  0x0000562335ee5ccb in kvm_dirty_ring_reap_locked (s=0x5623384947d0) at ../accel/kvm/kvm-all.c:737
#7  0x0000562335ee591a in kvm_dirty_ring_reap (s=0x5623384947d0) at ../accel/kvm/kvm-all.c:810
#8  kvm_dirty_ring_reaper_thread (data=0x5623384947d0) at ../accel/kvm/kvm-all.c:1473
#9  0x000056233611bbfa in qemu_thread_start (args=0x5623382d0570) at ../util/qemu-thread-posix.c:556
#10 0x00007facce40d802 in start_thread (arg=<optimized out>) at pthread_create.c:443
#11 0x00007facce3ad314 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:100



Actual results:
qemu-kvm crashed when starting guest with hugepage + cpu hotpluggable + kvm dirty-ring enabled

Expected results:
Guest should start successfully.

Additional info:

Comment 2 Li Xiaohui 2022-09-07 12:41:18 UTC
Test on qemu-kvm-7.0.0-12.el9.x86_64 && libvirt-8.5.0-5.el9.x86_64, found:
1. reproduce this bug with rhel 8.7 (q35+seabios) guest, can't reproduce with rhel 9.1 (q35+ovmf) guest;
2. reproduce through libvirt with dirty-ring and cputune configured, not specific to hugepage and hotplugged vcpus;
3. can't reproduce on qemu side with similar qemu commands that libvirt parses
**************************************************
  <vcpu placement='static' current='8'>16</vcpu>
  <cputune>
    <shares>2048</shares>
    <period>1000000</period>
    <quota>3000</quota>
    <global_period>1000000</global_period>
    <global_quota>4000</global_quota>
    <emulator_period>1000000</emulator_period>
    <emulator_quota>5000</emulator_quota>
    <vcpusched vcpus='0' scheduler='batch'/>
    <vcpusched vcpus='1' scheduler='batch'/>
    <vcpusched vcpus='2' scheduler='batch'/>
    <vcpusched vcpus='3' scheduler='batch'/>
    <vcpusched vcpus='4' scheduler='batch'/>
    <vcpusched vcpus='5' scheduler='batch'/>
    <vcpusched vcpus='6' scheduler='batch'/>
    <vcpusched vcpus='7' scheduler='batch'/>
    <vcpusched vcpus='8' scheduler='batch'/>
    <vcpusched vcpus='9' scheduler='batch'/>
    <vcpusched vcpus='10' scheduler='batch'/>
    <vcpusched vcpus='11' scheduler='batch'/>
    <vcpusched vcpus='12' scheduler='batch'/>
    <vcpusched vcpus='13' scheduler='batch'/>
    <vcpusched vcpus='14' scheduler='batch'/>
    <vcpusched vcpus='15' scheduler='batch'/>
  </cputune>
  <features>
    <acpi/>
    <apic/>
    <kvm>
      <poll-control state='on'/>
      <pv-ipi state='off'/>
      <dirty-ring state='on' size='4096'/>
    </kvm>
  </features>
*************************************************


I found that with or without cputune, the qemu command lines that libvirt parses are same. So I think we need to know the cputune now.

Nana, do you know the cputune? And please help needinfo the relevant dev or libvirt qe if you don't know, thanks in advance.

Comment 3 Li Xiaohui 2022-09-07 12:42:08 UTC
The qemu command lines that libvirt parses:
/usr/libexec/qemu-kvm \
-name guest=rhel870,debug-threads=on \
-S \
-object '{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-1-rhel870/master-key.aes"}' \
-machine pc-q35-rhel9.0.0,usb=off,dump-guest-core=off,memory-backend=pc.ram \
-accel kvm,dirty-ring-size=4096 \
-cpu Skylake-Server-IBRS,ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,clflushopt=on,umip=on,pku=on,md-clear=on,stibp=on,arch-capabilities=on,ssbd=on,xsaves=on,ibpb=on,ibrs=on,amd-stibp=on,amd-ssbd=on,rsba=on,skip-l1dfl-vmentry=on,pschange-mc-no=on,kvm-poll-control=on,kvm-pv-ipi=off \
-m 2048 \
-object '{"qom-type":"memory-backend-ram","id":"pc.ram","size":2147483648}' \
-overcommit mem-lock=off \
-smp 8,maxcpus=16,sockets=16,cores=1,threads=1 \
-uuid e18770d8-fb31-4e95-8e22-603608acfe40 \
-no-user-config \
-nodefaults \
-chardev socket,id=charmonitor,fd=23,server=on,wait=off \
-mon chardev=charmonitor,id=monitor,mode=control \
-rtc base=utc,driftfix=slew \
-global kvm-pit.lost_tick_policy=delay \
-no-hpet \
-no-shutdown \
-global ICH9-LPC.disable_s3=1 \
-global ICH9-LPC.disable_s4=1 \
-boot strict=on \
-device '{"driver":"pcie-root-port","port":16,"chassis":1,"id":"pci.1","bus":"pcie.0","multifunction":true,"addr":"0x2"}' \
-device '{"driver":"pcie-root-port","port":17,"chassis":2,"id":"pci.2","bus":"pcie.0","addr":"0x2.0x1"}' \
-device '{"driver":"pcie-root-port","port":18,"chassis":3,"id":"pci.3","bus":"pcie.0","addr":"0x2.0x2"}' \
-device '{"driver":"pcie-root-port","port":19,"chassis":4,"id":"pci.4","bus":"pcie.0","addr":"0x2.0x3"}' \
-device '{"driver":"pcie-root-port","port":20,"chassis":5,"id":"pci.5","bus":"pcie.0","addr":"0x2.0x4"}' \
-device '{"driver":"pcie-root-port","port":21,"chassis":6,"id":"pci.6","bus":"pcie.0","addr":"0x2.0x5"}' \
-device '{"driver":"pcie-root-port","port":22,"chassis":7,"id":"pci.7","bus":"pcie.0","addr":"0x2.0x6"}' \
-device '{"driver":"pcie-root-port","port":23,"chassis":8,"id":"pci.8","bus":"pcie.0","addr":"0x2.0x7"}' \
-device '{"driver":"pcie-root-port","port":24,"chassis":9,"id":"pci.9","bus":"pcie.0","multifunction":true,"addr":"0x3"}' \
-device '{"driver":"pcie-root-port","port":25,"chassis":10,"id":"pci.10","bus":"pcie.0","addr":"0x3.0x1"}' \
-device '{"driver":"pcie-root-port","port":26,"chassis":11,"id":"pci.11","bus":"pcie.0","addr":"0x3.0x2"}' \
-device '{"driver":"pcie-root-port","port":27,"chassis":12,"id":"pci.12","bus":"pcie.0","addr":"0x3.0x3"}' \
-device '{"driver":"pcie-root-port","port":28,"chassis":13,"id":"pci.13","bus":"pcie.0","addr":"0x3.0x4"}' \
-device '{"driver":"pcie-root-port","port":29,"chassis":14,"id":"pci.14","bus":"pcie.0","addr":"0x3.0x5"}' \
-device '{"driver":"qemu-xhci","p2":15,"p3":15,"id":"usb","bus":"pci.2","addr":"0x0"}' \
-device '{"driver":"virtio-scsi-pci","id":"scsi0","bus":"pci.3","addr":"0x0"}' \
-device '{"driver":"virtio-serial-pci","id":"virtio-serial0","bus":"pci.4","addr":"0x0"}' \
-blockdev '{"driver":"file","filename":"/mnt/xiaohli/rhel870-64-virtio-scsi.qcow2","aio":"threads","node-name":"libvirt-1-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"}' \
-blockdev '{"node-name":"libvirt-1-format","read-only":false,"cache":{"direct":true,"no-flush":false},"driver":"qcow2","file":"libvirt-1-storage","backing":null}' \
-device '{"driver":"scsi-hd","bus":"scsi0.0","channel":0,"scsi-id":0,"lun":0,"device_id":"drive-scsi0-0-0-0","drive":"libvirt-1-format","id":"scsi0-0-0-0","bootindex":1,"write-cache":"on"}' \
-netdev tap,fd=24,vhost=on,vhostfd=26,id=hostnet0 \
-device '{"driver":"virtio-net-pci","netdev":"hostnet0","id":"net0","mac":"f4:8e:38:c3:83:12","bus":"pci.1","addr":"0x0"}' \
-chardev pty,id=charserial0 \
-device '{"driver":"isa-serial","chardev":"charserial0","id":"serial0","index":0}' \
-chardev socket,id=charchannel0,fd=22,server=on,wait=off \
-device '{"driver":"virtserialport","bus":"virtio-serial0.0","nr":1,"chardev":"charchannel0","id":"channel0","name":"org.qemu.guest_agent.0"}' \
-device '{"driver":"usb-tablet","id":"input0","bus":"usb.0","port":"1"}' \
-audiodev '{"id":"audio1","driver":"none"}' \
-vnc 0.0.0.0:0,audiodev=audio1 \
-device '{"driver":"VGA","id":"video0","vgamem_mb":16,"bus":"pcie.0","addr":"0x1"}' \
-device '{"driver":"virtio-balloon-pci","id":"balloon0","bus":"pci.5","addr":"0x0"}' \
-object '{"qom-type":"rng-random","id":"objrng0","filename":"/dev/urandom"}' \
-device '{"driver":"virtio-rng-pci","rng":"objrng0","id":"rng0","bus":"pci.6","addr":"0x0"}' \
-sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \
-msg timestamp=on

Comment 4 liunana 2022-09-08 03:33:42 UTC
(In reply to Li Xiaohui from comment #2)
> Test on qemu-kvm-7.0.0-12.el9.x86_64 && libvirt-8.5.0-5.el9.x86_64, found:
> 1. reproduce this bug with rhel 8.7 (q35+seabios) guest, can't reproduce
> with rhel 9.1 (q35+ovmf) guest;
> 2. reproduce through libvirt with dirty-ring and cputune configured, not
> specific to hugepage and hotplugged vcpus;
> 3. can't reproduce on qemu side with similar qemu commands that libvirt
> parses
> **************************************************
>   <vcpu placement='static' current='8'>16</vcpu>
>   <cputune>
>     <shares>2048</shares>
>     <period>1000000</period>
>     <quota>3000</quota>
>     <global_period>1000000</global_period>
>     <global_quota>4000</global_quota>
>     <emulator_period>1000000</emulator_period>
>     <emulator_quota>5000</emulator_quota>
>     <vcpusched vcpus='0' scheduler='batch'/>
>     <vcpusched vcpus='1' scheduler='batch'/>
>     <vcpusched vcpus='2' scheduler='batch'/>
>     <vcpusched vcpus='3' scheduler='batch'/>
>     <vcpusched vcpus='4' scheduler='batch'/>
>     <vcpusched vcpus='5' scheduler='batch'/>
>     <vcpusched vcpus='6' scheduler='batch'/>
>     <vcpusched vcpus='7' scheduler='batch'/>
>     <vcpusched vcpus='8' scheduler='batch'/>
>     <vcpusched vcpus='9' scheduler='batch'/>
>     <vcpusched vcpus='10' scheduler='batch'/>
>     <vcpusched vcpus='11' scheduler='batch'/>
>     <vcpusched vcpus='12' scheduler='batch'/>
>     <vcpusched vcpus='13' scheduler='batch'/>
>     <vcpusched vcpus='14' scheduler='batch'/>
>     <vcpusched vcpus='15' scheduler='batch'/>
>   </cputune>
>   <features>
>     <acpi/>
>     <apic/>
>     <kvm>
>       <poll-control state='on'/>
>       <pv-ipi state='off'/>
>       <dirty-ring state='on' size='4096'/>
>     </kvm>
>   </features>
> *************************************************
> 
> 
> I found that with or without cputune, the qemu command lines that libvirt
> parses are same. So I think we need to know the cputune now.
> 
> Nana, do you know the cputune?

Hi,

The optional cputune element provides details regarding the CPU tunable parameters for the domain.
refer to: https://libvirt.org/formatdomain.html#cpu-tuning

QEMU doesn't have such parameter in command line, we can use tool 'taskset -pc 0-7 $qemu-pid' and 'chrt -p -b 0 $qemu-pid' separately to do the work.
We can try this firstly.

There are also some other cgroup configurations that we can't set them in qemu commandline directly, if needed I can find out more to check if we can set it by tools.

Add lhuang to cc list in case I miss something here.
Please correct me if I'm wrong.


Best regards
Nana Liu

 And please help needinfo the relevant dev or
> libvirt qe if you don't know, thanks in advance.

Comment 5 Li Xiaohui 2022-09-08 04:23:02 UTC
Hi Yiqian,
Please help provide the correspond qemu command lines or the shell commands about:
 <cputune>
    <shares>2048</shares>
    <period>1000000</period>
    <quota>3000</quota>
    <global_period>1000000</global_period>
    <global_quota>4000</global_quota>
    <emulator_period>1000000</emulator_period>
    <emulator_quota>5000</emulator_quota>
    ...
  </cputune>

Thanks.

Comment 7 Li Xiaohui 2022-09-22 13:30:26 UTC
Guest fails to start with some call trace if configure cgroup:
1.Boot guest with below CPU commands, other qemu commands same as Comment 3:
-smp 16,sockets=16,cores=1,threads=1 \
2.Execute below commands on host 
# taskset -pc 0-7 $qemu-pid
# chrt -p -b 0 $qemu-pid
3.Config cgroupV2:
# cd /sys/fs/cgroup
# mkdir blue
# echo 5000 > /sys/fs/cgroup/blue/cpu.max
# echo $qemu-pid > /sys/fs/cgroup/blue/cgroup.procs
4.Cont guest through hmp:
(qemu) cont


Actual result:
After step 4, wait long times(may be one hour left and right), guest try to start, but finally hit core dump, fail to start. I can't get the dump info.


Yiqian, can you help check if the above configures are right? 
If right, we shall go on to confirm whether this issue is the same as this bug.

Comment 8 Li Xiaohui 2022-09-22 13:32:00 UTC
Sorry, correct one command in Comment 7 from Step 2:
# taskset -pc 0-15 $qemu-pid

Comment 9 Yiqian Wei 2022-09-23 02:32:23 UTC
(In reply to Li Xiaohui from comment #7)
> Guest fails to start with some call trace if configure cgroup:
> 1.Boot guest with below CPU commands, other qemu commands same as Comment 3:
> -smp 16,sockets=16,cores=1,threads=1 \
> 2.Execute below commands on host 
> # taskset -pc 0-7 $qemu-pid
> # chrt -p -b 0 $qemu-pid
> 3.Config cgroupV2:
> # cd /sys/fs/cgroup
> # mkdir blue
> # echo 5000 > /sys/fs/cgroup/blue/cpu.max
> # echo $qemu-pid > /sys/fs/cgroup/blue/cgroup.procs
> 4.Cont guest through hmp:
> (qemu) cont
> 
> 
> Actual result:
> After step 4, wait long times(may be one hour left and right), guest try to
> start, but finally hit core dump, fail to start. I can't get the dump info.
> 
> 
> Yiqian, can you help check if the above configures are right? 

the above configures are right.

> If right, we shall go on to confirm whether this issue is the same as this
> bug.

Comment 10 Li Xiaohui 2022-09-23 09:00:59 UTC
Hi all, 
If I changed quota to be equal with period value, then guest start successfully:
  <cputune>
    <shares>2048</shares>
    <period>1000000</period>
    <quota>1000000</quota>
    <global_period>1000000</global_period>
    <global_quota>1000000</global_quota>
    <emulator_period>1000000</emulator_period>
    <emulator_quota>1000000</emulator_quota>
  .....
  </cputune>

Comment 11 yisun 2022-09-23 09:07:44 UTC
as https://docs.kernel.org/admin-guide/cgroup-v2.html says
the cpu.max has following meaning:

cpu.max
A read-write two value file which exists on non-root cgroups. The default is “max 100000”.
The maximum bandwidth limit. It’s in the following format:
$MAX $PERIOD
which indicates that the group may consume upto $MAX in each $PERIOD duration. “max” for $MAX indicates no limit. If only one number is written, $MAX is updated.

So in previous test we used:
<cputune>
    <shares>2048</shares>
    <period>1000000</period>
    <quota>3000</quota>
    <global_period>1000000</global_period>
    <global_quota>4000</global_quota>
    <emulator_period>1000000</emulator_period>
    <emulator_quota>5000</emulator_quota>
...

Such as the global_period/quota is  1000000/4000, which means the vm process can only use 4000/1000000 = 0.4% cpu capacity. This maybe why vm process killed. 
If the crash is due to this, this maybe not a issue.

Comment 12 Li Xiaohui 2022-09-23 09:21:37 UTC
Hi Pavel, Peter, can you help check if we can close this bug as not a bug per Comment 10, Comment 11?

Comment 13 Li Xiaohui 2022-09-26 07:46:08 UTC
As no response till now, I would recommend close as not a bug per Comment 11.


Please contact me if you have any questions, or reopen the bug if anyone thinks it's worth fixing. Thanks.

Comment 14 Peter Xu 2022-09-27 15:09:53 UTC
Xiaohui,

Sorry I missed the message, and only get attention again after 3 days notice..

Irrelevant of the specific cgroup configuration, I think you're right QEMU shouldn't crash.  In this case, probably due to the cgroup limits the threads run in a special order so it triggered a possible crash of qemu we hardly reproduce.  Here in comment 0 we're reaping the dirty ring of a vcpu during its creation and that's illegal.

I'll post a patch shortly for this.  The bug can be re-opened but with low priority as long as the customer cannot trigger it with any sane setup.

Comment 21 Li Xiaohui 2023-02-27 11:37:59 UTC
Hi Yan, do we have libvirt case that corresponds to this bug? If yes, please help add the polarion link of this case, thanks

Comment 22 yafu 2023-02-28 03:45:37 UTC
(In reply to Li Xiaohui from comment #21)
> Hi Yan, do we have libvirt case that corresponds to this bug? If yes, please
> help add the polarion link of this case, thanks

Hi Xiaohui,

Sorry. There is no testcases in polarion for this bug.

Comment 23 Li Xiaohui 2023-04-04 08:31:51 UTC
Hi Peter, will we fix this bug on RHEL 9.3.0 since from your Comment 15, seems upstream has fixed this issue?

If we're targeting 9.3.0, please help set the ITR (also DTM if you know the fix plan). Thanks  in advance

Comment 24 Peter Xu 2023-04-04 12:29:39 UTC
Hi, Xiaohui,

(In reply to Li Xiaohui from comment #23)
> Hi Peter, will we fix this bug on RHEL 9.3.0 since from your Comment 15,
> seems upstream has fixed this issue?

Unfortunately upstream hasn't yet merged the fix.

Let me needinfo Paolo for that.

> 
> If we're targeting 9.3.0, please help set the ITR (also DTM if you know the
> fix plan). Thanks  in advance

I'll update the entries after upstream plan consolidates.  

Thanks,
Peter

Comment 26 Li Xiaohui 2023-04-05 03:34:21 UTC
Thank you all for the update

Comment 27 Peter Xu 2023-04-05 14:23:16 UTC
Patch merged and will be in 8.0-rc3.

56adee407f kvm: dirty-ring: Fix race with vcpu creation

We'll get the fix in 2-3 weeks automatically 

Xiaohui, I don't know how to mark this bug, but I think it should be TestOnly after our upcoming c9s/rhel9.3 rebase to upstream qemu 8.0.0.  Could you help to update corresponding fields?

I know how to set ITR so I did.  Thanks.

Comment 28 Li Xiaohui 2023-04-06 06:20:10 UTC
Hi Peter,

(In reply to Peter Xu from comment #27)
> Patch merged and will be in 8.0-rc3.
> 
> 56adee407f kvm: dirty-ring: Fix race with vcpu creation
> 
> We'll get the fix in 2-3 weeks automatically 
> 
> Xiaohui, I don't know how to mark this bug, but I think it should be
> TestOnly after our upcoming c9s/rhel9.3 rebase to upstream qemu 8.0.0. 
> Could you help to update corresponding fields?

I have added the qemu 8.0.0 rebased bug 2180898 on the Depends On field.

Based Bug 2180898, I would set DTM 10, ITM 12. Feel free to correct if they're wrong.

> 
> I know how to set ITR so I did.  Thanks.

I have marked qa_ack+, can you help set devel_ack please?

Comment 29 Li Xiaohui 2023-04-06 06:38:21 UTC
Sorry, forget this bug is hard to reproduce through qemu, but libvirt did easily.

Found the libvirt rebase bug 2175785 for RHEL 9.3.0, but the DTM is 20. Too late for QE to test.
So I would update ITM from 12 to 16. During these time, would do the basic tests about dirty-ring via qemu 
and try to find if have regression issues.
Also we can mark this bug verified if find the fix really solves this bug when test through qemu or the libvirt rebase build comes out.

Comment 30 Peter Xu 2023-04-06 13:03:58 UTC
Xiaohui,

(In reply to Li Xiaohui from comment #29)
> So I would update ITM from 12 to 16. During these time, would do the basic
> tests about dirty-ring via qemu 

Since there'll be no patch to backport, please feel free to set them with your best judgement.

(In reply to Li Xiaohui from comment #28)
> I have marked qa_ack+, can you help set devel_ack please?

Done.  Thanks!

Comment 31 Yanan Fu 2023-04-24 10:34:05 UTC
QE bot(pre verify): Set 'Verified:Tested,SanityOnly' as gating/tier1 test pass.

Comment 34 Li Xiaohui 2023-04-28 11:37:38 UTC
Hi Peter, I can boot the guest through libvirt with the same configuration as Comment 2 which can reproduce this bug,
The qemu and libvirt versions are libvirt-9.2.0-1.el9.x86_64 and qemu-kvm-8.0.0-1.el9.x86_64.

[root@hp-dl385g10-14 home]# virsh start rhel880 
Domain 'rhel880' started
[root@hp-dl385g10-14 home]# virsh list --all
 Id   Name      State
-------------------------
 1    rhel880   running


The guest would crash when start up. I think it's an expected result, right?

Comment 36 Peter Xu 2023-04-28 12:55:38 UTC
(In reply to Li Xiaohui from comment #34)
> The guest would crash when start up. I think it's an expected result, right?

Not really..  The fix should avoid qemu from crashing.  It should have no impact on guest OS.  If QEMU would crash before and now it won't, then I assume current issue (at least what we thought) is fixed. But something else might be wrong.

Maybe there's some specific config that failed the VM boot after our 8.0 rebase?  I'd try start with a simplest VM config when the guest can still boot (with the same VM image, just to make sure the image is fine), then grow the config until the guest crash can hit.

Comment 37 Li Xiaohui 2023-06-19 10:45:17 UTC
Per Comment 34 && Comment 36, we can mark this bug verified as the product issue from Description has been fixed. 

I tried again on the latest qemu-kvm-8.0.0-5.el9.x86_64, guest hang in the boot stage if boot vm with below libvirt configure:
  <cputune>
    <shares>2048</shares>
    <period>1000000</period>
    <quota>3000</quota>
    <global_period>1000000</global_period>
    <global_quota>4000</global_quota>
    <emulator_period>1000000</emulator_period>
    <emulator_quota>5000</emulator_quota>
    <vcpusched vcpus='0' scheduler='batch'/>
    <vcpusched vcpus='1' scheduler='batch'/>
    <vcpusched vcpus='2' scheduler='batch'/>
    <vcpusched vcpus='3' scheduler='batch'/>
    <vcpusched vcpus='4' scheduler='batch'/>
    <vcpusched vcpus='5' scheduler='batch'/>
    <vcpusched vcpus='6' scheduler='batch'/>
    <vcpusched vcpus='7' scheduler='batch'/>
  </cputune>

Note: no difference if we see qemu cmds with/without above cputune configure.

I will file a new bug if need

Comment 38 Li Xiaohui 2023-06-25 07:13:54 UTC
Hi Yan, Yi
Per Comment 37, we still can't boot VM successfully on RHEL 9.3.0 with above cputune configured.

Can you help confirm if it's a bug and file one if needed? 
I am not familiar with cputune and the issue is not easy to reproduce through qemu.

Comment 39 yafu 2023-06-25 09:40:29 UTC
(In reply to Li Xiaohui from comment #38)
> Hi Yan, Yi
> Per Comment 37, we still can't boot VM successfully on RHEL 9.3.0 with above
> cputune configured.
> 
> Can you help confirm if it's a bug and file one if needed? 
> I am not familiar with cputune and the issue is not easy to reproduce
> through qemu.

Hi Xiaohui,

I can not reproduce the issue. The guest can boot successfully with:
host kernel: kernel-5.14.0-325.el9.x86_64
guest kernel: Kernel 5.14.0-331.el9.x86_64
libvirt-9.3.0-2.el9.x86_64
qemu-kvm-8.0.0-5.el9.x86_64

Comment 40 yalzhang@redhat.com 2023-07-17 03:26:30 UTC
Confirmed with Luyao, the behavior in comment 37 is accetpable. The quota related parameters are to specify the maximum allowed vcpu bandwidth, refer to https://libvirt.org/formatdomain.html#cpu-tuning. We have to be very careful when setting these parameters, since low quota settings could cause the guest to be very very slow, which seems like hang.

Comment 42 errata-xmlrpc 2023-11-07 08:26:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: qemu-kvm security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:6368


Note You need to log in before you can comment on or make changes to this bug.