Bug 1607768

Summary: qemu aborted when start guest with a big iothreads
Product: Red Hat Enterprise Linux 7 Reporter: yafu <yafu>
Component: qemu-kvm-rhevAssignee: Stefan Hajnoczi <stefanha>
Status: CLOSED ERRATA QA Contact: Xueqiang Wei <xuwei>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.6CC: aliang, chayang, coli, ddepaula, juzhang, mrezanin, ngu, stefanha, virt-maint, yafu
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.12.0-20.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1687541 (view as bug list) Environment:
Last Closed: 2019-08-22 09:18:48 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1649160, 1687541    
Attachments:
Description Flags
coredump file none

Description yafu 2018-07-24 08:47:09 UTC
Description of problem:
qemu aborted when start guest with a big iothreads

Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.12.0-7.el7.x86_64
libvirt-4.5.0-3.el7.x86_64

How reproducible:


Steps to Reproduce:
1.Edit guest xml with a big iothreads:
#virsh edit iommu1
<iothreads>10000</iothreads>

2.Start the guest:
#virsh start iommu1
error: Failed to start domain iommu1
error: internal error: qemu unexpectedly closed the monitor: qemu-kvm: util/qemu-thread-posix.c:131: qemu_cond_destroy: Assertion `cond->initialized' failed.

3.Check the backtrace of qemu process:
(gdb) bt
#0  0x00007f9004f95207 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:55
#1  0x00007f9004f968f8 in __GI_abort () at abort.c:90
#2  0x00007f9004f8e026 in __assert_fail_base (fmt=0x7f90050e8ea0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x5608a7020480 "cond->initialized", file=file@entry=0x5608a7020454 "util/qemu-thread-posix.c", line=line@entry=131, function=function@entry=0x5608a70206d0 <__PRETTY_FUNCTION__.18631> "qemu_cond_destroy") at assert.c:92
#3  0x00007f9004f8e0d2 in __GI___assert_fail (assertion=assertion@entry=0x5608a7020480 "cond->initialized", file=file@entry=0x5608a7020454 "util/qemu-thread-posix.c", line=line@entry=131, function=function@entry=0x5608a70206d0 <__PRETTY_FUNCTION__.18631> "qemu_cond_destroy") at assert.c:101
#4  0x00005608a6e8d26b in qemu_cond_destroy (cond=cond@entry=0x5608a8f50168) at util/qemu-thread-posix.c:131
#5  0x00005608a6c5f163 in iothread_instance_finalize (obj=<optimized out>) at iothread.c:138
#6  0x00005608a6dae852 in object_unref (type=<optimized out>, obj=0x5608a8f500e0) at qom/object.c:462
#7  0x00005608a6dae852 in object_unref (data=0x5608a8f500e0) at qom/object.c:476
#8  0x00005608a6dae852 in object_unref (obj=obj@entry=0x5608a8f500e0) at qom/object.c:924
#9  0x00005608a6db182d in user_creatable_add_type (type=type@entry=0x5608a8bcc3c0 "iothread", id=id@entry=0x5608a8bcc3a0 "iothread505", qdict=qdict@entry=0x5608a8df6000, v=v@entry=
    0x5608a8de22d0, errp=errp@entry=0x7ffe05197230) at qom/object_interfaces.c:107
#10 0x00005608a6db1a76 in user_creatable_add_opts (opts=opts@entry=0x5608a8bc94f0, errp=errp@entry=0x7ffe05197230) at qom/object_interfaces.c:137
#11 0x00005608a6db1bf8 in user_creatable_add_opts_foreach (opaque=0x5608a6c641a0 <object_create_initial>, opts=0x5608a8bc94f0, errp=<optimized out>) at qom/object_interfaces.c:161
#12 0x00005608a6e9a73a in qemu_opts_foreach (list=<optimized out>, func=0x5608a6db1bb0 <user_creatable_add_opts_foreach>, opaque=opaque@entry=0x5608a6c641a0 <object_create_initial>, errp=errp@entry=0x0)
    at util/qemu-option.c:1104
#13 0x00005608a6b315aa in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4392


Actual results:
qemu aborted when start guest with a big iothreads

Expected results:
qemu should not abort when start guest with a big iothreads

Additional info:

Comment 3 aihua liang 2018-07-27 07:18:16 UTC
Hi, yafu

  I tested with 1000 iothreads, don't hit your issue.
  kernel version:3.10.0-918.el7.x86_64
  qemu-kvm-rhev version:qemu-kvm-rhev-2.12.0-8.el7.x86_64

  Need to confirm info bellow:
   1. your testing scenarios: iommu + iothreads? or only big iothreads?
   2. if your testing scenarios is iommu, do you configure iommu in both host and guest cmdline?

Comment 4 yafu 2018-07-30 01:12:16 UTC
(In reply to aihua liang from comment #3)
> Hi, yafu
> 
>   I tested with 1000 iothreads, don't hit your issue.
>   kernel version:3.10.0-918.el7.x86_64
>   qemu-kvm-rhev version:qemu-kvm-rhev-2.12.0-8.el7.x86_64
> 
>   Need to confirm info bellow:
>    1. your testing scenarios: iommu + iothreads? or only big iothreads?
>    2. if your testing scenarios is iommu, do you configure iommu in both
> host and guest cmdline?

Iommu is not enabled. The issue can be reproduced with 10000 iothreads.

Comment 5 aihua liang 2018-08-02 03:46:00 UTC
Yes, can reproduce it with 10000 iothreads.
 
Test env:
  kernel version: 3.10.0-926.el7.x86_64
  qemu-kvm-rhev version:qemu-kvm-rhev-2.12.0-8.el7.x86_64

Test steps:
 1.Start rhel7.6 guest with 10000 iothreads, cmds as bellow:
    /usr/libexec/qemu-kvm \
    -name 'avocado-vt-vm1'  \
    -sandbox off  \
    -machine pc  \
    -nodefaults \
    -device VGA,bus=pci.0,addr=0x2  \
    -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1-20180730-052157-8R2ZuIGd,server,nowait \
    -mon chardev=qmp_id_qmpmonitor1,mode=control  \
    -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor-20180730-052157-8R2ZuIGd,server,nowait \
    -mon chardev=qmp_id_catch_monitor,mode=control \
    -device pvpanic,ioport=0x505,id=idPX2yVp  \
    -chardev socket,id=serial_id_serial0,path=/var/tmp/serial-serial0-20180730-052157-8R2ZuIGd,server,nowait \
    -device isa-serial,chardev=serial_id_serial0  \
    -chardev socket,id=seabioslog_id_20180730-052157-8R2ZuIGd,path=/var/tmp/seabios-20180730-052157-8R2ZuIGd,server,nowait \
    -device isa-debugcon,chardev=seabioslog_id_20180730-052157-8R2ZuIGd,iobase=0x402 \
    -device nec-usb-xhci,id=usb1,bus=pci.0,addr=0x3 \
    -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=qcow2,file=/home/rhel76-64-virtio.qcow2 \
    -device virtio-blk-pci,id=image1,drive=drive_image1,bus=pci.0 \
    device virtio-net-pci,mac=9a:be:bf:c0:c1:c2,id=idZKQtxs,vectors=4,netdev=idvsIEDa,bus=pci.0  \
    -netdev tap,id=idvsIEDa,vhost=on \
    -m 4096  \
    -smp 2,maxcpus=2,cores=1,threads=1,sockets=2  \
    -cpu 'Penryn' \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
    -vnc :0  \
    -rtc base=utc,clock=host,driftfix=slew  \
    -boot menu=on,strict=off,order=cdn  \
    -no-shutdown \
    -enable-kvm \
    -monitor stdio \
    -object iothread,id=iothread0 \
    -object iothread,id=iothread1 \
    -object iothread,id=iothread2 \
    -object iothread,id=iothread3 \
    -object iothread,id=iothread4 \
    -object iothread,id=iothread5 \
    ....
    -object iothread,id=iothread9999 \

Test reuslt:
 Qemu core dump:
   [root@intel-3323-24-2 home]# sh ta.txt
Failed to create epoll instance: Too many open filesqemu-kvm: util/qemu-thread-posix.c:131: qemu_cond_destroy: Assertion `cond->initialized' failed.
ta.txt: line 10031: 10845 Aborted                 (core dumped) /usr/libexec/qemu-kvm -name 'avocado-vt-vm1' -sandbox off ....


Attachment it the coredump file.

Comment 6 aihua liang 2018-08-02 03:46:52 UTC
Created attachment 1472254 [details]
coredump file

Comment 7 aihua liang 2018-08-02 06:07:01 UTC
The minimum iothreads loaded that will cause qemu core dump is 4085.

So when you start guest with 4085 iothreads, qemu will core dump.

Comment 8 Stefan Hajnoczi 2018-11-28 15:16:57 UTC
I will backport the following fix:

commit 14a2d11825ddc37d6547a80704ae6450e9e376c7
Author: Marc-André Lureau <marcandre.lureau>
Date:   Tue Aug 21 12:07:16 2018 +0200

    iothread: fix crash with invalid properties

Comment 10 Miroslav Rezanina 2018-12-06 12:39:27 UTC
Fix included in qemu-kvm-rhev-2.12.0-20.el7

Comment 12 Xueqiang Wei 2018-12-10 03:07:38 UTC
Tested on qemu-kvm-rhev-2.12.0-20.el7, not hit this issue. So set bug status to VERIFIED.


Versions:
Host:
kernel-3.10.0-957.el7.x86_64
qemu-kvm-rhev-2.12.0-20.el7

Guest:
kernel-3.10.0-957.el7.x86_64



# ulimit -n 102400

start guest with 10000 iothreads, it works well.

# sh bug_1607768.sh 
QEMU 2.12.0 monitor - type 'help' for more information
(qemu) c
(qemu) system_reset      
(qemu) system_powerdown

Comment 13 Stefan Hajnoczi 2019-01-02 15:38:05 UTC
*** Bug 1622963 has been marked as a duplicate of this bug. ***

Comment 17 Danilo de Paula 2019-03-12 16:34:33 UTC
* my last comment is related to the slow train.
There will be a fast train update for 8.0.1. The branch is ready anyway.

Comment 19 errata-xmlrpc 2019-08-22 09:18:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:2553