Bug 1525868

Summary: Guest hit core dump with both IO throttling and data plane
Product: Red Hat Enterprise Linux 7 Reporter: Yongxue Hong <yhong>
Component: qemu-kvm-rhevAssignee: Stefan Hajnoczi <stefanha>
Status: CLOSED ERRATA QA Contact: Gu Nini <ngu>
Severity: high Docs Contact:
Priority: high    
Version: 7.5CC: aliang, chayang, coli, juzhang, knoel, lmiksik, michen, mrezanin, ngu, qzhang, virt-maint, yilzhang
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.10.0-16.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-11 00:55:27 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Yongxue Hong 2017-12-14 09:39:03 UTC
Description of problem:
Guest hit core dump with both IO throttling and data plane.

Version-Release number of selected component (if applicable):
3.10.0-823.el7.x86_64
qemu-kvm-rhev-2.10.0-12.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1.boot a guest with both IO throttling and data plane
eg:
/usr/libexec/qemu-kvm \
    -name 'avocado-vt-vm1'  \
    -sandbox off  \
    -machine pc  \
    -nodefaults  \
    -vga std \
    -rtc base=utc,clock=host,driftfix=slew \
    -devicevirtio-serial-pci,id=virtio_serial_pci0,bus=pci.0,addr=03,disable-legacy=off,disable-modern=on \
    -chardev socket,id=console0,path=/tmp/console0,server,nowait \
    -device virtserialport,chardev=console0,name=console0,id=console0,bus=virtio_serial_pci0.0 \
    -chardev socket,id=serial0,path=/tmp/serial0,server,nowait \
    -device isa-serial,chardev=serial0,id=serial0 \
    -device nec-usb-xhci,id=usb1,multifunction=on,bus=pci.0,addr=11 \
    -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=04,disable-legacy=off,disable-modern=on,iothread=iothread0 \
    -object iothread,id=iothread0 \
    -drive id=drive_image1,if=none,cache=none,format=qcow2,snapshot=off,file=/home/xianwang/jeos-25-64.qcow2,iops=100,bps=100 \
    -device scsi-hd,id=image1,drive=drive_image1,bus=virtio_scsi_pci0.0,bootindex=0 \
    -netdev tap,vhost=on,id=idlkwV8e,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown \
    -device virtio-net-pci,mac=9a:7b:7c:7d:7e:7f,id=idtlLxAk,vectors=4,netdev=idlkwV8e,bus=pci.0,addr=05,disable-legacy=off,disable-modern=on \
    -m 4G  \
    -smp 4  \
    -cpu SandyBridge \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=2  \
    -device usb-kbd,id=usb-kbd1,bus=usb1.0,port=3 \
    -device usb-mouse,id=usb-mouse1,bus=usb1.0,port=4 \
    -qmp tcp:0:6666,server,nowait \
    -vnc :9 \
    -rtc base=localtime,clock=vm,driftfix=slew  \
    -boot order=cdn,once=c,menu=off,strict=off \
    -monitor stdio \
    -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=06 \

Actual results:
Guest couldn't boot up and hit core dump.

GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-107.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/libexec/qemu-kvm...Reading symbols from
/usr/lib/debug/usr/libexec/qemu-kvm.debug...done.
done.
(gdb) r
Starting program: /usr/libexec/qemu-kvm -name avocado-vt-vm1 -sandbox
off -machine pc -nodefaults -vga std -rtc
base=utc,clock=host,driftfix=slew -device
virtio-serial-pci,id=virtio_serial_pci0,bus=pci.0,addr=03,disable-legacy=off,disable-modern=on
-chardev socket,id=console0,path=/tmp/console0,server,nowait -device
virtserialport,chardev=console0,name=console0,id=console0,bus=virtio_serial_pci0.0
-chardev socket,id=serial0,path=/tmp/serial0,server,nowait -device
isa-serial,chardev=serial0,id=serial0 -device
nec-usb-xhci,id=usb1,multifunction=on,bus=pci.0,addr=11 -device
virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=04,disable-legacy=off,disable-modern=on,iothread=iothread0
-object iothread,id=iothread0 -drive
id=drive_image1,if=none,cache=none,format=qcow2,snapshot=off,file=/home/xianwang/jeos-25-64.qcow2,iops=100,bps=100
-device
scsi-hd,id=image1,drive=drive_image1,bus=virtio_scsi_pci0.0,bootindex=0
-netdev
tap,vhost=on,id=idlkwV8e,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown
-device
virtio-net-pci,mac=9a:7b:7c:7d:7e:7f,id=idtlLxAk,vectors=4,netdev=idlkwV8e,bus=pci.0,addr=05,disable-legacy=off,disable-modern=on
-m 4G -smp 4 -cpu SandyBridge -device
usb-tablet,id=usb-tablet1,bus=usb1.0,port=2 -device
usb-kbd,id=usb-kbd1,bus=usb1.0,port=3 -device
usb-mouse,id=usb-mouse1,bus=usb1.0,port=4 -qmp tcp:0:6666,server,nowait
-vnc :9 -rtc base=localtime,clock=vm,driftfix=slew -boot
order=cdn,once=c,menu=off,strict=off -monitor stdio -device
virtio-balloon-pci,id=balloon0,bus=pci.0,addr=06
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x7fffe67f2700 (LWP 12985)]
[New Thread 0x7fffe5ff1700 (LWP 12986)]
Detaching after fork from child process 12990.
[New Thread 0x7fffe57f0700 (LWP 12997)]
QEMU 2.10.0 monitor - type 'help' for more information
(qemu) [New Thread 0x7fffe49e9700 (LWP 13000)]
[New Thread 0x7fffe41e8700 (LWP 13001)]
[New Thread 0x7fffe39e7700 (LWP 13002)]
[New Thread 0x7fffe31e6700 (LWP 13004)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffe5ff1700 (LWP 12986)]
throttle_group_restart_queue_entry (opaque=0x7fffffffd5e0) at
block/throttle-groups.c:376
376        ThrottleState *ts = tgm->throttle_state;
Missing separate debuginfos, use: debuginfo-install
boost-system-1.53.0-27.el7.x86_64 boost-thread-1.53.0-27.el7.x86_64
bzip2-libs-1.0.6-13.el7.x86_64 celt051-0.5.1.3-8.el7.x86_64
cyrus-sasl-lib-2.1.26-22.el7.x86_64 elfutils-libelf-0.170-2.el7.x86_64
elfutils-libs-0.170-2.el7.x86_64 glib2-2.54.2-1.el7.x86_64
glibc-2.17-217.el7.x86_64 glusterfs-api-3.8.4-52.el7.x86_64
glusterfs-libs-3.8.4-52.el7.x86_64 gmp-6.0.0-15.el7.x86_64
gnutls-3.3.26-9.el7.x86_64 gperftools-libs-2.6.1-1.el7.x86_64
keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-17.el7.x86_64
libacl-2.2.51-14.el7.x86_64 libaio-0.3.109-13.el7.x86_64
libattr-2.4.46-13.el7.x86_64 libblkid-2.23.2-46.el7.x86_64
libcacard-2.5.2-2.el7.x86_64 libcap-2.22-9.el7.x86_64
libcom_err-1.42.9-10.el7.x86_64 libcurl-7.29.0-45.el7.x86_64
libffi-3.0.13-18.el7.x86_64 libgcc-4.8.5-22.el7.x86_64
libgcrypt-1.5.3-14.el7.x86_64 libgpg-error-1.12-3.el7.x86_64
libibverbs-15-1.el7.x86_64 libidn-1.28-4.el7.x86_64
libiscsi-1.9.0-7.el7.x86_64 libjpeg-turbo-1.2.90-5.el7.x86_64
libmount-2.23.2-46.el7.x86_64 libnl3-3.2.28-4.el7.x86_64
libpng-1.5.13-7.el7_2.x86_64 librados2-0.94.5-2.el7.x86_64
librbd1-0.94.5-2.el7.x86_64 librdmacm-15-1.el7.x86_64
libseccomp-2.3.1-3.el7.x86_64 libselinux-2.5-12.el7.x86_64
libssh2-1.4.3-10.el7_2.1.x86_64 libstdc++-4.8.5-22.el7.x86_64
libtasn1-4.10-1.el7.x86_64 libusbx-1.0.21-1.el7.x86_64
libuuid-2.23.2-46.el7.x86_64 lz4-1.7.5-2.el7.x86_64
lzo-2.06-8.el7.x86_64 nettle-2.7.1-8.el7.x86_64 nspr-4.17.0-1.el7.x86_64
nss-3.34.0-0.1.beta1.el7.x86_64
nss-softokn-freebl-3.34.0-0.2.beta1.el7.x86_64
nss-util-3.34.0-0.1.beta1.el7.x86_64 numactl-libs-2.0.9-7.el7.x86_64
openldap-2.4.44-9.el7.x86_64 openssl-libs-1.0.2k-8.el7.x86_64
opus-1.0.2-6.el7.x86_64 p11-kit-0.23.5-3.el7.x86_64
pcre-8.32-17.el7.x86_64 pixman-0.34.0-1.el7.x86_64
snappy-1.1.0-3.el7.x86_64 spice-server-0.14.0-2.el7.x86_64
systemd-libs-219-46.el7.x86_64 usbredir-0.7.1-2.el7.x86_64
xz-libs-5.2.2-1.el7.x86_64 zlib-1.2.7-17.el7.x86_64
(gdb) bt
#0  throttle_group_restart_queue_entry (opaque=0x7fffffffd5e0) at
block/throttle-groups.c:376
#1  0x0000555555ad050a in coroutine_trampoline (i0=<optimized out>,
i1=<optimized out>) at util/coroutine-ucontext.c:79
#2  0x00007fffed89afa0 in ?? () from /lib64/libc.so.6
#3  0x00007fffffffce50 in ?? ()
#4  0x0000000000000000 in ?? ()
(gdb)

Expected results:
Guest boot up successfully.

Additional info:
It is also reproduced on P8.

[root@ibm-p8-rhevm-14 home]# sh yhong.sh
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-107.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "ppc64le-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/libexec/qemu-kvm...Reading symbols from /usr/lib/debug/usr/libexec/qemu-kvm.debug...done.
done.
(gdb) r
Starting program: /usr/libexec/qemu-kvm -name avocado-vt-vm1 -sandbox off -nodefaults -machine pseries-rhel7.5.0 -vga std -uuid 8aeab7e2-f341-4f8c-80e8-59e2968d85c2 -device virtio-serial-pci,id=virtio_serial_pci0,bus=pci.0,addr=03 -device virtio-scsi-pci,id=scsi1,bus=pci.0,addr=0x4 -device spapr-vscsi,id=scsi2 -chardev socket,id=console0,path=/tmp/console0,server,nowait -device spapr-vty,chardev=console0 -device nec-usb-xhci,id=usb1,bus=pci.0,addr=05 -object iothread,id=iothread0 -drive file=/home/rhel75-ppc64le-virtio-scsi.qcow2,format=qcow2,if=none,cache=none,id=drive_blk1,werror=stop,rerror=stop,bps=512000,iops=100 -device virtio-blk-pci,drive=drive_blk1,id=blk-disk1,bootindex=0,bus=pci.0,addr=06,iothread=iothread0 -drive file=/home/r1.qcow2,format=qcow2,if=none,cache=none,id=drive_data1,werror=stop,rerror=stop -device virtio-blk-pci,drive=drive_data1,id=blk-data,bus=pci.0,addr=07 -device virtio-net-pci,mac=9a:7b:7c:7d:7e:72,id=id9HRc5V,vectors=4,netdev=idjlQN53,bus=pci.0,addr=10 -netdev tap,id=idjlQN53,vhost=off,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown -m 4G -smp 4 -device usb-kbd -device usb-mouse -qmp tcp:0:8881,server,nowait -vnc :1 -msg timestamp=on -rtc base=localtime,clock=vm,driftfix=slew -monitor stdio -boot order=cdn,once=c,menu=on,strict=off -enable-kvm
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x3fffb5c3ead0 (LWP 54090)]
[New Thread 0x3fffb543ead0 (LWP 54091)]
Detaching after fork from child process 54095.
[New Thread 0x3fffb4b2ead0 (LWP 54107)]
QEMU 2.10.0 monitor - type 'help' for more information
(qemu) [New Thread 0x3fffb377ead0 (LWP 54165)]
[New Thread 0x3fffb2f5ead0 (LWP 54167)]
[New Thread 0x3fffb273ead0 (LWP 54170)]
[New Thread 0x3fffb1f1ead0 (LWP 54172)]
[New Thread 0x3ffeb01aead0 (LWP 54173)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x3fffb543ead0 (LWP 54091)]
0x0000000035fad9c4 in throttle_group_schedule_timer (tgm=0x371b2000, is_write=208) at block/throttle-groups.c:253
253        if (tg->any_timer_armed[is_write]) {
Missing separate debuginfos, use: debuginfo-install bzip2-libs-1.0.6-13.el7.ppc64le cyrus-sasl-gssapi-2.1.26-23.el7.ppc64le cyrus-sasl-lib-2.1.26-23.el7.ppc64le cyrus-sasl-md5-2.1.26-23.el7.ppc64le elfutils-libelf-0.170-3.el7.ppc64le elfutils-libs-0.170-3.el7.ppc64le glib2-2.54.2-2.el7.ppc64le glibc-2.17-220.el7.ppc64le gmp-6.0.0-15.el7.ppc64le gnutls-3.3.26-9.el7.ppc64le gperftools-libs-2.6.1-1.el7.ppc64le keyutils-libs-1.5.8-3.el7.ppc64le krb5-libs-1.15.1-18.el7.ppc64le libaio-0.3.109-13.el7.ppc64le libattr-2.4.46-13.el7.ppc64le libcap-2.22-9.el7.ppc64le libcom_err-1.42.9-11.el7.ppc64le libcurl-7.29.0-45.el7.ppc64le libdb-5.3.21-22.el7.ppc64le libfdt-1.4.3-1.el7.ppc64le libffi-3.0.13-18.el7.ppc64le libgcc-4.8.5-25.el7.ppc64le libgcrypt-1.5.3-14.el7.ppc64le libgpg-error-1.12-3.el7.ppc64le libibverbs-15-2.el7.ppc64le libidn-1.28-4.el7.ppc64le libiscsi-1.9.0-7.el7.ppc64le libnl3-3.2.28-4.el7.ppc64le libpng-1.5.13-7.el7_2.ppc64le librdmacm-15-2.el7.ppc64le libseccomp-2.3.1-3.el7.ppc64le libselinux-2.5-12.el7.ppc64le libssh2-1.4.3-10.el7_2.1.ppc64le libstdc++-4.8.5-25.el7.ppc64le libtasn1-4.10-1.el7.ppc64le libusbx-1.0.21-1.el7.ppc64le lzo-2.06-8.el7.ppc64le nettle-2.7.1-8.el7.ppc64le nspr-4.17.0-1.el7.ppc64le nss-3.34.0-1.el7.ppc64le nss-softokn-freebl-3.34.0-1.el7.ppc64le nss-util-3.34.0-1.el7.ppc64le numactl-libs-2.0.9-7.el7.ppc64le openldap-2.4.44-10.el7.ppc64le openssl-libs-1.0.2k-10.el7.ppc64le p11-kit-0.23.5-3.el7.ppc64le pcre-8.32-17.el7.ppc64le pixman-0.34.0-1.el7.ppc64le snappy-1.1.0-3.el7.ppc64le systemd-libs-219-49.el7.ppc64le xz-libs-5.2.2-1.el7.ppc64le zlib-1.2.7-17.el7.ppc64le
(gdb) bt
#0  0x0000000035fad9c4 in throttle_group_schedule_timer (tgm=0x371b2000, is_write=208) at block/throttle-groups.c:253
#1  0x0000000035fadd38 in schedule_next_request (tgm=0x370a0540, is_write=208) at block/throttle-groups.c:307
#2  0x0000000035fadeb0 in throttle_group_restart_queue_entry (opaque=<optimized out>) at block/throttle-groups.c:387
#3  0x000000003606d7a8 in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:79
#4  0x00003fffb702367c in makecontext () from /lib64/libc.so.6
#5  0x0000000000000000 in ?? ()
(gdb)

Comment 1 Miroslav Rezanina 2018-01-08 07:36:34 UTC
Fix included in qemu-kvm-rhev-2.10.0-16.el7

Comment 5 errata-xmlrpc 2018-04-11 00:55:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:1104