Bug 1003819
| Summary: | System-reset make qemu core dumpd after migrating a "s3-state" guest w/ spice&qxl . | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Qian Guo <qiguo> |
| Component: | qemu-kvm | Assignee: | Gerd Hoffmann <kraxel> |
| Status: | CLOSED DUPLICATE | QA Contact: | Virtualization Bugs <virt-bugs> |
| Severity: | high | Docs Contact: | |
| Priority: | medium | ||
| Version: | 7.0 | CC: | hhuang, juzhang, mazhang, qiguo, qzhang, rbalakri, rhod, rmainz, virt-bugs, virt-maint, xutian |
| Target Milestone: | rc | Keywords: | TestOnly |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2014-10-30 08:46:26 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1054077 | ||
| Bug Blocks: | 923626 | ||
There's call trace in the dst host after qemu-kvm coredumpd: [98393.475354] WARNING: at net/core/dev.c:5011 rollback_registered_many+0x1e2/0x210() [98393.475356] Modules linked in: tcp_lp rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache nfsd auth_rpcgss nfs_acl lockd sunrpc vhost_net macvtap macvlan tun bnep bluetooth fuse xt_CHECKSUM bridge stp llc ebtable_nat nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6table_nat nf_nat_ipv6 ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables openvswitch vxlan ip_tunnel gre sg snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device iTCO_wdt igb coretemp kvm_intel iTCO_vendor_support kvm e1000e i2c_i801 snd_pcm snd_page_alloc snd_timer snd hp_wmi sparse_keymap [98393.475421] rfkill crc32_pclmul lpc_ich dca crc32c_intel soundcore ghash_clmulni_intel mfd_core ptp pps_core wmi shpchp serio_raw microcode mperf pcspkr uinput xfs libcrc32c sr_mod sd_mod cdrom crc_t10dif i915 video i2c_algo_bit drm_kms_helper ahci drm libahci libata i2c_core dm_mirror dm_region_hash dm_log dm_mod [98393.475454] CPU: 4 PID: 14411 Comm: qemu-kvm Tainted: G W -------------- 3.10.0-15.el7.x86_64 #1 [98393.475457] Hardware name: Hewlett-Packard HP Compaq 8200 Elite MT PC/1495, BIOS J01 v02.15 11/10/2011 [98393.475459] 0000000000000009 ffff8801ee419b30 ffffffff815fa8cc ffff8801ee419b68 [98393.475464] ffffffff81060711 ffff8801249e0000 ffff8801ee419bb0 ffff8801ee419bb0 [98393.475468] ffff88021bc800c0 ffff8801e4974400 ffff8801ee419b78 ffffffff810607ea [98393.475473] Call Trace: [98393.475480] [<ffffffff815fa8cc>] dump_stack+0x19/0x1b [98393.475487] [<ffffffff81060711>] warn_slowpath_common+0x61/0x80 [98393.475489] [<ffffffff810607ea>] warn_slowpath_null+0x1a/0x20 [98393.475492] [<ffffffff814ee8e2>] rollback_registered_many+0x1e2/0x210 [98393.475494] [<ffffffff814ee941>] rollback_registered+0x31/0x40 [98393.475497] [<ffffffff814ef9f8>] unregister_netdevice_queue+0x48/0x90 [98393.475509] [<ffffffffa0677312>] __tun_detach+0x112/0x2b0 [tun] [98393.475513] [<ffffffffa06774dd>] tun_chr_close+0x2d/0x50 [tun] [98393.475517] [<ffffffff8119e6a9>] __fput+0xe9/0x270 [98393.475520] [<ffffffff8119e8ee>] ____fput+0xe/0x10 [98393.475524] [<ffffffff810820a4>] task_work_run+0xc4/0xe0 [98393.475527] [<ffffffff81066025>] do_exit+0x2b5/0xa20 [98393.475530] [<ffffffff8106680f>] do_group_exit+0x3f/0xa0 [98393.475535] [<ffffffff81074eeb>] get_signal_to_deliver+0x1cb/0x5d0 [98393.475539] [<ffffffff81011408>] do_signal+0x48/0x5a0 [98393.475543] [<ffffffff810119d0>] do_notify_resume+0x70/0xa0 [98393.475547] [<ffffffff81609292>] int_signal+0x12/0x17 [98393.475549] ---[ end trace 8b1af66abfed498d ]--- Looks simliar to bug 1021324. Can you retest with qemu-kvm-1.5.3-12.el7.x86_64 (or newer) please? Hi, Gerd Reproduced with # rpm -q qemu-kvm-rhev qemu-kvm-rhev-1.5.3-13.el7.x86_64 host/guest kernel : kernel-3.10.0-42.el7.x86_64 After migrate and system_reset, qemu-kvm coredumpd(qemu) qemu-kvm: /builddir/build/BUILD/qemu-1.5.3/hw/display/qxl.c:1114: qxl_check_state: Assertion `!spice_display_running || ((&ram->cmd_ring)->cons == (&ram->cmd_ring)->prod)' failed. Aborted and in dst host, hit call trace, the coredumpd and call trace messages are same as comment #0. See bug #1021324 , the coredumpd messages are similar, seams same bug. Thanks, Qian Guo http://patchwork.ozlabs.org/patch/299331/ http://patchwork.ozlabs.org/patch/299329/ http://patchwork.ozlabs.org/patch/299330/ upstream commits: 7cc6a25fe94b430cb5a041bcb19d7d854b4e99a7 b50f3e42b9438e033074222671c0502ecfeba82c 75c70e37bc4a6bdc394b4d1b163fe730abb82c72 Most likely same as bug 1054077. bug 1054077 was fixed in qemu-kvm-1.5.3-71.el7, please retest with that build (or newer). Hi Qian, Could you re-test this issue? Best Regards, Junyi (In reply to Gerd Hoffmann from comment #16) > bug 1054077 was fixed in qemu-kvm-1.5.3-71.el7, please retest with that > build (or newer). Test this scenario with qemu-kvm-rhev-2.1.2-5.el7.x86_64 and qemu-kvm-1.5.3-77.el7.x86_64, both works well. qemu cli: # /usr/libexec/qemu-kvm -cpu Penryn -enable-kvm -m 4096 -smp 4,sockets=1,cores=4,threads=1 -name rhel7base -drive file=/mnt/rhel7u1/rhel7u1cp1.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,werror=stop,rerror=stop,aio=native -device virtio-blk-pci,drive=drive-virtio-disk0,id=virtio-disk0 -boot menu=on -monitor stdio -netdev tap,id=hostnet0,ifname=guest1,script=/etc/qemu-ifup,vhost=on,queues=4 -device virtio-net,netdev=hostnet0,mac=54:52:1b:35:3c:16,id=test,mq=on,vectors=9 -nodefaults -nodefconfig -spice disable-ticketing,port=5930,seamless-migration=on -vga qxl -global qxl-vga.vram_size=67108864 -device virtio-balloon-pci,id=balloon1 -qmp tcp:0:4446,server,nowait -device intel-hda,id=hda1 -device hda-duplex -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -serial unix:/tmp/qiguo,server,nowait So the latest build has fixed this bug. > So the latest build has fixed this bug. Good, closing as 1054077 dup then. *** This bug has been marked as a duplicate of bug 1054077 *** |
Description of problem: RHEL7 guest w/ spice and qxl device is in S3 state, then migrated, the guest can not be resumed(exist issue), but when try to system_reset via hmp, the qemu coredumpd. Version-Release number of selected component (if applicable): kernel: # uname -r 3.10.0-15.el7.x86_64 # rpm -q qemu-kvm qemu-kvm-1.5.3-2.el7.x86_64 How reproducible: 100% Steps to Reproduce: 1.Boot guest in src host: #/usr/libexec/qemu-kvm -cpu Penryn -enable-kvm -m 4096 -smp 4,sockets=1,cores=4,threads=1 -name rhel7base -drive file=/mnt/rhel7cp1.qcow2_v3,if=none,id=drive-virtio-disk0,format=qcow2,werror=stop,rerror=stop,aio=native -device virtio-blk-pci,drive=drive-virtio-disk0,id=virtio-disk0 -boot menu=on -monitor stdio -netdev tap,id=hostnet0,ifname=guest1,script=/etc/ovs-ifup,downscript=/etc/ovs-ifdown,vhost=on,queues=4 -device virtio-net,netdev=hostnet0,mac=54:52:1b:35:3c:16,id=test,mq=on,vectors=9 -nodefaults -nodefconfig -spice disable-ticketing,port=5930,seamless-migration=on -vga qxl -global qxl-vga.vram_size=67108864 -device virtio-balloon-pci,id=balloon1 -qmp tcp:0:4446,server,nowait -device intel-hda,id=hda1 -device hda-duplex -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -serial unix:/tmp/qiguo,server,nowait 2.Do S3 in guest: # pm-suspend 3.Migrate this guest to dst host, then the guest stalled, can not be resumed. 4.After migration, try to reset the guest via hmp # system_reset Actual results: qemu coredumpd w/ the qxl/spice information: (qemu) qemu-kvm: /builddir/build/BUILD/qemu-1.5.3/hw/display/qxl.c:1114: qxl_check_state: Assertion `!spice_display_running || ((&ram->cmd_ring)->cons == (&ram->cmd_ring)->prod)' failed. Program received signal SIGABRT, Aborted. 0x00007ffff32e4999 in raise () from /lib64/libc.so.6 Missing separate debuginfos, use: debuginfo-install alsa-lib-1.0.27.2-1.el7.x86_64 celt051-0.5.1.3-6.el7.x86_64 cyrus-sasl-lib-2.1.26-9.el7.x86_64 cyrus-sasl-md5-2.1.26-9.el7.x86_64 cyrus-sasl-plain-2.1.26-9.el7.x86_64 cyrus-sasl-scram-2.1.26-9.el7.x86_64 dbus-libs-1.6.12-4.el7.x86_64 flac-libs-1.3.0-2.el7.x86_64 glib2-2.36.3-2.el7.x86_64 glibc-2.17-21.el7.x86_64 glusterfs-3.4.0.15rhs-1.el7.x86_64 gmp-5.1.1-2.el7.x86_64 gnutls-3.1.13-1.el7.x86_64 gsm-1.0.13-9.el7.x86_64 json-c-0.11-1.el7.x86_64 keyutils-libs-1.5.5-4.el7.x86_64 krb5-libs-1.11.3-8.el7.x86_64 libICE-1.0.8-5.el7.x86_64 libSM-1.2.1-5.el7.x86_64 libX11-1.6.0-1.el7.x86_64 libXau-1.0.8-1.el7.x86_64 libXext-1.3.2-1.el7.x86_64 libXi-1.7.2-1.el7.x86_64 libXtst-1.2.2-1.el7.x86_64 libaio-0.3.109-9.el7.x86_64 libasyncns-0.8-5.el7.x86_64 libattr-2.4.46-10.el7.x86_64 libcap-2.22-6.el7.x86_64 libcom_err-1.42.8-2.el7.x86_64 libdb-5.3.21-11.el7.x86_64 libgcc-4.8.1-6.el7.x86_64 libgcrypt-1.5.3-1.el7.x86_64 libgpg-error-1.11-1.el7.x86_64 libiscsi-1.7.0-6.el7.x86_64 libjpeg-turbo-1.2.90-2.el7.x86_64 libogg-1.3.0-5.el7.x86_64 libpng-1.5.13-2.el7.x86_64 libseccomp-2.1.0-0.el7.x86_64 libselinux-2.1.13-16.el7.x86_64 libsndfile-1.0.25-7.el7.x86_64 libtasn1-3.3-1.el7.x86_64 libusbx-1.0.15-2.el7.x86_64 libuuid-2.23.2-2.el7.x86_64 libvorbis-1.3.3-4.el7.x86_64 libxcb-1.9-3.el7.x86_64 nettle-2.6-2.el7.x86_64 nspr-4.10-3.el7.x86_64 nss-3.15.1-2.el7.x86_64 nss-softokn-freebl-3.15.1-2.el7.x86_64 nss-util-3.15.1-2.el7.x86_64 openssl-libs-1.0.1e-15.el7.x86_64 p11-kit-0.18.5-1.el7.x86_64 pcre-8.32-7.el7.x86_64 pixman-0.30.0-1.el7.x86_64 pulseaudio-libs-3.0-10.el7.x86_64 spice-server-0.12.4-1.el7.x86_64 tcp_wrappers-libs-7.6-75.el7.x86_64 usbredir-0.6-3.el7.x86_64 zlib-1.2.7-10.el7.x86_64 (gdb) bt #0 0x00007ffff32e4999 in raise () from /lib64/libc.so.6 #1 0x00007ffff32e60a8 in abort () from /lib64/libc.so.6 #2 0x00007ffff32dd906 in __assert_fail_base () from /lib64/libc.so.6 #3 0x00007ffff32dd9b2 in __assert_fail () from /lib64/libc.so.6 #4 0x000055555575971d in qxl_check_state (d=<optimized out>) at /usr/src/debug/qemu-1.5.3/hw/display/qxl.c:1114 #5 0x000055555575a025 in qxl_reset_state (d=d@entry=0x555556713e20) at /usr/src/debug/qemu-1.5.3/hw/display/qxl.c:1122 #6 0x000055555575b35b in qxl_hard_reset (d=0x555556713e20, loadvm=0) at /usr/src/debug/qemu-1.5.3/hw/display/qxl.c:1159 #7 0x000055555563ecd9 in qdev_reset_one (dev=dev@entry=0x555556713e20, opaque=opaque@entry=0x0) at hw/core/qdev.c:227 #8 0x000055555563e3d0 in qdev_walk_children (dev=dev@entry=0x555556713e20, devfn=devfn@entry=0x55555563ecc0 <qdev_reset_one>, busfn=busfn@entry=0x55555563ccc0 <qbus_reset_one>, opaque=opaque@entry=0x0) at hw/core/qdev.c:376 #9 0x000055555563e46d in qdev_reset_all (dev=dev@entry=0x555556713e20) at hw/core/qdev.c:243 #10 0x0000555555682c3d in pci_device_reset (dev=0x555556713e20) at hw/pci/pci.c:180 #11 0x0000555555682df2 in pci_bus_reset (bus=0x5555566aed70) at hw/pci/pci.c:226 #12 0x0000555555682e39 in pcibus_reset (qbus=<optimized out>) at hw/pci/pci.c:233 #13 0x000055555563e4b0 in qbus_walk_children (bus=bus@entry=0x5555566aed70, devfn=devfn@entry=0x55555563ecc0 <qdev_reset_one>, busfn=busfn@entry=0x55555563ccc0 <qbus_reset_one>, opaque=opaque@entry=0x0) at hw/core/qdev.c:353 #14 0x000055555563e3fa in qdev_walk_children (dev=<optimized out>, devfn=devfn@entry=0x55555563ecc0 <qdev_reset_one>, busfn=busfn@entry=0x55555563ccc0 <qbus_reset_one>, opaque=opaque@entry=0x0) at hw/core/qdev.c:383 #15 0x000055555563e4da in qbus_walk_children (bus=<optimized out>, devfn=0x55555563ecc0 <qdev_reset_one>, busfn=0x55555563ccc0 <qbus_reset_one>, opaque=0x0) at hw/core/qdev.c:360 #16 0x000055555573143d in qemu_devices_reset () at vl.c:1883 #17 qemu_system_reset (report=report@entry=true) at vl.c:1892 #18 0x00005555555c5474 in main_loop_should_exit () at vl.c:2026 #19 main_loop () at vl.c:2064 #20 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4451 Expected results: No coredumpd and can reboot successfully. Additional info: When test w/ std&spice, no such issue.