Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1003819

Summary:	System-reset make qemu core dumpd after migrating a "s3-state" guest w/ spice&qxl .
Product:	Red Hat Enterprise Linux 7	Reporter:	Qian Guo <qiguo>
Component:	qemu-kvm	Assignee:	Gerd Hoffmann <kraxel>
Status:	CLOSED DUPLICATE	QA Contact:	Virtualization Bugs <virt-bugs>
Severity:	high	Docs Contact:
Priority:	medium
Version:	7.0	CC:	hhuang, juzhang, mazhang, qiguo, qzhang, rbalakri, rhod, rmainz, virt-bugs, virt-maint, xutian
Target Milestone:	rc	Keywords:	TestOnly
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2014-10-30 08:46:26 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1054077
Bug Blocks:	923626

Description Qian Guo 2013-09-03 09:45:38 UTC

Description of problem:
RHEL7 guest w/ spice and qxl device is in S3 state, then migrated, the guest can not be resumed(exist issue), but when try to system_reset via hmp, the qemu coredumpd.

Version-Release number of selected component (if applicable):
kernel:
# uname -r
3.10.0-15.el7.x86_64
# rpm -q qemu-kvm
qemu-kvm-1.5.3-2.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1.Boot guest in src host:
#/usr/libexec/qemu-kvm -cpu Penryn -enable-kvm -m 4096 -smp 4,sockets=1,cores=4,threads=1 -name rhel7base  -drive file=/mnt/rhel7cp1.qcow2_v3,if=none,id=drive-virtio-disk0,format=qcow2,werror=stop,rerror=stop,aio=native -device virtio-blk-pci,drive=drive-virtio-disk0,id=virtio-disk0 -boot menu=on -monitor stdio -netdev tap,id=hostnet0,ifname=guest1,script=/etc/ovs-ifup,downscript=/etc/ovs-ifdown,vhost=on,queues=4 -device virtio-net,netdev=hostnet0,mac=54:52:1b:35:3c:16,id=test,mq=on,vectors=9 -nodefaults -nodefconfig -spice disable-ticketing,port=5930,seamless-migration=on -vga qxl -global qxl-vga.vram_size=67108864   -device virtio-balloon-pci,id=balloon1 -qmp tcp:0:4446,server,nowait -device intel-hda,id=hda1 -device hda-duplex -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -serial unix:/tmp/qiguo,server,nowait

2.Do S3 in guest:
# pm-suspend

3.Migrate this guest to dst host, then the guest stalled, can not be resumed.

4.After migration, try to reset the guest via hmp
# system_reset
Actual results:
qemu coredumpd w/ the qxl/spice information:

(qemu) qemu-kvm: /builddir/build/BUILD/qemu-1.5.3/hw/display/qxl.c:1114: qxl_check_state: Assertion `!spice_display_running || ((&ram->cmd_ring)->cons == (&ram->cmd_ring)->prod)' failed.

Program received signal SIGABRT, Aborted.
0x00007ffff32e4999 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install alsa-lib-1.0.27.2-1.el7.x86_64 celt051-0.5.1.3-6.el7.x86_64 cyrus-sasl-lib-2.1.26-9.el7.x86_64 cyrus-sasl-md5-2.1.26-9.el7.x86_64 cyrus-sasl-plain-2.1.26-9.el7.x86_64 cyrus-sasl-scram-2.1.26-9.el7.x86_64 dbus-libs-1.6.12-4.el7.x86_64 flac-libs-1.3.0-2.el7.x86_64 glib2-2.36.3-2.el7.x86_64 glibc-2.17-21.el7.x86_64 glusterfs-3.4.0.15rhs-1.el7.x86_64 gmp-5.1.1-2.el7.x86_64 gnutls-3.1.13-1.el7.x86_64 gsm-1.0.13-9.el7.x86_64 json-c-0.11-1.el7.x86_64 keyutils-libs-1.5.5-4.el7.x86_64 krb5-libs-1.11.3-8.el7.x86_64 libICE-1.0.8-5.el7.x86_64 libSM-1.2.1-5.el7.x86_64 libX11-1.6.0-1.el7.x86_64 libXau-1.0.8-1.el7.x86_64 libXext-1.3.2-1.el7.x86_64 libXi-1.7.2-1.el7.x86_64 libXtst-1.2.2-1.el7.x86_64 libaio-0.3.109-9.el7.x86_64 libasyncns-0.8-5.el7.x86_64 libattr-2.4.46-10.el7.x86_64 libcap-2.22-6.el7.x86_64 libcom_err-1.42.8-2.el7.x86_64 libdb-5.3.21-11.el7.x86_64 libgcc-4.8.1-6.el7.x86_64 libgcrypt-1.5.3-1.el7.x86_64 libgpg-error-1.11-1.el7.x86_64 libiscsi-1.7.0-6.el7.x86_64 libjpeg-turbo-1.2.90-2.el7.x86_64 libogg-1.3.0-5.el7.x86_64 libpng-1.5.13-2.el7.x86_64 libseccomp-2.1.0-0.el7.x86_64 libselinux-2.1.13-16.el7.x86_64 libsndfile-1.0.25-7.el7.x86_64 libtasn1-3.3-1.el7.x86_64 libusbx-1.0.15-2.el7.x86_64 libuuid-2.23.2-2.el7.x86_64 libvorbis-1.3.3-4.el7.x86_64 libxcb-1.9-3.el7.x86_64 nettle-2.6-2.el7.x86_64 nspr-4.10-3.el7.x86_64 nss-3.15.1-2.el7.x86_64 nss-softokn-freebl-3.15.1-2.el7.x86_64 nss-util-3.15.1-2.el7.x86_64 openssl-libs-1.0.1e-15.el7.x86_64 p11-kit-0.18.5-1.el7.x86_64 pcre-8.32-7.el7.x86_64 pixman-0.30.0-1.el7.x86_64 pulseaudio-libs-3.0-10.el7.x86_64 spice-server-0.12.4-1.el7.x86_64 tcp_wrappers-libs-7.6-75.el7.x86_64 usbredir-0.6-3.el7.x86_64 zlib-1.2.7-10.el7.x86_64
(gdb) bt
#0  0x00007ffff32e4999 in raise () from /lib64/libc.so.6
#1  0x00007ffff32e60a8 in abort () from /lib64/libc.so.6
#2  0x00007ffff32dd906 in __assert_fail_base () from /lib64/libc.so.6
#3  0x00007ffff32dd9b2 in __assert_fail () from /lib64/libc.so.6
#4  0x000055555575971d in qxl_check_state (d=<optimized out>) at /usr/src/debug/qemu-1.5.3/hw/display/qxl.c:1114
#5  0x000055555575a025 in qxl_reset_state (d=d@entry=0x555556713e20) at /usr/src/debug/qemu-1.5.3/hw/display/qxl.c:1122
#6  0x000055555575b35b in qxl_hard_reset (d=0x555556713e20, loadvm=0) at /usr/src/debug/qemu-1.5.3/hw/display/qxl.c:1159
#7  0x000055555563ecd9 in qdev_reset_one (dev=dev@entry=0x555556713e20, opaque=opaque@entry=0x0) at hw/core/qdev.c:227
#8  0x000055555563e3d0 in qdev_walk_children (dev=dev@entry=0x555556713e20, devfn=devfn@entry=0x55555563ecc0 <qdev_reset_one>, 
    busfn=busfn@entry=0x55555563ccc0 <qbus_reset_one>, opaque=opaque@entry=0x0) at hw/core/qdev.c:376
#9  0x000055555563e46d in qdev_reset_all (dev=dev@entry=0x555556713e20) at hw/core/qdev.c:243
#10 0x0000555555682c3d in pci_device_reset (dev=0x555556713e20) at hw/pci/pci.c:180
#11 0x0000555555682df2 in pci_bus_reset (bus=0x5555566aed70) at hw/pci/pci.c:226
#12 0x0000555555682e39 in pcibus_reset (qbus=<optimized out>) at hw/pci/pci.c:233
#13 0x000055555563e4b0 in qbus_walk_children (bus=bus@entry=0x5555566aed70, devfn=devfn@entry=0x55555563ecc0 <qdev_reset_one>, 
    busfn=busfn@entry=0x55555563ccc0 <qbus_reset_one>, opaque=opaque@entry=0x0) at hw/core/qdev.c:353
#14 0x000055555563e3fa in qdev_walk_children (dev=<optimized out>, devfn=devfn@entry=0x55555563ecc0 <qdev_reset_one>, busfn=busfn@entry=0x55555563ccc0 <qbus_reset_one>, 
    opaque=opaque@entry=0x0) at hw/core/qdev.c:383
#15 0x000055555563e4da in qbus_walk_children (bus=<optimized out>, devfn=0x55555563ecc0 <qdev_reset_one>, busfn=0x55555563ccc0 <qbus_reset_one>, opaque=0x0)
    at hw/core/qdev.c:360
#16 0x000055555573143d in qemu_devices_reset () at vl.c:1883
#17 qemu_system_reset (report=report@entry=true) at vl.c:1892
#18 0x00005555555c5474 in main_loop_should_exit () at vl.c:2026
#19 main_loop () at vl.c:2064
#20 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4451


Expected results:
No coredumpd and can reboot successfully.

Additional info:
When test w/ std&spice, no such issue.

Comment 1 Qian Guo 2013-09-03 09:49:27 UTC

There's call trace in the dst host after qemu-kvm coredumpd:
[98393.475354] WARNING: at net/core/dev.c:5011 rollback_registered_many+0x1e2/0x210()
[98393.475356] Modules linked in: tcp_lp rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache nfsd auth_rpcgss nfs_acl lockd sunrpc vhost_net macvtap macvlan tun bnep bluetooth fuse xt_CHECKSUM bridge stp llc ebtable_nat nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6table_nat nf_nat_ipv6 ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables openvswitch vxlan ip_tunnel gre sg snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device iTCO_wdt igb coretemp kvm_intel iTCO_vendor_support kvm e1000e i2c_i801 snd_pcm snd_page_alloc snd_timer snd hp_wmi sparse_keymap
[98393.475421]  rfkill crc32_pclmul lpc_ich dca crc32c_intel soundcore ghash_clmulni_intel mfd_core ptp pps_core wmi shpchp serio_raw microcode mperf pcspkr uinput xfs libcrc32c sr_mod sd_mod cdrom crc_t10dif i915 video i2c_algo_bit drm_kms_helper ahci drm libahci libata i2c_core dm_mirror dm_region_hash dm_log dm_mod
[98393.475454] CPU: 4 PID: 14411 Comm: qemu-kvm Tainted: G        W   --------------   3.10.0-15.el7.x86_64 #1
[98393.475457] Hardware name: Hewlett-Packard HP Compaq 8200 Elite MT PC/1495, BIOS J01 v02.15 11/10/2011
[98393.475459]  0000000000000009 ffff8801ee419b30 ffffffff815fa8cc ffff8801ee419b68
[98393.475464]  ffffffff81060711 ffff8801249e0000 ffff8801ee419bb0 ffff8801ee419bb0
[98393.475468]  ffff88021bc800c0 ffff8801e4974400 ffff8801ee419b78 ffffffff810607ea
[98393.475473] Call Trace:
[98393.475480]  [<ffffffff815fa8cc>] dump_stack+0x19/0x1b
[98393.475487]  [<ffffffff81060711>] warn_slowpath_common+0x61/0x80
[98393.475489]  [<ffffffff810607ea>] warn_slowpath_null+0x1a/0x20
[98393.475492]  [<ffffffff814ee8e2>] rollback_registered_many+0x1e2/0x210
[98393.475494]  [<ffffffff814ee941>] rollback_registered+0x31/0x40
[98393.475497]  [<ffffffff814ef9f8>] unregister_netdevice_queue+0x48/0x90
[98393.475509]  [<ffffffffa0677312>] __tun_detach+0x112/0x2b0 [tun]
[98393.475513]  [<ffffffffa06774dd>] tun_chr_close+0x2d/0x50 [tun]
[98393.475517]  [<ffffffff8119e6a9>] __fput+0xe9/0x270
[98393.475520]  [<ffffffff8119e8ee>] ____fput+0xe/0x10
[98393.475524]  [<ffffffff810820a4>] task_work_run+0xc4/0xe0
[98393.475527]  [<ffffffff81066025>] do_exit+0x2b5/0xa20
[98393.475530]  [<ffffffff8106680f>] do_group_exit+0x3f/0xa0
[98393.475535]  [<ffffffff81074eeb>] get_signal_to_deliver+0x1cb/0x5d0
[98393.475539]  [<ffffffff81011408>] do_signal+0x48/0x5a0
[98393.475543]  [<ffffffff810119d0>] do_notify_resume+0x70/0xa0
[98393.475547]  [<ffffffff81609292>] int_signal+0x12/0x17
[98393.475549] ---[ end trace 8b1af66abfed498d ]---

Comment 3 Gerd Hoffmann 2013-11-05 12:53:34 UTC

Looks simliar to bug 1021324.
Can you retest with qemu-kvm-1.5.3-12.el7.x86_64 (or newer) please?

Comment 4 Qian Guo 2013-11-06 06:46:30 UTC

Hi, Gerd

Reproduced with 
# rpm -q qemu-kvm-rhev
qemu-kvm-rhev-1.5.3-13.el7.x86_64

host/guest kernel : kernel-3.10.0-42.el7.x86_64

After migrate and system_reset, qemu-kvm coredumpd(qemu) qemu-kvm: /builddir/build/BUILD/qemu-1.5.3/hw/display/qxl.c:1114: qxl_check_state: Assertion `!spice_display_running || ((&ram->cmd_ring)->cons == (&ram->cmd_ring)->prod)' failed.
Aborted


and in dst host, hit call trace, the coredumpd and call trace messages are same as comment #0.

See bug #1021324 , the coredumpd messages are similar, seams same bug.


Thanks,

Qian Guo

Comment 12 Gerd Hoffmann 2013-12-10 08:34:23 UTC

http://patchwork.ozlabs.org/patch/299331/
http://patchwork.ozlabs.org/patch/299329/
http://patchwork.ozlabs.org/patch/299330/

Comment 14 Gerd Hoffmann 2014-05-23 08:57:12 UTC

upstream commits:
7cc6a25fe94b430cb5a041bcb19d7d854b4e99a7
b50f3e42b9438e033074222671c0502ecfeba82c
75c70e37bc4a6bdc394b4d1b163fe730abb82c72

Comment 15 Gerd Hoffmann 2014-09-02 11:06:05 UTC

Most likely same as bug 1054077.

Comment 16 Gerd Hoffmann 2014-10-27 09:47:50 UTC

bug 1054077 was fixed in qemu-kvm-1.5.3-71.el7, please retest with that build (or newer).

Comment 17 juzhang 2014-10-28 00:48:52 UTC

Hi Qian,

Could you re-test this issue?

Best Regards,
Junyi

Comment 18 Qian Guo 2014-10-30 08:22:44 UTC

(In reply to Gerd Hoffmann from comment #16)
> bug 1054077 was fixed in qemu-kvm-1.5.3-71.el7, please retest with that
> build (or newer).

Test this scenario with qemu-kvm-rhev-2.1.2-5.el7.x86_64 and qemu-kvm-1.5.3-77.el7.x86_64, both works well.

qemu cli:
# /usr/libexec/qemu-kvm -cpu Penryn -enable-kvm -m 4096 -smp 4,sockets=1,cores=4,threads=1 -name rhel7base  -drive file=/mnt/rhel7u1/rhel7u1cp1.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,werror=stop,rerror=stop,aio=native -device virtio-blk-pci,drive=drive-virtio-disk0,id=virtio-disk0 -boot menu=on -monitor stdio -netdev tap,id=hostnet0,ifname=guest1,script=/etc/qemu-ifup,vhost=on,queues=4 -device virtio-net,netdev=hostnet0,mac=54:52:1b:35:3c:16,id=test,mq=on,vectors=9 -nodefaults -nodefconfig -spice disable-ticketing,port=5930,seamless-migration=on -vga qxl -global qxl-vga.vram_size=67108864   -device virtio-balloon-pci,id=balloon1 -qmp tcp:0:4446,server,nowait -device intel-hda,id=hda1 -device hda-duplex -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -serial unix:/tmp/qiguo,server,nowait

So the latest build has fixed this bug.

Comment 19 Gerd Hoffmann 2014-10-30 08:46:26 UTC

> So the latest build has fixed this bug.

Good, closing as 1054077 dup then.

*** This bug has been marked as a duplicate of bug 1054077 ***