868256 – [Spice] Guest aborted on the dst host when migrating guest during reboot

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 868256 - [Spice] Guest aborted on the dst host when migrating guest during reboot

Summary: [Spice] Guest aborted on the dst host when migrating guest during reboot

Keywords:
Status:	CLOSED EOL
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	qemu-kvm
Sub Component:
Version:	6.4
Hardware:	Unspecified
OS:	Unspecified
Priority:	low
Severity:	low
Target Milestone:	rc
Target Release:	---
Assignee:	David Blechter
QA Contact:	Virtualization Bugs
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	867816 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2012-10-19 11:00 UTC by Qunfang Zhang
Modified:	2016-08-02 15:26 UTC (History)
CC List:	18 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2016-08-02 15:26:31 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
the source qemu logs. (213.90 KB, text/plain) 2012-11-27 11:08 UTC, Sibiao Luo	no flags	Details
the destination qemu log. (5.60 KB, text/plain) 2012-11-27 11:09 UTC, Sibiao Luo	no flags	Details
View All

Description Qunfang Zhang 2012-10-19 11:00:12 UTC

Description of problem:
Boot a guest with spice and migrate it to the dst host during guest reboot, guest always aboorted on the dst host side. And the migration finished already.
Re-test with vnc, did not reproduce.

Version-Release number of selected component (if applicable):
kernel-2.6.32-331.el6.x86_64
qemu-kvm-0.12.1.2-2.327.el6.x86_64
seabios-0.6.1.2-25.el6.x86_64
spice-server-0.12.0-1.el6.x86_64

How reproducible:
2/2

Steps to Reproduce:
1. Boot a guest with spice.
(gdb) r  -M rhel6.4.0 -cpu Conroe -enable-kvm -m 2048 -smp 2,sockets=2,cores=1,threads=1 -enable-kvm -name rhel6.4-64 -uuid feebc8fd-f8b0-4e75-abc3-e63fcdb67170 -smbios type=1,manufacturer='Red Hat',product='RHEV Hypervisor',version=el6,serial=koTUXQrb,uuid=feebc8fd-f8b0-4e75-abc3-e63fcdb67170 -k en-us -rtc base=localtime,clock=host,driftfix=slew -no-kvm-pit-reinjection -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device usb-tablet,id=input0 -drive file=/mnt/rhel5.9-64-virtio.qcow2,if=none,id=disk0,format=qcow2,werror=stop,rerror=stop,aio=native -device virtio-blk-pci,drive=disk0,id=disk0,scsi=off,bus=pci.0,addr=0x3,bootindex=1 -drive file=/mnt/boot.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,id=hostnet0,vhost=on,fd=6 6<>/dev/tap6 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:10:1A:4A:25:28,bus=pci.0,addr=0x4  -monitor stdio -qmp tcp:0:6666,server,nowait -boot c -chardev socket,path=/tmp/isa-serial,server,nowait,id=isa1 -device isa-serial,chardev=isa1,id=isa-serial1 -drive if=none,id=drive-fdc0-0-0,readonly=on,format=raw -global isa-fdc.driveA=drive-fdc0-0-0 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5  -chardev socket,id=charchannel0,path=/tmp/serial-socket,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.rhevm.vdsm -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -chardev socket,path=/tmp/foo,server,nowait,id=foo -device virtconsole,chardev=foo,id=console0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -spice seamless-migration=on,port=5930,password=redhat -global qxl-vga.vram_size=33554432 -k en-us -vga qxl -device usb-ehci,id=ehci,bus=pci.0,addr=0x7 -device usb-storage,drive=drive-usb-0-0,id=usb-0-0,removable=on,bus=ehci.0,port=1 -drive file=/mnt/usb.qcow2,if=none,id=drive-usb-0-0,media=disk,format=qcow2,cache=none,werror=stop,rerror=stop,aio=native 

2. Connect guest desktop with 'remote-viewer'
#remote-viewer spice://$src_host_ip:5930

3. Boot the guest on dst host with listening mode.

4. On source host:
qemu) __com.redhat_spice_migrate_info $dst_host_ip 5930
main_channel_migrate_src_complete: 
main_channel_client_handle_migrate_connected: client 0x7ffff8a3df80 connected: 1 seamless 1

5. Inside guest: #reboot

6. Migrate guest immediately after step 4.
(qemu) migrate -d tcp:$dst_host_ip:5800
  
Actual results:
After finish migration, guest aborted on the dst host.

Expected results:
Guest works well on dst host after migration.

Additional info:
Log on dst host:

(/usr/bin/gdb:25657): Spice-CRITICAL **: red_memslots.c:94:validate_virt: virtual address out of range
    virt=0x800048f5f290+0xbf slot_id=0 group_id=1
    slot=0x7fff57c00000-0x7fff5bc00000 delta=0x7fff57c00000
Detaching after fork from child process 25864.

Program received signal SIGABRT, Aborted.
[Switching to Thread 0x7fffe53fd700 (LWP 25672)]
0x00007ffff574c8a5 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install alsa-lib-1.0.22-3.el6.x86_64 celt051-0.5.1.3-0.el6.x86_64 cyrus-sasl-lib-2.1.23-13.el6.x86_64 cyrus-sasl-md5-2.1.23-13.el6.x86_64 cyrus-sasl-plain-2.1.23-13.el6.x86_64 db4-4.7.25-17.el6.x86_64 dbus-libs-1.2.24-5.el6_1.x86_64 flac-1.2.1-6.1.el6.x86_64 glib2-2.22.5-7.el6.x86_64 glibc-2.12-1.80.el6.x86_64 gnutls-2.8.5-4.el6_2.2.x86_64 keyutils-libs-1.4-4.el6.x86_64 krb5-libs-1.9-33.el6.x86_64 libICE-1.0.6-1.el6.x86_64 libSM-1.1.0-7.1.el6.x86_64 libX11-1.3-2.el6.x86_64 libXau-1.0.5-1.el6.x86_64 libXext-1.1-3.el6.x86_64 libXi-1.3-3.el6.x86_64 libXtst-1.0.99.2-3.el6.x86_64 libaio-0.3.107-10.el6.x86_64 libasyncns-0.8-1.1.el6.x86_64 libcom_err-1.41.12-12.el6.x86_64 libgcrypt-1.4.5-9.el6_2.2.x86_64 libgpg-error-1.7-4.el6.x86_64 libjpeg-6b-46.el6.x86_64 libogg-1.1.4-2.1.el6.x86_64 libselinux-2.0.94-5.3.el6.x86_64 libsndfile-1.0.20-5.el6.x86_64 libtasn1-2.3-3.el6_2.1.x86_64 libuuid-2.17.2-12.7.el6.x86_64 libvorbis-1.2.3-4.el6_2.1.x86_64 libxcb-1.5-1.el6.x86_64 nss-softokn-freebl-3.12.9-11.el6.x86_64 openssl-1.0.0-20.el6_2.5.x86_64 pixman-0.18.4-1.el6_0.1.x86_64 pulseaudio-libs-0.9.21-13.el6.x86_64 tcp_wrappers-libs-7.6-57.el6.x86_64 zlib-1.2.3-27.el6.x86_64
(gdb) 
(gdb) 
(gdb) bt
#0  0x00007ffff574c8a5 in raise () from /lib64/libc.so.6
#1  0x00007ffff574e085 in abort () from /lib64/libc.so.6
#2  0x00007ffff5fa5c35 in spice_logv (log_domain=0x7ffff601cc4e "Spice", 
    log_level=SPICE_LOG_LEVEL_CRITICAL, strloc=0x7ffff60204ba "red_memslots.c:94", 
    function=0x7ffff602059f "validate_virt", 
    format=0x7ffff60202c8 "virtual address out of range\n    virt=0x%lx+0x%x slot_id=%d group_id=%d\n    slot=0x%lx-0x%lx delta=0x%lx", args=0x7fffe53fc6d0) at log.c:109
#3  0x00007ffff5fa5d6a in spice_log (log_domain=<value optimized out>, 
    log_level=<value optimized out>, strloc=<value optimized out>, function=<value optimized out>, 
    format=<value optimized out>) at log.c:123
#4  0x00007ffff5f66403 in validate_virt (info=<value optimized out>, virt=140738712433296, slot_id=0, 
    add_size=191, group_id=1) at red_memslots.c:90
#5  0x00007ffff5f66553 in get_virt (info=<value optimized out>, addr=<value optimized out>, 
    add_size=<value optimized out>, group_id=1, error=0x7fffe53fc87c) at red_memslots.c:142
#6  0x00007ffff5f68510 in red_get_native_drawable (slots=0x7fff501d3f20, group_id=1, 
    red=0x7fff50260b50, addr=<value optimized out>, flags=0) at red_parse_qxl.c:940
#7  red_get_drawable (slots=0x7fff501d3f20, group_id=1, red=0x7fff50260b50, 
    addr=<value optimized out>, flags=0) at red_parse_qxl.c:1111
#8  0x00007ffff5f810bb in red_process_commands (worker=0x7fff500008c0, ring_is_empty=0x7fffe53fcaac, 
    max_pipe_size=50) at red_worker.c:4900
#9  0x00007ffff5f84dfb in flush_display_commands (worker=0x7fff500008c0) at red_worker.c:9338
#10 flush_all_qxl_commands (worker=0x7fff500008c0) at red_worker.c:9421
#11 0x00007ffff5f85bc0 in dev_destroy_surfaces (opaque=<value optimized out>, 
    payload=<value optimized out>) at red_worker.c:10813
#12 handle_dev_destroy_surfaces (opaque=<value optimized out>, payload=<value optimized out>)
    at red_worker.c:10842
#13 0x00007ffff5f63cc7 in dispatcher_handle_single_read (dispatcher=0x7ffff94109e8) at dispatcher.c:139
#14 dispatcher_handle_recv_read (dispatcher=0x7ffff94109e8) at dispatcher.c:162
#15 0x00007ffff5f8488e in red_worker_main (arg=<value optimized out>) at red_worker.c:11782
#16 0x00007ffff7740851 in start_thread () from /lib64/libpthread.so.0
#17 0x00007ffff580167d in clone () from /lib64/libc.so.6
(gdb)

Comment 1 Qunfang Zhang 2012-10-19 11:04:17 UTC

Host A:
processor	: 3
vendor_id	: GenuineIntel
cpu family	: 6
model		: 23
model name	: Intel(R) Core(TM)2 Quad CPU    Q9550  @ 2.83GHz
stepping	: 10
cpu MHz		: 2826.041
cache size	: 6144 KB
physical id	: 0
siblings	: 4
core id		: 3
cpu cores	: 4
apicid		: 3
initial apicid	: 3
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm dts tpr_shadow vnmi flexpriority
bogomips	: 5652.08
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:


Host B:

processor	: 3
vendor_id	: GenuineIntel
cpu family	: 6
model		: 42
model name	: Intel(R) Core(TM) i5-2400 CPU @ 3.10GHz
stepping	: 7
cpu MHz		: 1600.000
cache size	: 6144 KB
physical id	: 0
siblings	: 4
core id		: 3
cpu cores	: 4
apicid		: 6
initial apicid	: 6
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dts tpr_shadow vnmi flexpriority ept vpid
bogomips	: 6185.74
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

Comment 3 Orit Wasserman 2012-10-23 13:43:55 UTC

*** Bug 867816 has been marked as a duplicate of this bug. ***

Comment 4 Sibiao Luo 2012-10-30 08:46:02 UTC

*** Bug 871306 has been marked as a duplicate of this bug. ***

Comment 5 Yonit Halperin 2012-11-26 16:47:55 UTC

Hi Sibiao,

please also attach the full qemu logs of the src and the destination, preferably with a spice debug messages (export SPICE_DEBUG_LEVEL=5).

Thanks,
Yonit.

Comment 6 Sibiao Luo 2012-11-27 01:53:05 UTC

(In reply to comment #5)
> Hi Sibiao,
> 
> please also attach the full qemu logs of the src and the destination,
> preferably with a spice debug messages (export SPICE_DEBUG_LEVEL=5).
> 
ok, no problem, i will try it and give the results here.

Comment 7 Sibiao Luo 2012-11-27 11:07:42 UTC

(In reply to comment #5)
> please also attach the full qemu logs of the src and the destination,
> preferably with a spice debug messages (export SPICE_DEBUG_LEVEL=5).
> 
I tried the rhel6.4 guest 5 times with the comment #0 steps, but did not reproduce this issue. so i use the bug 871306's steps to reproduce it.

host info:
kernel-2.6.32-342.el6.x86_64
qemu-kvm-0.12.1.2-2.327.el6.x86_64
spice-server-0.12.0-1.el6.x86_64
spice-gtk-0.14-5.el6.x86_64
seabios-0.6.1.2-25.el6.x86_64
guest info:
windows_7_ultimate_sp1_x64

steps:
the same as bug 871306

qemu-kvm command line:
eg: /usr/libexec/qemu-kvm -M rhel6.4.0 -cpu SandyBridge -enable-kvm -m 2048 -smp 4,sockets=2,cores=2,threads=1 -usb -device usb-tablet,id=input0 -name sluo_acpi -uuid 990ea161-6b67-47b2-b803-19fb01d30d30 -rtc base=localtime,clock=host,driftfix=slew -drive file=/mnt/windows_7_ultimate_sp1_x64.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,werror=stop,rerror=stop -device virtio-blk-pci,drive=drive-virtio-disk0,id=virtio_drive,bus=pci.0,addr=0x3,bootindex=1 -netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=08:2E:5F:0A:0D:B1,bus=pci.0,addr=0x4 -device usb-ehci,id=ehci,addr=0x5 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -spice port=5931,disable-ticketing,seamless-migration=on -vga qxl -global qxl-vga.vram_size=67108864 -device intel-hda,id=sound0,bus=pci.0,addr=0x7 -drive file=/mnt/my-data-disk.qcow2,if=none,id=drive-ide0-0-0,format=qcow2,cache=none,werror=stop,rerror=stop -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -boot menu=on -monitor stdio -incoming tcp:0:5888 2> /home/log.txt

Results:
After finish migration, guest aborted on the dst qemu, i will attach the full qemu logs of the src and the destination as your indication.
(qemu) info status 
VM status: paused (incoming-migration)
(qemu) [Thread 0x7ffff0062700 (LWP 15603) exited]
[New Thread 0x7ffff0062700 (LWP 15617)]
[New Thread 0x7fffe53fb700 (LWP 15618)]
[New Thread 0x7fff43fff700 (LWP 15619)]
[New Thread 0x7fff435fe700 (LWP 15620)]
id 0, group 0, virt start 0, virt end ffffffffffffffff, generation 0, delta 0
Detaching after fork from child process 15621.

Program received signal SIGABRT, Aborted.
[Switching to Thread 0x7fffe65fd700 (LWP 15608)]
0x00007ffff57498a5 in raise () from /lib64/libc.so.6

(gdb) bt
#0  0x00007ffff57498a5 in raise () from /lib64/libc.so.6
#1  0x00007ffff574b085 in abort () from /lib64/libc.so.6
#2  0x00007ffff5fa3c35 in spice_logv (log_domain=0x7ffff601ac4e "Spice", log_level=SPICE_LOG_LEVEL_CRITICAL, strloc=0x7ffff601e4ba "red_memslots.c:94", 
    function=0x7ffff601e59f "validate_virt", 
    format=0x7ffff601e2c8 "virtual address out of range\n    virt=0x%lx+0x%x slot_id=%d group_id=%d\n    slot=0x%lx-0x%lx delta=0x%lx", args=0x7fffe65fc890) at log.c:109
#3  0x00007ffff5fa3d6a in spice_log (log_domain=<value optimized out>, log_level=<value optimized out>, strloc=<value optimized out>, function=<value optimized out>, 
    format=<value optimized out>) at log.c:123
#4  0x00007ffff5f64403 in validate_virt (info=<value optimized out>, virt=0, slot_id=1, add_size=3145728, group_id=1) at red_memslots.c:90
#5  0x00007ffff5f64553 in get_virt (info=<value optimized out>, addr=<value optimized out>, add_size=<value optimized out>, group_id=1, error=0x7fffe65fca7c)
    at red_memslots.c:142
#6  0x00007ffff5f79717 in dev_create_primary_surface (worker=0x7fff440008c0, surface_id=<value optimized out>, surface=...) at red_worker.c:10976
#7  0x00007ffff5f79cf3 in handle_dev_create_primary_surface_async (opaque=<value optimized out>, payload=<value optimized out>) at red_worker.c:11187
#8  0x00007ffff5f61cc7 in dispatcher_handle_single_read (dispatcher=0x7ffff944b9e8) at dispatcher.c:139
#9  dispatcher_handle_recv_read (dispatcher=0x7ffff944b9e8) at dispatcher.c:162
#10 0x00007ffff5f8288e in red_worker_main (arg=<value optimized out>) at red_worker.c:11782
#11 0x00007ffff7740851 in start_thread () from /lib64/libpthread.so.0
#12 0x00007ffff57ff90d in clone () from /lib64/libc.so.6
(gdb) q

Comment 8 Sibiao Luo 2012-11-27 11:08:53 UTC

Created attachment 652609 [details]
the source qemu logs.

Comment 9 Sibiao Luo 2012-11-27 11:09:28 UTC

Created attachment 652610 [details]
the destination qemu log.

Comment 10 Yonit Halperin 2012-11-27 19:32:44 UTC

(In reply to comment #7)
Thanks Sibiao,

The crash you hit is equivalent to bug #874574.
However this bug is specific to Rhel5.9 guests, when the qxl device is in compatibility mode.

> (In reply to comment #5)
> > please also attach the full qemu logs of the src and the destination,
> > preferably with a spice debug messages (export SPICE_DEBUG_LEVEL=5).
> > 
> I tried the rhel6.4 guest 5 times with the comment #0 steps, but did not
> reproduce this issue. so i use the bug 871306's steps to reproduce it.
> 
> host info:
> kernel-2.6.32-342.el6.x86_64
> qemu-kvm-0.12.1.2-2.327.el6.x86_64
> spice-server-0.12.0-1.el6.x86_64
> spice-gtk-0.14-5.el6.x86_64
> seabios-0.6.1.2-25.el6.x86_64
> guest info:
> windows_7_ultimate_sp1_x64
> 
> steps:
> the same as bug 871306
> 
> qemu-kvm command line:
> eg: /usr/libexec/qemu-kvm -M rhel6.4.0 -cpu SandyBridge -enable-kvm -m 2048
> -smp 4,sockets=2,cores=2,threads=1 -usb -device usb-tablet,id=input0 -name
> sluo_acpi -uuid 990ea161-6b67-47b2-b803-19fb01d30d30 -rtc
> base=localtime,clock=host,driftfix=slew -drive
> file=/mnt/windows_7_ultimate_sp1_x64.qcow2,if=none,id=drive-virtio-disk0,
> format=qcow2,cache=none,werror=stop,rerror=stop -device
> virtio-blk-pci,drive=drive-virtio-disk0,id=virtio_drive,bus=pci.0,addr=0x3,
> bootindex=1 -netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup -device
> virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=08:2E:5F:0A:0D:B1,
> bus=pci.0,addr=0x4 -device usb-ehci,id=ehci,addr=0x5 -device
> virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -spice
> port=5931,disable-ticketing,seamless-migration=on -vga qxl -global
> qxl-vga.vram_size=67108864 -device intel-hda,id=sound0,bus=pci.0,addr=0x7
> -drive
> file=/mnt/my-data-disk.qcow2,if=none,id=drive-ide0-0-0,format=qcow2,
> cache=none,werror=stop,rerror=stop -device
> ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -global
> PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -boot menu=on -monitor
> stdio -incoming tcp:0:5888 2> /home/log.txt
> 
> Results:
> After finish migration, guest aborted on the dst qemu, i will attach the
> full qemu logs of the src and the destination as your indication.
> (qemu) info status 
> VM status: paused (incoming-migration)
> (qemu) [Thread 0x7ffff0062700 (LWP 15603) exited]
> [New Thread 0x7ffff0062700 (LWP 15617)]
> [New Thread 0x7fffe53fb700 (LWP 15618)]
> [New Thread 0x7fff43fff700 (LWP 15619)]
> [New Thread 0x7fff435fe700 (LWP 15620)]
> id 0, group 0, virt start 0, virt end ffffffffffffffff, generation 0, delta 0
> Detaching after fork from child process 15621.
> 
> Program received signal SIGABRT, Aborted.
> [Switching to Thread 0x7fffe65fd700 (LWP 15608)]
> 0x00007ffff57498a5 in raise () from /lib64/libc.so.6
> 
> (gdb) bt
> #0  0x00007ffff57498a5 in raise () from /lib64/libc.so.6
> #1  0x00007ffff574b085 in abort () from /lib64/libc.so.6
> #2  0x00007ffff5fa3c35 in spice_logv (log_domain=0x7ffff601ac4e "Spice",
> log_level=SPICE_LOG_LEVEL_CRITICAL, strloc=0x7ffff601e4ba
> "red_memslots.c:94", 
>     function=0x7ffff601e59f "validate_virt", 
>     format=0x7ffff601e2c8 "virtual address out of range\n    virt=0x%lx+0x%x
> slot_id=%d group_id=%d\n    slot=0x%lx-0x%lx delta=0x%lx",
> args=0x7fffe65fc890) at log.c:109
> #3  0x00007ffff5fa3d6a in spice_log (log_domain=<value optimized out>,
> log_level=<value optimized out>, strloc=<value optimized out>,
> function=<value optimized out>, 
>     format=<value optimized out>) at log.c:123
> #4  0x00007ffff5f64403 in validate_virt (info=<value optimized out>, virt=0,
> slot_id=1, add_size=3145728, group_id=1) at red_memslots.c:90
> #5  0x00007ffff5f64553 in get_virt (info=<value optimized out>, addr=<value
> optimized out>, add_size=<value optimized out>, group_id=1,
> error=0x7fffe65fca7c)
>     at red_memslots.c:142
> #6  0x00007ffff5f79717 in dev_create_primary_surface (worker=0x7fff440008c0,
> surface_id=<value optimized out>, surface=...) at red_worker.c:10976
> #7  0x00007ffff5f79cf3 in handle_dev_create_primary_surface_async
> (opaque=<value optimized out>, payload=<value optimized out>) at
> red_worker.c:11187
> #8  0x00007ffff5f61cc7 in dispatcher_handle_single_read
> (dispatcher=0x7ffff944b9e8) at dispatcher.c:139
> #9  dispatcher_handle_recv_read (dispatcher=0x7ffff944b9e8) at
> dispatcher.c:162
> #10 0x00007ffff5f8288e in red_worker_main (arg=<value optimized out>) at
> red_worker.c:11782
> #11 0x00007ffff7740851 in start_thread () from /lib64/libpthread.so.0
> #12 0x00007ffff57ff90d in clone () from /lib64/libc.so.6
> (gdb) q

Comment 11 David Blechter 2012-11-27 22:35:50 UTC

low priority for rhel 5.9 guest - > moving to 6.5 and removing blocker flag

Comment 13 Sibiao Luo 2012-11-28 02:09:51 UTC

(In reply to comment #10)
> (In reply to comment #7)
> Thanks Sibiao,
> 
> The crash you hit is equivalent to bug #874574.
> However this bug is specific to Rhel5.9 guests, when the qxl device is in
> compatibility mode.
> 
yes, thanks for your reminds, that's why i test 5 times but did not hit it, i did not note this bug specified the rhel5.9 guest, should i need to retest it again with rhel5.9 guest to provide the detail logs ? 
btw, as your said that the bug #871306 not duplicate to this bug #868256, could you help me check whether duplicate to bug #874574 ? then i could sure to reopen it or close it duplicate to bug #874574.

Best Regards.
sluo

Comment 14 Yonit Halperin 2012-11-28 15:46:39 UTC

(In reply to comment #13)
> (In reply to comment #10)
> > (In reply to comment #7)
> > Thanks Sibiao,
> > 
> > The crash you hit is equivalent to bug #874574.
> > However this bug is specific to Rhel5.9 guests, when the qxl device is in
> > compatibility mode.
> > 
> yes, thanks for your reminds, that's why i test 5 times but did not hit it,
> i did not note this bug specified the rhel5.9 guest, should i need to retest
> it again with rhel5.9 guest to provide the detail logs ? 
> btw, as your said that the bug #871306 not duplicate to this bug #868256,
> could you help me check whether duplicate to bug #874574 ? then i could sure
> to reopen it or close it duplicate to bug #874574.
> 
> Best Regards.
> sluo

Hi,

bug #871306 matches bug #874574, so you can change it to be a duplicate of it.
It would be great if you can supply logs for the 5.9 guest, also, please increase the qxl debug level by adding "-global qxl-vga.debug=1".

Thanks,
Yonit.

Comment 15 Yonit Halperin 2012-11-30 20:59:54 UTC

This bug happens under the following conditions
(1) Rhel5 qxl driver (or 0.4 Windows driver, for windows guest)
(2) migration occurs while qxl device is in VGA mode

The cause to the bug is that the devram memslot is added with the wrong delta 
in qxl_create_memslots:

qxl_add_memslot(d, i, 0, QXL_SYNC);

instead of:

qxl_add_memslot(d, i, d->pci.io_regions[QXL_RAM_RANGE_INDEX].addr, QXL_SYNC);

Comment 16 Yonit Halperin 2012-12-03 18:59:08 UTC

(In reply to comment #15)
> This bug happens under the following conditions
> (1) Rhel5 qxl driver (or 0.4 Windows driver, for windows guest)
> (2) migration occurs while qxl device is in VGA mode
> 
> The cause to the bug is that the devram memslot is added with the wrong
> delta 
> in qxl_create_memslots:
> 
> qxl_add_memslot(d, i, 0, QXL_SYNC);
> 
> instead of:
> 
> qxl_add_memslot(d, i, d->pci.io_regions[QXL_RAM_RANGE_INDEX].addr, QXL_SYNC);

Actually, the real problem is that the cmd ring is not empty while we are in VGA mode. Gerd, can you please have a look at this?
When we enter to VGA mode, in the src side, the ring is empty, and then I checked the ring state again on the pre_save routine, and it was not empty.

Comment 17 Gerd Hoffmann 2012-12-04 08:04:35 UTC

Guest bug?  There is nothing which prevents the guest from sticking commands into the ring in vga mode, even though qxl will never ever process them.  Although I can't see how migration changes that, on the destination host qxl should ignore the ring in vga mode too.  And when the guest enters compat mode qxl should reinitialize the memory slot properly ...

Comment 18 Yonit Halperin 2012-12-04 13:52:00 UTC

(In reply to comment #17)
> Guest bug?  There is nothing which prevents the guest from sticking commands
> into the ring in vga mode, even though qxl will never ever process them. 
> Although I can't see how migration changes that, on the destination host qxl
> should ignore the ring in vga mode too.  And when the guest enters compat
> mode qxl should reinitialize the memory slot properly ...

Before destroying the primary surface, or all the surfaces (as part of a hard_reset), the mode changes from VGA to UNDEFINED. Then, when spice-server reads commands, it receives the ones that are on the devram.
This didn't raise errors before migration, because the devram memslot has already been added. However, when migrating in VGA mode, the devram memslot is not added.

Comment 19 Gerd Hoffmann 2012-12-04 16:16:07 UTC

Ok,  So the sequence to trigger this is:

  (1) boot guest
  (2) start x11 (vga -> compat)
  (3) ask guest to reboot
  (4) stop x11 (compat ->vga, but leaving commands in the ring)
  (5) live migrate here
  (6) reset system, including qxl (vga -> undefined, spice reads
              stray commands and crashes due to missing memslot)

Correct?

Comment 20 Yonit Halperin 2012-12-05 16:09:23 UTC

(In reply to comment #19)
> Ok,  So the sequence to trigger this is:
> 
>   (1) boot guest
>   (2) start x11 (vga -> compat)
>   (3) ask guest to reboot
>   (4) stop x11 (compat ->vga, but leaving commands in the ring)
>   (5) live migrate here
>   (6) reset system, including qxl (vga -> undefined, spice reads
>               stray commands and crashes due to missing memslot)
> 
> Correct?

Almost correct: In step 4, when moving to vga mode, the ring is empty.
It gets filled somehow during vga mode.

Comment 21 Gerd Hoffmann 2012-12-06 07:58:13 UTC

Should we just reinitialize the rings in qxl_hard_reset?
I think that would be a good idea anyway, and it should fix this issue too.

Oh, I see we already do that, but after going into undefined mode.  So just moving up the qxl_reset_state() should do it (but needs careful testing to make sure we don't have other unwanted side effects).

Comment 22 Ademar Reis 2013-03-25 15:46:46 UTC

Is this really a regression? Please confirm that RHEL6.3 is not affected.

Comment 23 Qunfang Zhang 2013-03-28 07:42:51 UTC

Hi, Ademar
I re-test again with the same steps and command line with rhel5.9 guest, tested on both rhel6.3 and rhel6.4 hosts, both can reproduce the bug. Setted regression keyword ago maybe because did not reproduce it on rhel6.3 with a "rhel6" guest. 
Now I will clear the Regression keyword.

I tested the on rhel6.4 and rhel6.3 host with following version just now, both can reproduce:
(1) RHEL6.3 host:
kernel-2.6.32-279.9.1.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.295.el6_3.2.x86_64

(2) RHEL6.4 host:
2.6.32-358.2.1.el6.x86_64
qemu-kvm-0.12.1.2-2.355.el6.x86_64

The bt log on rhel6.3 host:
(gdb) 
#0  0x00007ffff57768a5 in raise () from /lib64/libc.so.6
#1  0x00007ffff5778085 in abort () from /lib64/libc.so.6
#2  0x00007ffff5f8cb67 in validate_virt (info=<value optimized out>, virt=<value optimized out>, 
    slot_id=<value optimized out>, add_size=<value optimized out>, group_id=<value optimized out>) at red_memslots.c:86
#3  0x00007ffff5f8cc0c in get_virt (info=<value optimized out>, addr=<value optimized out>, 
    add_size=<value optimized out>, group_id=1) at red_memslots.c:125
#4  0x00007ffff5f8e51e in red_get_native_drawable (slots=0x7fffeceedaa8, group_id=1, red=0x7fff540c25d0, 
    addr=<value optimized out>, flags=0) at red_parse_qxl.c:770
#5  red_get_drawable (slots=0x7fffeceedaa8, group_id=1, red=0x7fff540c25d0, addr=<value optimized out>, flags=0)
    at red_parse_qxl.c:925
#6  0x00007ffff5fa24e8 in red_process_commands (worker=0x7fffecd1a6c0, ring_is_empty=0x7fffecd1a4fc, max_pipe_size=50)
    at red_worker.c:4839
#7  0x00007ffff5fa4dcb in flush_display_commands (worker=0x7fffecd1a6c0) at red_worker.c:9156
#8  flush_all_qxl_commands (worker=0x7fffecd1a6c0) at red_worker.c:9239
#9  0x00007ffff5fa5d39 in dev_destroy_surfaces (opaque=0x7fffecd1a6c0, payload=<value optimized out>) at red_worker.c:10535
#10 handle_dev_destroy_surfaces (opaque=0x7fffecd1a6c0, payload=<value optimized out>) at red_worker.c:10564
#11 0x00007ffff5f8aaf3 in dispatcher_handle_single_read (dispatcher=0x7ffff92b3aa8) at dispatcher.c:120
#12 dispatcher_handle_recv_read (dispatcher=0x7ffff92b3aa8) at dispatcher.c:143
#13 0x00007ffff5fa4abc in red_worker_main (arg=<value optimized out>) at red_worker.c:11335
#14 0x00007ffff7748851 in start_thread () from /lib64/libpthread.so.0
#15 0x00007ffff582b67d in clone () from /lib64/libc.so.6
(gdb)

Comment 28 Andrei Stepanov 2016-08-02 14:24:30 UTC

As I understand, this bug is against QXL device in Qemu.
In particular, how RHELl5-GUEST uses QXL device.
Previous posts show that this bug is not reproducible with RHEL6.x guests.
David Blechter: please confirm that we support RHEL5 as a guest system.

Note You need to log in before you can comment on or make changes to this bug.