Bug 962954 - qemu 1.4.1-1.fc19: migrate to file crash from spice_server_migrate_end
Summary: qemu 1.4.1-1.fc19: migrate to file crash from spice_server_migrate_end
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: qemu
Version: 19
Hardware: x86_64
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: Fedora Virtualization Maintainers
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: abrt_hash:34d471f5ccd721fb0eb770d05d8...
: 965452 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-05-14 20:21 UTC by Zeeshan Ali
Modified: 2016-09-20 01:42 UTC (History)
18 users (show)

Fixed In Version: qemu-1.4.1-3.fc19
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-06-10 18:33:20 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
File: backtrace (63.60 KB, text/plain)
2013-05-14 20:21 UTC, Zeeshan Ali
no flags Details
File: cgroup (139 bytes, text/plain)
2013-05-14 20:21 UTC, Zeeshan Ali
no flags Details
File: core_backtrace (1.42 KB, text/plain)
2013-05-14 20:21 UTC, Zeeshan Ali
no flags Details
File: dso_list (8.23 KB, text/plain)
2013-05-14 20:21 UTC, Zeeshan Ali
no flags Details
File: environ (308 bytes, text/plain)
2013-05-14 20:21 UTC, Zeeshan Ali
no flags Details
File: limits (1.29 KB, text/plain)
2013-05-14 20:21 UTC, Zeeshan Ali
no flags Details
File: maps (50.99 KB, text/plain)
2013-05-14 20:21 UTC, Zeeshan Ali
no flags Details
File: open_fds (1.24 KB, text/plain)
2013-05-14 20:21 UTC, Zeeshan Ali
no flags Details
File: proc_pid_status (932 bytes, text/plain)
2013-05-14 20:21 UTC, Zeeshan Ali
no flags Details
File: var_log_messages (1.29 KB, text/plain)
2013-05-14 20:21 UTC, Zeeshan Ali
no flags Details

Description Zeeshan Ali 2013-05-14 20:21:00 UTC
Description of problem:
Tried to (managed) save a libvirt domain from gnome-boxes.

Version-Release number of selected component:
qemu-system-x86-1.4.1-1.fc19

Additional info:
reporter:       libreport-2.1.4
backtrace_rating: 4
cmdline:        /usr/bin/qemu-system-x86_64 -machine accel=kvm -name fedora18-2 -S -M pc-1.2 -cpu Westmere,+rdtscp,+pcid,+pdcm,+xtpr,+tm2,+est,+smx,+vmx,+ds_cpl,+monitor,+dtes64,+pclmuldq,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme -m 1152 -smp 4,sockets=1,cores=2,threads=2 -uuid 444f6b31-e7f0-c6f4-5d72-3dbbe8ac837a -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/home/zeenix/.config/libvirt/qemu/lib/fedora18-2.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -no-kvm-pit-reinjection -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x5.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x5.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 -drive file=/home/zeenix/.local/share/gnome-boxes/images/fedora18-2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive if=none,id=drive-ide0-1-0,readonly=on,format=raw -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev user,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:6a:44:79,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -device usb-tablet,id=input0 -spice port=5900,addr=127.0.0.1,disable-ticketing,image-compression=off,seamless-migration=on -vga qxl -global qxl-vga.ram_size=67108864 -global qxl-vga.vram_size=67108864 -device AC97,id=sound0,bus=pci.0,addr=0x4 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1 -chardev spicevmc,id=charredir2,name=usbredir -device usb-redir,chardev=charredir2,id=redir2 -chardev spicevmc,id=charredir3,name=usbredir -device usb-redir,chardev=charredir3,id=redir3 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8
crash_function: spice_logv
executable:     /usr/bin/qemu-system-x86_64
kernel:         3.9.1-301.fc19.x86_64
runlevel:       N 5
uid:            500
xsession_errors: 

Truncated backtrace:
Thread no. 1 (9 frames)
 #2 spice_logv at log.c:109
 #3 spice_log at log.c:123
 #4 red_client_destroy at red_channel.c:1711
 #5 reds_client_disconnect at reds.c:561
 #6 reds_disconnect at reds.c:589
 #7 spice_server_migrate_end at reds.c:4400
 #8 notifier_list_notify at util/notify.c:39
 #9 migrate_fd_completed at migration.c:294
 #10 buffered_file_thread at migration.c:716

Comment 1 Zeeshan Ali 2013-05-14 20:21:05 UTC
Created attachment 747903 [details]
File: backtrace

Comment 2 Zeeshan Ali 2013-05-14 20:21:08 UTC
Created attachment 747904 [details]
File: cgroup

Comment 3 Zeeshan Ali 2013-05-14 20:21:10 UTC
Created attachment 747905 [details]
File: core_backtrace

Comment 4 Zeeshan Ali 2013-05-14 20:21:13 UTC
Created attachment 747906 [details]
File: dso_list

Comment 5 Zeeshan Ali 2013-05-14 20:21:16 UTC
Created attachment 747907 [details]
File: environ

Comment 6 Zeeshan Ali 2013-05-14 20:21:18 UTC
Created attachment 747908 [details]
File: limits

Comment 7 Zeeshan Ali 2013-05-14 20:21:21 UTC
Created attachment 747909 [details]
File: maps

Comment 8 Zeeshan Ali 2013-05-14 20:21:23 UTC
Created attachment 747910 [details]
File: open_fds

Comment 9 Zeeshan Ali 2013-05-14 20:21:26 UTC
Created attachment 747911 [details]
File: proc_pid_status

Comment 10 Zeeshan Ali 2013-05-14 20:21:28 UTC
Created attachment 747912 [details]
File: var_log_messages

Comment 11 Zeeshan Ali 2013-05-16 14:34:59 UTC
Setting severity to high as this means Boxes users can't save their VMs on exit.

Comment 12 Zeeshan Ali 2013-05-16 15:48:30 UTC
I can't reproduce this against qemu-1.4.1-3.fc19.

Comment 13 Christophe Fergeau 2013-05-16 15:54:14 UTC
You probably want to add karma to the update then.

Comment 14 Cole Robinson 2013-05-20 19:21:00 UTC
Actually I can reproduce this consistently even with latest f19 package. Guest is F18, gnome shell, logged into the guest with a simple echo loop running in a terminal there. Do 'virsh managed-save f18' and it crashes with 30 seconds.

Command line and output:

LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=spice TMPDIR=/var/tmp /usr/bin/qemu-kvm -name ztester-f18-migrate-qemu1.4 -S -machine pc-i440fx-1.4,accel=kvm,usb=off -cpu Opteron_G4,+nodeid_msr,+wdt,+skinit,+ibs,+osvw,+cr8legacy,+extapic,+cmp_legacy,+fxsr_opt,+mmxext,+osxsave,+monitor,+ht,+vme -m 2048 -smp 2,sockets=2,cores=1,threads=1 -uuid ecca58bd-f3b5-4538-abc3-e26399f9f3cb -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/ztester-f18-migrate-qemu1.4.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive file=/mnt/data/devel/images/ztester-f18-migrate-overlay.qcow2,if=none,id=drive-virtio-disk0,format=qcow2 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=25,id=hostnet0,vhost=on,vhostfd=26 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:40:ee:0b,bus=pci.0,addr=0x3 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -chardev pty,id=charconsole0 -device virtconsole,chardev=charconsole0,id=console0 -device usb-tablet,id=input0 -spice port=5900,addr=127.0.0.1,disable-ticketing,seamless-migration=on -vga qxl -global qxl-vga.ram_size=67108864 -global qxl-vga.vram_size=67108864 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 -snapshot
Domain id=2 is tainted: custom-argv
char device redirected to /dev/pts/3 (label charconsole0)
main_channel_link: add main channel client
main_channel_handle_parsed: net test: latency 11.708000 ms, bitrate 9706161137 bps (9256.516587 Mbps)
inputs_connect: inputs channel client create
red_dispatcher_set_cursor_peer: 
main_channel_handle_parsed: agent start
__spice_char_device_write_buffer_get: internal buf is not available
__spice_char_device_write_buffer_get: internal buf is not available
__spice_char_device_write_buffer_get: internal buf is not available
__spice_char_device_write_buffer_get: internal buf is not available
__spice_char_device_write_buffer_get: internal buf is not available
__spice_char_device_write_buffer_get: internal buf is not available
__spice_char_device_write_buffer_get: internal buf is not available
__spice_char_device_write_buffer_get: internal buf is not available
__spice_char_device_write_buffer_get: internal buf is not available
((null):22588): Spice-Warning **: reds.c:4399:spice_server_migrate_end: spice_server_migrate_info was not called, disconnecting clients
red_client_destroy: destroy client with #channels 6
((null):22588): Spice-ERROR **: red_channel.c:1711:red_client_destroy: assertion `pthread_equal(pthread_self(), client->thread_id)' failed



Relevant backtrace:


#0  0x00007fc27d92e0cd in read () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007fc2796addf1 in read (__nbytes=255, __buf=0x7fc137ffe840, __fd=<optimized out>) at /usr/include/bits/unistd.h:44
#2  spice_backtrace_gstack () at backtrace.c:100
#3  0x00007fc2796adf39 in spice_backtrace () at backtrace.c:131
#4  0x00007fc2796b5597 in spice_logv (log_domain=0x7fc27972b266 "Spice", log_level=SPICE_LOG_LEVEL_ERROR, strloc=0x7fc27972dfe0 "red_channel.c:1711", function=0x7fc27972e0f0 <__FUNCTION__.22513> "red_client_destroy", format=0x7fc27972b23e "assertion `%s' failed", args=args@entry=0x7fc137ffe9d8) at log.c:108
#5  0x00007fc2796b56e8 in spice_log (log_domain=log_domain@entry=0x7fc27972b266 "Spice", log_level=log_level@entry=SPICE_LOG_LEVEL_ERROR, strloc=strloc@entry=0x7fc27972dfe0 "red_channel.c:1711", function=function@entry=0x7fc27972e0f0 <__FUNCTION__.22513> "red_client_destroy", format=format@entry=0x7fc27972b23e "assertion `%s' failed") at log.c:123
#6  0x00007fc279674db0 in red_client_destroy (client=client@entry=0x7fc2841e4b80) at red_channel.c:1711
#7  0x00007fc279699a75 in reds_client_disconnect (client=0x7fc2841e4b80) at reds.c:561
#8  0x00007fc279699ff1 in reds_disconnect () at reds.c:589
#9  0x00007fc27969e7ff in spice_server_migrate_end (s=<optimized out>, completed=1) at reds.c:4400
#10 0x00007fc27fa27817 in notifier_list_notify ()
#11 0x00007fc27f89577d in buffered_file_thread ()
#12 0x00007fc27d927c53 in start_thread (arg=0x7fc137fff700) at pthread_create.c:308
#13 0x00007fc2789bcecd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113



Alon, Hans, Yonit, any thoughts?

Comment 15 Alon Levy 2013-05-20 21:10:55 UTC
Yes - seems that now the migration notification list callbacks are not done from the main thread, but from a separate thread, and it triggers an assert. Or at least for a migrate to file. A possible fix would be instead of doing an assert to rpc the client_destroy to the main thread via a dispatcher. We already have such a dispatcher, I never considered if we can write to the dispatcher pipe from different threads and assume atomicity, (i.e. thread1:write(<fd>, <buf1>, <len1>) is atomic wrt thread2:write(<fd>, <buf2>, <len2>))

Comment 16 Cole Robinson 2013-05-21 01:32:03 UTC
I dug a bit more, and doesn't look like qemu.git is affected. This patch fixes things:

commit bb1fadc444ff967554c41d96cb9dde110e8aece9
Author: Paolo Bonzini <pbonzini>
Date:   Fri Feb 22 17:36:21 2013 +0100

    migration: cleanup migration (including thread) in the iothread
    
    Perform final cleanup in a bottom half, and add joining the thread to
    the series of cleanup actions.
    
    migrate_fd_error remains for connection error, but it doesn't need
    to cleanup anything anymore.


But it's in the middle of a large series of migration cleanups so doesn't look like an easy backport.

Comment 17 Alon Levy 2013-05-22 01:10:32 UTC
*** Bug 965452 has been marked as a duplicate of this bug. ***

Comment 18 Alon Levy 2013-05-22 01:12:59 UTC
Is there any plan to rebase qemu for f19?

Comment 19 Paolo Bonzini 2013-05-22 11:22:14 UTC
Backporting that series is safe (at least it won't break anything outside migration :)).

You'd have to backport also the 4 patches ending at 90f8ae724a575861f093fbdbfd49a925bcfec327.

You could choose to stop the backport where the bug is fixed, or go on until  6f190a0641f5b06a462b62955c15c77b8fb3990c.  The series had a _lot_ of testing, and is safe.

Comment 20 Alon Levy 2013-05-23 03:28:59 UTC
I've done the rebase, scratch build is here:

 http://koji.fedoraproject.org/koji/taskinfo?taskID=5412961

I've started a non scratch build.

Comment 21 Alon Levy 2013-05-23 23:15:10 UTC
Finally managed to do a build

http://koji.fedoraproject.org/koji/taskinfo?taskID=5415645

<rant>
fedpkg doesn't support a http/https proxy, and I'm stuck behind one, finally installed fedpkg & all the prerequisites inside the VPN.
</rant>

Comment 22 Alon Levy 2013-06-03 13:48:37 UTC
Moving to RELEASE_PENDING, fix is in from 1.4.1-3 (currently last is 1.4.2-2).

Comment 23 Cole Robinson 2013-06-10 18:33:20 UTC
Since it's already queued for F19, closing as CURRENTRELEASE


Note You need to log in before you can comment on or make changes to this bug.