Bug 1004443 - Qemu crash during migration with reboot
Qemu crash during migration with reboot
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: spice-server (Show other bugs)
6.5
Unspecified Unspecified
unspecified Severity medium
: rc
: ---
Assigned To: Uri Lublin
Desktop QE
:
Depends On:
Blocks: 1016795
  Show dependency treegraph
 
Reported: 2013-09-04 12:07 EDT by Marian Krcmarik
Modified: 2014-10-14 01:04 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: bad cleanup of pending data from the client Consequence: QEMU would abort if a VM is migrated while it's being rebooted Fix: Cleanup the pending client data appropriately in order to avoid the assertion/crash Result:
Story Points: ---
Clone Of:
: 1016795 (view as bug list)
Environment:
Last Closed: 2014-10-14 01:04:38 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
backtrace from core dump (4.20 KB, text/plain)
2013-09-04 12:08 EDT, Marian Krcmarik
no flags Details
autotest log (31.50 KB, text/plain)
2013-09-04 12:09 EDT, Marian Krcmarik
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2014:1435 normal SHIPPED_LIVE spice-server bug fix update 2014-10-13 21:06:04 EDT

  None (edit)
Description Marian Krcmarik 2013-09-04 12:07:13 EDT
Description of problem:
Qemu process is sometimes aborted during migration with reboot of spice session - This is caught by automated test in autotest virttest framework, It seems like abort happens right after client migrate info is sent, see attached autotest log or snip from log:
09/03 15:22:13 DEBUG|qemu_monit:0526| Send command: __com.redhat_spice_migrate_info 192.168.122.1 3001 3201 "C=CZ,L=BRNO,O=SPICE,CN=192.168.122.1"
09/03 15:22:26 INFO |   aexpect:0816| [qemu output] (/usr/libexec/qemu-kvm:14005): SpiceWorker-Warning **: red_worker.c:11009:red_wait_outgoing_item: timeout
09/03 15:22:26 INFO |   aexpect:0816| [qemu output] (/usr/libexec/qemu-kvm:14005): SpiceWorker-ERROR **: red_worker.c:11473:dev_destroy_primary_surface: assertion `!worker->surfaces[surface_id].context.canvas' failed

I am attaching backtrace from core dump, possibly core dump can be provided as well.

Version-Release number of selected component (if applicable):
spice-server-0.12.4-3.el6.x86_64
qemu-kvm-0.12.1.2-2.398.el6.x86_64

How reproducible:
1/2

Steps to Reproduce:
1. Establish Spice session to a RHEL6 VM
2. Start reboot of the VM
3. Start immediately migration.
SSL connection was used for main and inputs channel

Actual results:
Qemu abort

Expected results:
Successful migration

Additional info:
Comment 1 Marian Krcmarik 2013-09-04 12:08:00 EDT
Created attachment 793748 [details]
backtrace from core dump
Comment 2 Marian Krcmarik 2013-09-04 12:09:30 EDT
Created attachment 793749 [details]
autotest log
Comment 4 Yonit Halperin 2013-09-10 09:39:03 EDT
Hi,

Can you attach the qemu log file, preferably with increased spice-debug level?
Is the abort occurs in between the client_migrate_info and migrate commands, or after the migrate command?
Comment 5 Marian Krcmarik 2013-09-10 11:15:32 EDT
(In reply to Yonit Halperin from comment #4)
> Hi,
> 
> Can you attach the qemu log file, preferably with increased spice-debug
> level?
> Is the abort occurs in between the client_migrate_info and migrate commands,
> or after the migrate command?

I would need to rerun test and try toget debug log, There is nothing much in non-debug log just It is visible that crash happens after client_migrate_info and before migrate commands, right after:
SpiceWorker-Warning **: red_worker.c:11009:red_wait_outgoing_item: timeout
Comment 6 Yonit Halperin 2013-09-10 11:40:12 EDT
(In reply to Marian Krcmarik from comment #5)
> (In reply to Yonit Halperin from comment #4)
> > Hi,
> > 
> > Can you attach the qemu log file, preferably with increased spice-debug
> > level?
> > Is the abort occurs in between the client_migrate_info and migrate commands,
> > or after the migrate command?
> 
> I would need to rerun test and try toget debug log, There is nothing much in
> non-debug log just It is visible that crash happens after
> client_migrate_info and before migrate commands, right after:
> SpiceWorker-Warning **: red_worker.c:11009:red_wait_outgoing_item: timeout
Actually, this timeout explains the assert. However the timeout occurred due to a failure to send a message to the client for 30 seconds. We need to fix the bug that leads to the assert, but it is still unclear why the client was not responsive. If you manage to reproduce the bug, please attach the client log with increased log level. The bug is not directly related to migration, unless client_migrate_info caused the client to be unresponsive for a while.
Comment 7 Yonit Halperin 2013-09-12 16:12:29 EDT
patches have been posted to fix the abort (http://patchwork.freedesktop.org/patch/14639/). However, if this error is reproducible with reboot after client_migrate_info (I couldn't reproduce), another bug should be opened for investigating why the client display channel is not responsive.
Comment 10 errata-xmlrpc 2014-10-14 01:04:38 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1435.html

Note You need to log in before you can comment on or make changes to this bug.