RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1004443 - Qemu crash during migration with reboot
Summary: Qemu crash during migration with reboot
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: spice-server
Version: 6.5
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: rc
: ---
Assignee: Uri Lublin
QA Contact: Desktop QE
URL:
Whiteboard:
Depends On:
Blocks: 1016795
TreeView+ depends on / blocked
 
Reported: 2013-09-04 16:07 UTC by Marian Krcmarik
Modified: 2014-10-14 05:04 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: bad cleanup of pending data from the client Consequence: QEMU would abort if a VM is migrated while it's being rebooted Fix: Cleanup the pending client data appropriately in order to avoid the assertion/crash Result:
Clone Of:
: 1016795 (view as bug list)
Environment:
Last Closed: 2014-10-14 05:04:38 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
backtrace from core dump (4.20 KB, text/plain)
2013-09-04 16:08 UTC, Marian Krcmarik
no flags Details
autotest log (31.50 KB, text/plain)
2013-09-04 16:09 UTC, Marian Krcmarik
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2014:1435 0 normal SHIPPED_LIVE spice-server bug fix update 2014-10-14 01:06:04 UTC

Description Marian Krcmarik 2013-09-04 16:07:13 UTC
Description of problem:
Qemu process is sometimes aborted during migration with reboot of spice session - This is caught by automated test in autotest virttest framework, It seems like abort happens right after client migrate info is sent, see attached autotest log or snip from log:
09/03 15:22:13 DEBUG|qemu_monit:0526| Send command: __com.redhat_spice_migrate_info 192.168.122.1 3001 3201 "C=CZ,L=BRNO,O=SPICE,CN=192.168.122.1"
09/03 15:22:26 INFO |   aexpect:0816| [qemu output] (/usr/libexec/qemu-kvm:14005): SpiceWorker-Warning **: red_worker.c:11009:red_wait_outgoing_item: timeout
09/03 15:22:26 INFO |   aexpect:0816| [qemu output] (/usr/libexec/qemu-kvm:14005): SpiceWorker-ERROR **: red_worker.c:11473:dev_destroy_primary_surface: assertion `!worker->surfaces[surface_id].context.canvas' failed

I am attaching backtrace from core dump, possibly core dump can be provided as well.

Version-Release number of selected component (if applicable):
spice-server-0.12.4-3.el6.x86_64
qemu-kvm-0.12.1.2-2.398.el6.x86_64

How reproducible:
1/2

Steps to Reproduce:
1. Establish Spice session to a RHEL6 VM
2. Start reboot of the VM
3. Start immediately migration.
SSL connection was used for main and inputs channel

Actual results:
Qemu abort

Expected results:
Successful migration

Additional info:

Comment 1 Marian Krcmarik 2013-09-04 16:08:00 UTC
Created attachment 793748 [details]
backtrace from core dump

Comment 2 Marian Krcmarik 2013-09-04 16:09:30 UTC
Created attachment 793749 [details]
autotest log

Comment 4 Yonit Halperin 2013-09-10 13:39:03 UTC
Hi,

Can you attach the qemu log file, preferably with increased spice-debug level?
Is the abort occurs in between the client_migrate_info and migrate commands, or after the migrate command?

Comment 5 Marian Krcmarik 2013-09-10 15:15:32 UTC
(In reply to Yonit Halperin from comment #4)
> Hi,
> 
> Can you attach the qemu log file, preferably with increased spice-debug
> level?
> Is the abort occurs in between the client_migrate_info and migrate commands,
> or after the migrate command?

I would need to rerun test and try toget debug log, There is nothing much in non-debug log just It is visible that crash happens after client_migrate_info and before migrate commands, right after:
SpiceWorker-Warning **: red_worker.c:11009:red_wait_outgoing_item: timeout

Comment 6 Yonit Halperin 2013-09-10 15:40:12 UTC
(In reply to Marian Krcmarik from comment #5)
> (In reply to Yonit Halperin from comment #4)
> > Hi,
> > 
> > Can you attach the qemu log file, preferably with increased spice-debug
> > level?
> > Is the abort occurs in between the client_migrate_info and migrate commands,
> > or after the migrate command?
> 
> I would need to rerun test and try toget debug log, There is nothing much in
> non-debug log just It is visible that crash happens after
> client_migrate_info and before migrate commands, right after:
> SpiceWorker-Warning **: red_worker.c:11009:red_wait_outgoing_item: timeout
Actually, this timeout explains the assert. However the timeout occurred due to a failure to send a message to the client for 30 seconds. We need to fix the bug that leads to the assert, but it is still unclear why the client was not responsive. If you manage to reproduce the bug, please attach the client log with increased log level. The bug is not directly related to migration, unless client_migrate_info caused the client to be unresponsive for a while.

Comment 7 Yonit Halperin 2013-09-12 20:12:29 UTC
patches have been posted to fix the abort (http://patchwork.freedesktop.org/patch/14639/). However, if this error is reproducible with reboot after client_migrate_info (I couldn't reproduce), another bug should be opened for investigating why the client display channel is not responsive.

Comment 10 errata-xmlrpc 2014-10-14 05:04:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1435.html


Note You need to log in before you can comment on or make changes to this bug.