RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 952375 - migration: sockets to src-server are not closed after closing the session and swapping to dst-server?
Summary: migration: sockets to src-server are not closed after closing the session and...
Keywords:
Status: CLOSED DUPLICATE of bug 1024501
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: spice-gtk
Version: 6.4
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: rc
: ---
Assignee: Christophe Fergeau
QA Contact: Desktop QE
URL:
Whiteboard:
Depends On: 1024501
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-04-15 19:46 UTC by Yonit Halperin
Modified: 2014-02-18 19:44 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-02-18 19:44:29 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Yonit Halperin 2013-04-15 19:46:45 UTC
Description of problem:

After qemu migration is complete, and after spice-migration data has been passed from the src to the client. The src-server waits till it identifies the client had disconnected from it, and then it signals libvirt that it can close the src qemu. If a timeout of 10sec expires, the src-server stops waiting (and you can find in spice-server log "migrate_timeout").
Sometimes (see "Hints" below.), even though messages about disconnecting the channels appear in spice-gtk log quickly after qemu migration is complete, spice-server doesn't identifies disconnection. And migrate_timeout occurs.

This bug is related to bug #920205. relevant logs can be found there.
The symptom that is seen in bug #920205, is a result of libvirt waiting for spice-migration to complete, before starting the vm on the dest side. This should be fixed as well. Notice - There is no need to play video/audio in order to trigger the bug. It is just easier to see it that way, but only when using libvirt. When not using libvirt, the vm on the dest side starts immediately when migration completes.

Hints:
(a) Marian is able to reproduce it only if there are more than 1 qxl devices (but it is not necessary that more than 1 will be attached). 
(b) mazhang was able to reproduce it only when the client and the server are not on the same subnet.
(c) I tested this on f18 client, with upstream versions. I was able to reproduce it when the client and hosts were on different networks, but also with only one qxl device. I've bisected spice-gtk - however, the patch that causes this is 8029bd002a0378f593ad9e35c2a07b6c040f9adf, which is not part of rhel6, but it might give some hint.

Comment 1 Marian Krcmarik 2013-05-03 09:42:43 UTC
It seems that one of the possible consequences is that seamless migration falls back to Switch host type when the timeout occurs.
A VM with more qxl devices where I can reproduce the problem often falls back to Switch host type of migration.

Snip from log:
((null):21593): Spice-Debug **: red_dispatcher.c:829:red_dispatcher_on_vm_start: 
((null):21593): Spice-Info **: reds.c:4465:spice_server_migrate_connect: 
((null):21593): Spice-Info **: reds.c:3486:reds_mig_started: 
((null):21593): Spice-Info **: reds.c:3588:migrate_timeout: 
main_channel_migrate_cancel_wait: client 0x7f38b5412050 cancel wait connect
main_channel_client_handle_migrate_connected: client 0x7f38b5412050 connected: 1 seamless 
1
main_channel_client_handle_migrate_connected: client 0x7f38b5412050 MIGRATE_CANCEL
((null):21593): Spice-Info **: reds.c:4526:spice_server_migrate_start: 
((null):21593): Spice-Debug **: char_device.c:794:spice_char_device_stop: dev_state 0x7f38
b5458a70
((null):21593): Spice-Debug **: red_dispatcher.c:818:red_dispatcher_on_vm_stop: 
((null):21593): SpiceWorker-Info **: red_worker.c:11118:handle_dev_stop: stop
((null):21593): SpiceWorker-Info **: red_worker.c:11118:handle_dev_stop: stop
((null):21593): SpiceWorker-Info **: red_worker.c:11118:handle_dev_stop: stop
((null):21593): SpiceWorker-Info **: red_worker.c:11118:handle_dev_stop: stop
((null):21593): Spice-Info **: reds.c:4552:spice_server_migrate_end: 
((null):21593): Spice-Info **: reds.c:3558:reds_mig_finished: 
main_channel_migrate_src_complete: 
main_channel_migrate_src_complete: client 0x7f38b5412050 SWITCH_HOST
main_channel_marshall_migrate_switch:

Comment 2 Yonit Halperin 2013-05-03 12:47:40 UTC
(In reply to comment #1)
> It seems that one of the possible consequences is that seamless migration
> falls back to Switch host type when the timeout occurs.
> A VM with more qxl devices where I can reproduce the problem often falls
> back to Switch host type of migration.
> 
This timeout is a different one: it waits before migration starts, on client_migrate_info, for the client to connect to the destination.
The client ack for connection to the destination arrived after the timeout expired.
> Snip from log:
> ((null):21593): Spice-Debug **:
> red_dispatcher.c:829:red_dispatcher_on_vm_start: 
> ((null):21593): Spice-Info **: reds.c:4465:spice_server_migrate_connect: 
> ((null):21593): Spice-Info **: reds.c:3486:reds_mig_started: 
> ((null):21593): Spice-Info **: reds.c:3588:migrate_timeout: 
> main_channel_migrate_cancel_wait: client 0x7f38b5412050 cancel wait connect
> main_channel_client_handle_migrate_connected: client 0x7f38b5412050
> connected: 1 seamless 
> 1
> main_channel_client_handle_migrate_connected: client 0x7f38b5412050
> MIGRATE_CANCEL
> ((null):21593): Spice-Info **: reds.c:4526:spice_server_migrate_start: 
> ((null):21593): Spice-Debug **: char_device.c:794:spice_char_device_stop:
> dev_state 0x7f38
> b5458a70
> ((null):21593): Spice-Debug **:
> red_dispatcher.c:818:red_dispatcher_on_vm_stop: 
> ((null):21593): SpiceWorker-Info **: red_worker.c:11118:handle_dev_stop: stop
> ((null):21593): SpiceWorker-Info **: red_worker.c:11118:handle_dev_stop: stop
> ((null):21593): SpiceWorker-Info **: red_worker.c:11118:handle_dev_stop: stop
> ((null):21593): SpiceWorker-Info **: red_worker.c:11118:handle_dev_stop: stop
> ((null):21593): Spice-Info **: reds.c:4552:spice_server_migrate_end: 
> ((null):21593): Spice-Info **: reds.c:3558:reds_mig_finished: 
> main_channel_migrate_src_complete: 
> main_channel_migrate_src_complete: client 0x7f38b5412050 SWITCH_HOST
> main_channel_marshall_migrate_switch:

Comment 3 RHEL Program Management 2013-10-14 03:49:12 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 4 Marc-Andre Lureau 2013-11-22 15:24:22 UTC
Since the patch causing this bug is from 8029bd, there is a high probably this bug is duplicate of 1024501, where the sockets are not closed after migration.

Could you try with spice-gtk from git?

thanks

Comment 6 Marc-Andre Lureau 2014-02-18 18:50:57 UTC
Marian, Yonit being gone, I think it should be closed as dup of bug 1024501. Agree? thanks

Comment 7 Marian Krcmarik 2014-02-18 19:40:29 UTC
(In reply to Marc-Andre Lureau from comment #6)
> Marian, Yonit being gone, I think it should be closed as dup of bug 1024501.
> Agree? thanks

Well I am not sure, but if you believe so, go ahead. I was never really able to reproduce what Yonit meant I guess.

Comment 8 Marc-Andre Lureau 2014-02-18 19:44:29 UTC
thanks, closing as dup of 1024501 since it is the most likely reason (closing src fds).

*** This bug has been marked as a duplicate of bug 1024501 ***


Note You need to log in before you can comment on or make changes to this bug.