Bug 1489854

Summary: qemu: ports allocated via autoport for vnc/spice not tracked over daemon restart
Product: [Community] Virtualization Tools Reporter: Guido Günther <agx>
Component: libvirtAssignee: Libvirt Maintainers <libvirt-maint>
Status: CLOSED NEXTRELEASE QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: crobinso, laine, libvirt-maint, rbalakri
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-10-17 21:51:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Guido Günther 2017-09-08 13:35:12 UTC
Description of problem:
Starting new VMs fails when libvirtd is restarted with already running VMs.
This happens when the VMs use autoport='yes' for their vnc / spice display.

Version-Release number of selected component (if applicable):
3.7.0

How reproducible:

Steps to Reproduce:
1. virsh start vm1
2. systemctl restart libvirtd
3. virsh start vm2

Actual results:
$ virsh start vm2
error: Failed to start domain sid
error: internal error: process exited while connecting to monitor: ((null):12206): Spice-Warning **: reds.c:2530:reds_init_socket: listen: Address already in use
2017-09-08T13:32:11.329200Z qemu-system-x86_64: failed to initialize spice server

Expected results:
vm2 gets started

The issue seems to be that auto allocated vnc/spice ports are no longer tracked correctly over daemon restarts.

Comment 1 Laine Stump 2017-09-08 18:54:13 UTC
Surprisingly, this is apparently a kernel problem :-/ (see Bug 1432684)

That bug is a kernel bug filed against the Fedora 26 kernel, so it's unclear to me just how useful it is to mark this as a duplicate (especially since Guido doesn't (afaik) use the Fedora kernel :-). I'll leave changing the disposition of the BZ to someone who knows better than me.

Comment 2 Cole Robinson 2017-09-08 18:57:13 UTC
Let's leave this open, it can track the functional issue for upstream libvirt

Comment 3 Cole Robinson 2017-10-17 21:51:02 UTC
Some more info for anyone else watching, the main upstream kernel fix seems to be:

commit cbb2fb5c72f48d3029c144be0f0e61da1c7bccf7
Author: Josef Bacik <jbacik>
Date:   Fri Sep 22 20:20:06 2017 -0400

    net: set tb->fast_sk_family
    
Doesn't seem to be in any stable releases at the moment but it's in 4.14