Bug 1320331

Summary: Forcing TLS on Spice channels via defaultMode='secure' causes connection failure.
Product: [Fedora] Fedora Reporter: Jonas Jonsson <jonas>
Component: virt-managerAssignee: Cole Robinson <crobinso>
Status: CLOSED NEXTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 24CC: berrange, crobinso, jonas, virt-maint
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-06-18 14:44:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
virt-manager --debug showing first issue
none
Dump of virsh xml none

Description Jonas Jonsson 2016-03-22 20:56:46 UTC
Description of problem:

I have a remote machine running Fedora 23 that I can connect to with virt-manager from my local machine. I've setup x509 certificates using Let's Encrypt. I can verify that it works by connecting with remote-viewer and SSH port forwarding.

However, when I disable none-TLS channels virt-manager tries to connect to port "None" via SSH and Ncat.

Instead of a working console, I get the following error.

Error output from closed console: Ncat: Invalid port number "None". QUITTING.

I can totally understand that using both SSH and TLS is somewhat redundant but virt-manager should at least give a more human friendly hint about what is wrong.

Version-Release number of selected component (if applicable):
virt-manager-1.3.2-1.fc23.noarch

How reproducible:
Always

Steps to Reproduce:
1. Add defaultMode='secure' in virtual machine XML file
2. Try to connect with virt-manager

Actual results:
Error output from closed console: Ncat: Invalid port number "None". QUITTING.


Expected results:
Working console.

Additional info:

Comment 1 Jonas Jonsson 2016-03-24 14:23:42 UTC
Sort of the same issue, if I change the Spice graphics device to listen on all interface and not only loopback, virt-manager will try to do a direct connection to the none-TLS port.

I think that Virt-manager should try, in order, direct TLS, direct none-TLS, none-TLS over SSH and finally TLS over SSH. This would make it possible to use the TLS port with remote-viewer but still having a working console in virt-manager.

Comment 2 Cole Robinson 2016-05-08 00:41:28 UTC
Thanks for the report. But i'm  confused... what does none-TLS mean? what are you actually changing?

Can you provide the following:

virt-manager --debug when reproducing the issue
when the VM is running, grab the runtime XML with 'sudo virsh dumpxml $VMNAME'

Comment 3 Jonas Jonsson 2016-05-16 12:06:17 UTC
Created attachment 1157883 [details]
virt-manager --debug showing first issue

Comment 4 Jonas Jonsson 2016-05-16 12:06:48 UTC
Created attachment 1157884 [details]
Dump of virsh xml

Comment 5 Jonas Jonsson 2016-05-16 12:07:57 UTC
none-TLS is the default way to communicate without any changes. With defaultMode=secure TLS is enabled instead on the same port, 5900 by default.

I've attached the logs and the machine XML. The logs shows how I connect to the running machine with the default config, it works fine. Then I change the configuration and run "virsh define Windows7_VOLVO.xml" to change the defaultMode parameter, it still works. Restart the virtual machine and there is the error.

Comment 6 Cole Robinson 2016-05-16 20:37:15 UTC
Thanks for the info, I can reproduce now. The 'None' error is because we aren't handling the fact that port=None in this case, since only tlsPort is set.

But once I fix that it still doesn't work. Can you describe the setup you used to get remote-viewer working? So I can figure out why that works, but not what I'm trying locally

Comment 7 Jonas Jonsson 2016-05-18 19:09:24 UTC
I connect with remote-viewer spice://remote-host.example.net?tls-port=5901.

I disable the auto-port for TLS and set it to 5901. The Address was changed to all interfaces. All this was done using the virt-manager UI.

I've also opened the port in my local firewall using firewall-cmd --add-port=5901/tcp.

It seems that both remote-viewer and virt-manager tries to do a direct connection if it uses the TLS port. I suspect that is why it doesn't work in your case.

Comment 8 Cole Robinson 2016-05-18 21:02:08 UTC
Ah okay so you were directly connecting, not using SSH forwarding. That makes more sense. SSH likely won't work for most (any?) TLS setups, because the hostname you are connecting to is important for the TLS verification. So on my host machine, the TLS cert is valid for 'colepc' but doesn't work if connecting to 127.0.0.1/localhost. And in the ssh tunnel case the address that is relevant here is not the hostname in the libvirt URI but the address in the VM XML graphics listen= attribute

So the SSH case basically requires that 1) the physical machine hosting the VM is configured to be a TLS client, and 2) the listen address advertised in the XML is the valid TLS hostname for the VM, but even when I tried to force that config it wasn't working for some reason. Even if it did work it would be a pretty esoteric setup

So instead I changed virt-manager to error explicitly in this case and try to give a bit of info about what's going on:

commit 8c2adb83aefb6e1a42e0eaa75c9903faa24a6bc3
Author: Cole Robinson <crobinso>
Date:   Wed May 18 16:57:38 2016 -0400

    console: Error for more non-working graphical configs
    
    - If connecting remotely but graphics has no listen address,
        like the spice GL case.
    - Trying to connect to a TLS using VM over an ssh tunnel, it doesn't
        seem to work: https://bugzilla.redhat.com/show_bug.cgi?id=1320331

Comment 9 Jonas Jonsson 2016-05-19 15:13:59 UTC
I agree that running TLS over SSH is a bit of a corner case that is probably not worth supporting.

However, the mismatching hostname issue isn't that simple. The listen attribute can be set to 127.0.0.1 or 0.0.0.0 (using the GUI that is). This mean that for a TLS connection, this attribute would be 0.0.0.0. To be able to verify the certificate it must get the hostname from somewhere else, i.e. the URI of the remote libvirt instance.

My simple test with a made up hostname in /etc/hosts and connecting via that name using virt-manager gives me a TLS certificate error and the logs also indicates that it tries to do a direct connection to the made up name.

From a TLS perspective, I don't see why this wouldn't work. You are connecting to a machine called 'colepc' via SSH and the TLS certificate it presents is for a host called 'colepc'. The communication with SSH happens over stdin/stdout due to the use of nc, not a port forwarding involving TCP connections on your local machine.

Anyway, I've tried the latest git commit 8c2adb8 and it gives a much better error message so I think it's good for now.

Comment 10 Cole Robinson 2016-05-19 15:50:27 UTC
Yeah I'm not really sure why it failed either. If you can come up with a working scheme over ssh I'm happy to revisit this

Comment 11 Cole Robinson 2016-06-18 14:44:12 UTC
These patches are being built for f24+ with the new virt-manager release. Not sure if I'm going to pull the new virt-manager in f23 yet, maybe give it some time to bake in f24/rawhide. Just closing this bug against F24 virt-manager. If you come up with a working config for comment #9/#10 please open a separate bug to discuss