Bug 450677

Summary: Remote connection locks virt-manager with tick = 1 second
Product: Red Hat Enterprise Linux 5 Reporter: Alexander Todorov <atodorov>
Component: virt-managerAssignee: Cole Robinson <crobinso>
Status: CLOSED WONTFIX QA Contact: Virtualization Bugs <virt-bugs>
Severity: low Docs Contact:
Priority: low    
Version: 5.2CC: bernie+fedora, berrange, sascha-web-bugzilla.redhat.com, sputhenp, xen-maint
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-10-13 16:44:35 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
virt-manager.log
none
strace -f -p $PID 2>&1 output none

Description Alexander Todorov 2008-06-10 11:42:20 UTC
Description of problem:
virt-manager stops responding when trying to connect to remote hypervisor over SSH.

Version-Release number of selected component (if applicable):
local machine: virt-manager-0.5.3-8.el5

remote machine:
libvirt-0.3.3-7.el5
nc-1.84-10.fc6


How reproducible:
Always

Steps to Reproduce:
1. Generate ssh key for your local desktop user
2. Add the key to /root/.ssh/authorized_keys on the remote host
3. Verify that 'ssh root@remote-host' works without a password
4. Start virt-manager/unprivileged and add new connection
5. Chose remote tunneling over SSH
6. Type the remote server hostname and click OK
  
Actual results:
virt-manager tries to connect to the remote host and stops responding. the UI
becomes white. 

Expected results:
a connection to the remote hypervisor is opened

Additional info:
* When doing the same steps but use the hostname of my local machine (it's also
running xen as dom0) all works. I'm able to connect to it and see the running
guests.

* when doing the above with the remote machine (e.g. 2 different machines)
things just stop working.

* The connection to remote machine is fast enough for SSH and VNC running
without problems.

* On remote machine I don't have firewall (e.g. iptables -X/-F)
* On remote machine selinux is in permissive mode

* On local machine virt-manager.log says

-------------------------------------------------------------------------------
[Tue, 10 Jun 2008 13:38:11 virt-manager 14213] DEBUG (connect:116) Connection to
open is xen+ssh://root.com/
[Tue, 10 Jun 2008 13:38:11 virt-manager 14213] DEBUG (connection:293) Scheduling
background open thread for xen+ssh://root.com/
[Tue, 10 Jun 2008 13:38:11 virt-manager 14213] DEBUG (connection:300) Background
thread is running
[Tue, 10 Jun 2008 13:38:14 virt-manager 14213] DEBUG (connection:329) Background
open thread complete, scheduling notify
[Tue, 10 Jun 2008 13:38:14 virt-manager 14213] DEBUG (connection:338) Notifying
open result
[Tue, 10 Jun 2008 13:38:16 virt-manager 14213] DEBUG (manager:408) About to
append vm: Domain-0
[Tue, 10 Jun 2008 13:38:16 virt-manager 14213] DEBUG (manager:408) About to
append vm: linux
[Tue, 10 Jun 2008 13:38:16 virt-manager 14213] DEBUG (manager:398) VM Domain-0
started

* On remote machine there is a domU with name "linux"

Comment 1 RHEL Program Management 2008-06-10 11:44:20 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 2 Alexander Todorov 2008-06-10 11:46:00 UTC
When connecting to the local hypervisor but using the full hostname I have in
the log:

--------------------------------------------------------------------------------
[Tue, 10 Jun 2008 13:45:03 virt-manager 14606] INFO (virt-manager:126)
Application startup
[Tue, 10 Jun 2008 13:45:03 virt-manager 14653] DEBUG (engine:74) About to
connect to uris ['xen+ssh://root.com/',
'xen+ssh://root.com/', 'xen:///']
[Tue, 10 Jun 2008 13:45:10 virt-manager 14653] DEBUG (connection:293) Scheduling
background open thread for xen+ssh://root.com/
[Tue, 10 Jun 2008 13:45:10 virt-manager 14653] DEBUG (connection:300) Background
thread is running
[Tue, 10 Jun 2008 13:45:11 virt-manager 14653] DEBUG (connection:329) Background
open thread complete, scheduling notify
[Tue, 10 Jun 2008 13:45:11 virt-manager 14653] DEBUG (connection:338) Notifying
open result
[Tue, 10 Jun 2008 13:45:11 virt-manager 14653] DEBUG (manager:408) About to
append vm: Domain-0
[Tue, 10 Jun 2008 13:45:11 virt-manager 14653] DEBUG (manager:408) About to
append vm: debian
[Tue, 10 Jun 2008 13:45:11 virt-manager 14653] DEBUG (manager:408) About to
append vm: boinc
[Tue, 10 Jun 2008 13:45:11 virt-manager 14653] DEBUG (manager:408) About to
append vm: enc
[Tue, 10 Jun 2008 13:45:11 virt-manager 14653] DEBUG (manager:398) VM Domain-0
started
--------------------------------------------------------------------------------

"redbull" is the hostname of my machine

Comment 4 Cole Robinson 2009-02-13 19:55:55 UTC
Can you still reproduce this on 5.3? I can't manage to.

Comment 5 Alexander Todorov 2009-02-16 13:15:20 UTC
Created attachment 332031 [details]
virt-manager.log

Yes I can reproduce. I'm running RHEL 5.3 and the server is running 5.3 as well.
I'm connected to VPN if that matters.

Comment 6 Alexander Todorov 2009-02-16 13:17:31 UTC
Created attachment 332032 [details]
strace -f -p $PID 2>&1 output

strace output with virt-manager-0.5.3-10.el5

Comment 7 Cole Robinson 2009-03-12 17:36:43 UTC
A couple things to try:

does 'virsh --connect xen+ssh://root.host/' work? Is there any ssh prompt or anything to accept the key?

Do you see any output if you run virt-manager from the cli with the --no-fork option?

Comment 8 Alexander Todorov 2009-03-13 08:28:14 UTC
(In reply to comment #7)
> A couple things to try:
> 
> does 'virsh --connect xen+ssh://root.host/' work? 

Yes. I'm able to connect and execute commands.

> Is there any ssh
> prompt or anything to accept the key?
> 

No prompt at all.

> Do you see any output if you run virt-manager from the cli with the --no-fork
> option?  

No output at all from virt-manager --no-fork

Comment 9 Cole Robinson 2009-03-18 21:12:55 UTC
Strange. Well, the log says that it is actually connecting, maybe the app is blocking somewhere random, or just trying to do too much. Can you try a couple things?

- Go to Edit->Preferences, change the update interval to something like 5 seconds, see if that helps.

- Run virt-manager --no-fork from the command line. When the app locks up, hit ctrl-c from the command line as many times as necessary till the app stops. Post all the backtraces here. That should be a quick and dirty way to see what the app is stuck doing.

Comment 10 Alexander Todorov 2009-03-19 08:54:12 UTC
(In reply to comment #9)
> - Go to Edit->Preferences, change the update interval to something like 5
> seconds, see if that helps.

5 seconds refresh interval made it work although it works slowly. 
I was able to connect to the hypervisor and open the graphics console for a running guest. When trying to view guest details or interact with the graphics console (which uses VNC) the response is considerably slower compared to direct VNC connection to the hypervisor.

Comment 11 Cole Robinson 2009-04-23 15:45:36 UTC
Well, I'm not really sure what we can do here. Cutting down on the amount of polling we have to do on each tick interval is always a goal, but if only a small number of VMs is clogging the connection, it would require drastic changes to fix this.

Comment 12 Bernie Innocenti 2009-06-12 23:45:56 UTC
virt-manager 0.7.0-5 or hg head is hanging for me too when connecting to a remote host running libvirt 0.6.4.

I initially thought it was a timing issue because I was seeing continuous I/O on the socket, but when I tried with virsh, I've seen this:

virsh # list
 Id Name                 State
----------------------------------
  3 asterisk             running
 11 buildslave-debian-squeeze-64bit running

virsh # status
error: unknown command: 'status'
virsh # dominfo asterisk
Id:             3
Name:           asterisk
UUID:           c1cd7129-a3fd-0f94-97cc-5ce936417d88
OS Type:        hvm
State:          running
CPU(s):         2
CPU time:       56965.1s
Max memory:     524288 kB
Used memory:    524288 kB
Autostart:      enable
error: server closed connection

???  The server died?


Hope this helps.

Comment 13 Bernie Innocenti 2009-06-13 00:08:07 UTC
Increasing the delay to 10 seconds "cured" the issue for me, but the whole GUI becomes very unresponsive to the point of being unusable.

It seems we need to adopt a different strategy to handle connections with high latency without blocking the GTK event loop.

Comment 15 Cole Robinson 2009-06-14 20:51:51 UTC
I don't really think this is a blocker: comments 12 and 13 are actually against F11/Rawhide, and a bug was filed appropriately as #505693.

The original reporter only experienced this issue with remote connections over a VPN, which I wouldn't consider a huge issue, especially when X11 forwarding is likely a valid workaround.

There isn't really any magic fix for these types of issues either. We basically have to:

- Move non critical polling into background threads (VM stats)
- Use libvirt domain events rather than poll every tick

Both require development and have nontrivial risks of regression, since they are in a critical code path.

Comment 16 Alexander Todorov 2009-06-15 07:50:53 UTC
(In reply to comment #15)

> The original reporter only experienced this issue with remote connections over
> a VPN, which I wouldn't consider a huge issue, especially when X11 forwarding
> is likely a valid workaround.

For the record: X11 forwarding is slow in my case and is not an option.

Comment 18 Cole Robinson 2009-10-13 16:44:35 UTC
Latest virt-manager (0.8.0) has most of the polling moved into a background thread, and I've had many reports of improved usability over a slow connection. However, backporting this is nontrivial and risky from a regression stand point.

Since there hasn't been any customer/partner issues regarding this deficiency, I don't think it is worth trying to fix with a backport for 5.5 or RHEL5 in general. Thanks for everyone following up, but closing as WONTFIX.