Bug 1271519

Summary: Error message "Unable to connect to the graphic server 127.0.0.1:xxxx" pops up
Product: Red Hat Enterprise Linux 7 Reporter: Xiaodai Wang <xiaodwan>
Component: virt-viewerAssignee: Virt Viewer Maint <virt-viewer-maint>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.2CC: cfergeau, dblechte, fidencio, jjongsma, juzhou, mxie, mzhan, pgrunt, rbalakri, tzheng, xiaodwan
Target Milestone: rc   
Target Release: 7.3   
Hardware: x86_64   
OS: Unspecified   
Whiteboard:
Fixed In Version: virt-viewer-2.0-7.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-04 01:11:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
error message box
none
virt-viewer connect to guest successfully
none
virt-vewer connect to guest failure none

Description Xiaodai Wang 2015-10-14 08:17:29 UTC
Created attachment 1082727 [details]
error message box

Description of problem:
Error message "Unable to connect to the graphic server 127.0.0.1:xxxx" pops up.

Version-Release number of selected component (if applicable):
virt-viewer-2.0-6.el7.x86_64
libvirt-1.2.17-13.el7.x86_64

How reproducible:
80%

Steps to Reproduce:
1. Use virt-viewer connect to a shutdown guest.
# G_MESSAGES_DEBUG=all virt-viewer rhel7.2-0924 --wait

2. Stop libvirtd service on host.
# service libvirtd stop

3. Start libvirtd service on host again.
# service libvirtd start
4. Start the guest.

Actual results:
Error message "Unable to connect to the graphic server 127.0.0.1:xxxx" pops up.

Expected results:
virt-viewer should connect to the guest correctly.

Additional info:
I attached the debug logs for both success and failure.

Comment 1 Xiaodai Wang 2015-10-14 08:18:56 UTC
Created attachment 1082729 [details]
virt-viewer connect to guest successfully

Comment 2 Xiaodai Wang 2015-10-14 08:19:32 UTC
Created attachment 1082730 [details]
virt-vewer connect to guest failure

Comment 4 Jonathon Jongsma 2015-10-14 16:18:51 UTC
The fact that you have these warnings even on the successful run makes me think that you are using a non-standard configuration of some sort:

** (virt-viewer:4874): WARNING **: Couldn't connect to accessibility bus: Failed to connect to socket /tmp/dbus-3yGBLPcl8U: Connection refused

(virt-viewer:5058): virt-viewer-DEBUG: Error: Failed to connect socket to '/var/run/libvirt/libvirt-sock-ro': No such file or directory



I have not been able to reproduce the failure here, and I don't see any of those warnings above. Can you give a little more information on your setup?

Is the client running on the same machine as the host? If not, what OS is the host?

Comment 5 Xiaodai Wang 2015-10-15 02:31:36 UTC
Yes, the client is running on the same machine as the host. 
It's easy to meet this problem at the first time. And then you redo the same operation, it's hard to reproduce it. I tried about 8~12 times to reproduce it again. So you can try it many times.

Comment 6 Christophe Fergeau 2015-10-15 08:40:09 UTC
Is this being run as root?

Comment 7 Christophe Fergeau 2015-10-15 09:05:44 UTC
Actually I can reproduce this on a 7.2 system. Building virt-viewer from 7.2 and running it on a fedora box leads to a different kind of failure for what it's worth (repeated "cannot connect to libvirt socket" messages during libvirtd downtime).

Comment 8 Xiaodai Wang 2015-10-15 09:53:04 UTC
(In reply to Christophe Fergeau from comment #6)
> Is this being run as root?

yes

Comment 9 Christophe Fergeau 2015-10-15 13:24:33 UTC
Some experiments I've run


This triggers the bug:
$ virsh destroy $domain_name
$ virt-viewer $domain_name --wait
$ systemctl stop libvirtd.service && systemctl start libvirtd.service
$
$ virsh start $domain_name


This does not trigger the bug:
$ virsh destroy $domain_name
$ virt-viewer $domain_name --wait
$ systemctl stop libvirtd.service && systemctl start libvirtd.service
$
$ virsh start $other_domain && virsh destroy $other_domain && sleep 1 && virsh start $domain_name


This triggers the bug:
$ virsh destroy $domain_name
$ virt-viewer $domain_name --wait
$ systemctl stop libvirtd.service && systemctl start libvirtd.service
$
$ sleep 10 && virsh start $domain_name


Adding a 1 second g_usleep in virt_viewer_session_spice_open_host() right before the spice_session_connect() call also makes the bug go away in all cases.


Another thing to have in mind is that after a libvirt restart, virt-viewer does some polling to detect domain startup and connect to it rather than use libvirt domain events. See virt_viewer_start_reconnect_poll() and virt_viewer_connect_timer() and VirtViewer::domain_event. If I force domain_event to always be -1 in virt_viewer_connect(), I can sometimes reproduce this issue without a libvirtd restart. Why it's more reliably reproducible without a libvirtd restart, I don't know.

My current theory is that by polling libvirt domain, we get connection information before the domain is fully ready (SPICE ports not initialized), and thus the connection fails. Maybe always using libvirt events instead of falling back to polling would fix this bug.

Comment 10 Christophe Fergeau 2015-10-15 15:44:00 UTC
https://www.redhat.com/archives/virt-tools-list/2015-October/msg00043.html avoids this bug: it makes sure we don't keep polling when connecting to a recent (0.8 or newer) libvirtd.
As only connections to VM through the polling code have this bug (I think), this patch will prevent the buggy situation from occurring.

Comment 12 Fabiano FidĂȘncio 2016-03-15 12:26:45 UTC
Here is the relevant commit for virt-viewer:
c1e72c1be177544e5a718303dcb88412568f46d9

Comment 14 Xiaodai Wang 2016-04-14 09:22:29 UTC
I can reproduce it with versions:
virt-viewer-2.0-6.el7.x86_64
libvirt-1.2.17-13.el7.x86_64

And then only update virt-viewer to virt-viewer-2.0-7.el7.x86_64 and tried it several times again. It doesn't occur.

At last updated libvirt to latest version libvirt-1.3.3-1.el7.x86_64 and tried it several times. I didn't meet this issue.

So move the bug from ON_QA to VERIFIED.

Comment 16 errata-xmlrpc 2016-11-04 01:11:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2229.html