Bug 816451

Summary: virt-manager disconnects from libvirt when using spice+qxl: Failed to send keepalive response to client 0x7ff010004410
Product: Red Hat Enterprise Linux 6 Reporter: Geyang Kong <gkong>
Component: libvirtAssignee: Martin Kletzander <mkletzan>
Status: CLOSED NEXTRELEASE QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.3CC: acathrow, an.euroford, berrange, crobinso, dallan, dyasny, dyuan, jwu, mjenner, mzhan, rwu, sgordon, whuang, yupzhang, zpeng
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-09-05 05:50:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
virt-manager debug information
none
libvirtd's log with log level=1
none
virt-manager's log
none
client.log
none
new-libvirtd log file
none
libvirtd's log and client's log
none
logfiles none

Description Geyang Kong 2012-04-26 06:00:01 UTC
Created attachment 580362 [details]
virt-manager debug information

Description of problem:
  Virt-manager lost connection to libvirt while installing a spice qxl guest.

Version-Release number of selected component (if applicable):
virt-manager-0.9.0-12.el6.x86_64
libvirt-0.9.10-14.el6.x86_64
qemu-kvm-0.12.1.2-2.275.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Launch virt-manager
2. Start create a new virtual machine wizard.
3. Following the wizard to the step 4, then check the "Customize configuration before install"
4. Click "Finish", and change its display device to spice, change vedio device to qxl
5. Click "Begin Installation".

Actual results:
1. After step 5, waiting a while, virt-manager will lost connection to libvirt. Error like following from virt-manager --debug
------------------------------------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/share/virt-manager/virtManager/engine.py", line 440, in _tick
    conn.tick()
  File "/usr/share/virt-manager/virtManager/connection.py", line 1418, in tick
    self.hostinfo = self.vmm.getInfo()
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 3171, in getInfo
    if ret is None: raise libvirtError ('virNodeGetInfo() failed', conn=self)
libvirtError: Cannot write data: Broken pipe
-------------------------------------------------------------------------------------------------

Expected results:
1. Guest can be installed successfully and work normally.

Additional info:

Comment 2 Cole Robinson 2012-05-02 20:12:45 UTC
Hmm, is libvirtd crashing? There's a mention of a keepalive message in the logs but not sure what the result of a failure is there.

Comment 3 Geyang Kong 2012-05-03 05:13:51 UTC
I think libvirtd is always there, since when I hit this error. I can get following output from terminal:
------------------------------------------------------
[root@KP-T2 libvirt]# service libvirtd status
libvirtd (pid  4715) is running...
------------------------------------------------------
And the following is everything in libvirtd.log file, since it was short so I didn't attach it as attachment.
------------------------------------------------------
2012-05-03 05:05:21.026+0000: 4715: info : libvirt version: 0.9.10, package: 14.el6 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2012-04-24-00:57:34, x86-004.build.bos.redhat.com)
2012-05-03 05:05:21.026+0000: 4715: warning : virKeepAliveTimer:182 : No response from client 0x1c9bb50 after 5 keepalive messages in 31 seconds
2012-05-03 05:05:21.048+0000: 4715: warning : qemuMonitorJSONHandleGraphics:675 : missing auth scheme in graphics event
2012-05-03 05:05:21.049+0000: 4715: warning : qemuMonitorJSONHandleGraphics:675 : missing auth scheme in graphics event
2012-05-03 05:05:21.049+0000: 4715: warning : qemuMonitorJSONHandleGraphics:675 : missing auth scheme in graphics event
2012-05-03 05:05:21.049+0000: 4715: warning : qemuMonitorJSONHandleGraphics:675 : missing auth scheme in graphics event
2012-05-03 05:05:21.050+0000: 4715: warning : qemuMonitorJSONHandleGraphics:675 : missing auth scheme in graphics event
2012-05-03 05:05:21.051+0000: 4715: warning : qemuMonitorJSONHandleGraphics:675 : missing auth scheme in graphics event
------------------------------------------------------
libvirtd service's PID didn't change, so I think libvirtd was always there but virt-manager thought libvirtd was dead.

Comment 4 Geyang Kong 2012-05-08 02:54:43 UTC
Created attachment 582830 [details]
libvirtd's log with log level=1

Comment 5 Geyang Kong 2012-05-08 02:56:07 UTC
Created attachment 582831 [details]
virt-manager's log

Comment 6 Martin Kletzander 2012-05-09 09:46:39 UTC
As I can see from the logs that virt-manager stopped sending keepalive messages after the last one being sent on 16:47:30.656, thus was properly disconnected after 30 seconds.

I'd most likely say that virt-manager stopped calling libvirt's event loop after the spice connection to the host, however the virt-manager's log is without libvirt's debug information. If this is enough for you, you can re-assign it to virt-manager, if not, I'd need the libvirt's client debug log as well with the virt-manager's one to confirm it. You can get it by running virt-manager like this:

LIBVIRT_DEBUG=1 LIBVIRT_LOG_OUTPUTS="1:file:client.log" virt-manager --debug

This will output the full debug log into file "client.log".

However, as I mentioned earlier, it seems to me as a clear virt-manager bug and in case this information is enough for you, I'd recommend re-assigning it to virt-manager right away.

Comment 7 Daniel Berrangé 2012-05-09 14:11:11 UTC
Please tell us what the guest OS is, and what version of the QXL driver RPM is installed in the guest

Comment 10 Geyang Kong 2012-05-11 02:49:07 UTC
Created attachment 583700 [details]
client.log

Comment 11 Geyang Kong 2012-05-11 03:01:25 UTC
I have attached the client.log, my guest os is RHEL6.2, I can get following output from guest terminal:
[root@localhost Desktop]# rpm -qa | grep qxl
xorg-x11-drv-qxl-0.0.14-10.el6_2.x86_64

Comment 12 Stephen Gordon 2012-06-07 14:02:42 UTC
For what it's worth I also get this behaviour using the upstream version in Fedora 17:

xorg-x11-drv-qxl-0.0.22-0.fc17.x86_64

Comment 14 Dave Allan 2012-07-18 15:26:45 UTC
Cole, you moved this from virt-manager to libvirt.  Given the error message about keepalive loss, what makes you think that this isn't virt-manager?  I don't want to get into finger pointing, so what can we do to nail down where the problem lies?

Comment 15 Cole Robinson 2012-07-18 16:47:37 UTC
(In reply to comment #14)
> Cole, you moved this from virt-manager to libvirt.  Given the error message
> about keepalive loss, what makes you think that this isn't virt-manager?  I
> don't want to get into finger pointing, so what can we do to nail down where
> the problem lies?

In virt-manager we don't touch any of the libvirt keepalive APIs, so anything mentioning them is likely something libvirt is doing internally.

Granted the root cause could be virt-manager, but given that the past few months has had a whole bunch of keepalive related fixes in libvirt upstream, I'm pretty sure this isn't virt-manager's fault.

Comment 16 Martin Kletzander 2012-07-18 20:58:19 UTC
It may not seem like it, but it is enough to have virEventRegisterDefaultImpl in the code and then get stuck for 30 second without calling virEventRunDefaultImpl and poof, the connection is gone.

However, Cole may be right about the keepalive code. We had some problems with that. To help us getting closer to the epicenter of the problem, could you please try to reproduce this with keepalives being turned off? Having latest build would maybe help as well.

You can turn them off server-side by setting keepalive_interval=-1 in libvirtd.conf and restarting libvirtd.

Thanks, Martin

Comment 17 Geyang Kong 2012-07-19 02:07:14 UTC
Looks the issues is still there. I have retried with libvirt-0.9.10-21.el6_3.3.x86_64 and set keepalive_interval=-1 log_level=1. Issue is still there. New log file attached as new-libvirtd.log.zip

Comment 18 Geyang Kong 2012-07-19 02:08:04 UTC
Created attachment 599035 [details]
new-libvirtd log file

Comment 19 Martin Kletzander 2012-07-23 12:05:35 UTC
Could you please attach the log from client as well? If you don't have it from the last time, then attach both so the client matches the server log, thanks.

Comment 20 Geyang Kong 2012-07-24 02:25:46 UTC
Created attachment 599888 [details]
libvirtd's log and client's log

New log files attached.

Comment 21 Martin Kletzander 2012-07-24 14:44:03 UTC
(In reply to comment #20)
Thanks, but you apparently attached /var/log/libvirt/libvirtd.log and /var/log/libvirt/qemu/T3.log, but what we need it debug output of virt-manager (with libvirt debug info as well) instead of the second log (as explained earlier in comment #6).

Comment 22 Geyang Kong 2012-07-25 02:02:24 UTC
Created attachment 600201 [details]
logfiles

Client.log and libvirtd.log files attached.

Comment 23 Martin Kletzander 2012-08-31 15:13:17 UTC
I'm sorry but I still can't find anything wrong in libvirt. Could you just check this problem remains with latest packages of libvirt/virt-manager/qemu-kvm?
Thanks, Martin.

Comment 24 Geyang Kong 2012-09-03 07:28:14 UTC
Unfortunately I cannot reproduce this bug anymore, builds listed here.
libvirt-0.10.1-1.el6.x86_64
qemu-kvm-0.12.1.2-2.308.el6.x86_64
virt-manager-0.9.0-14.el6.x86_64

Comment 25 Martin Kletzander 2012-09-03 08:54:38 UTC
(In reply to comment #24)
Since this might have been fixed in libvirt or qemu-kvm, I suggest one of these things:
 1) try to reproduce that with only one of the packages and then bisect till you find the version it was fixed in. Then we'd have to bisect in commits in the repo to find the root cause.
 2) close this bug as NEXTRELEASE

I incline to the latter, obviously, but in case you want to figure out what the particular problem was, you can try the first option (I'd like to help but I was unable to reproduce it since this was first reported).

Comment 26 Geyang Kong 2012-09-03 11:09:49 UTC
The last build I can reproduce this bug is:

[root@KP-T1 ~]# rpm -qa | grep libvirt
libvirt-python-0.9.10-21.el6_3.4.x86_64
libvirt-0.9.10-21.el6_3.4.x86_64
libvirt-client-0.9.10-21.el6_3.4.x86_64
libvirt-debuginfo-0.9.10-21.el6_3.4.x86_64

After upgrading libvirt to libvirt-0.10.0-0rc1.el6.x86_64.rpm, this issue has gone. And I didn't modified my qemu-kvm packages, it's always qemu-kvm-0.12.1.2-2.295.el6.x86_64 at this time.

Comment 27 Dave Allan 2012-09-04 14:03:56 UTC
I am generally in favor of finding root cause, but in this case given that the behavior was 100% reproducible and now does not reproduce, I'm ok with closing without determining which commit fixed it.

Comment 28 Martin Kletzander 2012-09-05 05:50:08 UTC
Based on comment #27, I'm closing this bug, feel free to reopen it in case the issue will appear not fixed again.
If you want to continue with the search for the commit that fixed this, feel free to contact me by mail or IRC, I'll be glad to help with that.