Bug 816451
Summary: | virt-manager disconnects from libvirt when using spice+qxl: Failed to send keepalive response to client 0x7ff010004410 | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Geyang Kong <gkong> | ||||||||||||||||
Component: | libvirt | Assignee: | Martin Kletzander <mkletzan> | ||||||||||||||||
Status: | CLOSED NEXTRELEASE | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||||||||||||
Severity: | medium | Docs Contact: | |||||||||||||||||
Priority: | medium | ||||||||||||||||||
Version: | 6.3 | CC: | acathrow, an.euroford, berrange, crobinso, dallan, dyasny, dyuan, jwu, mjenner, mzhan, rwu, sgordon, whuang, yupzhang, zpeng | ||||||||||||||||
Target Milestone: | rc | ||||||||||||||||||
Target Release: | --- | ||||||||||||||||||
Hardware: | x86_64 | ||||||||||||||||||
OS: | Linux | ||||||||||||||||||
Whiteboard: | |||||||||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||||||
Clone Of: | Environment: | ||||||||||||||||||
Last Closed: | 2012-09-05 05:50:08 UTC | Type: | Bug | ||||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||||
Embargoed: | |||||||||||||||||||
Attachments: |
|
Hmm, is libvirtd crashing? There's a mention of a keepalive message in the logs but not sure what the result of a failure is there. I think libvirtd is always there, since when I hit this error. I can get following output from terminal: ------------------------------------------------------ [root@KP-T2 libvirt]# service libvirtd status libvirtd (pid 4715) is running... ------------------------------------------------------ And the following is everything in libvirtd.log file, since it was short so I didn't attach it as attachment. ------------------------------------------------------ 2012-05-03 05:05:21.026+0000: 4715: info : libvirt version: 0.9.10, package: 14.el6 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2012-04-24-00:57:34, x86-004.build.bos.redhat.com) 2012-05-03 05:05:21.026+0000: 4715: warning : virKeepAliveTimer:182 : No response from client 0x1c9bb50 after 5 keepalive messages in 31 seconds 2012-05-03 05:05:21.048+0000: 4715: warning : qemuMonitorJSONHandleGraphics:675 : missing auth scheme in graphics event 2012-05-03 05:05:21.049+0000: 4715: warning : qemuMonitorJSONHandleGraphics:675 : missing auth scheme in graphics event 2012-05-03 05:05:21.049+0000: 4715: warning : qemuMonitorJSONHandleGraphics:675 : missing auth scheme in graphics event 2012-05-03 05:05:21.049+0000: 4715: warning : qemuMonitorJSONHandleGraphics:675 : missing auth scheme in graphics event 2012-05-03 05:05:21.050+0000: 4715: warning : qemuMonitorJSONHandleGraphics:675 : missing auth scheme in graphics event 2012-05-03 05:05:21.051+0000: 4715: warning : qemuMonitorJSONHandleGraphics:675 : missing auth scheme in graphics event ------------------------------------------------------ libvirtd service's PID didn't change, so I think libvirtd was always there but virt-manager thought libvirtd was dead. Created attachment 582830 [details]
libvirtd's log with log level=1
Created attachment 582831 [details]
virt-manager's log
As I can see from the logs that virt-manager stopped sending keepalive messages after the last one being sent on 16:47:30.656, thus was properly disconnected after 30 seconds. I'd most likely say that virt-manager stopped calling libvirt's event loop after the spice connection to the host, however the virt-manager's log is without libvirt's debug information. If this is enough for you, you can re-assign it to virt-manager, if not, I'd need the libvirt's client debug log as well with the virt-manager's one to confirm it. You can get it by running virt-manager like this: LIBVIRT_DEBUG=1 LIBVIRT_LOG_OUTPUTS="1:file:client.log" virt-manager --debug This will output the full debug log into file "client.log". However, as I mentioned earlier, it seems to me as a clear virt-manager bug and in case this information is enough for you, I'd recommend re-assigning it to virt-manager right away. Please tell us what the guest OS is, and what version of the QXL driver RPM is installed in the guest Created attachment 583700 [details]
client.log
I have attached the client.log, my guest os is RHEL6.2, I can get following output from guest terminal: [root@localhost Desktop]# rpm -qa | grep qxl xorg-x11-drv-qxl-0.0.14-10.el6_2.x86_64 For what it's worth I also get this behaviour using the upstream version in Fedora 17: xorg-x11-drv-qxl-0.0.22-0.fc17.x86_64 Cole, you moved this from virt-manager to libvirt. Given the error message about keepalive loss, what makes you think that this isn't virt-manager? I don't want to get into finger pointing, so what can we do to nail down where the problem lies? (In reply to comment #14) > Cole, you moved this from virt-manager to libvirt. Given the error message > about keepalive loss, what makes you think that this isn't virt-manager? I > don't want to get into finger pointing, so what can we do to nail down where > the problem lies? In virt-manager we don't touch any of the libvirt keepalive APIs, so anything mentioning them is likely something libvirt is doing internally. Granted the root cause could be virt-manager, but given that the past few months has had a whole bunch of keepalive related fixes in libvirt upstream, I'm pretty sure this isn't virt-manager's fault. It may not seem like it, but it is enough to have virEventRegisterDefaultImpl in the code and then get stuck for 30 second without calling virEventRunDefaultImpl and poof, the connection is gone. However, Cole may be right about the keepalive code. We had some problems with that. To help us getting closer to the epicenter of the problem, could you please try to reproduce this with keepalives being turned off? Having latest build would maybe help as well. You can turn them off server-side by setting keepalive_interval=-1 in libvirtd.conf and restarting libvirtd. Thanks, Martin Looks the issues is still there. I have retried with libvirt-0.9.10-21.el6_3.3.x86_64 and set keepalive_interval=-1 log_level=1. Issue is still there. New log file attached as new-libvirtd.log.zip Created attachment 599035 [details]
new-libvirtd log file
Could you please attach the log from client as well? If you don't have it from the last time, then attach both so the client matches the server log, thanks. Created attachment 599888 [details]
libvirtd's log and client's log
New log files attached.
(In reply to comment #20) Thanks, but you apparently attached /var/log/libvirt/libvirtd.log and /var/log/libvirt/qemu/T3.log, but what we need it debug output of virt-manager (with libvirt debug info as well) instead of the second log (as explained earlier in comment #6). Created attachment 600201 [details]
logfiles
Client.log and libvirtd.log files attached.
I'm sorry but I still can't find anything wrong in libvirt. Could you just check this problem remains with latest packages of libvirt/virt-manager/qemu-kvm? Thanks, Martin. Unfortunately I cannot reproduce this bug anymore, builds listed here. libvirt-0.10.1-1.el6.x86_64 qemu-kvm-0.12.1.2-2.308.el6.x86_64 virt-manager-0.9.0-14.el6.x86_64 (In reply to comment #24) Since this might have been fixed in libvirt or qemu-kvm, I suggest one of these things: 1) try to reproduce that with only one of the packages and then bisect till you find the version it was fixed in. Then we'd have to bisect in commits in the repo to find the root cause. 2) close this bug as NEXTRELEASE I incline to the latter, obviously, but in case you want to figure out what the particular problem was, you can try the first option (I'd like to help but I was unable to reproduce it since this was first reported). The last build I can reproduce this bug is: [root@KP-T1 ~]# rpm -qa | grep libvirt libvirt-python-0.9.10-21.el6_3.4.x86_64 libvirt-0.9.10-21.el6_3.4.x86_64 libvirt-client-0.9.10-21.el6_3.4.x86_64 libvirt-debuginfo-0.9.10-21.el6_3.4.x86_64 After upgrading libvirt to libvirt-0.10.0-0rc1.el6.x86_64.rpm, this issue has gone. And I didn't modified my qemu-kvm packages, it's always qemu-kvm-0.12.1.2-2.295.el6.x86_64 at this time. I am generally in favor of finding root cause, but in this case given that the behavior was 100% reproducible and now does not reproduce, I'm ok with closing without determining which commit fixed it. Based on comment #27, I'm closing this bug, feel free to reopen it in case the issue will appear not fixed again. If you want to continue with the search for the commit that fixed this, feel free to contact me by mail or IRC, I'll be glad to help with that. |
Created attachment 580362 [details] virt-manager debug information Description of problem: Virt-manager lost connection to libvirt while installing a spice qxl guest. Version-Release number of selected component (if applicable): virt-manager-0.9.0-12.el6.x86_64 libvirt-0.9.10-14.el6.x86_64 qemu-kvm-0.12.1.2-2.275.el6.x86_64 How reproducible: 100% Steps to Reproduce: 1. Launch virt-manager 2. Start create a new virtual machine wizard. 3. Following the wizard to the step 4, then check the "Customize configuration before install" 4. Click "Finish", and change its display device to spice, change vedio device to qxl 5. Click "Begin Installation". Actual results: 1. After step 5, waiting a while, virt-manager will lost connection to libvirt. Error like following from virt-manager --debug ------------------------------------------------------------------------------------------------ Traceback (most recent call last): File "/usr/share/virt-manager/virtManager/engine.py", line 440, in _tick conn.tick() File "/usr/share/virt-manager/virtManager/connection.py", line 1418, in tick self.hostinfo = self.vmm.getInfo() File "/usr/lib64/python2.6/site-packages/libvirt.py", line 3171, in getInfo if ret is None: raise libvirtError ('virNodeGetInfo() failed', conn=self) libvirtError: Cannot write data: Broken pipe ------------------------------------------------------------------------------------------------- Expected results: 1. Guest can be installed successfully and work normally. Additional info: