Bug 796451
Summary: | Virsh hangs when connecting to local qemu-kvm (FC16 running as VMware guest) | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Matt <matt> | ||||||
Component: | libvirt | Assignee: | Osier Yang <jyang> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 16 | CC: | ajia, berrange, clalancette, crobinso, dallan, dougsland, itamar, jforbes, laine, libvirt-maint, matt, orthostatic, veillard, virt-maint | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | libvirt-0.9.6-5.fc16 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2012-03-17 23:45:09 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Can you provide a backtrace of all the libvirtd threads with bt -a when this problem is occurring? And when you reproduce the hang, is dmidecode running? ps axwww | grep dmide Hi, 1. Attachment created: backtrace of libvirtd attached I did not fully understand your instructions, I hope this is the information that you require, let me know if there's anything more that you want - the gdb commands that I used are in the attachment, 2. Results of ps axwww | grep dmide: 1484 ? S 0:00 /usr/sbin/dmidecode -q -t 0,1,4,17 Matt Created attachment 565125 [details]
libvirtd backtrace (all threads)
Yeah I've heard of this issue before, the dmidecode hang in vmware guests. I think there's a patch upstream for it Eric, do you know more about this? Matt, that's what I was looking for. I have the same thought Cole did which is that this is dmidecode related. Are you willing to try building upstream libvirt to see if it makes the problem go away? I'm not convinced it's fixed upstream yet, but if you can repro this at will and test builds I'm sure we can figure it out. Sure Dave. Can you provide me some high-level instructions, or point me to a site that might have something similar? Thanks, Matt bug 783453 is another example of a dmidecode hang; F16 does not (yet) have the two patches mentioned in that bug: commit 06b9c5b9231ef4dbd4b5ff69564305cd4f814879 Author: Michal Privoznik <mprivozn> Date: Tue Jan 3 18:40:55 2012 +0100 virCommand: Properly handle POLLHUP It is a good practise to set revents to zero before doing any poll(). Moreover, we should check if event we waited for really occurred or if any of fds we were polling on didn't encountered hangup. commit d19149dda888d36cea58b6cdf7446f98bd1bf734 Author: Laszlo Ersek <lersek> Date: Tue Jan 24 15:55:19 2012 +0100 virCommandProcessIO(): make poll() usage more robust POLLIN and POLLHUP are not mutually exclusive. Currently the following seems possible: the child writes 3K to its stdout or stderr pipe, and immediately closes it. We get POLLIN|POLLHUP (I'm not sure that's possible on Linux, but SUSv4 seems to allow it). We read 1K and throw away the rest. But it is not certain whether those two patches are all that's needed, or whether we need yet a third patch backported to the F16 build. After a bit of investigation - I am currently building the fc17 version of libvirt from src RPM. During the build of libvirt-0.9.10-1 from the fc17 source repo, the test for virsh-all hung. It seems that dmidecode was the issue again - the build continued once I have terminated the dmidecode process. Once the new RPM was installed - and once I had disabled TLS auth :) - the problem is solved. Both virsh and virt-manager connect without issue. P.S. There was a sanlock=>0.8 dependency that I ignored for now as I don't have shared storage. So now the question is, are the two patches Eric mentioned sufficient, or is there some other required commit? Osier, I'm about to go offline for the day, would you mind spinning an F16 test build with just the two patches and see if it still fixes the problem? (In reply to comment #13) > So now the question is, are the two patches Eric mentioned sufficient, or is > there some other required commit? Osier, I'm about to go offline for the day, > would you mind spinning an F16 test build with just the two patches and see if > it still fixes the problem? Let me do it. (In reply to comment #14) > (In reply to comment #13) > > So now the question is, are the two patches Eric mentioned sufficient, or is > > there some other required commit? Osier, I'm about to go offline for the day, > > would you mind spinning an F16 test build with just the two patches and see if > > it still fixes the problem? > > Let me do it. Tested with installing VMware Workstation 8, and fc16 guest, the problem was resolved exactly with those two patches applied in the testing build. libvirt-0.9.6-5.fc16 has been submitted as an update for Fedora 16. https://admin.fedoraproject.org/updates/libvirt-0.9.6-5.fc16 Package libvirt-0.9.6-5.fc16: * should fix your issue, * was pushed to the Fedora 16 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing libvirt-0.9.6-5.fc16' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2012-3067/libvirt-0.9.6-5.fc16 then log in and leave karma (feedback). libvirt-0.9.6-5.fc16 has been pushed to the Fedora 16 stable repository. If problems still persist, please make note of it in this bug report. I apologize for the noise, devs. I'm posting this to benefit those searching for RHEL solutions to this very problem. :) This problem with libvirt exists in RHEL 6.2, and I stumbled upon it while preparing for RHCSA/RHCE recertification. My study environment consists of VMWare Workstation 8.0.2-591240 and RHEL 6.2. This is fixed in RHEL 6.3 beta as of 2012/04/25. |
Created attachment 565119 [details] gdb backtrace of virsh Description of problem: Using FC16 as VMware Workstation 8 guest with Intel VT-x virtualisation so that I can test KVM. When installing libvirt & qemu-kvm I am unable to connect to the local hypervisor with virsh (or virt-manager for that matter). Running fallback Gnome desktop environment and latest updates Have tried disabling auth (set to none) in the libvirtd.conf and disabling selinux (setenforce 0). Also tried with std user & root user. Version-Release number of selected component (if applicable): * FC16 stock with all updates (also tested with testing updates) * Kernel 3.2.6-3.fc16.x86_64 * libvirt 0.9.6-4.fc16 How reproducible: Have reproduced on another system, using fresh FC16 install as VMware Workstation 8 guest. Same results. Steps to Reproduce: 1. Install FC16 as VMware guest with Intel VT-x virtualisation 2. Install qemu-kvm & libvirt 3. Type qemu --connect qemu:///system Actual results: Process hangs until ^C Expected results: Virsh prompt connected to local hypervisor Additional info: In the hope that it is useful, I have attached a gdb backtrace while it is hanging. I ran debuginfo-install libvirt then: virsh --connect qemu:///system & gdb attach [processid] backtrace See attachment for backtrace