Bug 559755 - libvirt can't connect to kvm hypervisor on a certain machine at rhts with RHEL5.5-Server-20100117.0 tree.
Summary: libvirt can't connect to kvm hypervisor on a certain machine at rhts with RHE...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: libvirt
Version: 5.5
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Jiri Denemark
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-01-28 21:41 UTC by Gurhan Ozen
Modified: 2010-03-30 08:09 UTC (History)
6 users (show)

Fixed In Version: libvirt-0.6.3-32.el5
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-03-30 08:09:09 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
libvirtd debug log file when running kvm (1.33 KB, text/plain)
2010-01-28 21:41 UTC, Gurhan Ozen
no flags Details
libvirtd debug log file when running xen (2.11 KB, text/plain)
2010-01-28 21:42 UTC, Gurhan Ozen
no flags Details
Make numa errors non-fatal (3.96 KB, patch)
2010-02-05 12:58 UTC, Jiri Denemark
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2010:0205 0 normal SHIPPED_LIVE libvirt bug fix and enhancement update 2010-03-29 12:27:37 UTC

Description Gurhan Ozen 2010-01-28 21:41:48 UTC
Created attachment 387442 [details]
libvirtd debug log file when running kvm

Description of problem:
 This issue seems to be pretty specific to a particular machine as RHEL5.5-Server-20100117.0 tree has been used before on other machines. 
 The issue is that, when the host is running kvm hypervisor, the libvirtd process is not able to connect to the hypervisor. It can, however, can connect to Xen hypervisor. 
  Ran the libvirtd process with debugging on under both hypervisors and will be attaching both logs to this bug. All it was done was a virsh list command, after booting the machine to different hypervisors. 
   The host is sun-x4440-01.rhts.eng.bos.redhat.com . Will fire off more jobs of the same tree to see if this can be replicated on other boxes to find some consistency. 

Version-Release number of selected component (if applicable):
[root@sun-x4440-01 ~]# uname -a
Linux sun-x4440-01.rhts.eng.bos.redhat.com 2.6.18-185.el5 #1 SMP Thu Jan 14 16:44:40 EST 2010 x86_64 x86_64 x86_64 GNU/Linux
[root@sun-x4440-01 ~]# rpm -qa | egrep "kvm|qemu"
etherboot-zroms-kvm-5.4.4-13.el5
kvm-qemu-img-83-147.el5
kmod-kvm-83-147.el5
kvm-83-147.el5
[root@sun-x4440-01 ~]# 


How reproducible:
Very. 

Steps to Reproduce:
1. Log in to sun-x4440-01.rhts.eng.bos.redhat.com machine
2. Boot into kvm hypervisor, try to use libvirt.
3. Do the same running xen hypervisor. 
  
Actual results:


Expected results:


Additional info:

Comment 1 Gurhan Ozen 2010-01-28 21:42:54 UTC
Created attachment 387443 [details]
libvirtd debug log file when running xen

Comment 2 Daniel Veillard 2010-02-02 15:23:36 UTC
16:28:15.953: info : Received unexpected signal 17
means a SIGCHLD i.e. a forked process terminates.
Somehow on that machine it seems qemu-kvm fails to start. Make 100% sure
the CPU has virtualization support and this is enabled in the BIOS this could
be the problem (Xen doesn't need hardware virt support so works fine)

Daniel

Comment 3 Gurhan Ozen 2010-02-02 20:52:15 UTC
I'll try to reserve the machine and see if i can get into the bios over the serial console, however doesn't virt-install give an appropriate error when it finds out that the machine isn't virtualization capable?

Also, it looks like the same host was able to install hvm guest, though under Xen of course, with rhel5.3 host on jan 22nd 2010. It's recipe 325186  on job: http://rhts.redhat.com/cgi-bin/rhts/jobs.cgi?id=119719&type=Single ..

Is there anything else you'd like me to look when the machine is reserved?

Thanks!

Comment 4 Daniel Veillard 2010-02-03 08:53:53 UTC
kernel logs potential errors  and check if manually launching qemu-kvm works fine,

Daniel

Comment 5 Gurhan Ozen 2010-02-03 23:14:51 UTC
(In reply to comment #4)
> kernel logs potential errors  and check if manually launching qemu-kvm works
> fine,
> 
> Daniel    

Hi Daniel, 
I don't see anything interesting in the kernel logs. Here is what might be relevant:

Feb  3 17:27:40 sun-x4440-01 kernel: kvm: 5497: cpu0 unimplemented perfctr wrmsr: 0xc0010004 data 0x0
Feb  3 17:27:40 sun-x4440-01 kernel: kvm: 5497: cpu0 unimplemented perfctr wrmsr: 0xc0010000 data 0x130076
Feb  3 17:27:40 sun-x4440-01 kernel: kvm: 5497: cpu0 unimplemented perfctr wrmsr: 0xc0010004 data 0xffffffffffdce742
Feb  3 17:27:40 sun-x4440-01 kernel: kvm: 5497: cpu0 unimplemented perfctr wrmsr: 0xc0010000 data 0x530076
Feb  3 17:32:14 sun-x4440-01 kernel: kvm: 5514: cpu0 unimplemented perfctr wrmsr: 0xc0010004 data 0x0
Feb  3 17:32:14 sun-x4440-01 kernel: kvm: 5514: cpu0 unimplemented perfctr wrmsr: 0xc0010000 data 0x130076
Feb  3 17:32:15 sun-x4440-01 kernel: kvm: 5514: cpu0 unimplemented perfctr wrmsr: 0xc0010004 data 0xffffffffffdce742
Feb  3 17:32:15 sun-x4440-01 kernel: kvm: 5514: cpu0 unimplemented perfctr wrmsr: 0xc0010000 data 0x530076


Other than this, as suspected, the machine does have hardware enabled virtualization and it's on. I have booted the machine into Xen hypervisor and was able to install hvm guest under Xen hypervisor. 
Furthermore, qemu-kvm is launchable manually:
/usr/libexec/qemu-kvm /var/lib/libvirt/images/latestrhel5_x86_64_hvm_guest.img

does actually work and boot into the virtual machine. 

However, libvirtd still doesn't recognize the kvm hypervisor:
[root@sun-x4440-01 ~]# virsh list
error: could not connect to hypervisor
error: failed to connect to the hypervisor

The machine is reserved, if you would like to jump in and poke around.

Comment 6 Jiri Denemark 2010-02-04 10:01:33 UTC
I poked around a bit and starting libvirtd with LIBVIRT_DEBUG=1 shows:

...
04:51:25.728: debug : virEventAddTimeoutImpl:208 : Adding timer 1 with -1 ms freq
04:51:25.728: debug : virEventAddTimeoutImpl:216 : Used 0 timeout slots, adding 10 more
04:51:25.728: debug : virEventInterruptLocked:635 : Skip interrupt, 0 0
libnuma: Warning: /sys not mounted or invalid. Assuming one node: No such file or directory
libvir: QEMU error : out of memory

The warning is a numactl error fixed in numactl-0.9.8-11.el5. The real problem is the "out of memory" error which is bogus. In reality it is numa_node_to_cpus() failing in virCapsInitNUMA() in qemudCapsInit(). In upstream, we made virCapsInitNUMA() failures non-fatal, perhaps this is the reason?

Comment 7 Daniel Veillard 2010-02-04 10:15:31 UTC
Yeah, 

   if (virCapsInitNUMA(caps) < 0)
        goto no_memory;

is really bogus we should do like upstream

    if (nodeCapsInitNUMA(caps) < 0) {
        virCapabilitiesFreeNUMAInfo(caps);
        VIR_WARN0("Failed to query host NUMA topology, disabling NUMA capabilities");
    }

with the change in libnuma, this is probably worth considering a blocker for
5.5, I'm requesting flags,

Daniel

Comment 8 Jiri Denemark 2010-02-05 12:58:24 UTC
Created attachment 389071 [details]
Make numa errors non-fatal

According to my testing with a package with this patch applied (http://brewweb.devel.redhat.com/brew/taskinfo?taskID=2244890) this patch fixes the issue.

I observed a bit strange behavior which is not really connected to this bug. The numa call which fails during qemu driver initialization succeeds afterwards when running ``virsh capabilities'' although it gives totally confusing results. This could be worth filing a bug against numactl after some more investigation.

Comment 10 Gunannan Ren 2010-02-22 05:33:26 UTC
The bug has been fixed on libvirt-0.6.3-32.el5

After upgrading to libvirt-0.6.3-32.el5, virsh could connect to kvm hypervisor on
machine with hostname sun-x4440-01.rhts.eng.bos.redhat.com on RHTS.

Comment 13 errata-xmlrpc 2010-03-30 08:09:09 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2010-0205.html


Note You need to log in before you can comment on or make changes to this bug.