Created attachment 387442 [details] libvirtd debug log file when running kvm Description of problem: This issue seems to be pretty specific to a particular machine as RHEL5.5-Server-20100117.0 tree has been used before on other machines. The issue is that, when the host is running kvm hypervisor, the libvirtd process is not able to connect to the hypervisor. It can, however, can connect to Xen hypervisor. Ran the libvirtd process with debugging on under both hypervisors and will be attaching both logs to this bug. All it was done was a virsh list command, after booting the machine to different hypervisors. The host is sun-x4440-01.rhts.eng.bos.redhat.com . Will fire off more jobs of the same tree to see if this can be replicated on other boxes to find some consistency. Version-Release number of selected component (if applicable): [root@sun-x4440-01 ~]# uname -a Linux sun-x4440-01.rhts.eng.bos.redhat.com 2.6.18-185.el5 #1 SMP Thu Jan 14 16:44:40 EST 2010 x86_64 x86_64 x86_64 GNU/Linux [root@sun-x4440-01 ~]# rpm -qa | egrep "kvm|qemu" etherboot-zroms-kvm-5.4.4-13.el5 kvm-qemu-img-83-147.el5 kmod-kvm-83-147.el5 kvm-83-147.el5 [root@sun-x4440-01 ~]# How reproducible: Very. Steps to Reproduce: 1. Log in to sun-x4440-01.rhts.eng.bos.redhat.com machine 2. Boot into kvm hypervisor, try to use libvirt. 3. Do the same running xen hypervisor. Actual results: Expected results: Additional info:
Created attachment 387443 [details] libvirtd debug log file when running xen
16:28:15.953: info : Received unexpected signal 17 means a SIGCHLD i.e. a forked process terminates. Somehow on that machine it seems qemu-kvm fails to start. Make 100% sure the CPU has virtualization support and this is enabled in the BIOS this could be the problem (Xen doesn't need hardware virt support so works fine) Daniel
I'll try to reserve the machine and see if i can get into the bios over the serial console, however doesn't virt-install give an appropriate error when it finds out that the machine isn't virtualization capable? Also, it looks like the same host was able to install hvm guest, though under Xen of course, with rhel5.3 host on jan 22nd 2010. It's recipe 325186 on job: http://rhts.redhat.com/cgi-bin/rhts/jobs.cgi?id=119719&type=Single .. Is there anything else you'd like me to look when the machine is reserved? Thanks!
kernel logs potential errors and check if manually launching qemu-kvm works fine, Daniel
(In reply to comment #4) > kernel logs potential errors and check if manually launching qemu-kvm works > fine, > > Daniel Hi Daniel, I don't see anything interesting in the kernel logs. Here is what might be relevant: Feb 3 17:27:40 sun-x4440-01 kernel: kvm: 5497: cpu0 unimplemented perfctr wrmsr: 0xc0010004 data 0x0 Feb 3 17:27:40 sun-x4440-01 kernel: kvm: 5497: cpu0 unimplemented perfctr wrmsr: 0xc0010000 data 0x130076 Feb 3 17:27:40 sun-x4440-01 kernel: kvm: 5497: cpu0 unimplemented perfctr wrmsr: 0xc0010004 data 0xffffffffffdce742 Feb 3 17:27:40 sun-x4440-01 kernel: kvm: 5497: cpu0 unimplemented perfctr wrmsr: 0xc0010000 data 0x530076 Feb 3 17:32:14 sun-x4440-01 kernel: kvm: 5514: cpu0 unimplemented perfctr wrmsr: 0xc0010004 data 0x0 Feb 3 17:32:14 sun-x4440-01 kernel: kvm: 5514: cpu0 unimplemented perfctr wrmsr: 0xc0010000 data 0x130076 Feb 3 17:32:15 sun-x4440-01 kernel: kvm: 5514: cpu0 unimplemented perfctr wrmsr: 0xc0010004 data 0xffffffffffdce742 Feb 3 17:32:15 sun-x4440-01 kernel: kvm: 5514: cpu0 unimplemented perfctr wrmsr: 0xc0010000 data 0x530076 Other than this, as suspected, the machine does have hardware enabled virtualization and it's on. I have booted the machine into Xen hypervisor and was able to install hvm guest under Xen hypervisor. Furthermore, qemu-kvm is launchable manually: /usr/libexec/qemu-kvm /var/lib/libvirt/images/latestrhel5_x86_64_hvm_guest.img does actually work and boot into the virtual machine. However, libvirtd still doesn't recognize the kvm hypervisor: [root@sun-x4440-01 ~]# virsh list error: could not connect to hypervisor error: failed to connect to the hypervisor The machine is reserved, if you would like to jump in and poke around.
I poked around a bit and starting libvirtd with LIBVIRT_DEBUG=1 shows: ... 04:51:25.728: debug : virEventAddTimeoutImpl:208 : Adding timer 1 with -1 ms freq 04:51:25.728: debug : virEventAddTimeoutImpl:216 : Used 0 timeout slots, adding 10 more 04:51:25.728: debug : virEventInterruptLocked:635 : Skip interrupt, 0 0 libnuma: Warning: /sys not mounted or invalid. Assuming one node: No such file or directory libvir: QEMU error : out of memory The warning is a numactl error fixed in numactl-0.9.8-11.el5. The real problem is the "out of memory" error which is bogus. In reality it is numa_node_to_cpus() failing in virCapsInitNUMA() in qemudCapsInit(). In upstream, we made virCapsInitNUMA() failures non-fatal, perhaps this is the reason?
Yeah, if (virCapsInitNUMA(caps) < 0) goto no_memory; is really bogus we should do like upstream if (nodeCapsInitNUMA(caps) < 0) { virCapabilitiesFreeNUMAInfo(caps); VIR_WARN0("Failed to query host NUMA topology, disabling NUMA capabilities"); } with the change in libnuma, this is probably worth considering a blocker for 5.5, I'm requesting flags, Daniel
Created attachment 389071 [details] Make numa errors non-fatal According to my testing with a package with this patch applied (http://brewweb.devel.redhat.com/brew/taskinfo?taskID=2244890) this patch fixes the issue. I observed a bit strange behavior which is not really connected to this bug. The numa call which fails during qemu driver initialization succeeds afterwards when running ``virsh capabilities'' although it gives totally confusing results. This could be worth filing a bug against numactl after some more investigation.
The bug has been fixed on libvirt-0.6.3-32.el5 After upgrading to libvirt-0.6.3-32.el5, virsh could connect to kvm hypervisor on machine with hostname sun-x4440-01.rhts.eng.bos.redhat.com on RHTS.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2010-0205.html