Bug 636347 - libvirt-0.8.4 fails to start VMs after upgrade
Summary: libvirt-0.8.4 fails to start VMs after upgrade
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: libvirt
Version: 5.6
Hardware: x86_64
OS: Linux
low
medium
Target Milestone: rc
: 5.6
Assignee: Eric Blake
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On: 635857
Blocks: 636349
TreeView+ depends on / blocked
 
Reported: 2010-09-21 23:32 UTC by Eric Blake
Modified: 2011-01-13 23:16 UTC (History)
9 users (show)

Fixed In Version: libvirt-0.8.2-6.el5
Doc Type: Bug Fix
Doc Text:
Clone Of: 635857
: 636349 (view as bug list)
Environment:
Last Closed: 2011-01-13 23:16:21 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2011:0060 0 normal SHIPPED_LIVE libvirt bug fix and enhancement update 2011-01-12 17:22:30 UTC

Description Eric Blake 2010-09-21 23:32:04 UTC
+++ This bug was initially created as a clone of Bug #635857 +++

Upstream commit d413e5d7 was applied after 0.8.3, but was backported to RHEL 5.6 as patch 150, so 0.8.2-5.el5 is also vulnerable.

Description of problem:
I just upgraded one of our servers to libvirt-0.8.4 but I can't use virsh or virt-manager anymore. Syslog says the following "error : get_cpu_value:88 : cannot open /sys/devices/system/cpu/cpu1/online: No such file or directory"
I only have /sys/devices/system/cpu/online which states the number of CPUs the server has, e.g. 0-7.

I suspected the kernel option "hotpluggable cpu support", which was not enabled on my host. After enabling it libvirt works again even though I still don't have "/sys/devices/system/cpu/cpu1/online".


I agree with danpb's analysis that commit d413e5d7 is the culprit.  I only tested that commit on a hot-plug enabled kernel.  So the fact that disabling hot-plugging makes /sys/devices/system/cpu/cpu<n>/ directories disappear makes sense, but was something I never encountered during my testing.

My patch only allowed a missing directory for cpu0 (since x86_64 systems with hot-unplug cpu support still disallow hot-unplugging cpu0).  The fix is to allow a missing directory for all possible cpus.





diff --git i/src/nodeinfo.c w/src/nodeinfo.c
index 65eeb24..3dac9f3 100644
--- i/src/nodeinfo.c
+++ w/src/nodeinfo.c
@@ -65,7 +65,8 @@ int linuxNodeInfoCPUPopulate(FILE *cpuinfo,
 /* Return the positive decimal contents of the given
  * CPU_SYS_PATH/cpu%u/FILE, or -1 on error.  If MISSING_OK and the
  * file could not be found, return 1 instead of an error; this is
- * because some machines cannot hot-unplug cpu0.  */
+ * because some machines cannot hot-unplug cpu0, or because
+ * hot-unplugging is disabled.  */
 static int
 get_cpu_value(unsigned int cpu, const char *file, bool missing_ok)
 {
@@ -113,7 +114,7 @@ cleanup:
 static int
 cpu_online(unsigned int cpu)
 {
-    return get_cpu_value(cpu, "online", cpu == 0);
+    return get_cpu_value(cpu, "online", true);
 }

 static unsigned long count_thread_siblings(unsigned int cpu)

Comment 1 Jiri Denemark 2010-09-22 08:08:25 UTC
This shouldn't affect RHEL since our kernels support hotpluggable CPUs but it's an easy fix and makes libvirt run correctly if someone decides to configure their own kernel.

Comment 2 Daniel Veillard 2010-09-22 12:30:48 UTC
yeah, that sounds a good idea to have this backported in the 5.6 tree,

Daniel

Comment 4 Jiri Denemark 2010-09-23 14:59:49 UTC
Fix built in libvirt-0.8.2-6.el5

Comment 9 Daniel Berrangé 2010-10-28 09:23:10 UTC
You can probably test this by faking the sysfs files arrangement eric mentions in the initial description  eg

   mkdir /tmp/fakesysfs
   cd /tmp/fakesysfs
   echo "0-7" > online
   mount --bind /tmp/fakesysfs /sys/devices/system/cpu


After this /sys/devices/system/cpu should contain one file called 'online' as per the bug description

Comment 10 Eric Blake 2010-10-28 14:27:08 UTC
Thanks for the idea Daniel.  I can confirm that a bind-mount can be used to fake the removal of the hotplug files that libvirt looks for.  But you may also need to do:

mkdir -p /tmp/fakesysfs/cpu{0,1,2,3,4,5,6,7}

before the bind mount to make sure the cpu<N> directories are present but empty in the fake tree.

Comment 11 weizhang 2010-10-29 06:28:36 UTC
Thanks Daniel and Eric.
It can be verified on rhel5.6-x86_64-kvm
steps:

1. #cp /sys/devices/system/cpu/ /tmp/fakesysfs/
2. in /tmp/fakesysfs/cpu<N> remove 'online' file
3. #echo "0-7" > online
4. #mount --bind /tmp/fakesysfs /sys/devices/system/cpu
5  run #virsh start vm 
There is no error information. The vm can be start successfully.

# rpm -qa libvirt
libvirt-0.8.2-9.el5
# uname -r
2.6.18-228.el5
# rpm -qa |grep kvm
kmod-kvm-83-205.el5
kvm-83-205.el5
kvm-qemu-img-83-205.el5

Comment 13 errata-xmlrpc 2011-01-13 23:16:21 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2011-0060.html


Note You need to log in before you can comment on or make changes to this bug.