Bug 500564

Summary: KVM guest hangs during boot (occasionally)
Product: [Fedora] Fedora Reporter: Richard W.M. Jones <rjones>
Component: qemuAssignee: Glauber Costa <gcosta>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: high    
Version: rawhideCC: berrange, clalance, dwmw2, ehabkost, gcosta, itamar, markmc, quintela, virt-maint
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-05-21 16:37:25 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 498968    

Description Richard W.M. Jones 2009-05-13 10:15:41 UTC
The guest kernel hangs occasionally (1 time in 10 or less) during
boot with the latest KVM and guest kernel from Rawhide.  The
last kernel messages before the hang are:

  Dentry cache hash table entries: 65536 (order: 7, 524288 bytes)
  Inode-cache hash table entries: 32768 (order: 6, 262144 bytes)
  Mount-cache hash table entries: 256
  Initializing cgroup subsys ns
  Initializing cgroup subsys cpuacct
  Initializing cgroup subsys memory
  Initializing cgroup subsys devices
  Initializing cgroup subsys freezer
  Initializing cgroup subsys net_cls
  CPU: L1 I cache: 32K, L1 D cache: 32K
  CPU: L2 cache: 2048K
  CPU 0/0x0 -> Node 0
  SMP alternatives: switching to UP code
  ACPI: Core revision 20081204
  ftrace: converting mcount calls to 0f 1f 44 00 00
  ftrace: allocating 18877 entries in 149 pages
  Setting APIC routing to flat
  ..TIMER: vector=0x30 apic1=0 pin1=0 apic2=-1 pin2=-1
  CPU0: Intel QEMU Virtual CPU version 0.10.0 stepping 03

The full KVM command line is:

/usr/bin/qemu-kvm -drive file=test1.img -m 384 -no-reboot -kernel vmlinuz.rawhide.x86_64 -initrd initramfs.rawhide.x86_64.img -append 'panic=1 console=ttyS0 guestfs=10.0.2.4:6666 guestfs_verbose=1' -nographic -serial stdio -net channel,6666:unix:/tmp/libguestfsjm0kll/sock,server,nowait -net user,vlan=0 -net nic,model=virtio,vlan=0

KVM version:
qemu-kvm-0.10.50-3.kvm85.fc12.x86_64

Guest kernel version:
2.6.29.2-126.fc11.x86_64 (mockbuild.phx.redhat.com) (gcc version 4.4.0 20090427 (Red Hat 4.4.0-3) (GCC) ) #1 SMP Mon May 4 04:46:15 EDT 2009

Host kernel:
Linux intel-mb 2.6.29-0.258.2.3.rc8.git2.fc11.x86_64 #1 SMP Tue Mar 24 18:39:53 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux

Various discussion leads us to believe that this might be caused by
a mismatch between the qemu and KVM BIOS.  FWIW the BIOS image
that KVM is using is the one in /usr/share/qemu/bios.bin and
/usr/share/qemu/vgabios.bin

Comment 1 Mark McLoughlin 2009-05-13 11:45:00 UTC
Most likely that the KVM boch bios needs updating

http://www.redhat.com/archives/fedora-virt/2009-May/msg00061.html

Comment 2 Richard W.M. Jones 2009-05-13 11:50:11 UTC
I should add that my original estimate ("1 time in 10 or less") is poor.
It's more like 1 time in 100 in my test script (which just runs qemu-kvm
in a loop).  However it does occur more frequently in the libguestfs
checks (more like 1 in 10).  And during Koji builds, where I originally
found the problem, it seems more like 1 in 2.  I don't exactly know what
the factor is that triggers this though.

Comment 3 Richard W.M. Jones 2009-05-13 11:55:03 UTC
(In reply to comment #1)
> Most likely that the KVM boch bios needs updating
> 
> http://www.redhat.com/archives/fedora-virt/2009-May/msg00061.html  

I seem to be using the latest bochs / bochs-bios:

# rpm -qi bochs-bios
Name        : bochs-bios                   Relocations: (not relocatable)
Version     : 2.3.8                             Vendor: Fedora Project
Release     : 0.6.git04387139e3b.fc11       Build Date: Wed 11 Mar 2009 05:26:02 PM GMT
Install Date: Thu 26 Mar 2009 03:47:03 PM GMT      Build Host: x86-2.fedora.phx.redhat.com
Group       : Applications/Emulators        Source RPM: bochs-2.3.8-0.6.git04387139e3b.fc11.src.rpm
Size        : 327680                           License: LGPLv2+
Signature   : (none)
Packager    : Fedora Project
URL         : http://bochs.sourceforge.net/
Summary     : Bochs bios
Description :
Bochs BIOS is a free implementation of a x86 BIOS provided by the Bochs projects.
It can also be used in other emulators, such as QEMU

That's the latest one I can find in Koji anyway (build here:
http://koji.fedoraproject.org/koji/buildinfo?buildID=93823 )

Do we really still think this is a BIOS problem?  May be a red herring.

Comment 4 Richard W.M. Jones 2009-05-21 16:37:25 UTC
I've retested this with:
  guest kernel 2.6.30-0.81.rc5.git1.fc12
  qemu-kvm-0.10.50-4.kvm86.fc12
and for whatever reason the bug has gone away.