This service will be undergoing maintenance at 00:00 UTC, 2016-08-01. It is expected to last about 1 hours
Bug 644973 - On an AMD F14 host, running an F14 guest with 2 cores assigned hangs for "a long time" (several 10's of minutes) at start of boot
On an AMD F14 host, running an F14 guest with 2 cores assigned hangs for "a l...
Status: CLOSED CURRENTRELEASE
Product: Fedora
Classification: Fedora
Component: qemu (Show other bugs)
14
All Linux
low Severity high
: ---
: ---
Assigned To: Zachary Amsden
Fedora Extras Quality Assurance
:
: 652489 (view as bug list)
Depends On:
Blocks: 651639 654912
  Show dependency treegraph
 
Reported: 2010-10-20 13:19 EDT by Laine Stump
Modified: 2013-01-09 06:41 EST (History)
21 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 651639 (view as bug list)
Environment:
Last Closed: 2010-12-09 12:07:13 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
kernel info from /var/log/messages (60.31 KB, text/plain)
2010-10-20 13:19 EDT, Laine Stump
no flags Details

  None (edit)
Description Laine Stump 2010-10-20 13:19:08 EDT
Created attachment 454618 [details]
kernel info from /var/log/messages

I have an AMD Thuban 1055T, which is a 6 core Phenom II, and have installed the F14 Beta from Sept 28, then updated from the updates-testing repo to Oct 19.

When I use virt-manager to create a guest with 1 vcpu assigned, and point it at the same F14 DVD ISO, it installs, then boots the OS with no issues.


If I change the config for that guest to have 2 vcpus assigned, or create a new guest with 2 vcpus and try to boot the install ISO, it hangs just after the screen is cleared past the initial SeaBIOS post screen (ie, before the "shades of blue" progress bar begins). This "hang" continues for "a very long time" (I haven't yet caught it as it finally continued, but did witness it hanging for at least 30 minutes) before it finally decides to continue the boot process. During this time, "top" on the host shows that qemu is using 124% of CPU time (I've been unable to attach gdb to the qemu-kvm process to get a backtrace)

None of my other guests have this problem (I've tried RHEL5, F13, and WinXP - they all work fine with 2 vcpus).

I also installed the same F14 Beta (then updated to Oct 19) on Intel Xeon 8-core hardware (an IBM Thinkstation), and an install of F14 on a 2-core guest completed with no problems.

I can provide login credentials and exclusive access to this particular machine if necessary.


Note that I have tried installing the upstream qemu-kvm and kvm packages on this same AMD machine, and once I've done that, I'm unable to get any F13 or F14 guest to boot properly, even with a single vcpu assigned (RHEL5 and WinXP still work fine single or multi cpu), so the utility of a comparison/bi-sect is dubious.

Here is the qemu commandline that's issued by libvirt:


LC_ALL=C PATH=/sbin:/usr/sbin:/bin:/usr/bin QEMU_AUDIO_DRV=none /usr/bin/qemu-kvm -S -M pc-0.13 -cpu phenom,+wdt,+skinit,+osvw,+3dnowprefetch,+misalignsse,+sse4a,+abm,+cr8legacy,+extapic,+cmp_legacy,+lahf_lm,+rdtscp,+pdpe1gb,+popcnt,+cx16,+ht,+vme -enable-nesting -enable-kvm -m 2048 -smp 2,sockets=1,cores=6,threads=1 -name f14hvirt -uuid 24feae2d-3335-3dc3-297f-6ee826d6e634 -nodefconfig -nodefaults -chardev socket,id=monitor,path=/var/lib/libvirt/qemu/f14hvirt.monitor,server,nowait -mon chardev=monitor,mode=readline -rtc base=utc -boot c -drive file=/var/lib/libvirt/images/f14hvirt-1.img,if=none,id=drive-virtio-disk0,boot=on,format=raw -device virtio-blk-pci,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0 -drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -device virtio-net-pci,vlan=0,id=net0,mac=52:54:00:b3:e5:43,bus=pci.0,addr=0x3 -net tap,fd=45,vlan=0,name=hostnet0 -chardev pty,id=serial0 -device isa-serial,chardev=serial0 -usb -device usb-tablet,id=input0 -vnc 127.0.0.1:2 -vga cirrus -device AC97,id=sound0,bus=pci.0,addr=0x4 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 
char device redirected to /dev/pts/0

Note that a line like the following will also trigger the problem (ie less specification of CPU features):

LC_ALL=C PATH=/sbin:/usr/sbin:/bin:/usr/bin QEMU_AUDIO_DRV=none /usr/bin/qemu-kvm -S -M pc-0.13 -enable-kvm -m 900 -smp 2,sockets=2,cores=1,threads=1 -name f14alphatest -uuid 335e5300-e9a8-fd86-0986-748f02a8e69e -nodefconfig -nodefaults -chardev socket,id=monitor,path=/var/lib/libvirt/qemu/f14alphatest.monitor,server,nowait -mon chardev=monitor,mode=readline -rtc base=utc -no-reboot -boot d -drive file=/var/lib/libvirt/images/f14alphatest.img,if=none,id=drive-virtio-disk0,format=raw -device virtio-blk-pci,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0 -drive file=/dev/sr0,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -device virtio-net-pci,vlan=0,id=net0,mac=52:54:00:c0:62:6c,bus=pci.0,addr=0x3 -net tap,fd=54,vlan=0,name=hostnet0 -chardev pty,id=serial0 -device isa-serial,chardev=serial0 -usb -device usb-tablet,id=input0 -vnc 127.0.0.1:3 -vga cirrus -device AC97,id=sound0,bus=pci.0,addr=0x4 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6
Comment 1 Avi Kivity 2010-10-21 06:52:30 EDT
Am interested in access.
Comment 2 Laine Stump 2010-10-21 08:09:43 EDT
Avi - the necessary info is in an IRC private chat I just sent. Contact me if it didn't show up.
Comment 3 Avi Kivity 2010-10-21 09:46:16 EDT
kvm.git with F14's qemu-kvm appears to work fine.
Comment 4 Avi Kivity 2010-10-21 09:47:41 EDT
qemu-kvm.git fails on kvm.git with a segmentation fault appears to be a different problem.
Comment 5 Avi Kivity 2010-10-21 11:01:02 EDT
Plain 2.6.35.6 fails, same way.
Comment 6 Avi Kivity 2010-10-21 11:12:51 EDT
2.6.36 works.  Bisecting.
Comment 7 Avi Kivity 2010-10-21 11:58:52 EDT
Works with -cpu ...,-kvmclock

Zachary, what do we need to backport to 2.6.35.6?  Marcelo, did you already send it?
Comment 8 Zachary Amsden 2010-10-21 22:13:48 EDT
Backport list is probably quite short, although it's unclear if this is a host or guest problem or a mix of both.

Avi, did you bisect the host's qemu or the guest?
Comment 9 Avi Kivity 2010-10-22 04:00:27 EDT
The bisection was irrelevant.  The findings are:

  2.6.35.6: fails
  2.6.35.6, -kvmclock: works
  2.6.36: works

Conclusion: 2.6.35.6 is missing some patches that went into 2.6.36.
Comment 10 Avi Kivity 2010-10-22 04:01:08 EDT
Kernel versions above are for host kernel.
Comment 11 Zachary Amsden 2010-10-25 20:15:33 EDT
These two patches just went to stable..


2.6.35-stable review patch.  If anyone has any objections, please let us know.

------------------

From: Marcelo Tosatti <mtosatti@redhat.com>

commit 58877679fd393d3ef71aa383031ac7817561463d upstream.

On reset, VMCB TSC should be set to zero.  Instead, code was setting
tsc_offset to zero, which passes through the underlying TSC.

Signed-off-by: Zachary Amsden <zamsden@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

2.6.35-stable review patch.  If anyone has any objections, please let us know.

------------------

From: Marcelo Tosatti <mtosatti@redhat.com>

commit 47008cd887c1836bcadda123ba73e1863de7a6c4 upstream.

The VMCB is reset whenever we receive a startup IPI, so Linux is setting
TSC back to zero happens very late in the boot process and destabilizing
the TSC.  Instead, just set TSC to zero once at VCPU creation time.

Why the separate patch?  So git-bisect is your friend.

Signed-off-by: Zachary Amsden <zamsden@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Comment 12 Laine Stump 2010-10-27 12:09:01 EDT
Zach - can you give me the right git incantation to get a kernel source tree with these two patches? I have the kvm kernel tree already, if that helps.
Comment 13 Zachary Amsden 2010-10-27 12:18:37 EDT
I believe you want to git-cherry-pick
Comment 14 digimer 2010-11-12 17:10:38 EST
*** Bug 652489 has been marked as a duplicate of this bug. ***
Comment 15 Justin M. Forbes 2010-12-09 12:07:13 EST
This should be resolved in kernel-2.6.35.9-64.fc14
Comment 16 Ian Kent 2010-12-14 22:40:39 EST
(In reply to comment #15)
> This should be resolved in kernel-2.6.35.9-64.fc14

Appears to have resolved the problem for me.
Thanks.

Note You need to log in before you can comment on or make changes to this bug.