Bug 525699
Summary: | x86_64 guest hang when set guest's cpu1 online on AMD host | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Qunfang Zhang <qzhang> | ||||||
Component: | kvm | Assignee: | Eduardo Habkost <ehabkost> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Lawrence Lim <llim> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | low | ||||||||
Version: | 5.4 | CC: | lihuang, ndai, ovirt-maint, tools-bugs, virt-maint, ykaul | ||||||
Target Milestone: | rc | ||||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | kvm-83-132.el5 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | |||||||||
: | 554506 (view as bug list) | Environment: | |||||||
Last Closed: | 2010-03-30 07:55:47 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 554506 | ||||||||
Attachments: |
|
Description
Qunfang Zhang
2009-09-25 10:06:54 UTC
(In reply to comment #0) > Description of problem: > Start a rhel5.4-x86_64 guest on an AMD host with multi-vcpu, on the terminal of > guest, first "echo 0 > cpu1" then "echo 1 > cpu1", the guest hang or sometimes > quit. Sorry, should be "echo 0 > /sys/devices/system/cpu/cpu1/online" then "echo 1 > /sys/devices/system/cpu/cpu1/online" Could you send the sosreport output from the host machine? I have tried to reproduce this on a AMD host, x86_64 guest, using exactly the same kernel and kvm versions, and the same qemu-kvm command-line, and I couldn't reproduce it. How many CPUs does the host have? Could you send the sosreport output for the host where this was reproduced? On host amd-4450b-4-2.englab.nay.redhat.com, this bug can be reproduce. This host have 2 CPU, the sosreport see the attachment. Created attachment 364204 [details]
The sosreport of amd-4450b-4-2
The host has 2 CPUs but you are running a 4-vcpu guest. It is not recommended to run a guest with more vcpus than the number of available CPUs on the host. That shouldn't cause the "failed entry" error on the host, but it explains the "CPU stuck" message on the guest. Is the bug reproducible if you limit the number of guest vcpus to 2? (or use a host that has enough CPUs) Yes,the bug can be reproduced when the number of guest vcpus is 2 on an AMD host with 2 cpus. Status: I am debugging the issue on the machine where it can be reproduced. Booting of the CPU is failing. I didn't see the problem when I have used -no-kvm-irqchip. Created attachment 365890 [details]
experimental patch to the issue
Attached experimental fix to the issue. I need to test it with other guests and on a host running latest upstream KVM, before submitting it upstream.
Fix submitted and applied upstream: http://article.gmane.org/gmane.comp.emulators.kvm.devel/42168 Verified this bug in kvm-83-131.el5, the issue does not exist. AMD: x86_64 guest --Passed i386 guest --Passed i386-PAE guest --Passed Intel: i386-PAE guest -- Passed i386 guest: -- Passed x86_64 guest: -- Passed Verified in kvm-83-140.el5,this issue does not exist. host kernel: 2.6.18-182.el5 AMD: x86_64 passed i386 passed i386-PAE passed Intel: x86_64 passed i386 passed i386-PAE passed AMD host cpuinfo: processor : 1 vendor_id : AuthenticAMD cpu family : 15 model : 107 model name : AMD Athlon(tm) Dual Core Processor 5400B stepping : 2 cpu MHz : 1000.000 cache size : 512 KB physical id : 0 siblings : 2 core id : 1 cpu cores : 2 apicid : 1 fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy misalignsse bogomips : 2004.17 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts fid vid ttp tm stc 100mhzsteps An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2010-0271.html |