Bug 689665
Summary: | Specify the number of cpu cores failed with cpu model Nehalem Penryn and Conroe | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Shaolong Hu <shu> | ||||||||
Component: | qemu-kvm | Assignee: | Eduardo Habkost <ehabkost> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||||
Severity: | medium | Docs Contact: | |||||||||
Priority: | medium | ||||||||||
Version: | 6.1 | CC: | areis, bcao, chayang, ehabkost, flang, juzhang, lagarcia, michen, mishu, mkenneth, shuang, tburke, virt-maint, xwei | ||||||||
Target Milestone: | rc | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | qemu-kvm-0.12.1.2-2.320.el6 | Doc Type: | Bug Fix | ||||||||
Doc Text: |
Cause: some CPU models defined on qemu-kvm have a low "level" value (< 4). This includes the models: Conroe, Penrym, Nehalem.
Consequence: the guest OS won't read the CPUID.4 leaf, that contains extra information about the CPU topology, and won't recognize the three-level topology info (package;core;thread), only two-level topology info (package;thread) will be recognized.
Workaround: three possible workarounds:
1) Do not define multi-core vCPUs when using Conroe, Penrym, Nehalem CPU models; or
2) Use the default cpu64-rhel6 CPU model (that has level=4) if a multi-core vCPU is required; or
3) Override the "level" parameter to a value >=4 when using Conroe, Penrym or Nehalem. e.g.: "-cpu Nehalem,level4". Unfortunately this is not possible to be done on the libvirt XML definition, but it can be done on the Qemu command-line.
|
Story Points: | --- | ||||||||
Clone Of: | |||||||||||
: | 861209 (view as bug list) | Environment: | |||||||||
Last Closed: | 2013-02-21 07:30:37 UTC | Type: | --- | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | 833152, 851245 | ||||||||||
Bug Blocks: | 833130, 903089 | ||||||||||
Attachments: |
|
Description
Shaolong Hu
2011-03-22 03:40:55 UTC
Seems cpu cores&physical id&siblings issue exists in AMD host with rhel3.9&rhel4.9&rhel5.6-64&rhel6.0.z&rhel6.1 guest image. Here lists two: In rhel4.9-32-virtio guest, value of physical id & siblings & cpu cores seems wrong. CLI: /usr/libexec/qemu-kvm -M rhel6.1.0 -enable-kvm -m 4096 -smp 2,cores=4,threads=1,socket=2 -cpu Opteron_G3,-x2apic,+svm -name rhel4.9 ...-boot c -drive file=/mnt/images/RHEL-4.9-32-virtio.qcow2... processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 6 model name : AMD Opteron 23xx (Gen 3 Class Opteron) stepping : 1 cpu MHz : 2295.061 cache size : 512 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 2 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht pni syscall nx lm pni popcnt bogomips : 4598.44 processor : 1 vendor_id : AuthenticAMD cpu family : 15 model : 6 model name : AMD Opteron 23xx (Gen 3 Class Opteron) stepping : 1 cpu MHz : 2295.061 cache size : 512 KB physical id : 0 siblings : 2 core id : 1 cpu cores : 2 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht pni syscall nx lm pni popcnt bogomips : 4857.89 ********************************************************************* Boot rhel3.9-32 guest with same cli, cat /proc/cpuinfo outputs the correct siblings & cpu cores number except wrong physical id in guest. processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 6 model name : AMD Opteron 23xx (Gen 3 Class Opteron) stepping : 1 cpu MHz : 2293.758 cache size : 512 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 4 runqueue : 0 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx lm bogomips : 4561.30 processor : 1 vendor_id : AuthenticAMD cpu family : 15 model : 6 model name : AMD Opteron 23xx (Gen 3 Class Opteron) stepping : 1 cpu MHz : 2293.758 cache size : 512 KB physical id : 0 siblings : 4 core id : 1 cpu cores : 4 runqueue : 1 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx lm bogomips : 4574.41 ****************************************************************************** boot win2008-32-virtio image with -smp 2,cores=4,threads=1,socket=2 -cpu host... check CPU with x86info, the family&cpu number seems not right. On AMD host, get this: # ./x86info -f -a x86info v1.29beta. Dave Jones 2001-2011 Feedback to <davej>. Found 4 identical CPUs Extended Family: 1 Extended Model: 0 Family: 15 Model: 2 Stepping: 3 CPU Model (x86info's best guess): Quad-Core Opteron/Phenom (DR-B3) Processor name string (BIOS programmed): AMD Phenom(tm) 9600B Quad-Core Processor Number of reporting banks : 6 In win2008-32-virtio guest, gets: Found 2 CPUs Family: 16 Model: 2 Stepping: 3 CPU Model: Unknown CPU Processor name string : AMD Phenom(tm) 9600B Quad-Core Processor And x86info complaints "WARNING: Detected SMP, but unable to access cpuid driver.Used Uniprocessor CPU routines. Results inaccurate." On AMD host, issue x86info: ./x86info -f -a x86info v1.29beta. Dave Jones 2001-2011 Feedback to <davej>. Found 4 identical CPUs Extended Family: 1 Extended Model: 0 Family: 15 Model: 2 Stepping: 3 CPU Model (x86info's best guess): Quad-Core Opteron/Phenom (DR-B3) Processor name string (BIOS programmed): AMD Phenom(tm) 9600B Quad-Core Processor Number of reporting banks : 6 ... ******************************************************* Issue cat /proc/cpuinfo on same AMD host: processor : 3 vendor_id : AuthenticAMD cpu family : 16 model : 2 model name : AMD Phenom(tm) 9600B Quad-Core Processor stepping : 3 cpu MHz : 1150.000 cache size : 512 KB physical id : 0 siblings : 4 core id : 3 cpu cores : 4 apicid : 3 initial apicid : 3 fpu : yes fpu_exception : yes cpuid level : 5 ... The CPU family x86info and /proc/cpuinfo report is different. So which one should I trust? (In reply to comment #2) > Seems cpu cores&physical id&siblings issue exists in AMD host with > rhel3.9&rhel4.9&rhel5.6-64&rhel6.0.z&rhel6.1 guest image. Here lists two: > In rhel4.9-32-virtio guest, value of physical id & siblings & cpu cores seems > wrong. > CLI: > /usr/libexec/qemu-kvm -M rhel6.1.0 -enable-kvm -m 4096 -smp > 2,cores=4,threads=1,socket=2 -cpu Opteron_G3,-x2apic,+svm -name rhel4.9 > ...-boot c -drive file=/mnt/images/RHEL-4.9-32-virtio.qcow2... If I boot image and change the n of option -smp from 2 to 8, cpu topology(cpu cores, physical id, processor) is correct. So I guess in AMD host, the point is n should be equal to cores*threads*sockets when booting guest. this issue also exist in winxp guest 1. cmd: /usr/libexec/qemu-kvm -monitor stdio -drive file='/home/Auto/autotest/client/tests/kvm/images/winXP-32-virtio.qcow2',index=0,if=none,id=drive-virtio-disk1,media=disk,cache=none,format=qcow2,aio=native -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk1,id=virtio-disk1 -device virtio-net-pci,netdev=idOpKnu6,mac=9a:c7:11:c1:0e:1a,netdev=idOpKnu6,id=ndev00idOpKnu6,bus=pci.0,addr=0x3 -netdev tap,id=idOpKnu6,vhost=on,ifname='t0-131554-tp5H',script='/home/Auto/autotest/client/tests/kvm/scripts/qemu-ifup-switch',downscript='no' -m 4096 -smp 4,cores=2,threads=2,sockets=1 \ -cpu Penryn -spice port=8000,disable-ticketing -vga qxl -rtc base=localtime,clock=host,driftfix=none -boot order=cdn,once=c,menu=off -usbdevice tablet -enable-kvm 2. cpu-z result: processor:1 cores:1 threads: 4 (In reply to comment #0) > 1. Boot guest with: > > /usr/libexec/qemu-kvm -M rhel6.1.0 -cpu Nehalem -enable-kvm -smp > cores=4,threads=2,sockets=2 -m 4G -name RHEL6.0-32-virtio-qcow2 Looks like the above smp arg # was truncated, what value was used? > Expected results: > ------------------ > CPU cores should be 4. I believe the guest's mapping calculation is getting confused when smp# < sockets * cores. What is "x86info -a -f" telling you in the same case? > Additional info: > ----------------- > Westmere is fine, Penryn and Conroe have the same problem. That's rather odd as qemu is exporting the topology in the same manner independent of cpu model. Although the guest could conceivably be interpreting the cpuid data differently depending on model. Could you recheck Westmere topology is indeed reported correctly by the guest compared to identical qemu CLI arguments for Nehalem/Penryn/Conroe? (In reply to comment #4) > If I boot image and change the n of option -smp from 2 to 8, cpu topology(cpu > cores, physical id, processor) is correct. So I guess in AMD host, the point is > n should be equal to cores*threads*sockets when booting guest. That's true for intel cpus as well. The internal relationship is: sockets = smp / (cores * threads) with missing/default values being calculated. However this is as well suspicions as smp 2 -> 8 should not cause the exported cpuid data to change. I'd hazard the guest is doing its own interpretation of the topology. (In reply to comment #3) > The CPU family x86info and /proc/cpuinfo report is different. So which one > should I trust? I trust the raw data dumped by x86info far more than the the kernel's /proc/cpuinfo or even x86info's mode where it attempts to interpret the raw data. (In reply to comment #6) > (In reply to comment #0) > > > 1. Boot guest with: > > > > /usr/libexec/qemu-kvm -M rhel6.1.0 -cpu Nehalem -enable-kvm -smp > > cores=4,threads=2,sockets=2 -m 4G -name RHEL6.0-32-virtio-qcow2 > > Looks like the above smp arg # was truncated, what value was used? It's not truncated, i don't specify it at all, in this case, it's equal to -smp 16,cores=4,threads=2,socket=2, right? I also tried to specify it, and it's the same result. > > > Expected results: > > ------------------ > > CPU cores should be 4. > > I believe the guest's mapping calculation is getting confused when > smp# < sockets * cores. What is "x86info -a -f" telling you in the > same case? X86info suggests that the total number of cpu is right, which in this case is 16, but cpu cores=1. > > > Additional info: > > ----------------- > > Westmere is fine, Penryn and Conroe have the same problem. > > That's rather odd as qemu is exporting the topology in the same > manner independent of cpu model. Although the guest could conceivably > be interpreting the cpuid data differently depending on model. Could > you recheck Westmere topology is indeed reported correctly by the guest > compared to identical qemu CLI arguments for Nehalem/Penryn/Conroe? I tested this on 20 guests, inlcuding RHEL and Windows, 32 bit and 64 bit, using the same command line in comment 0, and as your asked, i recheck it on a e5620 machine, with -smp cores=2,threads=2,sockets=2, test Westmere and Nehalem, attach the results, which is x86info-westmere and x86info-Nehalem, and then test -cpu Nehalem -smp 8,cores=2,threads=2,sockets=2, the result is x86info-Nehalem-2, there is no difference with x86info-Nehalem. > > > (In reply to comment #4) > > > If I boot image and change the n of option -smp from 2 to 8, cpu topology(cpu > > cores, physical id, processor) is correct. So I guess in AMD host, the point is > > n should be equal to cores*threads*sockets when booting guest. > > That's true for intel cpus as well. The internal relationship is: > > sockets = smp / (cores * threads) > > with missing/default values being calculated. However this is as > well suspicions as smp 2 -> 8 should not cause the exported cpuid > data to change. I'd hazard the guest is doing its own interpretation > of the topology. > > > (In reply to comment #3) > > > The CPU family x86info and /proc/cpuinfo report is different. So which one > > should I trust? > > I trust the raw data dumped by x86info far more than the the kernel's > /proc/cpuinfo or even x86info's mode where it attempts to interpret > the raw data. Created attachment 488348 [details]
x86info-westmere
Created attachment 488349 [details]
x86info-Nehalem
Created attachment 488350 [details]
x86info-Nehalem-2
(In reply to comment #8) > (In reply to comment #6) > > (In reply to comment #0) > > > > > 1. Boot guest with: > > > > > > /usr/libexec/qemu-kvm -M rhel6.1.0 -cpu Nehalem -enable-kvm -smp > > > cores=4,threads=2,sockets=2 -m 4G -name RHEL6.0-32-virtio-qcow2 > > > > Looks like the above smp arg # was truncated, what value was used? > > It's not truncated, i don't specify it at all, in this case, it's equal to -smp > 16,cores=4,threads=2,socket=2, right? Yes it is, just verifying the cli flags. > X86info suggests that the total number of cpu is right, which in this case is > 16, but cpu cores=1. Yes I'm able to reproduce your results for Nehalem as well as explaining the difference for Westmere. In the case of Nehalem advertised cpuid range to the guest is is limited to 0000_0002 even though the raw cpuid data is being fully exported. This is causing confusion for guest code which strictly interprets the raw cpuid data. Tried on Nehalem host with windows7 64 bit guest also trigger this issue while cli override the vendor. steps: start VM: 1.<commandLine> -cpu <cpu_model,vendor="AuthenticAMD" -smp 4,cores=2,threads=1,sockets=2 Acutal Results: in the cpu-z Selection=4 cores=1,Threads=1 Additional info: tried w/o vendor , selections=2,cores=2,threads=1 ,works as expected. Tried on Penryn Host with win7-32/64 guest, will show incorrect cpu info cmd: -cpu host/Conroe/Penryn/ -smp 8,cores=4,threads=1,sockets=2 inside the guest,cpu-z shows: cpu-z results: host Conroe() Penryn processer(should be 2): 2 2 2 cores(should be 4): 4 1 1 threads(should be 1): 4 4 4 host: kernel-2.6.32-125.el6.x86_64 qemu-kvm-0.12.1.2-2.153.el6.x86_64 cpuinfo: processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz stepping : 10 cpu MHz : 2660.430 cache size : 3072 KB physical id : 0 siblings : 4 core id : 3 cpu cores : 4 Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Cause: some CPU models defined on qemu-kvm have a low "level" value (< 4). This includes the models: Conroe, Penrym, Nehalem. Consequence: the guest OS won't read the CPUID.4 leaf, that contains extra information about the CPU topology, and won't recognize the three-level topology info (package;core;thread), only two-level topology info (package;thread) will be recognized. Workaround: three possible workarounds: 1) Do not define multi-core vCPUs when using Conroe, Penrym, Nehalem CPU models; or 2) Use the default cpu64-rhel6 CPU model (that has level=4) if a multi-core vCPU is required; or 3) Override the "level" parameter to a value >=4 when using Conroe, Penrym or Nehalem. e.g.: "-cpu Nehalem,level4". Unfortunately this is not possible to be done on the libvirt XML definition. Technical note updated. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -5,4 +5,4 @@ Workaround: three possible workarounds: 1) Do not define multi-core vCPUs when using Conroe, Penrym, Nehalem CPU models; or 2) Use the default cpu64-rhel6 CPU model (that has level=4) if a multi-core vCPU is required; or - 3) Override the "level" parameter to a value >=4 when using Conroe, Penrym or Nehalem. e.g.: "-cpu Nehalem,level4". Unfortunately this is not possible to be done on the libvirt XML definition.+ 3) Override the "level" parameter to a value >=4 when using Conroe, Penrym or Nehalem. e.g.: "-cpu Nehalem,level4". Unfortunately this is not possible to be done on the libvirt XML definition, but it can be done on the Qemu command-line. test on version as follow. host: # rpm -q kernel kernel-2.6.32-279.el6.x86_64 # rpm -q qemu-kvm qemu-kvm-0.12.1.2-2.307.el6.x86_64 guest: kernel-2.6.32-303.el6.x86_64 steps: 1.boot guest with "-cpu Nehalem,-smp 4,sockets=1,cores=2,threads=2..." 2.in guest : # cat /proc/cpuinfo |grep cores cpu cores : 1 cpu cores : 1 ------>wrong value cpu cores : 1 cpu cores : 1 [root@virtlab-66-84-200 ~]# cat /proc/cpuinfo |grep "physical id"|sort|uniq|wc -l 1 [root@virtlab-66-84-200 ~]# cat /proc/cpuinfo |grep siblings siblings : 4 siblings : 4 siblings : 4 siblings : 4 addinfo:boot guest with "-cpu SandyBridge,-smp 4,sockets=1,cores=2,threads=2",not hit the probelm Verified on : qemu-kvm-rhev-0.12.1.2-2.331.el6.x86_64 CMD: -cpu Nehalem -smp cores=4,threads=2,sockets=2 [root@localhost ~]# cat /proc/cpuinfo | grep "physical id" | wc -l 16 [root@localhost ~]# cat /proc/cpuinfo | grep "physical id" | sort | uniq physical id : 0 physical id : 1 [root@localhost ~]# cat /proc/cpuinfo | grep "cpu cores" | uniq cpu cores : 4 [root@localhost ~]# cat /proc/cpuinfo | grep "siblings" | uniq siblings : 8 Test with "-cpu Penryn -smp cores=4,threads=2,sockets=2" and "-cpu Conroe -smp cores=4,threads=2,sockets=2", also works well. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-0527.html |