Bug 1135772

Summary:	Qemu should prevent the configuration of 'AMD & threads>1'
Product:	Red Hat Enterprise Linux 7	Reporter:	CongLi <coli>
Component:	qemu-kvm-rhev	Assignee:	Wei Huang (AMD) <wehuang>
Status:	CLOSED ERRATA	QA Contact:	Virtualization Bugs <virt-bugs>
Severity:	low	Docs Contact:
Priority:	low
Version:	7.2	CC:	ehabkost, hhuang, imammedo, juzhang, michen, mrezanin, scui, shuang, virt-maint, xuhan, xwei
Target Milestone:	rc
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	qemu-2.3	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:
Clones:	1135773 (view as bug list)		Environment:
Last Closed:	2015-12-04 16:18:09 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1135773

Description CongLi 2014-08-31 12:02:46 UTC

Description of problem:
Qemu should prevent the configuration of 'AMD & threads>1'

As we know, AMD does not support hyper threading.
But when boot up a guest with the following CML, there is no error:
    -smp 16,cores=4,threads=2,sockets=2  \
    -cpu 'Opteron_G4',+kvm_pv_unhalt \

And according to the manpage of qemu-kvm, this operation should 
be not allowed, but the limitation is not implemented in QEMU,
For AMD CPU, QEMU calculates number of logical processors per package (cores)
as (cores * threads).

Check the guest, we can get "cores=8,threads=1,sockets=2":
2014-08-31 17:14:16: [root@unused ~]#
2014-08-31 17:14:20: lscpu | grep '^Core'
2014-08-31 17:14:20: Core(s) per socket:    8
2014-08-31 17:14:20: [root@unused ~]#
2014-08-31 17:14:20: echo $?
2014-08-31 17:14:20: 0
2014-08-31 17:14:20: [root@unused ~]#
2014-08-31 17:14:20: lscpu | grep '^Thread'
2014-08-31 17:14:20: Thread(s) per core:    1
2014-08-31 17:14:20: [root@unused ~]#
2014-08-31 17:14:20: echo $?
2014-08-31 17:14:20: 0
2014-08-31 17:14:20: [root@unused ~]#
2014-08-31 17:14:20: lscpu | grep -E '(^Socket|^CPU socket)'
2014-08-31 17:14:20: Socket(s):             2

Version-Release number of selected component (if applicable):
host:
kernel-3.10.0-147.el7.x86_64
qemu-kvm-rhev-2.1.0-3.el7.x86_64

guest:
kernel-3.10.0-146.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Boot a RHEL.7.1 guest on AMD host with:
    -smp 16,cores=4,threads=2,sockets=2  \
    -cpu 'Opteron_G4',+kvm_pv_unhalt \

2.
3.

Actual results:
guest can boot up successfully 

Expected results:
Qemu should prevent this configuration

Additional info:
1. QEMU CML:
/bin/qemu-kvm \
    -S  \
    -name 'virt-tests-vm1'  \
    -sandbox off  \
    -M pc  \
    -nodefaults  \
    -vga cirrus  \
    -chardev socket,id=qmp_id_qmpmonitor1,path=/tmp/monitor-qmpmonitor1-20140831-171332-NgOMgnq2,server,nowait \
    -mon chardev=qmp_id_qmpmonitor1,mode=control  \
    -chardev socket,id=serial_id_serial0,path=/tmp/serial-serial0-20140831-171332-NgOMgnq2,server,nowait \
    -device isa-serial,chardev=serial_id_serial0  \
    -chardev socket,id=seabioslog_id_20140831-171332-NgOMgnq2,path=/tmp/seabios-20140831-171332-NgOMgnq2,server,nowait \
    -device isa-debugcon,chardev=seabioslog_id_20140831-171332-NgOMgnq2,iobase=0x402 \
    -device ich9-usb-uhci1,id=usb1,bus=pci.0,addr=03 \
    -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=04 \
    -drive id=drive_image1,if=none,cache=none,snapshot=off,aio=native,file=/home/staf-kvm-devel/autotest-devel/client/tests/virt/shared/data/images/RHEL-Server-7.1-64-virtio.qcow2 \
    -device scsi-hd,id=image1,drive=drive_image1,serial=GY-\`pK \
    -device virtio-net-pci,mac=9a:1d:1e:1f:20:21,id=id33kJVK,vectors=4,netdev=idHJzv3I,bus=pci.0,addr=05  \
    -netdev tap,id=idHJzv3I,vhost=on,vhostfd=26,fd=25  \
    -m 16384  \
    -smp 16,cores=4,threads=2,sockets=2  \
    -cpu 'Opteron_G4',+kvm_pv_unhalt \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
    -vnc :0  \
    -rtc base=utc,clock=host,driftfix=slew  \
    -boot order=cdn,once=c,menu=off  \
    -no-kvm-pit-reinjection \
    -enable-kvm

2. host info:
processor	: 23
vendor_id	: AuthenticAMD
cpu family	: 21
model		: 1
model name	: AMD Opteron(TM) Processor 6234                 
stepping	: 2
microcode	: 0x6000626
cpu MHz		: 2400.059
cache size	: 2048 KB
physical id	: 1
siblings	: 12
core id		: 5
cpu cores	: 6
apicid		: 75
initial apicid	: 43
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid amd_dcm aperfmperf pni pclmulqdq monitor ssse3 cx16 sse4_1 sse4_2 popcnt aes xsave avx lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 nodeid_msr topoext perfctr_core perfctr_nb arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold
bogomips	: 4799.76
TLB size	: 1536 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb

Comment 3 Wei Huang (AMD) 2014-10-07 20:28:54 UTC

Confirmed. I have posted a patch to upstream qemu:

http://lists.nongnu.org/archive/html/qemu-devel/2014-10/msg00794.html

-Wei

Comment 4 Wei Huang (AMD) 2014-10-08 15:44:53 UTC

While the patch is being reviewed by the upstream, here are the detailed findings:

1. First of all, in my opinion this isn't a bug. Even though users can specifify "threads=n" for AMD CPU, QEMU converts such settings into a proper CPUID. Here are some examples:

** "-smp 8,cores=2,threads=2,sockets=2" will be converted to "2 sockets, 4 cores/socket" for guest VM;
** "-smp 8,cores=4,threads=2" will be converted to "1 socket, 8 cores/socket".

2. QEMU takes care of the conversion of threads=n to proper CPUID_0000_0001_EBX[LogicalProcCount] and CPUID_8000_0008_ECX[NC] for AMD CPU. Please refer to http://support.amd.com/TechDocs/25481.pdf for details. Per Paolo, the following patch was original patch to address this problem:

===========
commit 400281af34e5ee6aa9f5496b53d8f82c6fef9319
Author: Andre Przywara <andre.przywara>
Date:   Wed Aug 19 15:42:42 2009 +0200

    set CPUID bits to present cores and threads topology
    
    Controlled by the enhanced -smp option set the CPUID bits to present the
    guest the desired topology. This is vendor specific, but (with the exception
    of the CMP_LEGACY bit) not conflicting, so we set all bits everytime.
    There is no real multithreading support for AMD CPUs, so report cores
    instead.
    
    Signed-off-by: Andre Przywara <andre.przywara>
    Signed-off-by: Anthony Liguori <aliguori.com>
============

For this bug, the key question is: should QEMU presents CPUIDs strictly as specified by the command line or QEMU can tweak a
little bit on behalf of end-users? I think either way is fine. That is why I called it a non-bug.

-Wei

Comment 5 Eduardo Habkost 2014-10-08 17:46:00 UTC

(In reply to Wei Huang from comment #4)
> For this bug, the key question is: should QEMU presents CPUIDs strictly as
> specified by the command line or QEMU can tweak a
> little bit on behalf of end-users? I think either way is fine. That is why I
> called it a non-bug.

I believe QEMU should either: 1) do exactly what the user asked for; or 2) reject the configuration from the user; and never tweak the settings on behalf of the user.

But we have another question: is the current behavior really "tweaking"? The number of VCPUs per socket must match nr_cores*nr_threads, otherwise the guest will believe some VCPUs are on the wrong socket. We are just not telling the guest that the user also wanted the VCPUs inside each socket to be grouped as "threads" inside "cores" (to be more accurate, we not even hiding that information from the guest completely, as CPUID[4] has it. It is just that the guest chooses to ignore those bits because they are reserved on AMD docs).

And, even if the current behavior is considered "tweaking": keeping compatibility is a strong enough reason to not start rejecting configurations that work today, even if they look odd. Probably the best we can do is to generate a warning on libvirt and/or QEMU. Or, if we decide to really reject some configurations, we can do that only on newer machine-types.

Comment 6 Wei Huang (AMD) 2014-10-08 18:05:36 UTC

(In reply to Eduardo Habkost from comment #5)
> (In reply to Wei Huang from comment #4)
> > For this bug, the key question is: should QEMU presents CPUIDs strictly as
> > specified by the command line or QEMU can tweak a
> > little bit on behalf of end-users? I think either way is fine. That is why I
> > called it a non-bug.
> 
> I believe QEMU should either: 1) do exactly what the user asked for; or 2)
> reject the configuration from the user; and never tweak the settings on
> behalf of the user.
That is what the patch does.

> 
> But we have another question: is the current behavior really "tweaking"? The
> number of VCPUs per socket must match nr_cores*nr_threads, otherwise the
> guest will believe some VCPUs are on the wrong socket. We are just not
> telling the guest that the user also wanted the VCPUs inside each socket to
> be grouped as "threads" inside "cores" (to be more accurate, we not even

Current behavior still groups nr_cores*nr_threads on the same socket. But it doesn't honor the request of "threads inside cores". QEMU sliently drops the notion of threads and bumps up nr_cores instead.

> hiding that information from the guest completely, as CPUID[4] has it. It is
> just that the guest chooses to ignore those bits because they are reserved
> on AMD docs).

I think this behavior is legistimate. AMD choose to ignore it because HyperThreading doesn't exist on their CPUs.

> 
> And, even if the current behavior is considered "tweaking": keeping
> compatibility is a strong enough reason to not start rejecting
> configurations that work today, even if they look odd. Probably the best we
> can do is to generate a warning on libvirt and/or QEMU. Or, if we decide to
> really reject some configurations, we can do that only on newer
> machine-types.

Paolo has concerns regarding compability issue as well. If the patch was rejected by the upstream, I think your first suggestion would be a good choice (at least some level of warning). 

Regarding the new machine types, could you elaborate? I am not quite sure about this appraoch. Right now we have various AMD types (Opteron G1, Opteron G2, ...). A new machine type might really confuse users.

Comment 7 Eduardo Habkost 2014-10-08 18:12:49 UTC

(In reply to Wei Huang from comment #6)
> > And, even if the current behavior is considered "tweaking": keeping
> > compatibility is a strong enough reason to not start rejecting
> > configurations that work today, even if they look odd. Probably the best we
> > can do is to generate a warning on libvirt and/or QEMU. Or, if we decide to
> > really reject some configurations, we can do that only on newer
> > machine-types.
> 
> Paolo has concerns regarding compability issue as well. If the patch was
> rejected by the upstream, I think your first suggestion would be a good
> choice (at least some level of warning). 
> 
> Regarding the new machine types, could you elaborate? I am not quite sure
> about this appraoch. Right now we have various AMD types (Opteron G1,
> Opteron G2, ...). A new machine type might really confuse users.

Opteron_G[1234] are CPU models, not machine-types. machine-types are the mechanism we use to introduce guest-visible changes while keeping compatibility of existing configurations. See the QEMUMachine structs, compat_props arrays, and pc_compat_*() functions on hw/i386/pc_piix.c and hw/i386/pc_q35.c. For an example of a compatibility bit directly related to CPU topology, see enable_compat_apic_id_mode().

Comment 8 Wei Huang (AMD) 2014-10-22 14:45:00 UTC

Here is a quick update:

1) The consensus from QEMU community is to give a warning, instead of stopping QEMU. 
2) Based on the idea of (1), the proposal is to have warnings in two components: QEMU and virt-manager. 

The patch to print warning in QEMU has been accepted: http://lists.nongnu.org/archive/html/qemu-devel/2014-10/msg02275.html.

Comment 9 Wei Huang (AMD) 2014-10-27 16:06:38 UTC

The virt-manager patches have been posted to its mailing list:
https://www.redhat.com/archives/virt-tools-list/2014-October/msg00135.html

Comment 12 Xiaoqing Wei 2015-07-02 08:42:46 UTC

patch introduced by commit
e48638f target-i386: warns users when CPU threads>1 for non-Intel CPUs

[root@dhcp-11-50 qemu-kvm-rhev7]# rpm -q qemu-img-rhev
qemu-img-rhev-2.3.0-6.el7.x86_64
[root@dhcp-11-50 qemu-kvm-rhev7]# /usr/libexec/qemu-kvm -S -monitor stdio -smp 4,cores=1,threads=2,sockets=2 
QEMU 2.3.0 monitor - type 'help' for more information
(qemu) qemu-kvm: AMD CPU doesn't support hyperthreading. Please configure -smp options properly.
VNC server running on `::1:5900'

(qemu) q

Comment 14 errata-xmlrpc 2015-12-04 16:18:09 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2546.html