Bug 1238574 - qemu-kvm does not expose expected cpu topology to guest when wrong cpu topology is defined.
Summary: qemu-kvm does not expose expected cpu topology to guest when wrong cpu topolo...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.2
Hardware: ppc64le
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Thomas Huth
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: RHV4.1PPC
TreeView+ depends on / blocked
 
Reported: 2015-07-02 07:56 UTC by Dan Zheng
Modified: 2016-11-07 20:26 UTC (History)
19 users (show)

Fixed In Version: qemu-kvm-rhev-2.5.0-1.el7
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-11-07 20:26:20 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:2673 normal SHIPPED_LIVE qemu-kvm-rhev bug fix and enhancement update 2016-11-08 01:06:13 UTC

Description Dan Zheng 2015-07-02 07:56:16 UTC
Description of problem:

Define wrong cpu topology and qemu-kvm does not expose correct cpu topology to guest as expected.

Test two scenarios:
A: -smp 4,sockets=1,cores=4,threads=2 
B: -smp 6,sockets=1,cores=3,threads=4 

Version-Release number of selected component (if applicable):
libvirt-1.2.16-1.el7.ppc64le
qemu-kvm-rhev-2.3.0-5.el7.ppc64le
kernel-3.10.0-282.el7.ppc64le

How reproducible:
100%

Steps to Reproduce:

Scenario A:
1.Edit XML with below setting:
.......
<vcpu placement='static'>4</vcpu>
......
<cpu> <topology sockets='1' cores='4' threads='2'/> </cpu>

2. check qemu process, and parameters are set correctly by libvirt.
# ps -ef|grep qemu-kvm
qemu 21296 ... -smp 4,sockets=1,cores=4,threads=2 ......

3. Use following formula to calculate supposed socket/cores/threads value: 
                <== Is this current qemu realization?
smp = n (argument to -smp)
nr_threads = smp / (cores * sockets)
nr_cores = cores
smp = n which is 4
nr_threads = 4 /(4 * 1) which is 1
nr_cores = cores which is 4
So the expected socket/cores/threads in guest is 1/4/1

4.login guest:
[root@localhost ~]# lscpu
Architecture: ppc64le
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 2
Core(s) per socket: 2
Socket(s): 1
NUMA node(s): 1
Model: IBM pSeries (emulated by qemu)
L1d cache: 64K
L1i cache: 32K
NUMA node0 CPU(s): 0-3

Actual result:
socket/cores/threads in guest is 1/2/2
The #cores is changed which is unexpected.


Expected result:
socket/cores/threads in guest is 1/4/1


Scenario Bīŧš
1. Use below setting:
.......
<vcpu placement='static'>6</vcpu>
......
<cpu> <topology sockets='1' cores='3' threads='4'/> </cpu>

2. Check qemu command line and is correct
-smp 6,sockets=1,cores=3,threads=4 

3. Check in guest
[root@localhost ~]# lscpu
Architecture: ppc64le
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-5
Off-line CPU(s) list: 6,7
Thread(s) per core: 3
Core(s) per socket: 2
Socket(s): 1
NUMA node(s): 1
Model: IBM pSeries (emulated by qemu)
L1d cache: 64K
L1i cache: 32K
NUMA node0 CPU(s): 0-5

Actual result:
socket/cores/threads in guest is 1/2/3
CPU(s) :8
Off-line CPU(s) list: 6,7

Expected result:
CPU(s): 6
No offline CPU
socket/cores/threads in guest is 1/3/2



Additional information:
-- 
see attachment for libvirtd.log

Comment 2 Karen Noel 2015-07-06 10:44:16 UTC
Dan, Is this problem specific to ppc64le? Or, does it also happen on X86_64? 

Thanks.

Comment 3 David Gibson 2015-07-07 00:47:42 UTC
Your understanding the -smp options is not correct.

  -smp n,sockets=s,cores=c,threads=t

n = total number of threads in the whole system
s = total number of sockets in the system
c = number of cores per socket
t = number of threads per core

If n != s * c * t, then n takes precedence over s.  So, qemu will allocate all the threads in the first core, then all the threads in the next core until the total number of threads is reached.  So, the expected topology is 1/2/2 which is what you see.

In scenario B, qemu will allocate all 4 threads in core 0, then 2 of the 4 threads in core 1 before running out of total threads.  Because the guest expects cores to have an equal number of threads, this is represented as 2 cores of 4 threads each, but the last two threads of core1 are offline as shown in the lscpu output.  I think "3 threads per core" value shown by lscpu is just lscpu being confused by the strange situation of having cores with a different number of online threads each.

Comment 4 Dan Zheng 2015-07-07 02:15:31 UTC
Re-test above two scenarios on x86. The two results are as expected.

Here is the bug tracking similar matter for x86.
https://bugzilla.redhat.com/show_bug.cgi?id=689665

Version-Release number of selected component (if applicable):

libvirt-1.2.17-1.el7.x86_64
qemu-kvm-rhev-2.3.0-5.el7.x86_64
kernel-3.10.0-221.el7.x86_64

Scenario A: 
In guest:
[root@localhost ~]# lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                4
On-line CPU(s) list:   0-3
Thread(s) per core:    1
Core(s) per socket:    4
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6

Scenario B: 
In guest:
[root@localhost ~]# lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                6
On-line CPU(s) list:   0-5
Thread(s) per core:    2
Core(s) per socket:    3
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6

Comment 5 Dan Zheng 2015-07-07 02:30:51 UTC
(In reply to Karen Noel from comment #2)
> Dan, Is this problem specific to ppc64le? Or, does it also happen on X86_64? 
> 
> Thanks.

This problem is specific to ppc64le now. Similar problem on X86_64 has been fixed before. See https://bugzilla.redhat.com/show_bug.cgi?id=689665

Comment 6 Thomas Huth 2015-07-21 10:56:54 UTC
I just had a look at this bug ticket, and I wonder whether there is a "right" topology in this case at all ... I mean, if the user provides wrong input, there can hardly be a real right output, can it?

IMHO QEMU should simply exit with an error message when this misconfiguration occured. Would that be an acceptable solution for you, too, Dan Zheng? Then we could try to provide an according patch (which rejects wrong configurations) upstream first and then backport it if it is accepted.

Comment 7 Dan Zheng 2015-07-22 03:12:05 UTC
(In reply to Thomas Huth from comment #6)
> I just had a look at this bug ticket, and I wonder whether there is a
> "right" topology in this case at all ... I mean, if the user provides wrong
> input, there can hardly be a real right output, can it?
> 
> IMHO QEMU should simply exit with an error message when this
> misconfiguration occured. Would that be an acceptable solution for you, too,
> Dan Zheng? Then we could try to provide an according patch (which rejects
> wrong configurations) upstream first and then backport it if it is accepted.

It sounds reasonable. It is a good idea to reject this kind of wrong configurations. Thank you.

Comment 8 Thomas Huth 2015-07-22 08:12:44 UTC
Ok, I'll try to get a check for such wrong configurations into upstream...

Comment 9 Thomas Huth 2015-07-22 14:25:13 UTC
I've now suggested a patch upstream: http://lists.nongnu.org/archive/html/qemu-devel/2015-07/msg04549.html

Comment 11 David Gibson 2015-11-05 04:00:46 UTC
Thomas' patch is upstream in time for qemu-2.4, so we should get it it RHEL 7.3 with the first rebase.

Comment 13 Thomas Huth 2015-11-06 14:34:21 UTC
(In reply to David Gibson from comment #11)
> Thomas' patch is upstream in time for qemu-2.4, so we should get it it RHEL
> 7.3 with the first rebase.

As far as I can see, it will be in QEMU 2.5 ... QEMU 2.4 does not contain the patch yet.

Comment 14 David Gibson 2015-11-08 22:58:41 UTC
Ah, yes, sorry.  Off-by-one error in my thinking.

Comment 16 Xujun Ma 2016-06-03 06:32:21 UTC
Reproduced the issue on old version:

Version-Release number of selected component (if applicable):
Guest kernel: 3.10.0-327.el7.ppc64le
qemu-kvm-rhev:qemu-kvm-rhev-2.3.0-31.el7_2.8.ppc64le
Host kernel:3.10.0-418.el7.ppc64le


Steps to Reproduce:

1. start a guest with command:
/usr/libexec/qemu-kvm \
 -name xuma-vm \
 -smp 4,sockets=1,cores=4,threads=2 \
 -m 8192 \
 -realtime mlock=off \
 -monitor stdio \
 -rtc base=localtime,clock=host \
 -boot strict=on \
 -vnc 0:59 \
 -qmp tcp:0:9999,server,nowait \
\
 -device virtio-scsi-pci,bus=pci.0,addr=0x5 \
\
 -device scsi-hd,id=scsi-hd0,drive=scsi-hd-dr0,bootindex=0 \
 -drive file=RHEL-7.2.qcow2,if=none,id=scsi-hd-dr0,format=qcow2,cache=none \
\
 -device virtio-net-pci,netdev=net0,id=nic0,mac=52:54:00:c4:e7:84 \
 -netdev tap,id=net0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown,vhost=on \



2.check vcpu in the guest with command lscpu:
Architecture: ppc64le
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 2
Core(s) per socket: 2
Socket(s): 1
NUMA node(s): 1
Model: IBM pSeries (emulated by qemu)
L1d cache: 64K
L1i cache: 32K
NUMA node0 CPU(s): 0-3

3.change vpcu command to "-smp 6,sockets=1,cores=3,threads=4" then check in guest:
Architecture: ppc64le
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-5
Off-line CPU(s) list: 6,7
Thread(s) per core: 3
Core(s) per socket: 2
Socket(s): 1
NUMA node(s): 1
Model: IBM pSeries (emulated by qemu)
L1d cache: 64K
L1i cache: 32K
NUMA node0 CPU(s): 0-5


Actual result:
socket/cores/threads in guest is 1/2/2
socket/cores/threads in guest is 1/2/3
Expected result:
socket/cores/threads in guest is 1/4/1 
socket/cores/threads in guest is 1/3/2

Verified the issue on the latest build:
Version-Release number of selected component (if applicable):
Guest kernel: 3.10.0-327.el7.ppc64le
qemu-kvm-rhev:qemu-kvm-rhev-2.6.0-4.el7.ppc64le
Host kernel:3.10.0-418.el7.ppc64le


Steps to Verify:
1. start a guest with command:
/usr/libexec/qemu-kvm \
 -name xuma-vm \
 -smp 4,sockets=1,cores=4,threads=2 \
 -m 8192 \
 -realtime mlock=off \
 -monitor stdio \
 -rtc base=localtime,clock=host \
 -boot strict=on \
 -vnc 0:59 \
 -qmp tcp:0:9999,server,nowait \
\
 -device virtio-scsi-pci,bus=pci.0,addr=0x5 \
\
 -device scsi-hd,id=scsi-hd0,drive=scsi-hd-dr0,bootindex=0 \
 -drive file=RHEL-7.2.qcow2,if=none,id=scsi-hd-dr0,format=qcow2,cache=none \
\
 -device virtio-net-pci,netdev=net0,id=nic0,mac=52:54:00:c4:e7:84 \
 -netdev tap,id=net0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown,vhost=on \

2.change vpcu command to "-smp 6,sockets=1,cores=3,threads=4" then run the commands:


Results: can't run the commands and display message as following:
qemu-kvm: cpu topology: sockets (1) * cores (4) * threads (2) > maxcpus (4)
qemu-kvm: cpu topology: sockets (1) * cores (3) * threads (4) > maxcpus (6)

so the bus was fixed.

Comment 18 errata-xmlrpc 2016-11-07 20:26:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2673.html


Note You need to log in before you can comment on or make changes to this bug.