Bug 1741151 - Option die-id of cpu device got wrong default value
Summary: Option die-id of cpu device got wrong default value
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: qemu-kvm
Version: 8.1
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: rc
: ---
Assignee: Eduardo Habkost
QA Contact: Yumei Huang
URL:
Whiteboard:
Depends On:
Blocks: 1771318
TreeView+ depends on / blocked
 
Reported: 2019-08-14 11:26 UTC by Yumei Huang
Modified: 2020-05-05 09:47 UTC (History)
7 users (show)

Fixed In Version: qemu-kvm-4.2.0-4.module+el8.2.0+5220+e82621dc
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-05-05 09:47:43 UTC
Type: Bug
Target Upstream Version:
Embargoed:
knoel: mirror+


Attachments (Terms of Use)

Description Yumei Huang 2019-08-14 11:26:45 UTC
Description of problem:
Boot guest with a cpu device and don't specify option die-id, QEMU fail to start and print error message.

Version-Release number of selected component (if applicable):
qemu v4.1.0-rc5
kernel-4.18.0-130.el8.x86_64

How reproducible:
always

Steps to Reproduce:
1. Boot guest with cpu device
# qemu-system-x86_64 -enable-kvm -smp 1,threads=2,cores=1,sockets=3,maxcpus=6 -cpu Haswell-noTSX \
-device Haswell-noTSX-x86_64-cpu,socket-id=0,core-id=0,thread-id=1,id=cpu1 

2.
3.

Actual results:
QEMU failed to start, print following error message.
qemu-system-x86_64: -device Haswell-noTSX-x86_64-cpu,socket-id=0,core-id=0,thread-id=1,id=cpu1: Invalid CPU die-id: 4294967295 must be in range 0:2

Expected results:
QEMU start successfully and assign a valid default value for die-id.

Additional info:
If specify die-id=1 or 2, still hit error, and the message makes no sense.

qemu-system-x86_64: -device Haswell-noTSX-x86_64-cpu,socket-id=0,core-id=0,thread-id=1,id=cpu1,die-id=1: Invalid CPU die-id: 1 must be in range 0:2

Comment 2 Yumei Huang 2019-08-15 07:02:18 UTC
Hit the issue with downstream as well, qemu-kvm-4.1.0-1.module+el8.1.0+3966+4a23dca1.

Comment 3 Yumei Huang 2019-08-15 08:37:23 UTC
Adding regression keyword since qemu-kvm-4.0.0-6.module+el8.1.0+3736+a2aefea3 doesn't have the problem.

Comment 4 Eduardo Habkost 2019-08-15 17:35:36 UTC
(In reply to Yumei Huang from comment #0)
> Additional info:
> If specify die-id=1 or 2, still hit error, and the message makes no sense.

The only valid value for die-id is 0, with the topology above.

die-id=0 is mandatory because it appears in query-hotpluggable-cpus.  -device and device_add of CPU objects must always have the properties returned by query-hotpluggable-cpus.

> 
> qemu-system-x86_64: -device
> Haswell-noTSX-x86_64-cpu,socket-id=0,core-id=0,thread-id=1,id=cpu1,die-id=1:
> Invalid CPU die-id: 1 must be in range 0:2

This error message is incorrect, though.  The range is 0:0.  I will send a fix upstream.

Comment 5 Eduardo Habkost 2019-08-15 19:24:20 UTC
Fixes submitted upstream:
https://lore.kernel.org/qemu-devel/20190815183803.13346-1-ehabkost@redhat.com/

Comment 6 Eduardo Habkost 2019-08-28 10:27:23 UTC
The most serious CPU hotplug issue is tracked at bug 1741451.  This one is just about the incorrect error messages and can be addressed in 8.2.

Comment 8 Eduardo Habkost 2019-12-30 12:49:50 UTC
Upstream fix:

commit 2a0585e183d0b7c628638fa07a3ffee8db852d69
Author: Eduardo Habkost <ehabkost>
Date:   Thu Aug 15 15:38:01 2019 -0300

    pc: Fix error message on die-id validation
    
    The error message for die-id range validation is incorrect.  Example:
    
      $ qemu-system-x86_64 -smp 1,sockets=6,maxcpus=6 \
        -device qemu64-x86_64-cpu,socket-id=1,die-id=1,core-id=0,thread-id=0
      qemu-system-x86_64: -device qemu64-x86_64-cpu,socket-id=1,die-id=1,core-id=0,thread-id=0: \
        Invalid CPU die-id: 1 must be in range 0:5
    
    The actual range for die-id in this example is 0:0.
    
    Fix the error message to use smp_dies and print the correct range.
    
    Signed-off-by: Eduardo Habkost <ehabkost>
    Message-Id: <20190815183803.13346-2-ehabkost>
    Reviewed-by: Igor Mammedov <imammedo>
    Reviewed-by: Vanderson M. do Rosario <vandersonmr2>
    Reviewed-by: Michael S. Tsirkin <mst>
    Signed-off-by: Eduardo Habkost <ehabkost>

Comment 9 Ademar Reis 2020-02-05 23:02:50 UTC
QEMU has been recently split into sub-components and as a one-time operation to avoid breakage of tools, we are setting the QEMU sub-component of this BZ to "General". Please review and change the sub-component if necessary the next time you review this BZ. Thanks

Comment 10 Yumei Huang 2020-03-05 08:00:45 UTC
Hi Eduardo,

There are some issue about the error message for die-id. I'm wondering if we should fix it in this bug or file a new one. Thanks.


Testing details:

qemu-kvm-4.2.0-13.module+el8.2.0+5898+fb4bceae
kernel-4.18.0-185.el8.x86_64

Case 1, explicitly specify dies in '-smp'

a) set die-id a positive value but out of valid range, the error message is expected.

# /usr/libexec/qemu-kvm -cpu Cascadelake-Server -smp 1,maxcpus=4,sockets=2,dies=2 \
 -device Cascadelake-Server-x86_64-cpu,socket-id=1,die-id=4,core-id=0,thread-id=0 -monitor stdio
QEMU 4.2.0 monitor - type 'help' for more information
(qemu) qemu-kvm: -device Cascadelake-Server-x86_64-cpu,socket-id=1,die-id=4,core-id=0,thread-id=0: Invalid CPU die-id: 4 must be in range 0:1


b) set die-id a negative value, the error message is not right.

# /usr/libexec/qemu-kvm -cpu Cascadelake-Server -smp 1,maxcpus=4,sockets=2,dies=2 \
 -device Cascadelake-Server-x86_64-cpu,socket-id=1,die-id=-4,core-id=0,thread-id=0 -monitor stdio
QEMU 4.2.0 monitor - type 'help' for more information
(qemu) qemu-kvm: -device Cascadelake-Server-x86_64-cpu,socket-id=1,die-id=-4,core-id=0,thread-id=0: CPU die-id is not set


Case 2, don't set dies in '-smp'
  
a) set die-id a positive value but out of valid range, got same error messages as when set dies in 'smp'.

# /usr/libexec/qemu-kvm -cpu Cascadelake-Server -smp 1,maxcpus=4,sockets=4 -device Cascadelake-Server-x86_64-cpu,socket-id=1,die-id=2,core-id=0,thread-id=0 -monitor stdio -qmp tcp:0:3333,server,nowait
QEMU 4.2.0 monitor - type 'help' for more information
(qemu) qemu-kvm: -device Cascadelake-Server-x86_64-cpu,socket-id=1,die-id=2,core-id=0,thread-id=0: Invalid CPU die-id: 2 must be in range 0:0


b) set die-id a negative value, no error message prompt, and no die-id info in 'query-hotpluggable-cpus'

# /usr/libexec/qemu-kvm -cpu Cascadelake-Server -smp 1,maxcpus=4,sockets=4 -device Cascadelake-Server-x86_64-cpu,socket-id=1,die-id=-4,core-id=0,thread-id=0 -monitor stdio  -qmp tcp:0:3333,server,nowait
QEMU 4.2.0 monitor - type 'help' for more information
(qemu) VNC server running on ::1:5901

{'execute': 'query-hotpluggable-cpus' } 
{"return": [
{"props": {"core-id": 0, "thread-id": 0, "socket-id": 3}, "vcpus-count": 1, "type": "Cascadelake-Server-x86_64-cpu"}, 
{"props": {"core-id": 0, "thread-id": 0, "socket-id": 2}, "vcpus-count": 1, "type": "Cascadelake-Server-x86_64-cpu"}, 
{"props": {"core-id": 0, "thread-id": 0, "socket-id": 1}, "vcpus-count": 1, "qom-path": "/machine/peripheral-anon/device[0]", "type": "Cascadelake-Server-x86_64-cpu"}, 
{"props": {"core-id": 0, "thread-id": 0, "socket-id": 0}, "vcpus-count": 1, "qom-path": "/machine/unattached/device[0]", "type": "Cascadelake-Server-x86_64-cpu"}
]}

Comment 11 Eduardo Habkost 2020-03-09 15:10:02 UTC
(In reply to Yumei Huang from comment #10)
> Hi Eduardo,
> 
> There are some issue about the error message for die-id. I'm wondering if we
> should fix it in this bug or file a new one. Thanks.
> 

Please open a separate BZ.  The original bug was about handling of valid configuration input (which affects customers).  Making error messages more useful when receiving invalid configuration input from management layer is very low priority (QEMU command line is not a customer-visible user interface).

Comment 12 Yumei Huang 2020-03-10 05:15:44 UTC
(In reply to Eduardo Habkost from comment #11)
> (In reply to Yumei Huang from comment #10)
> > Hi Eduardo,
> > 
> > There are some issue about the error message for die-id. I'm wondering if we
> > should fix it in this bug or file a new one. Thanks.
> > 
> 
> Please open a separate BZ.  The original bug was about handling of valid
> configuration input (which affects customers).  Making error messages more
> useful when receiving invalid configuration input from management layer is
> very low priority (QEMU command line is not a customer-visible user
> interface).

Thanks, have filed bug 1811874 to track.

Will move this one to verified once it's ON_QA.

Comment 17 errata-xmlrpc 2020-05-05 09:47:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2017


Note You need to log in before you can comment on or make changes to this bug.