Bug 2088311

Summary: Guest reports "CPU #5's llc-sibling CPU #0 is not on the same node! [node: 1 != 0]"
Product: Red Hat Enterprise Linux 9 Reporter: Lukáš Doktor <ldoktor>
Component: qemu-kvmAssignee: Igor Mammedov <imammedo>
qemu-kvm sub component: CPU Models QA Contact: Mario Casquero <mcasquer>
Status: CLOSED CURRENTRELEASE Docs Contact: Parth Shah <pashah>
Severity: medium    
Priority: medium CC: chayang, coli, gfialova, imammedo, jherrman, jinzhao, juzhang, mcasquer, mrezanin, nilal, pashah, pbonzini, virt-maint
Version: 9.0Keywords: TestOnly, Triaged
Target Milestone: rc   
Target Release: 9.2   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: qemu-kvm-7.2.0-10 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-11-19 07:28:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Libvirt xml file of a machine that emits this warning none

Description Lukáš Doktor 2022-05-19 07:46:11 UTC
Created attachment 1881213 [details]
Libvirt xml file of a machine that emits this warning

Description of problem:
I'm trying to map a numa topology to guest but the guest kernel aways complains about cpus on the other node not being on the same node. Similar configuration worked well in RHEL8.

Version-Release number of selected component (if applicable):
* Host & guest use the same OS version
* RHEL-9.0.0-20211026.10
* qemu-kvm-core-6.0.0-13.el9_b.5.x86_64 as well as various upstream qemus (6621441db50d5bae7e34dbd04bf3c57a27a71b32)
* libvirt-7.6.0-2.el9.x86_64

How reproducible:
* Always, tried various setups

Steps to Reproduce:
1. Get a bootable RHEL image
2. Adjust the provided xml file to match your hardware, but keep at least 2 numa nodes
3. Boot the system and check the guest serial console

Actual results:
[    0.141952] ------------[ cut here ]------------
[    0.141952] sched: CPU #5's llc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency.
[    0.141952] WARNING: CPU: 5 PID: 0 at arch/x86/kernel/smpboot.c:421 topology_sane.isra.0+0x67/0x80
[    0.141952] Modules linked in:
[    0.141952] CPU: 5 PID: 0 Comm: swapper/5 Not tainted 5.14.0-1.7.1.el9.x86_64 #1
[    0.141952] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.15.0-29-g6a62e0cb0dfe-prebuilt.qemu.org 04/01/2014
[    0.141952] RIP: 0010:topology_sane.isra.0+0x67/0x80
[    0.141952] Code: 80 3d 3d 9c f2 01 00 75 f6 48 83 ec 08 4c 89 da 44 89 d6 48 c7 c7 30 b7 b1 b9 88 44 24 07 c6 05 1f 9c f2 01 01 e8 bc 64 99 00 <0f> 0b 0f b6 44 24 07 48 83 c4 08 c3 66 66 2e 0f 1f 84 00 00 00 00
[    0.141952] RSP: 0000:ffffa161431cfed0 EFLAGS: 00010086
[    0.141952] RAX: 0000000000000000 RBX: ffff8acdb5a11460 RCX: ffffffffba527a08
[    0.141952] RDX: 0000000000000000 RSI: 00000000ffff7fff RDI: ffffffffba267a00
[    0.141952] RBP: 0000000000000005 R08: 0000000000000000 R09: ffffa161431cfd08
[    0.141952] R10: ffffa161431cfd00 R11: ffffffffba5e7a48 R12: 0000000000000000
[    0.141952] R13: ffff8acb35811460 R14: 0000000000011010 R15: 0000000000000000
[    0.141952] FS:  0000000000000000(0000) GS:ffff8acdb5a00000(0000) knlGS:0000000000000000
[    0.141952] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.141952] CR2: 0000000000000000 CR3: 0000000315010001 CR4: 0000000000770ee0
[    0.141952] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    0.141952] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    0.141952] PKRU: 55555554
[    0.141952] Call Trace:
[    0.141952]  set_cpu_sibling_map+0x176/0x590
[    0.141952]  start_secondary+0x5b/0x150
[    0.141952]  secondary_startup_64_no_verify+0xc2/0xcb
[    0.141952] ---[ end trace 83f8500fc1ba2966 ]---

Expected results:
Should boot with no issues

Additional info:

Comment 1 Lukáš Doktor 2022-05-19 07:48:18 UTC
Including reply from Igor from preliminary check:

I can reproduce it with following QEMU CLI:
 /usr/libexec/qemu-kvm \
        -machine pc-q35-rhel8.5.0,accel=kvm,usb=off,vmport=off,dump-guest-core=off                                           \
        -cpu host,migratable=on,kvm-hint-dedicated=on,kvm-poll-control=on,host-cache-info=on,l3-cache=off                    \
        -m 4G    \
        -smp 10,sockets=2,dies=1,cores=5,threads=1
        -object memory-backend-ram,id=ram-node0,size=2G -numa node,nodeid=0,memdev=ram-node0  \
        -object memory-backend-ram,id=ram-node1,size=2G -numa node,nodeid=1,memdev=ram-node1  \
        -numa cpu,node-id=0,socket-id=0 \
        -numa cpu,node-id=1,socket-id=1 \
        path_to_rhel9_disk_image

but SRAT table in guest contains only 8 vcpus, and warning we get complains about
a cpu that's not described in SRAT.

[    0.011045] SRAT: PXM 0 -> APIC 0x00 -> Node 0
[    0.011047] SRAT: PXM 0 -> APIC 0x01 -> Node 0
[    0.011048] SRAT: PXM 0 -> APIC 0x02 -> Node 0
[    0.011049] SRAT: PXM 0 -> APIC 0x03 -> Node 0
[    0.011050] SRAT: PXM 0 -> APIC 0x04 -> Node 0
[    0.011051] SRAT: PXM 1 -> APIC 0x08 -> Node 1
[    0.011052] SRAT: PXM 1 -> APIC 0x09 -> Node 1
[    0.011052] SRAT: PXM 1 -> APIC 0x0a -> Node 1
[    0.011053] SRAT: PXM 1 -> APIC 0x0b -> Node 1
[    0.011054] SRAT: PXM 1 -> APIC 0x0c -> Node 1

Comment 3 Igor Mammedov 2022-05-25 08:41:27 UTC
(In reply to Lukáš Doktor from comment #1)
> Including reply from Igor from preliminary check:
> 
> I can reproduce it with following QEMU CLI:
>  /usr/libexec/qemu-kvm \
>         -machine
> pc-q35-rhel8.5.0,accel=kvm,usb=off,vmport=off,dump-guest-core=off           
> \
>         -cpu
> host,migratable=on,kvm-hint-dedicated=on,kvm-poll-control=on,host-cache-
> info=on,l3-cache=off                    \
>         -m 4G    \
>         -smp 10,sockets=2,dies=1,cores=5,threads=1
>         -object memory-backend-ram,id=ram-node0,size=2G -numa
> node,nodeid=0,memdev=ram-node0  \
>         -object memory-backend-ram,id=ram-node1,size=2G -numa
> node,nodeid=1,memdev=ram-node1  \
>         -numa cpu,node-id=0,socket-id=0 \
>         -numa cpu,node-id=1,socket-id=1 \
>         path_to_rhel9_disk_image
> 
> but SRAT table in guest contains only 8 vcpus, and warning we get complains
> about
> a cpu that's not described in SRAT.
> 
> [    0.011045] SRAT: PXM 0 -> APIC 0x00 -> Node 0
> [    0.011047] SRAT: PXM 0 -> APIC 0x01 -> Node 0
> [    0.011048] SRAT: PXM 0 -> APIC 0x02 -> Node 0
> [    0.011049] SRAT: PXM 0 -> APIC 0x03 -> Node 0
> [    0.011050] SRAT: PXM 0 -> APIC 0x04 -> Node 0
> [    0.011051] SRAT: PXM 1 -> APIC 0x08 -> Node 1
> [    0.011052] SRAT: PXM 1 -> APIC 0x09 -> Node 1
> [    0.011052] SRAT: PXM 1 -> APIC 0x0a -> Node 1
> [    0.011053] SRAT: PXM 1 -> APIC 0x0b -> Node 1
> [    0.011054] SRAT: PXM 1 -> APIC 0x0c -> Node 1

so above turned out to be red herring (SRAT table is correct),

what really happening is that host's L3 cache info passed-through
as is which confuses guest. Possible fix is on the way:
https://www.mail-archive.com/qemu-devel@nongnu.org/msg890062.html

Comment 13 Igor Mammedov 2022-11-02 11:22:52 UTC
upstream commits:
d7caf13b5f x86: cpu: fixup number of addressable IDs for logical processors sharing cache
efb3934adf x86: cpu: make sure number of addressable IDs for processor cores meets the spec

Comment 16 Igor Mammedov 2023-02-23 09:38:16 UTC
we have inherited fix with rebase to 7.2

Comment 24 Mario Casquero 2023-03-02 13:22:29 UTC
Moving to VERIFIED based on comment 20 and comment 22

Comment 26 RHEL Program Management 2023-11-19 07:28:44 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.