Created attachment 1881213 [details] Libvirt xml file of a machine that emits this warning Description of problem: I'm trying to map a numa topology to guest but the guest kernel aways complains about cpus on the other node not being on the same node. Similar configuration worked well in RHEL8. Version-Release number of selected component (if applicable): * Host & guest use the same OS version * RHEL-9.0.0-20211026.10 * qemu-kvm-core-6.0.0-13.el9_b.5.x86_64 as well as various upstream qemus (6621441db50d5bae7e34dbd04bf3c57a27a71b32) * libvirt-7.6.0-2.el9.x86_64 How reproducible: * Always, tried various setups Steps to Reproduce: 1. Get a bootable RHEL image 2. Adjust the provided xml file to match your hardware, but keep at least 2 numa nodes 3. Boot the system and check the guest serial console Actual results: [ 0.141952] ------------[ cut here ]------------ [ 0.141952] sched: CPU #5's llc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency. [ 0.141952] WARNING: CPU: 5 PID: 0 at arch/x86/kernel/smpboot.c:421 topology_sane.isra.0+0x67/0x80 [ 0.141952] Modules linked in: [ 0.141952] CPU: 5 PID: 0 Comm: swapper/5 Not tainted 5.14.0-1.7.1.el9.x86_64 #1 [ 0.141952] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.15.0-29-g6a62e0cb0dfe-prebuilt.qemu.org 04/01/2014 [ 0.141952] RIP: 0010:topology_sane.isra.0+0x67/0x80 [ 0.141952] Code: 80 3d 3d 9c f2 01 00 75 f6 48 83 ec 08 4c 89 da 44 89 d6 48 c7 c7 30 b7 b1 b9 88 44 24 07 c6 05 1f 9c f2 01 01 e8 bc 64 99 00 <0f> 0b 0f b6 44 24 07 48 83 c4 08 c3 66 66 2e 0f 1f 84 00 00 00 00 [ 0.141952] RSP: 0000:ffffa161431cfed0 EFLAGS: 00010086 [ 0.141952] RAX: 0000000000000000 RBX: ffff8acdb5a11460 RCX: ffffffffba527a08 [ 0.141952] RDX: 0000000000000000 RSI: 00000000ffff7fff RDI: ffffffffba267a00 [ 0.141952] RBP: 0000000000000005 R08: 0000000000000000 R09: ffffa161431cfd08 [ 0.141952] R10: ffffa161431cfd00 R11: ffffffffba5e7a48 R12: 0000000000000000 [ 0.141952] R13: ffff8acb35811460 R14: 0000000000011010 R15: 0000000000000000 [ 0.141952] FS: 0000000000000000(0000) GS:ffff8acdb5a00000(0000) knlGS:0000000000000000 [ 0.141952] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.141952] CR2: 0000000000000000 CR3: 0000000315010001 CR4: 0000000000770ee0 [ 0.141952] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 0.141952] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 0.141952] PKRU: 55555554 [ 0.141952] Call Trace: [ 0.141952] set_cpu_sibling_map+0x176/0x590 [ 0.141952] start_secondary+0x5b/0x150 [ 0.141952] secondary_startup_64_no_verify+0xc2/0xcb [ 0.141952] ---[ end trace 83f8500fc1ba2966 ]--- Expected results: Should boot with no issues Additional info:
Including reply from Igor from preliminary check: I can reproduce it with following QEMU CLI: /usr/libexec/qemu-kvm \ -machine pc-q35-rhel8.5.0,accel=kvm,usb=off,vmport=off,dump-guest-core=off \ -cpu host,migratable=on,kvm-hint-dedicated=on,kvm-poll-control=on,host-cache-info=on,l3-cache=off \ -m 4G \ -smp 10,sockets=2,dies=1,cores=5,threads=1 -object memory-backend-ram,id=ram-node0,size=2G -numa node,nodeid=0,memdev=ram-node0 \ -object memory-backend-ram,id=ram-node1,size=2G -numa node,nodeid=1,memdev=ram-node1 \ -numa cpu,node-id=0,socket-id=0 \ -numa cpu,node-id=1,socket-id=1 \ path_to_rhel9_disk_image but SRAT table in guest contains only 8 vcpus, and warning we get complains about a cpu that's not described in SRAT. [ 0.011045] SRAT: PXM 0 -> APIC 0x00 -> Node 0 [ 0.011047] SRAT: PXM 0 -> APIC 0x01 -> Node 0 [ 0.011048] SRAT: PXM 0 -> APIC 0x02 -> Node 0 [ 0.011049] SRAT: PXM 0 -> APIC 0x03 -> Node 0 [ 0.011050] SRAT: PXM 0 -> APIC 0x04 -> Node 0 [ 0.011051] SRAT: PXM 1 -> APIC 0x08 -> Node 1 [ 0.011052] SRAT: PXM 1 -> APIC 0x09 -> Node 1 [ 0.011052] SRAT: PXM 1 -> APIC 0x0a -> Node 1 [ 0.011053] SRAT: PXM 1 -> APIC 0x0b -> Node 1 [ 0.011054] SRAT: PXM 1 -> APIC 0x0c -> Node 1
(In reply to Lukáš Doktor from comment #1) > Including reply from Igor from preliminary check: > > I can reproduce it with following QEMU CLI: > /usr/libexec/qemu-kvm \ > -machine > pc-q35-rhel8.5.0,accel=kvm,usb=off,vmport=off,dump-guest-core=off > \ > -cpu > host,migratable=on,kvm-hint-dedicated=on,kvm-poll-control=on,host-cache- > info=on,l3-cache=off \ > -m 4G \ > -smp 10,sockets=2,dies=1,cores=5,threads=1 > -object memory-backend-ram,id=ram-node0,size=2G -numa > node,nodeid=0,memdev=ram-node0 \ > -object memory-backend-ram,id=ram-node1,size=2G -numa > node,nodeid=1,memdev=ram-node1 \ > -numa cpu,node-id=0,socket-id=0 \ > -numa cpu,node-id=1,socket-id=1 \ > path_to_rhel9_disk_image > > but SRAT table in guest contains only 8 vcpus, and warning we get complains > about > a cpu that's not described in SRAT. > > [ 0.011045] SRAT: PXM 0 -> APIC 0x00 -> Node 0 > [ 0.011047] SRAT: PXM 0 -> APIC 0x01 -> Node 0 > [ 0.011048] SRAT: PXM 0 -> APIC 0x02 -> Node 0 > [ 0.011049] SRAT: PXM 0 -> APIC 0x03 -> Node 0 > [ 0.011050] SRAT: PXM 0 -> APIC 0x04 -> Node 0 > [ 0.011051] SRAT: PXM 1 -> APIC 0x08 -> Node 1 > [ 0.011052] SRAT: PXM 1 -> APIC 0x09 -> Node 1 > [ 0.011052] SRAT: PXM 1 -> APIC 0x0a -> Node 1 > [ 0.011053] SRAT: PXM 1 -> APIC 0x0b -> Node 1 > [ 0.011054] SRAT: PXM 1 -> APIC 0x0c -> Node 1 so above turned out to be red herring (SRAT table is correct), what really happening is that host's L3 cache info passed-through as is which confuses guest. Possible fix is on the way: https://www.mail-archive.com/qemu-devel@nongnu.org/msg890062.html
upstream commits: d7caf13b5f x86: cpu: fixup number of addressable IDs for logical processors sharing cache efb3934adf x86: cpu: make sure number of addressable IDs for processor cores meets the spec
we have inherited fix with rebase to 7.2
Moving to VERIFIED based on comment 20 and comment 22