Description of problem: The setting fails when booting the guest with 128 numa nodes. Check "cat /sys/devices/system/node/possible" information, which is 0 while expected numa node is 0-127 Version-Release number of selected component (if applicable): Host Distro: RHEL-9.0.0-20220115.2 Host Kernel: kernel-5.14.0-42.el9.aarch64 Guest Kernel: kernel-5.14.0-39.el9.aarch64 qemu-kvm: qemu-kvm-6.2.0-3.el9 How reproducible: 100% Steps to Reproduce: 1. Boot guest with 128 numa nodes /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -sandbox on \ -blockdev node-name=file_aavmf_code,driver=file,filename=/usr/share/edk2/aarch64/QEMU_EFI-silent-pflash.raw,auto-read-only=on,discard=unmap \ -blockdev node-name=drive_aavmf_code,driver=raw,read-only=on,file=file_aavmf_code \ -blockdev node-name=file_aavmf_vars,driver=file,filename=/home/kvm_autotest_root/images/avocado-vt-vm1_rhel900-aarch64-virtio-scsi.qcow2_VARS.fd,auto-read-only=on,discard=unmap \ -blockdev node-name=drive_aavmf_vars,driver=raw,read-only=off,file=file_aavmf_vars \ -machine virt,gic-version=host,pflash0=drive_aavmf_code,pflash1=drive_aavmf_vars \ -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \ -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0 \ -nodefaults \ -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \ -device virtio-gpu-pci,bus=pcie-root-port-1,addr=0x0 \ -m 32768 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem0 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem1 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem2 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem3 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem4 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem5 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem6 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem7 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem8 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem9 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem10 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem11 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem12 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem13 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem14 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem15 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem16 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem17 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem18 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem19 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem20 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem21 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem22 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem23 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem24 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem25 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem26 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem27 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem28 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem29 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem30 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem31 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem32 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem33 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem34 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem35 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem36 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem37 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem38 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem39 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem40 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem41 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem42 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem43 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem44 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem45 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem46 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem47 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem48 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem49 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem50 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem51 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem52 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem53 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem54 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem55 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem56 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem57 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem58 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem59 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem60 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem61 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem62 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem63 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem64 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem65 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem66 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem67 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem68 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem69 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem70 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem71 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem72 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem73 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem74 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem75 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem76 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem77 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem78 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem79 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem80 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem81 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem82 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem83 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem84 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem85 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem86 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem87 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem88 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem89 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem90 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem91 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem92 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem93 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem94 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem95 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem96 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem97 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem98 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem99 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem100 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem101 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem102 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem103 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem104 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem105 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem106 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem107 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem108 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem109 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem110 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem111 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem112 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem113 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem114 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem115 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem116 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem117 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem118 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem119 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem120 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem121 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem122 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem123 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem124 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem125 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem126 \ -object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem127 \ -smp 6,maxcpus=6,cores=3,threads=1,sockets=2 \ -numa node,memdev=mem-mem0 \ -numa node,memdev=mem-mem1 \ -numa node,memdev=mem-mem2 \ -numa node,memdev=mem-mem3 \ -numa node,memdev=mem-mem4 \ -numa node,memdev=mem-mem5 \ -numa node,memdev=mem-mem6 \ -numa node,memdev=mem-mem7 \ -numa node,memdev=mem-mem8 \ -numa node,memdev=mem-mem9 \ -numa node,memdev=mem-mem10 \ -numa node,memdev=mem-mem11 \ -numa node,memdev=mem-mem12 \ -numa node,memdev=mem-mem13 \ -numa node,memdev=mem-mem14 \ -numa node,memdev=mem-mem15 \ -numa node,memdev=mem-mem16 \ -numa node,memdev=mem-mem17 \ -numa node,memdev=mem-mem18 \ -numa node,memdev=mem-mem19 \ -numa node,memdev=mem-mem20 \ -numa node,memdev=mem-mem21 \ -numa node,memdev=mem-mem22 \ -numa node,memdev=mem-mem23 \ -numa node,memdev=mem-mem24 \ -numa node,memdev=mem-mem25 \ -numa node,memdev=mem-mem26 \ -numa node,memdev=mem-mem27 \ -numa node,memdev=mem-mem28 \ -numa node,memdev=mem-mem29 \ -numa node,memdev=mem-mem30 \ -numa node,memdev=mem-mem31 \ -numa node,memdev=mem-mem32 \ -numa node,memdev=mem-mem33 \ -numa node,memdev=mem-mem34 \ -numa node,memdev=mem-mem35 \ -numa node,memdev=mem-mem36 \ -numa node,memdev=mem-mem37 \ -numa node,memdev=mem-mem38 \ -numa node,memdev=mem-mem39 \ -numa node,memdev=mem-mem40 \ -numa node,memdev=mem-mem41 \ -numa node,memdev=mem-mem42 \ -numa node,memdev=mem-mem43 \ -numa node,memdev=mem-mem44 \ -numa node,memdev=mem-mem45 \ -numa node,memdev=mem-mem46 \ -numa node,memdev=mem-mem47 \ -numa node,memdev=mem-mem48 \ -numa node,memdev=mem-mem49 \ -numa node,memdev=mem-mem50 \ -numa node,memdev=mem-mem51 \ -numa node,memdev=mem-mem52 \ -numa node,memdev=mem-mem53 \ -numa node,memdev=mem-mem54 \ -numa node,memdev=mem-mem55 \ -numa node,memdev=mem-mem56 \ -numa node,memdev=mem-mem57 \ -numa node,memdev=mem-mem58 \ -numa node,memdev=mem-mem59 \ -numa node,memdev=mem-mem60 \ -numa node,memdev=mem-mem61 \ -numa node,memdev=mem-mem62 \ -numa node,memdev=mem-mem63 \ -numa node,memdev=mem-mem64 \ -numa node,memdev=mem-mem65 \ -numa node,memdev=mem-mem66 \ -numa node,memdev=mem-mem67 \ -numa node,memdev=mem-mem68 \ -numa node,memdev=mem-mem69 \ -numa node,memdev=mem-mem70 \ -numa node,memdev=mem-mem71 \ -numa node,memdev=mem-mem72 \ -numa node,memdev=mem-mem73 \ -numa node,memdev=mem-mem74 \ -numa node,memdev=mem-mem75 \ -numa node,memdev=mem-mem76 \ -numa node,memdev=mem-mem77 \ -numa node,memdev=mem-mem78 \ -numa node,memdev=mem-mem79 \ -numa node,memdev=mem-mem80 \ -numa node,memdev=mem-mem81 \ -numa node,memdev=mem-mem82 \ -numa node,memdev=mem-mem83 \ -numa node,memdev=mem-mem84 \ -numa node,memdev=mem-mem85 \ -numa node,memdev=mem-mem86 \ -numa node,memdev=mem-mem87 \ -numa node,memdev=mem-mem88 \ -numa node,memdev=mem-mem89 \ -numa node,memdev=mem-mem90 \ -numa node,memdev=mem-mem91 \ -numa node,memdev=mem-mem92 \ -numa node,memdev=mem-mem93 \ -numa node,memdev=mem-mem94 \ -numa node,memdev=mem-mem95 \ -numa node,memdev=mem-mem96 \ -numa node,memdev=mem-mem97 \ -numa node,memdev=mem-mem98 \ -numa node,memdev=mem-mem99 \ -numa node,memdev=mem-mem100 \ -numa node,memdev=mem-mem101 \ -numa node,memdev=mem-mem102 \ -numa node,memdev=mem-mem103 \ -numa node,memdev=mem-mem104 \ -numa node,memdev=mem-mem105 \ -numa node,memdev=mem-mem106 \ -numa node,memdev=mem-mem107 \ -numa node,memdev=mem-mem108 \ -numa node,memdev=mem-mem109 \ -numa node,memdev=mem-mem110 \ -numa node,memdev=mem-mem111 \ -numa node,memdev=mem-mem112 \ -numa node,memdev=mem-mem113 \ -numa node,memdev=mem-mem114 \ -numa node,memdev=mem-mem115 \ -numa node,memdev=mem-mem116 \ -numa node,memdev=mem-mem117 \ -numa node,memdev=mem-mem118 \ -numa node,memdev=mem-mem119 \ -numa node,memdev=mem-mem120 \ -numa node,memdev=mem-mem121 \ -numa node,memdev=mem-mem122 \ -numa node,memdev=mem-mem123 \ -numa node,memdev=mem-mem124 \ -numa node,memdev=mem-mem125 \ -numa node,memdev=mem-mem126 \ -numa node,memdev=mem-mem127 \ -cpu 'host' \ -chardev socket,wait=off,path=/tmp/monitor-qmpmonitor1-20211219-205427-izDbxLsL,server=on,id=qmp_id_qmpmonitor1 \ -mon chardev=qmp_id_qmpmonitor1,mode=control \ -chardev socket,wait=off,path=/tmp/monitor-catch_monitor-20211219-205427-izDbxLsL,server=on,id=qmp_id_catch_monitor \ -mon chardev=qmp_id_catch_monitor,mode=control \ -serial unix:'/tmp/serial-serial0-20211219-205427-izDbxLsL',server=on,wait=off \ -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \ -device qemu-xhci,id=usb1,bus=pcie-root-port-2,addr=0x0 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \ -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-3,addr=0x0 \ -blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/rhel900-aarch64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \ -blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \ -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \ -device pcie-root-port,id=pcie-root-port-4,port=0x4,addr=0x1.0x4,bus=pcie.0,chassis=5 \ -device virtio-net-pci,mac=9a:48:8b:5c:b4:11,rombar=0,id=id53U1P3,netdev=idANq8Op,bus=pcie-root-port-4,addr=0x0 \ -netdev tap,id=idANq8Op,vhost=on \ -vnc :20 \ -rtc base=utc,clock=host,driftfix=slew \ -enable-kvm \ -device pcie-root-port,id=pcie-root-port-5,port=0x5,addr=0x1.0x5,bus=pcie.0,chassis=6 \ -device virtio-balloon-pci,id=balloon0,bus=pcie-root-port-5,addr=0x0 \ -device pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x2,chassis=7 \ -device pcie-root-port,id=pcie_extra_root_port_1,addr=0x2.0x1,bus=pcie.0,chassis=8 \ -monitor stdio 2. Check "cat /sys/devices/system/node/possible" information --------------- hit this issue # cat /sys/devices/system/node/possible 0 # lscpu Architecture: aarch64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 6 On-line CPU(s) list: 0-5 Vendor ID: APM BIOS Vendor ID: QEMU BIOS Model name: virt-rhel9.0.0 Model: 2 Thread(s) per core: 1 Core(s) per socket: 3 Socket(s): 2 Stepping: 0x3 BogoMIPS: 80.00 Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid NUMA: NUMA node(s): 1 NUMA node0 CPU(s): 0-5 Actual results: The setting fails when booting the guest with 128 numa nodes Expected results: Consistent with the command line and as expected Additional info: Boot guest with 64 numa nodes it succeeds. Check the serial log 2022-01-17 00:38:28: [ 0.000000] ACPI: SRAT: Node 62 PXM 62 [mem 0x420000000-0x42fffffff] 2022-01-17 00:38:28: [ 0.000000] ACPI: SRAT: Node 63 PXM 63 [mem 0x430000000-0x43fffffff] 2022-01-17 00:38:28: [ 0.000000] ACPI: SRAT: Too many proximity domains. 2022-01-17 00:38:28: [ 0.000000] ACPI: SRAT: SRAT not used.
Zhenyu, I don't think it's a bug. The result is correct and as expected at least. The maximal NUMA nodes, which can be supported on the guest, is staticly defined by the kernel configuration as below. In RHEL9.0, The value in RHEL9.0.0 is 64. When 128 NUMA nodes are detected from SRAT table, all information in the ACPI table is dropped because it's considered as being corrupted. Instead, the dummy node scheme, where only one NUMA node, is used. It's why you just see one NUMA node from the corresponding sysfs file. CONFIG_NODES_SHIFT=6
(In reply to Guowen Shan from comment #1) > Zhenyu, I don't think it's a bug. The result is correct and as expected at > least. > The maximal NUMA nodes, which can be supported on the guest, is staticly > defined > by the kernel configuration as below. In RHEL9.0, The value in RHEL9.0.0 is > 64. > When 128 NUMA nodes are detected from SRAT table, all information in the ACPI > table is dropped because it's considered as being corrupted. Instead, the > dummy > node scheme, where only one NUMA node, is used. It's why you just see one > NUMA > node from the corresponding sysfs file. > > CONFIG_NODES_SHIFT=6 Hello Guowen, Thanks for the feedback Do you know in which version the increase of numa node is expected to be implemented? According to bug-1961072 Comment 19 rhel9 GA will have CONFIG_NODES_SHIFT=9 On the other hand, I found another issue, ARM requires more memory than x86 to boot the guest. When I used -m 4096, numa nodes = 64, node size = 64M, aarch64 guest hang. But with the same configuration, x86_64 can boot successfully. Check the serial log: 2022-01-17 02:34:26: UEFI firmware starting. 2022-01-17 02:34:26: ��ASSERT [MemoryInit] /builddir/build/BUILD/edk2-e1999b264f1f/ArmVirtPkg/Library/QemuVirtMemInfoLib/QemuVirtMemInfoPeiLibConstructor.c(93): NewSize >= 0x08000000 When I used -m 4096, numa nodes = 32, node size = 128M, aarch64 guest can boot successfully.
(In reply to Zhenyu Zhang from comment #2) > Do you know in which version the increase of numa node is expected to be > implemented? > According to bug-1961072 Comment 19 > rhel9 GA will have CONFIG_NODES_SHIFT=9 The MR for that change https://gitlab.com/cki-project/kernel-ark/-/merge_requests/1333 appears to be stuck waiting on a review. I wouldn't count on the change getting merged in time for 9.0. In any case, until the change is made, please only test up to the maximum currently supported, which, as Gavin pointed out, is 64.
(In reply to Andrew Jones from comment #4) > (In reply to Zhenyu Zhang from comment #2) > > Do you know in which version the increase of numa node is expected to be > > implemented? > > According to bug-1961072 Comment 19 > > rhel9 GA will have CONFIG_NODES_SHIFT=9 > > The MR for that change > > https://gitlab.com/cki-project/kernel-ark/-/merge_requests/1333 > > appears to be stuck waiting on a review. I wouldn't count on the change > getting merged in time for 9.0. In any case, until the change is made, > please only test up to the maximum currently supported, which, as Gavin > pointed out, is 64. Hello drew, Do you think the 'node size = 64M, aarch64 guest hang' mentioned in my comment #2 is as expected? Could we use this bug tracker?
(In reply to Zhenyu Zhang from comment #5) > Hello drew, > > Do you think the 'node size = 64M, aarch64 guest hang' mentioned in my > comment #2 is as expected? > Could we use this bug tracker? While 64M is quite small and feels unrealistic to me, I think it's worth Gavin checking the code to see if there's room for improvement, at least in how the problem is handled. That would be low priority work though, so changing the test case to 128M or more per node would make sense to me.
(In reply to Andrew Jones from comment #6) > (In reply to Zhenyu Zhang from comment #5) > > Hello drew, > > > > Do you think the 'node size = 64M, aarch64 guest hang' mentioned in my > > comment #2 is as expected? > > Could we use this bug tracker? > > While 64M is quite small and feels unrealistic to me, I think it's worth > Gavin checking the code to see if there's room for improvement, at least in > how the problem is handled. That would be low priority work though, so > changing the test case to 128M or more per node would make sense to me. Thanks for the feedback, Change the bug title according to comments6.
Zhenyu, could you please provide complete command line to reproduce the node size and boot failure issue?
(In reply to Guowen Shan from comment #8) > Zhenyu, could you please provide complete command line to reproduce > the node size and boot failure issue? Hello Guowen, The complete cmd: /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -sandbox on \ -blockdev node-name=file_aavmf_code,driver=file,filename=/usr/share/edk2/aarch64/QEMU_EFI-silent-pflash.raw,auto-read-only=on,discard=unmap \ -blockdev node-name=drive_aavmf_code,driver=raw,read-only=on,file=file_aavmf_code \ -blockdev node-name=file_aavmf_vars,driver=file,filename=/home/kvm_autotest_root/images/avocado-vt-vm1_rhel900-aarch64-virtio-scsi.qcow2_VARS.fd,auto-read-only=on,discard=unmap \ -blockdev node-name=drive_aavmf_vars,driver=raw,read-only=off,file=file_aavmf_vars \ -machine virt,gic-version=host,pflash0=drive_aavmf_code,pflash1=drive_aavmf_vars \ -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \ -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0 \ -nodefaults \ -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \ -device virtio-gpu-pci,bus=pcie-root-port-1,addr=0x0 \ -m 4096 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem0 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem1 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem2 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem3 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem4 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem5 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem6 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem7 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem8 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem9 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem10 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem11 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem12 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem13 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem14 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem15 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem16 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem17 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem18 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem19 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem20 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem21 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem22 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem23 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem24 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem25 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem26 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem27 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem28 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem29 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem30 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem31 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem32 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem33 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem34 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem35 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem36 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem37 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem38 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem39 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem40 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem41 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem42 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem43 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem44 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem45 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem46 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem47 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem48 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem49 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem50 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem51 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem52 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem53 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem54 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem55 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem56 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem57 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem58 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem59 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem60 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem61 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem62 \ -object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem63 \ -smp 6,maxcpus=6,cores=3,threads=1,sockets=2 \ -numa node,memdev=mem-mem0 \ -numa node,memdev=mem-mem1 \ -numa node,memdev=mem-mem2 \ -numa node,memdev=mem-mem3 \ -numa node,memdev=mem-mem4 \ -numa node,memdev=mem-mem5 \ -numa node,memdev=mem-mem6 \ -numa node,memdev=mem-mem7 \ -numa node,memdev=mem-mem8 \ -numa node,memdev=mem-mem9 \ -numa node,memdev=mem-mem10 \ -numa node,memdev=mem-mem11 \ -numa node,memdev=mem-mem12 \ -numa node,memdev=mem-mem13 \ -numa node,memdev=mem-mem14 \ -numa node,memdev=mem-mem15 \ -numa node,memdev=mem-mem16 \ -numa node,memdev=mem-mem17 \ -numa node,memdev=mem-mem18 \ -numa node,memdev=mem-mem19 \ -numa node,memdev=mem-mem20 \ -numa node,memdev=mem-mem21 \ -numa node,memdev=mem-mem22 \ -numa node,memdev=mem-mem23 \ -numa node,memdev=mem-mem24 \ -numa node,memdev=mem-mem25 \ -numa node,memdev=mem-mem26 \ -numa node,memdev=mem-mem27 \ -numa node,memdev=mem-mem28 \ -numa node,memdev=mem-mem29 \ -numa node,memdev=mem-mem30 \ -numa node,memdev=mem-mem31 \ -numa node,memdev=mem-mem32 \ -numa node,memdev=mem-mem33 \ -numa node,memdev=mem-mem34 \ -numa node,memdev=mem-mem35 \ -numa node,memdev=mem-mem36 \ -numa node,memdev=mem-mem37 \ -numa node,memdev=mem-mem38 \ -numa node,memdev=mem-mem39 \ -numa node,memdev=mem-mem40 \ -numa node,memdev=mem-mem41 \ -numa node,memdev=mem-mem42 \ -numa node,memdev=mem-mem43 \ -numa node,memdev=mem-mem44 \ -numa node,memdev=mem-mem45 \ -numa node,memdev=mem-mem46 \ -numa node,memdev=mem-mem47 \ -numa node,memdev=mem-mem48 \ -numa node,memdev=mem-mem49 \ -numa node,memdev=mem-mem50 \ -numa node,memdev=mem-mem51 \ -numa node,memdev=mem-mem52 \ -numa node,memdev=mem-mem53 \ -numa node,memdev=mem-mem54 \ -numa node,memdev=mem-mem55 \ -numa node,memdev=mem-mem56 \ -numa node,memdev=mem-mem57 \ -numa node,memdev=mem-mem58 \ -numa node,memdev=mem-mem59 \ -numa node,memdev=mem-mem60 \ -numa node,memdev=mem-mem61 \ -numa node,memdev=mem-mem62 \ -numa node,memdev=mem-mem63 \ -cpu 'host' \ -chardev socket,server=on,id=qmp_id_qmpmonitor1,path=/tmp/monitor-qmpmonitor1-20220117-203351-6r2MJ5Z5,wait=off \ -mon chardev=qmp_id_qmpmonitor1,mode=control \ -chardev socket,server=on,id=qmp_id_catch_monitor,path=/tmp/monitor-catch_monitor-20220117-203351-6r2MJ5Z5,wait=off \ -mon chardev=qmp_id_catch_monitor,mode=control \ -serial unix:'/tmp/serial-serial0-20220117-203351-6r2MJ5Z5',server=on,wait=off \ -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \ -device qemu-xhci,id=usb1,bus=pcie-root-port-2,addr=0x0 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \ -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-3,addr=0x0 \ -blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/rhel900-aarch64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \ -blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \ -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \ -device pcie-root-port,id=pcie-root-port-4,port=0x4,addr=0x1.0x4,bus=pcie.0,chassis=5 \ -device virtio-net-pci,mac=9a:c3:8a:ec:54:9d,rombar=0,id=idu0OOqB,netdev=id1YGvvJ,bus=pcie-root-port-4,addr=0x0 \ -netdev tap,id=id1YGvvJ,vhost=on \ -vnc :20 \ -rtc base=utc,clock=host,driftfix=slew \ -enable-kvm \ -device pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x2,chassis=6 \ -device pcie-root-port,id=pcie_extra_root_port_1,addr=0x2.0x1,bus=pcie.0,chassis=7 \ -monitor stdio (qemu) info status VM status: running (qemu) info numa 64 nodes node 0 cpus: 0 node 0 size: 64 MB node 0 plugged: 0 MB 2. check the serial log: ----- guest hangs # nc -U /tmp/serial-serial0-20220117-203351-6r2MJ5Z5 2022-01-17 20:33:59: UEFI firmware starting. 2022-01-17 20:33:59: ��ASSERT [MemoryInit] /builddir/build/BUILD/edk2-e1999b264f1f/ArmVirtPkg/Library/QemuVirtMemInfoLib/QemuVirtMemInfoPeiLibConstructor.c(93): NewSize >= 0x08000000 Additional info: Boot guest with 64 numa nodes with 128M it succeeds. -m 8192 \ -object memory-backend-ram,size=128M,prealloc=yes,policy=default,id=mem-mem0 \ .... -object memory-backend-ram,size=128M,prealloc=yes,policy=default,id=mem-mem63 \
BTW do you think the error in Comment 3 needs to be reported as a new bug to track it down?
(In reply to Zhenyu Zhang from comment #10) > BTW do you think the error in Comment 3 needs to be reported as a new bug to > track it down? > Since we already use this bugzilla to start the discussion, it's fine to reuse the current bugzilla to track the issue, mentioned in comment#3
I believe it's a EDK2 bug. The issue is also existing in upstream QEMU and EDK2. I guess we need EDK2 developer to be involved here. In the following source file, there is comment explaining why 128MB is required. However, the code missed one case at least: the multiple memory devices in the device-tree are contigous in terms of their address ranges. I assume the permanent PEI RAM will use up to 128MB. However, @NewSize is the size of memory device whose base address is lowest. It means it's not the whole memory size that the machine has, especially when multiple memory devices are contiguous in their address ranges. edk2/ArmVirtPkg/Library/QemuVirtMemInfoLib/QemuVirtMemInfoPeiLibConstructor.c RETURN_STATUS EFIAPI QemuVirtMemInfoPeiLibConstructor ( VOID ) { : // // We need to make sure that the machine we are running on has at least // 128 MB of memory configured, and is currently executing this binary from // NOR flash. This prevents a device tree image in DRAM from getting // clobbered when our caller installs permanent PEI RAM, before we have a // chance of marking its location as reserved or copy it to a freshly // allocated block in the permanent PEI RAM in the platform PEIM. // ASSERT (NewSize >= SIZE_128MB); : } First of all, the issue can be reproduced with the following command lines on upstream qemu, the booting is stuck. /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \ -accel kvm -machine virt,gic-version=host \ -cpu host -smp 8,sockets=2,cores=2,threads=2 \ -m 1024M,slots=16,maxmem=64G \ -object memory-backend-ram,id=mem0,size=64M \ -object memory-backend-ram,id=mem1,size=960M \ -numa node,nodeid=0,cpus=0-3,memdev=mem0 \ -numa node,nodeid=1,cpus=4-7,memdev=mem1 \ : The issue disappears when the following command lines are used. -object memory-backend-ram,id=mem0,size=960M \ -object memory-backend-ram,id=mem1,size=64M \
(In reply to Guowen Shan from comment #11) > (In reply to Zhenyu Zhang from comment #10) > > BTW do you think the error in Comment 3 needs to be reported as a new bug to > > track it down? > > > > Since we already use this bugzilla to start the discussion, it's fine to > reuse > the current bugzilla to track the issue, mentioned in comment#3 > Zhenyu, please help to create another bug to track the issue reported from comment#3 and assign to me. I was thinking both two issues share same root cause, but it's unlikely true. Thanks, Gavin
(In reply to Guowen Shan from comment #13) > Zhenyu, please help to create another bug to track the issue reported from > comment#3 and assign to me. I was thinking both two issues share same root > cause, but it's unlikely true. > > Thanks, > Gavin Got it, create the following bug to track the issue. Bug 2041823 - [aarch64][numa] When there are at least 6 Numa nodes serial log shows 'arch topology borken'
> I assume the permanent PEI RAM will use up to 128MB. However, @NewSize is > the size of memory device whose base address is lowest. It means it's not > the whole memory size that the machine has, especially when multiple memory > devices are contiguous in their address ranges. > First of all, the issue can be reproduced with the following command lines > on upstream qemu, the booting is stuck. > > /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \ > -accel kvm -machine virt,gic-version=host \ > -cpu host -smp 8,sockets=2,cores=2,threads=2 \ > -m 1024M,slots=16,maxmem=64G \ > -object memory-backend-ram,id=mem0,size=64M \ > -object memory-backend-ram,id=mem1,size=960M \ > -numa node,nodeid=0,cpus=0-3,memdev=mem0 \ > -numa node,nodeid=1,cpus=4-7,memdev=mem1 \ Why does this matter in the first place? Creating numa nodes which are that small looks rather pointless to me. I'd suggest to simply document that 128M is the smallest numa node size supported and be done with it (unless someone can come up with a good argument why such a configuration makes sense).
(In reply to Gerd Hoffmann from comment #15) > I'd suggest to simply document that 128M is the smallest numa node size > supported and be done with it (unless someone can come up with a good > argument why such a configuration makes sense). Hello Gerd, I admit it may not be that important, but we should have a reasonable explanation. The current arm and x86_64 are both 4096 PAGESIZE, but the x86 node size is 32M can successfully boot successfully. ARM doubles the size but it doesn't work. Our goal is for ARM to gradually support the same functionality as the x86 platform. -machine q35 \ -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \ -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0 \ -nodefaults \ -device VGA,bus=pcie.0,addr=0x2 \ -smp 6,maxcpus=6,cores=3,threads=1,dies=1,sockets=2 \ -m 1024M,slots=16,maxmem=64G \ -object memory-backend-ram,id=mem0,size=64M \ -----------------------> boot successfully -object memory-backend-ram,id=mem1,size=960M \ -numa node,nodeid=0,cpus=0-3,memdev=mem0 \ -numa node,nodeid=1,cpus=4-7,memdev=mem1 \
I agree with Gerd (comment 15); the ArmVirtQemu platform simply has this requirement that the "system RAM" starting at GPA 1GiB be at least 128MiB in size. This is a foundational design tenet of the ArmVirtQemu platform. The PEI phase is in general not flexible enough to deal with any random physical RAM layout; there must be some RAM that is *always* available at a fixed address and in some previously-known minimum size. For ArmVirtQemu, that means 128MiB at 1GiB. For the pc and q35 machine types, the same requirements exist, only the numbers differ -- we expect (IIRC) 128MiB in size just the same, and based at GPA 0 -- with the usual x86 RAM holes under 1MB and 4GB. In ArmVirtQemu, "OvmfPkg/Fdt/HighMemDxe" handles *additional* memory areas. They are discovered from QEMU's device tree, and the RAM discovered in this way is made available to the firmware (and later to the OS) during the DXE phase. Note: *not before* the DXE phase; so any RAM discovered here is unavailable for use during PEI. In "ArmVirtPkg/Library/QemuVirtMemInfoLib/QemuVirtMemInfoPeiLibConstructor.c", we have an explicit sanity check that the hard-coded memory base (1GiB) that we assume everywhere else in SEC and PEI, namely "PcdSystemMemoryBase", equals the lowest-address memory range exposed by QEMU in the DTB. If that assumption is shown incorrect, the platform must not continue booting. (For example, PcdCPUCoresStackBase = 1GiB + 0x7c000, per "ArmVirtQemu.dsc", and that constant is used at the earliest stage of guest boot, namely in SEC, ArmPlatformPkg/PrePeiCore.) ... I guess QemuVirtMemInfoPeiLibConstructor() [ArmVirtPkg/Library/QemuVirtMemInfoLib/QemuVirtMemInfoPeiLibConstructor.c] could be improved to build the largest contiguous RAM range that starts at the lowest address, and then check if [1GiB, 1GiB+128MiB) is a subset of that interval. But what's the point?
(In reply to Laszlo Ersek from comment #17) > I agree with Gerd (comment 15); the ArmVirtQemu platform simply has this > requirement that the "system RAM" starting at GPA 1GiB be at least 128MiB in > size. This is a foundational design tenet of the ArmVirtQemu platform. Thanks for the updated information. I think this is a good explanation. If we recommend a minimum 128MiB node size for ARM, could we increase the prompt? Instead of hanging, we can check and report errors when qemu boot guests.
(In reply to Zhenyu Zhang from comment #18) > If we recommend a minimum 128MiB node size for ARM, could we increase the > prompt? > Instead of hanging, we can check and report errors when qemu boot guests. Not sure what you mean by "increase the prompt" -- if that's about raising the limit in QEMU, it seems OK to me. The firmware is very limited in reporting issues, let alone in recovering them. Especially in such an early stage as PEI. In such situations the firmware commonly logs an error and intentionally hangs.
(In reply to Laszlo Ersek from comment #19) > Not sure what you mean by "increase the prompt" -- if that's about raising > the limit in QEMU, it seems OK to me. Sure I also think it would be very friendly to add a hint to QEMU. Of course, it's up to you to decide how to do it. Tips like the following are very friendly to me so that the guest will not be boot. Instead, report the problem directly in QEMU. # /usr/libexec/qemu-kvm -cpu ''host -smp 64 qemu-kvm: warning: Number of SMP cpus requested (64) exceeds the recommended cpus supported by KVM (32) Number of SMP cpus requested (64) exceeds the maximum cpus supported by KVM (32)
> Sure I also think it would be very friendly to add a hint to QEMU. Yes, having qemu check that and throw an error makes sense. Much easier to discover for the user what the problem is.
I don't think it's a good idea to have the check in QEMU because QEMU and EDK2 are bound in that way. It'd better to keep them separate and detached. Gerd, is there any way to print a error message in EDK2 for this specific case?
(In reply to Guowen Shan from comment #22) > Gerd, is there any way to print a error message in EDK2 for > this specific case? Laszlo answered that one already (comment 19): > The firmware is very limited in reporting issues, let alone in recovering > them. Especially in such an early stage as PEI. In such situations the > firmware commonly logs an error and intentionally hangs. qemu is in a *much* better position to print a helpful error message, specifically when it comes to give hints to the user how the config should be fixed.
I'd also like to emphasize a general concept here (which I've come to learn way too late): the idea that the platform (QEMU) and the firmware (SeaBIOS, edk2) should be *completely independent* is wrong. They should be kept as independent as much as *easily* doable, but not more independent than that. Let's ask ourselves: if the same issue presented itself on the Hyper-V platform (meaning all of: hypervisor, management UI, and guest firmware), where would Microsoft implement the error? (NB, for another example: While keeping ACPI generation in QEMU is a good thing, let's not forget the original problem that required that solution: the impossibility to treat SeaBIOS and QEMU together as a product, upstream. This split between the developer communities led to incredible development pains and the need for an elaborate information channel. We should have as few of those as possible.)
Ok. I've posted QEMU patch for comments. Lets see what comments I will receive from upstream community. https://lists.nongnu.org/archive/html/qemu-arm/2022-02/msg00445.html
Posted v1 patch to upstream for comments. We still think the best place to have the check is EDK2. Laszlo and Gerd, feel free to reassign this bug to yourself if it's fine to provide the user-friendly error message in EDK2. https://lore.kernel.org/all/20220301114257.2bppjnjqj7dgxztc@sirius.home.kraxel.org/T/
I've replied in the qemu-devel thread.
I discussed with QA (Zhenyu) and agree to close it as "won't fix". From the edk2_assert(), we already know the memory in the first NUMA node shouldn't less than 128MB. So it's fine not to provide explicit error message from edk2 side.
Clearing NEEDINFO as discussion occurred on upstream and decision has been made to close the BZ.