RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2041323 - [aarch64][numa] When node size is less than 128M aarch64 guest hangs
Summary: [aarch64][numa] When node size is less than 128M aarch64 guest hangs
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: qemu-kvm
Version: 9.0
Hardware: aarch64
OS: Linux
low
low
Target Milestone: rc
: 9.1
Assignee: Guowen Shan
QA Contact: Zhenyu Zhang
URL:
Whiteboard:
Depends On:
Blocks: 1924294
TreeView+ depends on / blocked
 
Reported: 2022-01-17 06:13 UTC by Zhenyu Zhang
Modified: 2022-03-31 01:22 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-03-15 08:02:20 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-108253 0 None None None 2022-01-17 06:19:14 UTC

Description Zhenyu Zhang 2022-01-17 06:13:54 UTC
Description of problem:
The setting fails when booting the guest with 128 numa nodes.
Check "cat /sys/devices/system/node/possible" information, which is 0 while expected numa node is 0-127

Version-Release number of selected component (if applicable):
Host Distro: RHEL-9.0.0-20220115.2
Host Kernel: kernel-5.14.0-42.el9.aarch64
Guest Kernel: kernel-5.14.0-39.el9.aarch64
qemu-kvm: qemu-kvm-6.2.0-3.el9

How reproducible:
100%

Steps to Reproduce:
1. Boot guest with 128 numa nodes
/usr/libexec/qemu-kvm \
-name 'avocado-vt-vm1'  \
-sandbox on  \
-blockdev node-name=file_aavmf_code,driver=file,filename=/usr/share/edk2/aarch64/QEMU_EFI-silent-pflash.raw,auto-read-only=on,discard=unmap \
-blockdev node-name=drive_aavmf_code,driver=raw,read-only=on,file=file_aavmf_code \
-blockdev node-name=file_aavmf_vars,driver=file,filename=/home/kvm_autotest_root/images/avocado-vt-vm1_rhel900-aarch64-virtio-scsi.qcow2_VARS.fd,auto-read-only=on,discard=unmap \
-blockdev node-name=drive_aavmf_vars,driver=raw,read-only=off,file=file_aavmf_vars \
-machine virt,gic-version=host,pflash0=drive_aavmf_code,pflash1=drive_aavmf_vars \
-device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
-device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
-nodefaults \
-device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
-device virtio-gpu-pci,bus=pcie-root-port-1,addr=0x0 \
-m 32768 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem0 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem1 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem2 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem3 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem4 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem5 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem6 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem7 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem8 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem9 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem10 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem11 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem12 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem13 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem14 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem15 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem16 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem17 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem18 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem19 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem20 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem21 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem22 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem23 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem24 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem25 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem26 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem27 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem28 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem29 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem30 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem31 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem32 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem33 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem34 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem35 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem36 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem37 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem38 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem39 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem40 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem41 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem42 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem43 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem44 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem45 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem46 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem47 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem48 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem49 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem50 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem51 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem52 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem53 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem54 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem55 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem56 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem57 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem58 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem59 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem60 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem61 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem62 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem63 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem64 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem65 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem66 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem67 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem68 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem69 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem70 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem71 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem72 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem73 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem74 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem75 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem76 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem77 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem78 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem79 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem80 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem81 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem82 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem83 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem84 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem85 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem86 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem87 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem88 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem89 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem90 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem91 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem92 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem93 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem94 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem95 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem96 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem97 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem98 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem99 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem100 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem101 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem102 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem103 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem104 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem105 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem106 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem107 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem108 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem109 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem110 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem111 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem112 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem113 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem114 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem115 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem116 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem117 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem118 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem119 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem120 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem121 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem122 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem123 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem124 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem125 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem126 \
-object memory-backend-ram,size=256M,prealloc=yes,policy=default,id=mem-mem127  \
-smp 6,maxcpus=6,cores=3,threads=1,sockets=2  \
-numa node,memdev=mem-mem0  \
-numa node,memdev=mem-mem1  \
-numa node,memdev=mem-mem2  \
-numa node,memdev=mem-mem3  \
-numa node,memdev=mem-mem4  \
-numa node,memdev=mem-mem5  \
-numa node,memdev=mem-mem6  \
-numa node,memdev=mem-mem7  \
-numa node,memdev=mem-mem8  \
-numa node,memdev=mem-mem9  \
-numa node,memdev=mem-mem10  \
-numa node,memdev=mem-mem11  \
-numa node,memdev=mem-mem12  \
-numa node,memdev=mem-mem13  \
-numa node,memdev=mem-mem14  \
-numa node,memdev=mem-mem15  \
-numa node,memdev=mem-mem16  \
-numa node,memdev=mem-mem17  \
-numa node,memdev=mem-mem18  \
-numa node,memdev=mem-mem19  \
-numa node,memdev=mem-mem20  \
-numa node,memdev=mem-mem21  \
-numa node,memdev=mem-mem22  \
-numa node,memdev=mem-mem23  \
-numa node,memdev=mem-mem24  \
-numa node,memdev=mem-mem25  \
-numa node,memdev=mem-mem26  \
-numa node,memdev=mem-mem27  \
-numa node,memdev=mem-mem28  \
-numa node,memdev=mem-mem29  \
-numa node,memdev=mem-mem30  \
-numa node,memdev=mem-mem31  \
-numa node,memdev=mem-mem32  \
-numa node,memdev=mem-mem33  \
-numa node,memdev=mem-mem34  \
-numa node,memdev=mem-mem35  \
-numa node,memdev=mem-mem36  \
-numa node,memdev=mem-mem37  \
-numa node,memdev=mem-mem38  \
-numa node,memdev=mem-mem39  \
-numa node,memdev=mem-mem40  \
-numa node,memdev=mem-mem41  \
-numa node,memdev=mem-mem42  \
-numa node,memdev=mem-mem43  \
-numa node,memdev=mem-mem44  \
-numa node,memdev=mem-mem45  \
-numa node,memdev=mem-mem46  \
-numa node,memdev=mem-mem47  \
-numa node,memdev=mem-mem48  \
-numa node,memdev=mem-mem49  \
-numa node,memdev=mem-mem50  \
-numa node,memdev=mem-mem51  \
-numa node,memdev=mem-mem52  \
-numa node,memdev=mem-mem53  \
-numa node,memdev=mem-mem54  \
-numa node,memdev=mem-mem55  \
-numa node,memdev=mem-mem56  \
-numa node,memdev=mem-mem57  \
-numa node,memdev=mem-mem58  \
-numa node,memdev=mem-mem59  \
-numa node,memdev=mem-mem60  \
-numa node,memdev=mem-mem61  \
-numa node,memdev=mem-mem62  \
-numa node,memdev=mem-mem63  \
-numa node,memdev=mem-mem64  \
-numa node,memdev=mem-mem65  \
-numa node,memdev=mem-mem66  \
-numa node,memdev=mem-mem67  \
-numa node,memdev=mem-mem68  \
-numa node,memdev=mem-mem69  \
-numa node,memdev=mem-mem70  \
-numa node,memdev=mem-mem71  \
-numa node,memdev=mem-mem72  \
-numa node,memdev=mem-mem73  \
-numa node,memdev=mem-mem74  \
-numa node,memdev=mem-mem75  \
-numa node,memdev=mem-mem76  \
-numa node,memdev=mem-mem77  \
-numa node,memdev=mem-mem78  \
-numa node,memdev=mem-mem79  \
-numa node,memdev=mem-mem80  \
-numa node,memdev=mem-mem81  \
-numa node,memdev=mem-mem82  \
-numa node,memdev=mem-mem83  \
-numa node,memdev=mem-mem84  \
-numa node,memdev=mem-mem85  \
-numa node,memdev=mem-mem86  \
-numa node,memdev=mem-mem87  \
-numa node,memdev=mem-mem88  \
-numa node,memdev=mem-mem89  \
-numa node,memdev=mem-mem90  \
-numa node,memdev=mem-mem91  \
-numa node,memdev=mem-mem92  \
-numa node,memdev=mem-mem93  \
-numa node,memdev=mem-mem94  \
-numa node,memdev=mem-mem95  \
-numa node,memdev=mem-mem96  \
-numa node,memdev=mem-mem97  \
-numa node,memdev=mem-mem98  \
-numa node,memdev=mem-mem99  \
-numa node,memdev=mem-mem100  \
-numa node,memdev=mem-mem101  \
-numa node,memdev=mem-mem102  \
-numa node,memdev=mem-mem103  \
-numa node,memdev=mem-mem104  \
-numa node,memdev=mem-mem105  \
-numa node,memdev=mem-mem106  \
-numa node,memdev=mem-mem107  \
-numa node,memdev=mem-mem108  \
-numa node,memdev=mem-mem109  \
-numa node,memdev=mem-mem110  \
-numa node,memdev=mem-mem111  \
-numa node,memdev=mem-mem112  \
-numa node,memdev=mem-mem113  \
-numa node,memdev=mem-mem114  \
-numa node,memdev=mem-mem115  \
-numa node,memdev=mem-mem116  \
-numa node,memdev=mem-mem117  \
-numa node,memdev=mem-mem118  \
-numa node,memdev=mem-mem119  \
-numa node,memdev=mem-mem120  \
-numa node,memdev=mem-mem121  \
-numa node,memdev=mem-mem122  \
-numa node,memdev=mem-mem123  \
-numa node,memdev=mem-mem124  \
-numa node,memdev=mem-mem125  \
-numa node,memdev=mem-mem126  \
-numa node,memdev=mem-mem127  \
-cpu 'host' \
-chardev socket,wait=off,path=/tmp/monitor-qmpmonitor1-20211219-205427-izDbxLsL,server=on,id=qmp_id_qmpmonitor1  \
-mon chardev=qmp_id_qmpmonitor1,mode=control \
-chardev socket,wait=off,path=/tmp/monitor-catch_monitor-20211219-205427-izDbxLsL,server=on,id=qmp_id_catch_monitor  \
-mon chardev=qmp_id_catch_monitor,mode=control  \
-serial unix:'/tmp/serial-serial0-20211219-205427-izDbxLsL',server=on,wait=off \
-device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
-device qemu-xhci,id=usb1,bus=pcie-root-port-2,addr=0x0 \
-device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
-device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
-device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-3,addr=0x0 \
-blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/rhel900-aarch64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \
-blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \
-device scsi-hd,id=image1,drive=drive_image1,write-cache=on \
-device pcie-root-port,id=pcie-root-port-4,port=0x4,addr=0x1.0x4,bus=pcie.0,chassis=5 \
-device virtio-net-pci,mac=9a:48:8b:5c:b4:11,rombar=0,id=id53U1P3,netdev=idANq8Op,bus=pcie-root-port-4,addr=0x0  \
-netdev tap,id=idANq8Op,vhost=on  \
-vnc :20  \
-rtc base=utc,clock=host,driftfix=slew \
-enable-kvm \
-device pcie-root-port,id=pcie-root-port-5,port=0x5,addr=0x1.0x5,bus=pcie.0,chassis=6 \
-device virtio-balloon-pci,id=balloon0,bus=pcie-root-port-5,addr=0x0 \
-device pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x2,chassis=7 \
-device pcie-root-port,id=pcie_extra_root_port_1,addr=0x2.0x1,bus=pcie.0,chassis=8 \
-monitor stdio 

2. Check "cat /sys/devices/system/node/possible" information  --------------- hit this issue
# cat /sys/devices/system/node/possible
0

# lscpu
Architecture:          aarch64
  CPU op-mode(s):      32-bit, 64-bit
  Byte Order:          Little Endian
CPU(s):                6
  On-line CPU(s) list: 0-5
Vendor ID:             APM
  BIOS Vendor ID:      QEMU
  BIOS Model name:     virt-rhel9.0.0
  Model:               2
  Thread(s) per core:  1
  Core(s) per socket:  3
  Socket(s):           2
  Stepping:            0x3
  BogoMIPS:            80.00
  Flags:               fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
NUMA:                  
  NUMA node(s):        1
  NUMA node0 CPU(s):   0-5


Actual results:
The setting fails when booting the guest with 128 numa nodes

Expected results:
Consistent with the command line and as expected

Additional info:
Boot guest with 64 numa nodes it succeeds.

Check the serial log
2022-01-17 00:38:28: [    0.000000] ACPI: SRAT: Node 62 PXM 62 [mem 0x420000000-0x42fffffff]
2022-01-17 00:38:28: [    0.000000] ACPI: SRAT: Node 63 PXM 63 [mem 0x430000000-0x43fffffff]
2022-01-17 00:38:28: [    0.000000] ACPI: SRAT: Too many proximity domains.
2022-01-17 00:38:28: [    0.000000] ACPI: SRAT: SRAT not used.

Comment 1 Guowen Shan 2022-01-17 07:03:01 UTC
Zhenyu, I don't think it's a bug. The result is correct and as expected at least.
The maximal NUMA nodes, which can be supported on the guest, is staticly defined
by the kernel configuration as below. In RHEL9.0, The value in RHEL9.0.0 is 64.
When 128 NUMA nodes are detected from SRAT table, all information in the ACPI
table is dropped because it's considered as being corrupted. Instead, the dummy
node scheme, where only one NUMA node, is used. It's why you just see one NUMA
node from the corresponding sysfs file.

   CONFIG_NODES_SHIFT=6

Comment 2 Zhenyu Zhang 2022-01-17 07:47:08 UTC
(In reply to Guowen Shan from comment #1)
> Zhenyu, I don't think it's a bug. The result is correct and as expected at
> least.
> The maximal NUMA nodes, which can be supported on the guest, is staticly
> defined
> by the kernel configuration as below. In RHEL9.0, The value in RHEL9.0.0 is
> 64.
> When 128 NUMA nodes are detected from SRAT table, all information in the ACPI
> table is dropped because it's considered as being corrupted. Instead, the
> dummy
> node scheme, where only one NUMA node, is used. It's why you just see one
> NUMA
> node from the corresponding sysfs file.
> 
>    CONFIG_NODES_SHIFT=6

Hello Guowen,

Thanks for the feedback
Do you know in which version the increase of numa node is expected to be implemented?
According to bug-1961072 Comment 19
rhel9 GA will have CONFIG_NODES_SHIFT=9

On the other hand, I found another issue, 
ARM requires more memory than x86 to boot the guest.

When I used -m 4096, numa nodes = 64, node size = 64M, aarch64 guest hang. 
But with the same configuration, x86_64 can boot successfully.

Check the serial log:
2022-01-17 02:34:26: UEFI firmware starting.
2022-01-17 02:34:26: ��ASSERT [MemoryInit] /builddir/build/BUILD/edk2-e1999b264f1f/ArmVirtPkg/Library/QemuVirtMemInfoLib/QemuVirtMemInfoPeiLibConstructor.c(93): NewSize >= 0x08000000

When I used -m 4096, numa nodes = 32, node size = 128M, aarch64 guest can boot successfully.

Comment 4 Andrew Jones 2022-01-17 09:48:35 UTC
(In reply to Zhenyu Zhang from comment #2)
> Do you know in which version the increase of numa node is expected to be
> implemented?
> According to bug-1961072 Comment 19
> rhel9 GA will have CONFIG_NODES_SHIFT=9

The MR for that change

https://gitlab.com/cki-project/kernel-ark/-/merge_requests/1333

appears to be stuck waiting on a review. I wouldn't count on the change getting merged in time for 9.0. In any case, until the change is made, please only test up to the maximum currently supported, which, as Gavin pointed out, is 64.

Comment 5 Zhenyu Zhang 2022-01-17 10:58:05 UTC
(In reply to Andrew Jones from comment #4)
> (In reply to Zhenyu Zhang from comment #2)
> > Do you know in which version the increase of numa node is expected to be
> > implemented?
> > According to bug-1961072 Comment 19
> > rhel9 GA will have CONFIG_NODES_SHIFT=9
> 
> The MR for that change
> 
> https://gitlab.com/cki-project/kernel-ark/-/merge_requests/1333
> 
> appears to be stuck waiting on a review. I wouldn't count on the change
> getting merged in time for 9.0. In any case, until the change is made,
> please only test up to the maximum currently supported, which, as Gavin
> pointed out, is 64.

Hello drew,

Do you think the 'node size = 64M, aarch64 guest hang' mentioned in my comment #2 is as expected? 
Could we use this bug tracker?

Comment 6 Andrew Jones 2022-01-17 12:48:21 UTC
(In reply to Zhenyu Zhang from comment #5)
> Hello drew,
> 
> Do you think the 'node size = 64M, aarch64 guest hang' mentioned in my
> comment #2 is as expected? 
> Could we use this bug tracker?

While 64M is quite small and feels unrealistic to me, I think it's worth Gavin checking the code to see if there's room for improvement, at least in how the problem is handled. That would be low priority work though, so changing the test case to 128M or more per node would make sense to me.

Comment 7 Zhenyu Zhang 2022-01-17 23:52:04 UTC
(In reply to Andrew Jones from comment #6)
> (In reply to Zhenyu Zhang from comment #5)
> > Hello drew,
> > 
> > Do you think the 'node size = 64M, aarch64 guest hang' mentioned in my
> > comment #2 is as expected? 
> > Could we use this bug tracker?
> 
> While 64M is quite small and feels unrealistic to me, I think it's worth
> Gavin checking the code to see if there's room for improvement, at least in
> how the problem is handled. That would be low priority work though, so
> changing the test case to 128M or more per node would make sense to me.

Thanks for the feedback,
Change the bug title according to comments6.

Comment 8 Guowen Shan 2022-01-18 01:28:14 UTC
Zhenyu, could you please provide complete command line to reproduce
the node size and boot failure issue?

Comment 9 Zhenyu Zhang 2022-01-18 01:52:03 UTC
(In reply to Guowen Shan from comment #8)
> Zhenyu, could you please provide complete command line to reproduce
> the node size and boot failure issue?

Hello Guowen,

The complete cmd:
/usr/libexec/qemu-kvm \
-name 'avocado-vt-vm1'  \
-sandbox on  \
-blockdev node-name=file_aavmf_code,driver=file,filename=/usr/share/edk2/aarch64/QEMU_EFI-silent-pflash.raw,auto-read-only=on,discard=unmap \
-blockdev node-name=drive_aavmf_code,driver=raw,read-only=on,file=file_aavmf_code \
-blockdev node-name=file_aavmf_vars,driver=file,filename=/home/kvm_autotest_root/images/avocado-vt-vm1_rhel900-aarch64-virtio-scsi.qcow2_VARS.fd,auto-read-only=on,discard=unmap \
-blockdev node-name=drive_aavmf_vars,driver=raw,read-only=off,file=file_aavmf_vars \
-machine virt,gic-version=host,pflash0=drive_aavmf_code,pflash1=drive_aavmf_vars \
-device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
-device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
-nodefaults \
-device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
-device virtio-gpu-pci,bus=pcie-root-port-1,addr=0x0 \
-m 4096 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem0 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem1 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem2 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem3 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem4 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem5 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem6 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem7 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem8 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem9 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem10 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem11 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem12 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem13 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem14 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem15 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem16 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem17 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem18 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem19 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem20 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem21 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem22 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem23 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem24 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem25 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem26 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem27 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem28 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem29 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem30 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem31 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem32 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem33 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem34 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem35 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem36 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem37 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem38 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem39 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem40 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem41 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem42 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem43 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem44 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem45 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem46 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem47 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem48 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem49 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem50 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem51 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem52 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem53 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem54 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem55 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem56 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem57 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem58 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem59 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem60 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem61 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem62 \
-object memory-backend-ram,size=64M,prealloc=yes,policy=default,id=mem-mem63  \
-smp 6,maxcpus=6,cores=3,threads=1,sockets=2  \
-numa node,memdev=mem-mem0  \
-numa node,memdev=mem-mem1  \
-numa node,memdev=mem-mem2  \
-numa node,memdev=mem-mem3  \
-numa node,memdev=mem-mem4  \
-numa node,memdev=mem-mem5  \
-numa node,memdev=mem-mem6  \
-numa node,memdev=mem-mem7  \
-numa node,memdev=mem-mem8  \
-numa node,memdev=mem-mem9  \
-numa node,memdev=mem-mem10  \
-numa node,memdev=mem-mem11  \
-numa node,memdev=mem-mem12  \
-numa node,memdev=mem-mem13  \
-numa node,memdev=mem-mem14  \
-numa node,memdev=mem-mem15  \
-numa node,memdev=mem-mem16  \
-numa node,memdev=mem-mem17  \
-numa node,memdev=mem-mem18  \
-numa node,memdev=mem-mem19  \
-numa node,memdev=mem-mem20  \
-numa node,memdev=mem-mem21  \
-numa node,memdev=mem-mem22  \
-numa node,memdev=mem-mem23  \
-numa node,memdev=mem-mem24  \
-numa node,memdev=mem-mem25  \
-numa node,memdev=mem-mem26  \
-numa node,memdev=mem-mem27  \
-numa node,memdev=mem-mem28  \
-numa node,memdev=mem-mem29  \
-numa node,memdev=mem-mem30  \
-numa node,memdev=mem-mem31  \
-numa node,memdev=mem-mem32  \
-numa node,memdev=mem-mem33  \
-numa node,memdev=mem-mem34  \
-numa node,memdev=mem-mem35  \
-numa node,memdev=mem-mem36  \
-numa node,memdev=mem-mem37  \
-numa node,memdev=mem-mem38  \
-numa node,memdev=mem-mem39  \
-numa node,memdev=mem-mem40  \
-numa node,memdev=mem-mem41  \
-numa node,memdev=mem-mem42  \
-numa node,memdev=mem-mem43  \
-numa node,memdev=mem-mem44  \
-numa node,memdev=mem-mem45  \
-numa node,memdev=mem-mem46  \
-numa node,memdev=mem-mem47  \
-numa node,memdev=mem-mem48  \
-numa node,memdev=mem-mem49  \
-numa node,memdev=mem-mem50  \
-numa node,memdev=mem-mem51  \
-numa node,memdev=mem-mem52  \
-numa node,memdev=mem-mem53  \
-numa node,memdev=mem-mem54  \
-numa node,memdev=mem-mem55  \
-numa node,memdev=mem-mem56  \
-numa node,memdev=mem-mem57  \
-numa node,memdev=mem-mem58  \
-numa node,memdev=mem-mem59  \
-numa node,memdev=mem-mem60  \
-numa node,memdev=mem-mem61  \
-numa node,memdev=mem-mem62  \
-numa node,memdev=mem-mem63  \
-cpu 'host' \
-chardev socket,server=on,id=qmp_id_qmpmonitor1,path=/tmp/monitor-qmpmonitor1-20220117-203351-6r2MJ5Z5,wait=off  \
-mon chardev=qmp_id_qmpmonitor1,mode=control \
-chardev socket,server=on,id=qmp_id_catch_monitor,path=/tmp/monitor-catch_monitor-20220117-203351-6r2MJ5Z5,wait=off  \
-mon chardev=qmp_id_catch_monitor,mode=control  \
-serial unix:'/tmp/serial-serial0-20220117-203351-6r2MJ5Z5',server=on,wait=off \
-device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
-device qemu-xhci,id=usb1,bus=pcie-root-port-2,addr=0x0 \
-device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
-device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
-device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-3,addr=0x0 \
-blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/rhel900-aarch64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \
-blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \
-device scsi-hd,id=image1,drive=drive_image1,write-cache=on \
-device pcie-root-port,id=pcie-root-port-4,port=0x4,addr=0x1.0x4,bus=pcie.0,chassis=5 \
-device virtio-net-pci,mac=9a:c3:8a:ec:54:9d,rombar=0,id=idu0OOqB,netdev=id1YGvvJ,bus=pcie-root-port-4,addr=0x0  \
-netdev tap,id=id1YGvvJ,vhost=on  \
-vnc :20  \
-rtc base=utc,clock=host,driftfix=slew \
-enable-kvm \
-device pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x2,chassis=6 \
-device pcie-root-port,id=pcie_extra_root_port_1,addr=0x2.0x1,bus=pcie.0,chassis=7 \
-monitor stdio 

(qemu) info status
VM status: running
(qemu) info numa
64 nodes
node 0 cpus: 0
node 0 size: 64 MB
node 0 plugged: 0 MB

2. check the serial log:    ----- guest hangs
# nc -U /tmp/serial-serial0-20220117-203351-6r2MJ5Z5
2022-01-17 20:33:59: UEFI firmware starting.
2022-01-17 20:33:59: ��ASSERT [MemoryInit] /builddir/build/BUILD/edk2-e1999b264f1f/ArmVirtPkg/Library/QemuVirtMemInfoLib/QemuVirtMemInfoPeiLibConstructor.c(93): NewSize >= 0x08000000


Additional info:
Boot guest with 64 numa nodes with 128M it succeeds.

-m 8192 \
-object memory-backend-ram,size=128M,prealloc=yes,policy=default,id=mem-mem0 \
....
-object memory-backend-ram,size=128M,prealloc=yes,policy=default,id=mem-mem63  \

Comment 10 Zhenyu Zhang 2022-01-18 01:57:58 UTC
BTW do you think the error in Comment 3 needs to be reported as a new bug to track it down?

Comment 11 Guowen Shan 2022-01-18 06:35:41 UTC
(In reply to Zhenyu Zhang from comment #10)
> BTW do you think the error in Comment 3 needs to be reported as a new bug to
> track it down?
>

Since we already use this bugzilla to start the discussion, it's fine to reuse
the current bugzilla to track the issue, mentioned in comment#3

Comment 12 Guowen Shan 2022-01-18 07:58:01 UTC
I believe it's a EDK2 bug. The issue is also existing in upstream QEMU and EDK2.
I guess we need EDK2 developer to be involved here. In the following source
file, there is comment explaining why 128MB is required. However, the code
missed one case at least: the multiple memory devices in the device-tree
are contigous in terms of their address ranges.

I assume the permanent PEI RAM will use up to 128MB. However, @NewSize is
the size of memory device whose base address is lowest. It means it's not
the whole memory size that the machine has, especially when multiple memory
devices are contiguous in their address ranges.

  edk2/ArmVirtPkg/Library/QemuVirtMemInfoLib/QemuVirtMemInfoPeiLibConstructor.c

  RETURN_STATUS
  EFIAPI
  QemuVirtMemInfoPeiLibConstructor (
    VOID
    )
  {
    :
    //
    // We need to make sure that the machine we are running on has at least
    // 128 MB of memory configured, and is currently executing this binary from
    // NOR flash. This prevents a device tree image in DRAM from getting
    // clobbered when our caller installs permanent PEI RAM, before we have a
    // chance of marking its location as reserved or copy it to a freshly
    // allocated block in the permanent PEI RAM in the platform PEIM.
    //
    ASSERT (NewSize >= SIZE_128MB);
    :
}


First of all, the issue can be reproduced with the following command lines
on upstream qemu, the booting is stuck.

  /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
  -accel kvm -machine virt,gic-version=host               \
  -cpu host -smp 8,sockets=2,cores=2,threads=2            \
  -m 1024M,slots=16,maxmem=64G                            \
  -object memory-backend-ram,id=mem0,size=64M             \
  -object memory-backend-ram,id=mem1,size=960M            \
  -numa node,nodeid=0,cpus=0-3,memdev=mem0                \
  -numa node,nodeid=1,cpus=4-7,memdev=mem1                \
    :

The issue disappears when the following command lines are used.

  -object memory-backend-ram,id=mem0,size=960M              \
  -object memory-backend-ram,id=mem1,size=64M               \

Comment 13 Guowen Shan 2022-01-18 08:34:56 UTC
(In reply to Guowen Shan from comment #11)
> (In reply to Zhenyu Zhang from comment #10)
> > BTW do you think the error in Comment 3 needs to be reported as a new bug to
> > track it down?
> >
> 
> Since we already use this bugzilla to start the discussion, it's fine to
> reuse
> the current bugzilla to track the issue, mentioned in comment#3
>

Zhenyu, please help to create another bug to track the issue reported from
comment#3 and assign to me. I was thinking both two issues share same root
cause, but it's unlikely true.

Thanks,
Gavin

Comment 14 Zhenyu Zhang 2022-01-20 03:47:06 UTC
(In reply to Guowen Shan from comment #13)
> Zhenyu, please help to create another bug to track the issue reported from
> comment#3 and assign to me. I was thinking both two issues share same root
> cause, but it's unlikely true.
> 
> Thanks,
> Gavin

Got it, create the following bug to track the issue.
Bug 2041823 - [aarch64][numa] When there are at least 6 Numa nodes serial log shows 'arch topology borken'

Comment 15 Gerd Hoffmann 2022-01-25 06:37:43 UTC
> I assume the permanent PEI RAM will use up to 128MB. However, @NewSize is
> the size of memory device whose base address is lowest. It means it's not
> the whole memory size that the machine has, especially when multiple memory
> devices are contiguous in their address ranges.

> First of all, the issue can be reproduced with the following command lines
> on upstream qemu, the booting is stuck.
> 
>   /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
>   -accel kvm -machine virt,gic-version=host               \
>   -cpu host -smp 8,sockets=2,cores=2,threads=2            \
>   -m 1024M,slots=16,maxmem=64G                            \
>   -object memory-backend-ram,id=mem0,size=64M             \
>   -object memory-backend-ram,id=mem1,size=960M            \
>   -numa node,nodeid=0,cpus=0-3,memdev=mem0                \
>   -numa node,nodeid=1,cpus=4-7,memdev=mem1                \

Why does this matter in the first place?
Creating numa nodes which are that small looks rather pointless to me.

I'd suggest to simply document that 128M is the smallest numa node size
supported and be done with it (unless someone can come up with a good
argument why such a configuration makes sense).

Comment 16 Zhenyu Zhang 2022-01-25 10:53:31 UTC
(In reply to Gerd Hoffmann from comment #15)

> I'd suggest to simply document that 128M is the smallest numa node size
> supported and be done with it (unless someone can come up with a good
> argument why such a configuration makes sense).

Hello Gerd,

I admit it may not be that important, but we should have a reasonable explanation.
The current arm and x86_64 are both 4096 PAGESIZE, but the x86 node size is 32M can successfully boot successfully.
ARM doubles the size but it doesn't work.
Our goal is for ARM to gradually support the same functionality as the x86 platform.

-machine q35 \
-device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
-device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
-nodefaults \
-device VGA,bus=pcie.0,addr=0x2 \
-smp 6,maxcpus=6,cores=3,threads=1,dies=1,sockets=2  \
-m 1024M,slots=16,maxmem=64G                            \
-object memory-backend-ram,id=mem0,size=64M             \   -----------------------> boot successfully
-object memory-backend-ram,id=mem1,size=960M            \
-numa node,nodeid=0,cpus=0-3,memdev=mem0                \
-numa node,nodeid=1,cpus=4-7,memdev=mem1                \

Comment 17 Laszlo Ersek 2022-01-25 11:41:59 UTC
I agree with Gerd (comment 15); the ArmVirtQemu platform simply has this requirement that the "system RAM" starting at GPA 1GiB be at least 128MiB in size. This is a foundational design tenet of the ArmVirtQemu platform.

The PEI phase is in general not flexible enough to deal with any random physical RAM layout; there must be some RAM that is *always* available at a fixed address and in some previously-known minimum size. For ArmVirtQemu, that means 128MiB at 1GiB.

For the pc and q35 machine types, the same requirements exist, only the numbers differ -- we expect (IIRC) 128MiB in size just the same, and based at GPA 0 -- with the usual x86 RAM holes under 1MB and 4GB.

In ArmVirtQemu, "OvmfPkg/Fdt/HighMemDxe" handles *additional* memory areas. They are discovered from QEMU's device tree, and the RAM discovered in this way is made available to the firmware (and later to the OS) during the DXE phase. Note: *not before* the DXE phase; so any RAM discovered here is unavailable for use during PEI.

In "ArmVirtPkg/Library/QemuVirtMemInfoLib/QemuVirtMemInfoPeiLibConstructor.c", we have an explicit sanity check that the hard-coded memory base (1GiB) that we assume everywhere else in SEC and PEI, namely "PcdSystemMemoryBase", equals the lowest-address memory range exposed by QEMU in the DTB. If that assumption is shown incorrect, the platform must not continue booting.

(For example, PcdCPUCoresStackBase = 1GiB + 0x7c000, per "ArmVirtQemu.dsc", and that constant is used at the earliest stage of guest boot, namely in SEC, ArmPlatformPkg/PrePeiCore.)

... I guess QemuVirtMemInfoPeiLibConstructor() [ArmVirtPkg/Library/QemuVirtMemInfoLib/QemuVirtMemInfoPeiLibConstructor.c] could be improved to build the largest contiguous RAM range that starts at the lowest address, and then check if [1GiB, 1GiB+128MiB) is a subset of that interval. But what's the point?

Comment 18 Zhenyu Zhang 2022-01-25 23:50:58 UTC
(In reply to Laszlo Ersek from comment #17)
> I agree with Gerd (comment 15); the ArmVirtQemu platform simply has this
> requirement that the "system RAM" starting at GPA 1GiB be at least 128MiB in
> size. This is a foundational design tenet of the ArmVirtQemu platform.



Thanks for the updated information. I think this is a good explanation. 
If we recommend a minimum 128MiB node size for ARM, could we increase the prompt?
Instead of hanging, we can check and report errors when qemu boot guests.

Comment 19 Laszlo Ersek 2022-01-26 08:41:57 UTC
(In reply to Zhenyu Zhang from comment #18)

> If we recommend a minimum 128MiB node size for ARM, could we increase the
> prompt?
> Instead of hanging, we can check and report errors when qemu boot guests.

Not sure what you mean by "increase the prompt" -- if that's about raising the limit in QEMU, it seems OK to me.

The firmware is very limited in reporting issues, let alone in recovering them. Especially in such an early stage as PEI. In such situations the firmware commonly logs an error and intentionally hangs.

Comment 20 Zhenyu Zhang 2022-01-26 09:01:38 UTC
(In reply to Laszlo Ersek from comment #19)
> Not sure what you mean by "increase the prompt" -- if that's about raising
> the limit in QEMU, it seems OK to me.

Sure I also think it would be very friendly to add a hint to QEMU.
Of course, it's up to you to decide how to do it.
Tips like the following are very friendly to me so that the guest will not be boot. Instead, report the problem directly in QEMU.

# /usr/libexec/qemu-kvm -cpu ''host -smp 64
qemu-kvm: warning: Number of SMP cpus requested (64) exceeds the recommended cpus supported by KVM (32)
Number of SMP cpus requested (64) exceeds the maximum cpus supported by KVM (32)

Comment 21 Gerd Hoffmann 2022-01-26 10:03:36 UTC
> Sure I also think it would be very friendly to add a hint to QEMU.

Yes, having qemu check that and throw an error makes sense.
Much easier to discover for the user what the problem is.

Comment 22 Guowen Shan 2022-01-28 07:06:46 UTC
I don't think it's a good idea to have the check in QEMU because
QEMU and EDK2 are bound in that way. It'd better to keep them
separate and detached.

Gerd, is there any way to print a error message in EDK2 for
this specific case?

Comment 23 Gerd Hoffmann 2022-01-28 11:44:41 UTC
(In reply to Guowen Shan from comment #22)
> Gerd, is there any way to print a error message in EDK2 for
> this specific case?

Laszlo answered that one already (comment 19):

> The firmware is very limited in reporting issues, let alone in recovering
> them. Especially in such an early stage as PEI. In such situations the
> firmware commonly logs an error and intentionally hangs.

qemu is in a *much* better position to print a helpful error message,
specifically when it comes to give hints to the user how the config
should be fixed.

Comment 24 Laszlo Ersek 2022-01-31 09:58:38 UTC
I'd also like to emphasize a general concept here (which I've come to learn way too late):

the idea that the platform (QEMU) and the firmware (SeaBIOS, edk2) should be *completely independent* is wrong. They should be kept as independent as much as *easily* doable, but not more independent than that. Let's ask ourselves: if the same issue presented itself on the Hyper-V platform (meaning all of: hypervisor, management UI, and guest firmware), where would Microsoft implement the error? (NB, for another example: While keeping ACPI generation in QEMU is a good thing, let's not forget the original problem that required that solution: the impossibility to treat SeaBIOS and QEMU together as a product, upstream. This split between the developer communities led to incredible development pains and the need for an elaborate information channel. We should have as few of those as possible.)

Comment 25 Guowen Shan 2022-02-28 08:08:47 UTC
Ok. I've posted QEMU patch for comments. Lets see what comments I
will receive from upstream community.

https://lists.nongnu.org/archive/html/qemu-arm/2022-02/msg00445.html

Comment 26 Guowen Shan 2022-03-07 07:20:18 UTC
Posted v1 patch to upstream for comments. We still think the best
place to have the check is EDK2. Laszlo and Gerd, feel free to
reassign this bug to yourself if it's fine to provide the user-friendly
error message in EDK2.

https://lore.kernel.org/all/20220301114257.2bppjnjqj7dgxztc@sirius.home.kraxel.org/T/

Comment 27 Laszlo Ersek 2022-03-07 09:13:05 UTC
I've replied in the qemu-devel thread.

Comment 28 Guowen Shan 2022-03-15 08:02:20 UTC
I discussed with QA (Zhenyu) and agree to close it as "won't fix".
From the edk2_assert(), we already know the memory in the first
NUMA node shouldn't less than 128MB. So it's fine not to provide
explicit error message from edk2 side.

Comment 29 Luiz Capitulino 2022-03-21 14:19:45 UTC
Clearing NEEDINFO as discussion occurred on upstream and decision has been made to close the BZ.


Note You need to log in before you can comment on or make changes to this bug.