Description of problem: [ppc64le][numa][regresssion]the maxmum numa node is conflict in guest and qemu Version-Release number of selected component (if applicable): qemu-kvm-5.2.0-3.module+el8.4.0+9499+42e58f08 How reproducible: 3/3 Steps to Reproduce: 1.boot up a guest with over 128 nodes Actual results: [root@ibm-p9wr-02 home]# sh 129.sh QEMU 5.2.0 monitor - type 'help' for more information (qemu) qemu-kvm: -numa node,memdev=mem-mem128: Max number of NUMA nodes reached: 128 According to the current output by the following cmdline in the guest #cat /sys/devices/system/node/possible 0-255 the two numbers didn't matched, we should support 256 nodes but not 128 according to the current result.But with older versions, the issues wasn't reproducible (both guest and qemu show 128 nodes) It's should be a regression issue, please help to fix it, thanks. qemu-kvm-5.1.0-14.module+el8.3.0+8438+644aff69.ppc64le [root@localhost ~]# cat /sys/devices/system/node/possible cat /sys/devices/system/node/possible 0-127 Expected results: The maximum numa number should be the same in guest and qemu level Additional info:
Guest/host kernel kernel-4.18.0-276.el8.ppc64le/4.18.0-274.el8.ppc64le
Created attachment 1749292 [details] 129
Created attachment 1749293 [details] 128
I didn't reproduce it on x86 platform qemu-kvm-5.2.0-3.module+el8.4.0+9499+42e58f08.x86_64
AFAICT, the only issue here is that qemu has a limit of 128 NUMA nodes, whereas the guest kernel has a limit of 256 nodes, which is arguably not a bug. Since we use a common kernel for all platforms, we might have legitimate reasons to have a large limit in the kernel, to support more nodes on bare metal or PowerVM LPAR deployments. On x86, in what way does it not reproduce? Does qemu allow for more nodes? Or does the kernel report a lower number of possible nodes?
(In reply to David Gibson from comment #5) > AFAICT, the only issue here is that qemu has a limit of 128 NUMA nodes, > whereas the guest kernel has a limit of 256 nodes, which is arguably not a > bug. Since we use a common kernel for all platforms, we might have > legitimate reasons to have a large limit in the kernel, to support more > nodes on bare metal or PowerVM LPAR deployments. > > On x86, in what way does it not reproduce? Does qemu allow for more nodes? > Or does the kernel report a lower number of possible nodes? Hi David, x86 reported lower number of possible nodes in the guest. [root@localhost ~]# cat /sys/devices/system/node/possible cat /sys/devices/system/node/possible 0-127 Thanks Min
(In reply to David Gibson from comment #5) > AFAICT, the only issue here is that qemu has a limit of 128 NUMA nodes, > whereas the guest kernel has a limit of 256 nodes, which is arguably not a > bug. Since we use a common kernel for all platforms, we might have > legitimate reasons to have a large limit in the kernel, to support more > nodes on bare metal or PowerVM LPAR deployments. I understand your points but it didn't happened on old builds.
Ok, what happens on older builds? Both with 128 nodes specified in qemu, and with a smaller number of nodes specified in qemu? This still seems more or less a cosmetic problem to me.
Min, What is happening here is that the pseries machine is reporting the double of the NUMA nodes specified from the command line. This is easily reproducible: just create a guest with 2 nodes and /sys/devices/system/node/possible will report 4, 4 user nodes will become 8 and so on. David, I think we have a lucky break in this one. This is another instance of the bug I already fixed upstream, the one that was reported by Cedric: https://lists.gnu.org/archive/html/qemu-devel/2021-01/msg07408.html As such, this bug also happens with QEMU upstream, and with the above fix this is the output when running the 128.sh script: # cat /sys/devices/system/node/possible 0-127 The patches were reviewed and accepted, but they didn't land upstream yet. I would do a backport but I'm not sure if I can backport patches from David's tree. Let me know if that's allowed and I'll backport them. Otherwise we can wait for them to land upstream.
Thanks for the analysis Daniel. You'll need to wait until the upstream merge before backporting this. I'm not yet sure when my next pull request will be; probably 1-2 weeks away. Since this is now low priority and aimed at RHEL8.5, that shouldn't be a big problem.
> qemu-kvm-5.1.0-14.module+el8.3.0+8438+644aff69.ppc64le > [root@localhost ~]# cat /sys/devices/system/node/possible > cat /sys/devices/system/node/possible > 0-127 > > Expected results: > The maximum numa number should be the same in guest and qemu level > > > Additional info: According to the bug's description, the number of numa node matched from qemu and guest on old build.
The fix is now upstream, so we should get this fix via rebase.
Hi mrezanin, Can you take a look on this bz since it' in Devmissed status now, per lvivier, the fix is included in rebase code. Should we move it to be modified directly or something else ? Thanks. Best regards Min
Verified the bug on build qemu-kvm-6.0.0-16.module+el8.5.0+10848+2dccc46d.ppc64le The steps please refer to description, the original issue has been fixed, thanks.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (virt:av bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:4684