Bug 1443877
Summary: | All the memory was assigned to the last node when guest booted up with 128 nodes | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Min Deng <mdeng> |
Component: | qemu-kvm-rhev | Assignee: | Laurent Vivier <lvivier> |
Status: | CLOSED ERRATA | QA Contact: | Min Deng <mdeng> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 7.4 | CC: | dgibson, knoel, lvivier, mdeng, michen, mrezanin, qzhang, thuth, virt-maint |
Target Milestone: | rc | ||
Target Release: | 7.5 | ||
Hardware: | ppc64le | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | qemu-kvm-rhev-2.10.0-1.el7 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-04-11 00:16:25 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Min Deng
2017-04-20 08:04:57 UTC
Provide extra information about this bug,guest only has two nodes CLI, /usr/libexec/qemu-kvm -name virt-tests-vm1 -sandbox off -machine pseries-rhel7.4.0 -nodefaults -vga std -device virtio-blk-pci,id=virtio_blk_pci0,disable-legacy=off,disable-modern=off,drive=drive_image1 -drive id=drive_image1,if=none,cache=none,aio=native,format=qcow2,file=rhel74-ppc64-virtio.qcow2 -qmp tcp:0:4444,server,nowait -vnc :1 -rtc base=utc,clock=host,driftfix=slew -boot order=cdn,once=c,menu=off,strict=off -monitor stdio -device nec-usb-xhci,id=usb1 -device usb-kbd,id=input0 -device usb-mouse,id=input1 -device usb-tablet,id=input2 -netdev tap,script=/etc/qemu-ifup,downscript=/etc/qemu-down,id=hostnet1,vhost=on -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:52:11:36:3f:01 -machine accel=kvm:tcg -chardev socket,id=serial_id_serial0,path=/tmp/min,server,nowait -device spapr-vty,reg=0x30000000,chardev=serial_id_serial0 -realtime mlock=on -m 24G -smp 12 -numa node -numa node QEMU 2.8.92 monitor - type 'help' for more information (qemu) (qemu) (qemu) info numa 2 nodes node 0 cpus: 0 2 4 6 8 10 node 0 size: 12288 MB node 1 cpus: 1 3 5 7 9 11 node 1 size: 12288 MB (qemu) The memory was split into same value. I suspect I know roughly what the problem here is: the smallest amount which can be allocated to any single node is 256M. That's the smallest granularity that the firmware / hypervisor can represent in describing the NUMA configuration. 24G (total memory) / 128 nodes is < 256M. Now obviously we're not distributing as evenly as we could, even with that restriction, but I suspect we're just seeing <1 LMB, rounding down to 0 for each node, then all the leftovers end up in the last node. To check this, can you try the following scenarious: 1. Change total memory to 32G: I expect that to give 128 nodes each with 256M of memory 2. Go back to 24G of memory, but reduce to 96 nodes: I expect this to give 96 nodes each with 256M of memory. 3. Try 24G of memory with 97 nodes: I expect this to give 96 nodes with 0 memory and 1 node with 24G of memory. This is to confirm the problem occurs as soon as there is less than (256M * #nodes) of memory. > To check this, can you try the following scenarious: > > 1. Change total memory to 32G: I expect that to give 128 nodes each with > 256M of memory Test result, /usr/libexec/qemu-kvm -name virt-tests-vm1 -sandbox off -machine pseries-rhel7.4.0 -nodefaults -vga std -device virtio-blk-pci,id=virtio_blk_pci0,disable-legacy=off,disable-modern=off,drive=drive_image1 -drive id=drive_image1,if=none,cache=none,aio=native,format=qcow2,file=rhel74-ppc64-virtio.qcow2 -qmp tcp:0:4444,server,nowait -vnc :1 -rtc base=utc,clock=host,driftfix=slew -boot order=cdn,once=c,menu=off,strict=off -monitor stdio -device nec-usb-xhci,id=usb1 -device usb-kbd,id=input0 -device usb-mouse,id=input1 -device usb-tablet,id=input2 -netdev tap,script=/etc/qemu-ifup,downscript=/etc/qemu-down,id=hostnet1,vhost=on -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:52:11:36:3f:01 -machine accel=kvm:tcg -chardev socket,id=serial_id_serial0,path=/tmp/min,server,nowait -device spapr-vty,reg=0x30000000,chardev=serial_id_serial0 -realtime mlock=on -m 32G -smp 12 -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node QEMU 2.8.92 monitor - type 'help' for more information (qemu) (qemu) info numa 128 nodes node 0 cpus: 0 node 0 size: 256 MB node 1 cpus: 1 node 1 size: 256 MB node 2 cpus: 2 node 2 size: 256 MB node 3 cpus: 3 node 3 size: 256 MB node 4 cpus: 4 node 4 size: 256 MB node 5 cpus: 5 node 5 size: 256 MB node 6 cpus: 6 node 6 size: 256 MB node 7 cpus: 7 node 7 size: 256 MB node 8 cpus: 8 node 8 size: 256 MB node 9 cpus: 9 node 9 size: 256 MB node 10 cpus: 10 node 10 size: 256 MB node 11 cpus: 11 node 11 size: 256 MB node 12 cpus: node 12 size: 256 MB node 13 cpus: node 13 size: 256 MB node 14 cpus: node 14 size: 256 MB node 15 cpus: node 15 size: 256 MB node 16 cpus: node 16 size: 256 MB node 17 cpus: node 17 size: 256 MB node 18 cpus: node 18 size: 256 MB node 19 cpus: node 19 size: 256 MB node 20 cpus: node 20 size: 256 MB node 21 cpus: node 21 size: 256 MB node 22 cpus: node 22 size: 256 MB node 23 cpus: node 23 size: 256 MB node 24 cpus: node 24 size: 256 MB node 25 cpus: node 25 size: 256 MB node 26 cpus: node 26 size: 256 MB node 27 cpus: node 27 size: 256 MB node 28 cpus: node 28 size: 256 MB node 29 cpus: node 29 size: 256 MB node 30 cpus: node 30 size: 256 MB node 31 cpus: node 31 size: 256 MB node 32 cpus: node 32 size: 256 MB node 33 cpus: node 33 size: 256 MB node 34 cpus: node 34 size: 256 MB node 35 cpus: node 35 size: 256 MB node 36 cpus: node 36 size: 256 MB node 37 cpus: node 37 size: 256 MB node 38 cpus: node 38 size: 256 MB node 39 cpus: node 39 size: 256 MB node 40 cpus: node 40 size: 256 MB node 41 cpus: node 41 size: 256 MB node 42 cpus: node 42 size: 256 MB node 43 cpus: node 43 size: 256 MB node 44 cpus: node 44 size: 256 MB node 45 cpus: node 45 size: 256 MB node 46 cpus: node 46 size: 256 MB node 47 cpus: node 47 size: 256 MB node 48 cpus: node 48 size: 256 MB node 49 cpus: node 49 size: 256 MB node 50 cpus: node 50 size: 256 MB node 51 cpus: node 51 size: 256 MB node 52 cpus: node 52 size: 256 MB node 53 cpus: node 53 size: 256 MB node 54 cpus: node 54 size: 256 MB node 55 cpus: node 55 size: 256 MB node 56 cpus: node 56 size: 256 MB node 57 cpus: node 57 size: 256 MB node 58 cpus: node 58 size: 256 MB node 59 cpus: node 59 size: 256 MB node 60 cpus: node 60 size: 256 MB node 61 cpus: node 61 size: 256 MB node 62 cpus: node 62 size: 256 MB node 63 cpus: node 63 size: 256 MB node 64 cpus: node 64 size: 256 MB node 65 cpus: node 65 size: 256 MB node 66 cpus: node 66 size: 256 MB node 67 cpus: node 67 size: 256 MB node 68 cpus: node 68 size: 256 MB node 69 cpus: node 69 size: 256 MB node 70 cpus: node 70 size: 256 MB node 71 cpus: node 71 size: 256 MB node 72 cpus: node 72 size: 256 MB node 73 cpus: node 73 size: 256 MB node 74 cpus: node 74 size: 256 MB node 75 cpus: node 75 size: 256 MB node 76 cpus: node 76 size: 256 MB node 77 cpus: node 77 size: 256 MB node 78 cpus: node 78 size: 256 MB node 79 cpus: node 79 size: 256 MB node 80 cpus: node 80 size: 256 MB node 81 cpus: node 81 size: 256 MB node 82 cpus: node 82 size: 256 MB node 83 cpus: node 83 size: 256 MB node 84 cpus: node 84 size: 256 MB node 85 cpus: node 85 size: 256 MB node 86 cpus: node 86 size: 256 MB node 87 cpus: node 87 size: 256 MB node 88 cpus: node 88 size: 256 MB node 89 cpus: node 89 size: 256 MB node 90 cpus: node 90 size: 256 MB node 91 cpus: node 91 size: 256 MB node 92 cpus: node 92 size: 256 MB node 93 cpus: node 93 size: 256 MB node 94 cpus: node 94 size: 256 MB node 95 cpus: node 95 size: 256 MB node 96 cpus: node 96 size: 256 MB node 97 cpus: node 97 size: 256 MB node 98 cpus: node 98 size: 256 MB node 99 cpus: node 99 size: 256 MB node 100 cpus: node 100 size: 256 MB node 101 cpus: node 101 size: 256 MB node 102 cpus: node 102 size: 256 MB node 103 cpus: node 103 size: 256 MB node 104 cpus: node 104 size: 256 MB node 105 cpus: node 105 size: 256 MB node 106 cpus: node 106 size: 256 MB node 107 cpus: node 107 size: 256 MB node 108 cpus: node 108 size: 256 MB node 109 cpus: node 109 size: 256 MB node 110 cpus: node 110 size: 256 MB node 111 cpus: node 111 size: 256 MB node 112 cpus: node 112 size: 256 MB node 113 cpus: node 113 size: 256 MB node 114 cpus: node 114 size: 256 MB node 115 cpus: node 115 size: 256 MB node 116 cpus: node 116 size: 256 MB node 117 cpus: node 117 size: 256 MB node 118 cpus: node 118 size: 256 MB node 119 cpus: node 119 size: 256 MB node 120 cpus: node 120 size: 256 MB node 121 cpus: node 121 size: 256 MB node 122 cpus: node 122 size: 256 MB node 123 cpus: node 123 size: 256 MB node 124 cpus: node 124 size: 256 MB node 125 cpus: node 125 size: 256 MB node 126 cpus: node 126 size: 256 MB node 127 cpus: node 127 size: 256 MB > 2. Go back to 24G of memory, but reduce to 96 nodes: I expect this to > give 96 nodes each with 256M of memory. Test results, /usr/libexec/qemu-kvm -name virt-tests-vm1 -sandbox off -machine pseries-rhel7.4.0 -nodefaults -vga std -device virtio-blk-pci,id=virtio_blk_pci0,disable-legacy=off,disable-modern=off,drive=drive_image1 -drive id=drive_image1,if=none,cache=none,aio=native,format=qcow2,file=rhel74-ppc64-virtio.qcow2 -qmp tcp:0:4444,server,nowait -vnc :1 -rtc base=utc,clock=host,driftfix=slew -boot order=cdn,once=c,menu=off,strict=off -monitor stdio -device nec-usb-xhci,id=usb1 -device usb-kbd,id=input0 -device usb-mouse,id=input1 -device usb-tablet,id=input2 -netdev tap,script=/etc/qemu-ifup,downscript=/etc/qemu-down,id=hostnet1,vhost=on -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:52:11:36:3f:01 -machine accel=kvm:tcg -chardev socket,id=serial_id_serial0,path=/tmp/min,server,nowait -device spapr-vty,reg=0x30000000,chardev=serial_id_serial0 -realtime mlock=on -m 24G -smp 12 -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node QEMU 2.8.92 monitor - type 'help' for more information (qemu) info numa 96 nodes node 0 cpus: 0 node 0 size: 256 MB node 1 cpus: 1 node 1 size: 256 MB node 2 cpus: 2 node 2 size: 256 MB node 3 cpus: 3 node 3 size: 256 MB node 4 cpus: 4 node 4 size: 256 MB node 5 cpus: 5 node 5 size: 256 MB node 6 cpus: 6 node 6 size: 256 MB node 7 cpus: 7 node 7 size: 256 MB node 8 cpus: 8 node 8 size: 256 MB node 9 cpus: 9 node 9 size: 256 MB node 10 cpus: 10 node 10 size: 256 MB node 11 cpus: 11 node 11 size: 256 MB node 12 cpus: node 12 size: 256 MB node 13 cpus: node 13 size: 256 MB node 14 cpus: node 14 size: 256 MB node 15 cpus: node 15 size: 256 MB node 16 cpus: node 16 size: 256 MB node 17 cpus: node 17 size: 256 MB node 18 cpus: node 18 size: 256 MB node 19 cpus: node 19 size: 256 MB node 20 cpus: node 20 size: 256 MB node 21 cpus: node 21 size: 256 MB node 22 cpus: node 22 size: 256 MB node 23 cpus: node 23 size: 256 MB node 24 cpus: node 24 size: 256 MB node 25 cpus: node 25 size: 256 MB node 26 cpus: node 26 size: 256 MB node 27 cpus: node 27 size: 256 MB node 28 cpus: node 28 size: 256 MB node 29 cpus: node 29 size: 256 MB node 30 cpus: node 30 size: 256 MB node 31 cpus: node 31 size: 256 MB node 32 cpus: node 32 size: 256 MB node 33 cpus: node 33 size: 256 MB node 34 cpus: node 34 size: 256 MB node 35 cpus: node 35 size: 256 MB node 36 cpus: node 36 size: 256 MB node 37 cpus: node 37 size: 256 MB node 38 cpus: node 38 size: 256 MB node 39 cpus: node 39 size: 256 MB node 40 cpus: node 40 size: 256 MB node 41 cpus: node 41 size: 256 MB node 42 cpus: node 42 size: 256 MB node 43 cpus: node 43 size: 256 MB node 44 cpus: node 44 size: 256 MB node 45 cpus: node 45 size: 256 MB node 46 cpus: node 46 size: 256 MB node 47 cpus: node 47 size: 256 MB node 48 cpus: node 48 size: 256 MB node 49 cpus: node 49 size: 256 MB node 50 cpus: node 50 size: 256 MB node 51 cpus: node 51 size: 256 MB node 52 cpus: node 52 size: 256 MB node 53 cpus: node 53 size: 256 MB node 54 cpus: node 54 size: 256 MB node 55 cpus: node 55 size: 256 MB node 56 cpus: node 56 size: 256 MB node 57 cpus: node 57 size: 256 MB node 58 cpus: node 58 size: 256 MB node 59 cpus: node 59 size: 256 MB node 60 cpus: node 60 size: 256 MB node 61 cpus: node 61 size: 256 MB node 62 cpus: node 62 size: 256 MB node 63 cpus: node 63 size: 256 MB node 64 cpus: node 64 size: 256 MB node 65 cpus: node 65 size: 256 MB node 66 cpus: node 66 size: 256 MB node 67 cpus: node 67 size: 256 MB node 68 cpus: node 68 size: 256 MB node 69 cpus: node 69 size: 256 MB node 70 cpus: node 70 size: 256 MB node 71 cpus: node 71 size: 256 MB node 72 cpus: node 72 size: 256 MB node 73 cpus: node 73 size: 256 MB node 74 cpus: node 74 size: 256 MB node 75 cpus: node 75 size: 256 MB node 76 cpus: node 76 size: 256 MB node 77 cpus: node 77 size: 256 MB node 78 cpus: node 78 size: 256 MB node 79 cpus: node 79 size: 256 MB node 80 cpus: node 80 size: 256 MB node 81 cpus: node 81 size: 256 MB node 82 cpus: node 82 size: 256 MB node 83 cpus: node 83 size: 256 MB node 84 cpus: node 84 size: 256 MB node 85 cpus: node 85 size: 256 MB node 86 cpus: node 86 size: 256 MB node 87 cpus: node 87 size: 256 MB node 88 cpus: node 88 size: 256 MB node 89 cpus: node 89 size: 256 MB node 90 cpus: node 90 size: 256 MB node 91 cpus: node 91 size: 256 MB node 92 cpus: node 92 size: 256 MB node 93 cpus: node 93 size: 256 MB node 94 cpus: node 94 size: 256 MB node 95 cpus: node 95 size: 256 MB > 3. Try 24G of memory with 97 nodes: I expect this to give 96 nodes with > 0 memory and 1 node with 24G of memory. This is to confirm the problem > occurs as soon as there is less than (256M * #nodes) of memory. Test result, /usr/libexec/qemu-kvm -name virt-tests-vm1 -sandbox off -machine pseries-rhel7.4.0 -nodefaults -vga std -device virtio-blk-pci,id=virtio_blk_pci0,disable-legacy=off,disable-modern=off,drive=drive_image1 -drive id=drive_image1,if=none,cache=none,aio=native,format=qcow2,file=rhel74-ppc64-virtio.qcow2 -qmp tcp:0:4444,server,nowait -vnc :1 -rtc base=utc,clock=host,driftfix=slew -boot order=cdn,once=c,menu=off,strict=off -monitor stdio -device nec-usb-xhci,id=usb1 -device usb-kbd,id=input0 -device usb-mouse,id=input1 -device usb-tablet,id=input2 -netdev tap,script=/etc/qemu-ifup,downscript=/etc/qemu-down,id=hostnet1,vhost=on -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:52:11:36:3f:01 -machine accel=kvm:tcg -chardev socket,id=serial_id_serial0,path=/tmp/min,server,nowait -device spapr-vty,reg=0x30000000,chardev=serial_id_serial0 -realtime mlock=on -m 24G -smp 12 -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node (qemu) info numa 97 nodes node 0 cpus: 0 node 0 size: 0 MB node 1 cpus: 1 node 1 size: 0 MB node 2 cpus: 2 node 2 size: 0 MB node 3 cpus: 3 node 3 size: 0 MB node 4 cpus: 4 node 4 size: 0 MB node 5 cpus: 5 node 5 size: 0 MB node 6 cpus: 6 node 6 size: 0 MB node 7 cpus: 7 node 7 size: 0 MB node 8 cpus: 8 node 8 size: 0 MB node 9 cpus: 9 node 9 size: 0 MB node 10 cpus: 10 node 10 size: 0 MB node 11 cpus: 11 node 11 size: 0 MB node 12 cpus: node 12 size: 0 MB node 13 cpus: node 13 size: 0 MB node 14 cpus: node 14 size: 0 MB node 15 cpus: node 15 size: 0 MB node 16 cpus: node 16 size: 0 MB node 17 cpus: node 17 size: 0 MB node 18 cpus: node 18 size: 0 MB node 19 cpus: node 19 size: 0 MB node 20 cpus: node 20 size: 0 MB node 21 cpus: node 21 size: 0 MB node 22 cpus: node 22 size: 0 MB node 23 cpus: node 23 size: 0 MB node 24 cpus: node 24 size: 0 MB node 25 cpus: node 25 size: 0 MB node 26 cpus: node 26 size: 0 MB node 27 cpus: node 27 size: 0 MB node 28 cpus: node 28 size: 0 MB node 29 cpus: node 29 size: 0 MB node 30 cpus: node 30 size: 0 MB node 31 cpus: node 31 size: 0 MB node 32 cpus: node 32 size: 0 MB node 33 cpus: node 33 size: 0 MB node 34 cpus: node 34 size: 0 MB node 35 cpus: node 35 size: 0 MB node 36 cpus: node 36 size: 0 MB node 37 cpus: node 37 size: 0 MB node 38 cpus: node 38 size: 0 MB node 39 cpus: node 39 size: 0 MB node 40 cpus: node 40 size: 0 MB node 41 cpus: node 41 size: 0 MB node 42 cpus: node 42 size: 0 MB node 43 cpus: node 43 size: 0 MB node 44 cpus: node 44 size: 0 MB node 45 cpus: node 45 size: 0 MB node 46 cpus: node 46 size: 0 MB node 47 cpus: node 47 size: 0 MB node 48 cpus: node 48 size: 0 MB node 49 cpus: node 49 size: 0 MB node 50 cpus: node 50 size: 0 MB node 51 cpus: node 51 size: 0 MB node 52 cpus: node 52 size: 0 MB node 53 cpus: node 53 size: 0 MB node 54 cpus: node 54 size: 0 MB node 55 cpus: node 55 size: 0 MB node 56 cpus: node 56 size: 0 MB node 57 cpus: node 57 size: 0 MB node 58 cpus: node 58 size: 0 MB node 59 cpus: node 59 size: 0 MB node 60 cpus: node 60 size: 0 MB node 61 cpus: node 61 size: 0 MB node 62 cpus: node 62 size: 0 MB node 63 cpus: node 63 size: 0 MB node 64 cpus: node 64 size: 0 MB node 65 cpus: node 65 size: 0 MB node 66 cpus: node 66 size: 0 MB node 67 cpus: node 67 size: 0 MB node 68 cpus: node 68 size: 0 MB node 69 cpus: node 69 size: 0 MB node 70 cpus: node 70 size: 0 MB node 71 cpus: node 71 size: 0 MB node 72 cpus: node 72 size: 0 MB node 73 cpus: node 73 size: 0 MB node 74 cpus: node 74 size: 0 MB node 75 cpus: node 75 size: 0 MB node 76 cpus: node 76 size: 0 MB node 77 cpus: node 77 size: 0 MB node 78 cpus: node 78 size: 0 MB node 79 cpus: node 79 size: 0 MB node 80 cpus: node 80 size: 0 MB node 81 cpus: node 81 size: 0 MB node 82 cpus: node 82 size: 0 MB node 83 cpus: node 83 size: 0 MB node 84 cpus: node 84 size: 0 MB node 85 cpus: node 85 size: 0 MB node 86 cpus: node 86 size: 0 MB node 87 cpus: node 87 size: 0 MB node 88 cpus: node 88 size: 0 MB node 89 cpus: node 89 size: 0 MB node 90 cpus: node 90 size: 0 MB node 91 cpus: node 91 size: 0 MB node 92 cpus: node 92 size: 0 MB node 93 cpus: node 93 size: 0 MB node 94 cpus: node 94 size: 0 MB node 95 cpus: node 95 size: 0 MB node 96 cpus: node 96 size: 24576 MB Hi David, The above test results were just like what your expected,from QE's perspective,it is still werid the guest claimed to support 128 nodes or 97 nodes but only the last node could reserved all the memory of guest.How about other nodes ?Is it meaningful for others ? If I was wrong please correct me,thanks a lot. Thanks Min Min, thanks for the checks. That pretty much confirms my theory. Yes, the behaviour is quite odd. We could do better in theory - handling rounding behaviour more strictly so that we end up with some nodes with 256M of RAM and other nodes with 0 RAM. However, having so many NUMA nodes with so little (comparatively) memory is a pretty unlikely case in practice. So unless there's a realistic customer impact that I haven't seen yet, I'm inclined to close this as WONTFIX. (In reply to David Gibson from comment #5) > Min, thanks for the checks. That pretty much confirms my theory. > > Yes, the behaviour is quite odd. We could do better in theory - handling > rounding behaviour more strictly so that we end up with some nodes with 256M > of RAM and other nodes with 0 RAM. However, having so many NUMA nodes with > so little (comparatively) memory is a pretty unlikely case in practice. > > So unless there's a realistic customer impact that I haven't seen yet, I'm > inclined to close this as WONTFIX. Perhaps we can apply an error diffusion algorithm [1] to distribute the memory between the nodes? diff --git a/numa.c b/numa.c index 6fc2393..0a82bee 100644 --- a/numa.c +++ b/numa.c @@ -336,15 +336,19 @@ void parse_numa_opts(MachineClass *mc) } } if (i == nb_numa_nodes) { - uint64_t usedmem = 0; + uint64_t usedmem = 0, node_mem; + uint64_t granularity = ram_size / nb_numa_nodes; + uint64_t propagate = 0; /* Align each node according to the alignment * requirements of the machine class */ for (i = 0; i < nb_numa_nodes - 1; i++) { - numa_info[i].node_mem = (ram_size / nb_numa_nodes) & + node_mem = (granularity + propagate) & ~((1 << mc->numa_mem_align_shift) - 1); - usedmem += numa_info[i].node_mem; + propagate = granularity + propagate - node_mem; + numa_info[i].node_mem = node_mem; + usedmem += node_mem; } numa_info[i].node_mem = ram_size - usedmem; } $ qemu-system-ppc64 -S -nographic -nodefaults -monitor stdio -m 4G -smp 12 -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node QEMU 2.9.50 monitor - type 'help' for more information (qemu) info numa 24 nodes node 0 cpus: 0 node 0 size: 0 MB node 1 cpus: 1 node 1 size: 256 MB node 2 cpus: 2 node 2 size: 0 MB node 3 cpus: 3 node 3 size: 256 MB node 4 cpus: 4 node 4 size: 256 MB node 5 cpus: 5 node 5 size: 0 MB node 6 cpus: 6 node 6 size: 256 MB node 7 cpus: 7 node 7 size: 256 MB node 8 cpus: 8 node 8 size: 0 MB node 9 cpus: 9 node 9 size: 256 MB node 10 cpus: 10 node 10 size: 256 MB node 11 cpus: 11 node 11 size: 0 MB node 12 cpus: node 12 size: 256 MB node 13 cpus: node 13 size: 256 MB node 14 cpus: node 14 size: 0 MB node 15 cpus: node 15 size: 256 MB node 16 cpus: node 16 size: 256 MB node 17 cpus: node 17 size: 0 MB node 18 cpus: node 18 size: 256 MB node 19 cpus: node 19 size: 256 MB node 20 cpus: node 20 size: 0 MB node 21 cpus: node 21 size: 256 MB node 22 cpus: node 22 size: 256 MB node 23 cpus: node 23 size: 256 MB (qemu) [1] https://en.wikipedia.org/wiki/Error_diffusion https://en.wikipedia.org/wiki/Floyd%E2%80%93Steinberg_dithering Yes, we could certainly do that. Looks like the change is simpler than I expected, so if you could post that upstream, that'd be great. I'm just not convinced it's worth porting such a fix specially for RHEL 7.4, though; it doesn't seem a very likely situation in practice. Laurent is working on a fix for this upstream. However I think the triggering case is sufficiently obscure that we don't really need to backport it - we can wait until we pick it up by rebase in RHEL 7.5. Punting to RHEL 7.5. Laurent's patch has been merged upstream here: http://git.qemu.org/?p=qemu.git;a=commitdiff;h=3bfe57165b4bf86a4310990 QE re-tested the bug on the following builds kernel-3.10.0-781.el7.ppc64le qemu-kvm-rhev-2.10.0-5.el7.ppc64le SLOF-20170724-2.git89f519f.el7.noarch Cli, /usr/libexec/qemu-kvm -name virt-tests-vm1 -sandbox off -machine pseries-rhel7.4.0 -nodefaults -vga std -device virtio-blk-pci,id=virtio_blk_pci0,disable-legacy=off,disable-modern=off,drive=drive_image1 -drive id=drive_image1,if=none,cache=none,aio=native,format=qcow2,file=rhel75-ppc64-virtio.qcow2 -qmp tcp:0:4444,server,nowait -vnc :1 -rtc base=utc,clock=host,driftfix=slew -boot order=cdn,once=c,menu=off,strict=off -monitor stdio -device nec-usb-xhci,id=usb1 -device usb-kbd,id=input0 -device usb-mouse,id=input1 -device usb-tablet,id=input2 -netdev tap,script=/etc/qemu-ifup,downscript=/etc/qemu-down,id=hostnet1,vhost=on -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:52:11:36:3f:01 -machine accel=kvm:tcg -chardev socket,id=serial_id_serial0,path=/tmp/min,server,nowait -device spapr-vty,reg=0x30000000,chardev=serial_id_serial0 -realtime mlock=on -m 24G -smp 12 -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node Output, (qemu) info numa 128 nodes node 0 cpus: 0 node 0 size: 0 MB node 1 cpus: 1 node 1 size: 0 MB node 2 cpus: 2 node 2 size: 0 MB node 3 cpus: 3 node 3 size: 0 MB node 4 cpus: 4 node 4 size: 0 MB node 5 cpus: 5 node 5 size: 0 MB node 6 cpus: 6 node 6 size: 0 MB node 7 cpus: 7 node 7 size: 0 MB node 8 cpus: 8 node 8 size: 0 MB node 9 cpus: 9 node 9 size: 0 MB node 10 cpus: 10 node 10 size: 0 MB node 11 cpus: 11 node 11 size: 0 MB node 12 cpus: node 12 size: 0 MB node 13 cpus: node 13 size: 0 MB node 14 cpus: node 14 size: 0 MB node 15 cpus: node 15 size: 0 MB node 16 cpus: node 16 size: 0 MB node 17 cpus: node 17 size: 0 MB node 18 cpus: node 18 size: 0 MB node 19 cpus: node 19 size: 0 MB node 20 cpus: node 20 size: 0 MB node 21 cpus: node 21 size: 0 MB node 22 cpus: node 22 size: 0 MB node 23 cpus: node 23 size: 0 MB node 24 cpus: node 24 size: 0 MB node 25 cpus: node 25 size: 0 MB node 26 cpus: node 26 size: 0 MB node 27 cpus: node 27 size: 0 MB node 28 cpus: node 28 size: 0 MB node 29 cpus: node 29 size: 0 MB node 30 cpus: node 30 size: 0 MB node 31 cpus: node 31 size: 0 MB node 32 cpus: node 32 size: 0 MB node 33 cpus: node 33 size: 0 MB node 34 cpus: node 34 size: 0 MB node 35 cpus: node 35 size: 0 MB node 36 cpus: node 36 size: 0 MB node 37 cpus: node 37 size: 0 MB node 38 cpus: node 38 size: 0 MB node 39 cpus: node 39 size: 0 MB node 40 cpus: node 40 size: 0 MB node 41 cpus: node 41 size: 0 MB node 42 cpus: node 42 size: 0 MB node 43 cpus: node 43 size: 0 MB node 44 cpus: node 44 size: 0 MB node 45 cpus: node 45 size: 0 MB node 46 cpus: node 46 size: 0 MB node 47 cpus: node 47 size: 0 MB node 48 cpus: node 48 size: 0 MB node 49 cpus: node 49 size: 0 MB node 50 cpus: node 50 size: 0 MB node 51 cpus: node 51 size: 0 MB node 52 cpus: node 52 size: 0 MB node 53 cpus: node 53 size: 0 MB node 54 cpus: node 54 size: 0 MB node 55 cpus: node 55 size: 0 MB node 56 cpus: node 56 size: 0 MB node 57 cpus: node 57 size: 0 MB node 58 cpus: node 58 size: 0 MB node 59 cpus: node 59 size: 0 MB node 60 cpus: node 60 size: 0 MB node 61 cpus: node 61 size: 0 MB node 62 cpus: node 62 size: 0 MB node 63 cpus: node 63 size: 0 MB node 64 cpus: node 64 size: 0 MB node 65 cpus: node 65 size: 0 MB node 66 cpus: node 66 size: 0 MB node 67 cpus: node 67 size: 0 MB node 68 cpus: node 68 size: 0 MB node 69 cpus: node 69 size: 0 MB node 70 cpus: node 70 size: 0 MB node 71 cpus: node 71 size: 0 MB node 72 cpus: node 72 size: 0 MB node 73 cpus: node 73 size: 0 MB node 74 cpus: node 74 size: 0 MB node 75 cpus: node 75 size: 0 MB node 76 cpus: node 76 size: 0 MB node 77 cpus: node 77 size: 0 MB node 78 cpus: node 78 size: 0 MB node 79 cpus: node 79 size: 0 MB node 80 cpus: node 80 size: 0 MB node 81 cpus: node 81 size: 0 MB node 82 cpus: node 82 size: 0 MB node 83 cpus: node 83 size: 0 MB node 84 cpus: node 84 size: 0 MB node 85 cpus: node 85 size: 0 MB node 86 cpus: node 86 size: 0 MB node 87 cpus: node 87 size: 0 MB node 88 cpus: node 88 size: 0 MB node 89 cpus: node 89 size: 0 MB node 90 cpus: node 90 size: 0 MB node 91 cpus: node 91 size: 0 MB node 92 cpus: node 92 size: 0 MB node 93 cpus: node 93 size: 0 MB node 94 cpus: node 94 size: 0 MB node 95 cpus: node 95 size: 0 MB node 96 cpus: node 96 size: 0 MB node 97 cpus: node 97 size: 0 MB node 98 cpus: node 98 size: 0 MB node 99 cpus: node 99 size: 0 MB node 100 cpus: node 100 size: 0 MB node 101 cpus: node 101 size: 0 MB node 102 cpus: node 102 size: 0 MB node 103 cpus: node 103 size: 0 MB node 104 cpus: node 104 size: 0 MB node 105 cpus: node 105 size: 0 MB node 106 cpus: node 106 size: 0 MB node 107 cpus: node 107 size: 0 MB node 108 cpus: node 108 size: 0 MB node 109 cpus: node 109 size: 0 MB node 110 cpus: node 110 size: 0 MB node 111 cpus: node 111 size: 0 MB node 112 cpus: node 112 size: 0 MB node 113 cpus: node 113 size: 0 MB node 114 cpus: node 114 size: 0 MB node 115 cpus: node 115 size: 0 MB node 116 cpus: node 116 size: 0 MB node 117 cpus: node 117 size: 0 MB node 118 cpus: node 118 size: 0 MB node 119 cpus: node 119 size: 0 MB node 120 cpus: node 120 size: 0 MB node 121 cpus: node 121 size: 0 MB node 122 cpus: node 122 size: 0 MB node 123 cpus: node 123 size: 0 MB node 124 cpus: node 124 size: 0 MB node 125 cpus: node 125 size: 0 MB node 126 cpus: node 126 size: 0 MB node 127 cpus: node 127 size: 24576 MB According to above test steps,it seems it has not been fixed,thanks a lot.Any issues please let me know. Min (In reply to Min Deng from comment #11) > QE re-tested the bug on the following builds > kernel-3.10.0-781.el7.ppc64le > qemu-kvm-rhev-2.10.0-5.el7.ppc64le > SLOF-20170724-2.git89f519f.el7.noarch > Cli, > /usr/libexec/qemu-kvm -name virt-tests-vm1 -sandbox off -machine > pseries-rhel7.4.0 -nodefaults -vga std -device ... To not break migration, the new behavior is only enabled with pseries-rhel7.5.0 and following, so please test with "-machine pseries-rhel7.5.0". Thanks (In reply to Laurent Vivier from comment #12) > (In reply to Min Deng from comment #11) > > QE re-tested the bug on the following builds > > kernel-3.10.0-781.el7.ppc64le > > qemu-kvm-rhev-2.10.0-5.el7.ppc64le > > SLOF-20170724-2.git89f519f.el7.noarch > > Cli, > > /usr/libexec/qemu-kvm -name virt-tests-vm1 -sandbox off -machine > > pseries-rhel7.4.0 -nodefaults -vga std -device > ... > > To not break migration, the new behavior is only enabled with > pseries-rhel7.5.0 and following, so please test with "-machine > pseries-rhel7.5.0". > > Thanks Per developer,QE re-tested bug on Cli, 1.boot up guest with /usr/libexec/qemu-kvm -name virt-tests-vm1 -sandbox off -machine pseries-rhel7.5.0 -nodefaults -vga std -device virtio-blk-pci,id=virtio_blk_pci0,disable-legacy=off,disable-modern=off,drive=drive_image1 -drive id=drive_image1,if=none,cache=none,aio=native,format=qcow2,file=rhel75-ppc64-virtio.qcow2 -qmp tcp:0:4444,server,nowait -vnc :1 -rtc base=utc,clock=host,driftfix=slew -boot order=cdn,once=c,menu=off,strict=off -monitor stdio -device nec-usb-xhci,id=usb1 -device usb-kbd,id=input0 -device usb-mouse,id=input1 -device usb-tablet,id=input2 -netdev tap,script=/etc/qemu-ifup,downscript=/etc/qemu-down,id=hostnet1,vhost=on -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:52:11:36:3f:01 -machine accel=kvm:tcg -chardev socket,id=serial_id_serial0,path=/tmp/min,server,nowait -device spapr-vty,reg=0x30000000,chardev=serial_id_serial0 -realtime mlock=on -m 24G -smp 12 -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node -numa node 2.check numa info from both HMP and guest inside (qemu) info numa 128 nodes node 0 cpus: 0 node 0 size: 0 MB node 1 cpus: 1 node 1 size: 256 MB node 2 cpus: 2 node 2 size: 256 MB node 3 cpus: 3 node 3 size: 256 MB node 4 cpus: 4 node 4 size: 0 MB node 5 cpus: 5 node 5 size: 256 MB node 6 cpus: 6 node 6 size: 256 MB node 7 cpus: 7 node 7 size: 256 MB node 8 cpus: 8 node 8 size: 0 MB node 9 cpus: 9 node 9 size: 256 MB node 10 cpus: 10 node 10 size: 256 MB node 11 cpus: 11 node 11 size: 256 MB node 12 cpus: node 12 size: 0 MB node 13 cpus: node 13 size: 256 MB node 14 cpus: node 14 size: 256 MB node 15 cpus: node 15 size: 256 MB node 16 cpus: node 16 size: 0 MB node 17 cpus: node 17 size: 256 MB node 18 cpus: node 18 size: 256 MB node 19 cpus: node 19 size: 256 MB node 20 cpus: node 20 size: 0 MB node 21 cpus: node 21 size: 256 MB node 22 cpus: node 22 size: 256 MB node 23 cpus: node 23 size: 256 MB node 24 cpus: node 24 size: 0 MB node 25 cpus: node 25 size: 256 MB node 26 cpus: node 26 size: 256 MB node 27 cpus: node 27 size: 256 MB node 28 cpus: node 28 size: 0 MB node 29 cpus: node 29 size: 256 MB node 30 cpus: node 30 size: 256 MB node 31 cpus: node 31 size: 256 MB node 32 cpus: node 32 size: 0 MB node 33 cpus: node 33 size: 256 MB node 34 cpus: node 34 size: 256 MB node 35 cpus: node 35 size: 256 MB node 36 cpus: node 36 size: 0 MB node 37 cpus: node 37 size: 256 MB node 38 cpus: node 38 size: 256 MB node 39 cpus: node 39 size: 256 MB node 40 cpus: node 40 size: 0 MB node 41 cpus: node 41 size: 256 MB node 42 cpus: node 42 size: 256 MB node 43 cpus: node 43 size: 256 MB node 44 cpus: node 44 size: 0 MB node 45 cpus: node 45 size: 256 MB node 46 cpus: node 46 size: 256 MB node 47 cpus: node 47 size: 256 MB node 48 cpus: node 48 size: 0 MB node 49 cpus: node 49 size: 256 MB node 50 cpus: node 50 size: 256 MB node 51 cpus: node 51 size: 256 MB node 52 cpus: node 52 size: 0 MB node 53 cpus: node 53 size: 256 MB node 54 cpus: node 54 size: 256 MB node 55 cpus: node 55 size: 256 MB node 56 cpus: node 56 size: 0 MB node 57 cpus: node 57 size: 256 MB node 58 cpus: node 58 size: 256 MB node 59 cpus: node 59 size: 256 MB node 60 cpus: node 60 size: 0 MB node 61 cpus: node 61 size: 256 MB node 62 cpus: node 62 size: 256 MB node 63 cpus: node 63 size: 256 MB node 64 cpus: node 64 size: 0 MB node 65 cpus: node 65 size: 256 MB node 66 cpus: node 66 size: 256 MB node 67 cpus: node 67 size: 256 MB node 68 cpus: node 68 size: 0 MB node 69 cpus: node 69 size: 256 MB node 70 cpus: node 70 size: 256 MB node 71 cpus: node 71 size: 256 MB node 72 cpus: node 72 size: 0 MB node 73 cpus: node 73 size: 256 MB node 74 cpus: node 74 size: 256 MB node 75 cpus: node 75 size: 256 MB node 76 cpus: node 76 size: 0 MB node 77 cpus: node 77 size: 256 MB node 78 cpus: node 78 size: 256 MB node 79 cpus: node 79 size: 256 MB node 80 cpus: node 80 size: 0 MB node 81 cpus: node 81 size: 256 MB node 82 cpus: node 82 size: 256 MB node 83 cpus: node 83 size: 256 MB node 84 cpus: node 84 size: 0 MB node 85 cpus: node 85 size: 256 MB node 86 cpus: node 86 size: 256 MB node 87 cpus: node 87 size: 256 MB node 88 cpus: node 88 size: 0 MB node 89 cpus: node 89 size: 256 MB node 90 cpus: node 90 size: 256 MB node 91 cpus: node 91 size: 256 MB node 92 cpus: node 92 size: 0 MB node 93 cpus: node 93 size: 256 MB node 94 cpus: node 94 size: 256 MB node 95 cpus: node 95 size: 256 MB node 96 cpus: node 96 size: 0 MB node 97 cpus: node 97 size: 256 MB node 98 cpus: node 98 size: 256 MB node 99 cpus: node 99 size: 256 MB node 100 cpus: node 100 size: 0 MB node 101 cpus: node 101 size: 256 MB node 102 cpus: node 102 size: 256 MB node 103 cpus: node 103 size: 256 MB node 104 cpus: node 104 size: 0 MB node 105 cpus: node 105 size: 256 MB node 106 cpus: node 106 size: 256 MB node 107 cpus: node 107 size: 256 MB node 108 cpus: node 108 size: 0 MB node 109 cpus: node 109 size: 256 MB node 110 cpus: node 110 size: 256 MB node 111 cpus: node 111 size: 256 MB node 112 cpus: node 112 size: 0 MB node 113 cpus: node 113 size: 256 MB node 114 cpus: node 114 size: 256 MB node 115 cpus: node 115 size: 256 MB node 116 cpus: node 116 size: 0 MB node 117 cpus: node 117 size: 256 MB node 118 cpus: node 118 size: 256 MB node 119 cpus: node 119 size: 256 MB node 120 cpus: node 120 size: 0 MB node 121 cpus: node 121 size: 256 MB node 122 cpus: node 122 size: 256 MB node 123 cpus: node 123 size: 256 MB node 124 cpus: node 124 size: 0 MB node 125 cpus: node 125 size: 256 MB node 126 cpus: node 126 size: 256 MB node 127 cpus: node 127 size: 256 MB Draw a conclusion,the fix is reasonable and bug has been fixed while guest is with "-machine pseries-rhel7.5.0" Thanks for Laurent's effort. Min Min, so could you move the BZ to VERIFIED? Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:1104 |