Red Hat Bugzilla – Bug 832167
PRD35 - [RFE] NUMA information(memory and cpu) in guest - RHEV-M support
Last modified: 2016-02-10 15:17:58 EST
Karen - can you please explain how this related to the other numa bug 824634?
Bug 824634 is for autonuma, which is a feature in the kernel which moves memory and process around for best performance on a NUMA system. When NUMA topology is exposed to the guest in a guest, you can then run autonuma in the guest. Autonuma will not be available until RHEL7.
For RHEL6, we have numad instead.
To get best performance in RHEL6.3, for a large single guest, the performance team is using "NUMA in the guest" and pinning vcpus to physical cpus in the host. Then they are using numad in the guest.
We are requesting this feature to be added to libvirt (and RHEV-M), so the user can automatically take advantage of NUMA in the guest, without having to do sophisticated hand tuning.
Many of the details of how autonuma, numad and "NUMA in guest" will work together in RHEL7 are not yet worked out.
Many details of how RHEV-M should expose these features to the admin are not yet worked out.
karen - any more details available now?
NUMA functionality is now available in the engine and users can
make use of it through the REST API. The GUI is being tracked by Bug 1134880.
Verified on vt4
-numa node,nodeid=0,cpus=0,mem=1024 -numa node,nodeid=1,cpus=1,mem=1024 qemu command
Last login: Mon Oct 6 15:21:51 2014
[root@localhost ~]# numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0
node 0 size: 1023 MB
node 0 free: 673 MB
node 1 cpus: 1
node 1 size: 1023 MB
node 1 free: 954 MB
node 0 1
0: 10 20
1: 20 10
-numa node,nodeid=0,cpus=3,mem=768 -numa node,nodeid=1,cpus=0-1,mem=1024 -numa node,nodeid=2,cpus=2,mem=256
available: 3 nodes (0-2)
node 0 cpus: 0 1
node 0 size: 1024 MB
node 0 free: 840 MB
node 1 cpus: 2
node 1 size: 255 MB
node 1 free: 228 MB
node 2 cpus: 3
node 2 size: 767 MB
node 2 free: 572 MB
node 0 1 2
0: 10 20 20
1: 20 10 20
2: 20 20 10
(In reply to Artyom from comment #7)
> -numa node,nodeid=0,cpus=3,mem=768 -numa node,nodeid=1,cpus=0-1,mem=1024
> -numa node,nodeid=2,cpus=2,mem=256
> available: 3 nodes (0-2)
> node 0 cpus: 0 1
> node 0 size: 1024 MB
> node 0 free: 840 MB
> node 1 cpus: 2
> node 1 size: 255 MB
> node 1 free: 228 MB
> node 2 cpus: 3
> node 2 size: 767 MB
> node 2 free: 572 MB
> node distances:
> node 0 1 2
> 0: 10 20 20
> 1: 20 10 20
> 2: 20 20 10
One note to people who may be as confused as I was, seeing the above:
The node IDs shown by "numactl -H" on the guest are just IDs chosen by the Linux guest, and don't necessarily match the node IDs specified on "-numa node,nodeid=X" (e.g. CPU 3 is configured to be on node ID 0, not on node ID 2). To see if the nodeids being exposed to the guest are the right ones, grep for "SRAT" on the dmesg output, and check the "proximity domain" IDs ("PXM") shown for each APIC ID (which, in turn, may be different from the CPU indexes used on the -numa option, if the cores or sockets options are not powers of 2).
Summarizing it: the output above is correct as long as you see the right APIC IDs with the right PXM IDs on dmesg.
If this bug requires doc text for errata release, please provide draft text in the doc text field in the following format:
The documentation team will review, edit, and approve the text.
If this bug does not require doc text, please set the 'requires_doc_text' flag to -.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.