Description of problem: When pinning virtual NUMA nodes to physical numa node with memory allocation policy 'STRICT', we create the required virtual numa nodes and we pin the virtual numa node CPUs to the correct physical NUMA nodes, but we do not create the correct memory mapping. We produce something like > <numatune> > <memory mode='strict' nodeset='0-1'/> > </numatune> instead of > <numatune> > <memory mode='strict' nodeset='0-1'/> > <memnode cellid="0" mode="strict" nodeset="0"/> > <memnode cellid="1" mode="strict" nodeset="1"/> > </numatune> Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. Create a VM with two numa nodes (0, 1) 2. Pin the numa nodes to two different host numa nodes (0,1) 3. Start the VM 4. run virsh -r dumpxml <vmname> Actual results: Virtual NUMA node cpus are pinned but memory is not pinned Expected results: Both should be pinned Additional info: Full example with two virtual NUMA nodes which are pinned with the strict policy to two physical NUMA nodes: > <domain type='kvm' id='3'> > [...] > <cputune> > <shares>1020</shares> > <vcpupin vcpu='1' cpuset='1'/> > <vcpupin vcpu='0' cpuset='0'/> > </cputune> > <numatune> > <memory mode='strict' nodeset='0-1'/> > </numatune> > <cpu mode='custom' match='exact'> > <model fallback='allow'>SandyBridge</model> > <topology sockets='16' cores='1' threads='1'/> > <numa> > <cell id='0' cpus='0' memory='3072' unit='KiB'/> > <cell id='1' cpus='1' memory='3072' unit='KiB'/> > </numa> > </cpu> > [...] > </domain>
(In reply to Roman Mohr from comment #0) > Description of problem: > > When pinning virtual NUMA nodes to physical numa node with memory allocation > policy 'STRICT', we create the required virtual numa nodes and we pin the > virtual numa node CPUs to the correct physical NUMA nodes, but we do not > create the correct memory mapping. > > We produce something like > > > <numatune> > > <memory mode='strict' nodeset='0-1'/> > > </numatune> prior to RHEL 7 this was the configuration we could use AFAIR. Roman what is the runtime effect of this configuration? is it really different? > > instead of > > > <numatune> > > <memory mode='strict' nodeset='0-1'/> > > <memnode cellid="0" mode="strict" nodeset="0"/> > > <memnode cellid="1" mode="strict" nodeset="1"/> > > </numatune> > > Version-Release number of selected component (if applicable): > > > How reproducible: > > > Steps to Reproduce: > 1. Create a VM with two numa nodes (0, 1) > 2. Pin the numa nodes to two different host numa nodes (0,1) > 3. Start the VM > 4. run virsh -r dumpxml <vmname> > > Actual results: > > Virtual NUMA node cpus are pinned but memory is not pinned > > Expected results: > > Both should be pinned > > Additional info: > > Full example with two virtual NUMA nodes which are pinned with the strict > policy to two physical NUMA nodes: > > > <domain type='kvm' id='3'> > > [...] > > <cputune> > > <shares>1020</shares> > > <vcpupin vcpu='1' cpuset='1'/> > > <vcpupin vcpu='0' cpuset='0'/> > > </cputune> > > <numatune> > > <memory mode='strict' nodeset='0-1'/> > > </numatune> > > <cpu mode='custom' match='exact'> > > <model fallback='allow'>SandyBridge</model> > > <topology sockets='16' cores='1' threads='1'/> > > <numa> > > <cell id='0' cpus='0' memory='3072' unit='KiB'/> > > <cell id='1' cpus='1' memory='3072' unit='KiB'/> > > </numa> > > </cpu> > > [...] > > </domain>
(In reply to Roy Golan from comment #1) > (In reply to Roman Mohr from comment #0) > > Description of problem: > > > > When pinning virtual NUMA nodes to physical numa node with memory allocation > > policy 'STRICT', we create the required virtual numa nodes and we pin the > > virtual numa node CPUs to the correct physical NUMA nodes, but we do not > > create the correct memory mapping. > > > > We produce something like > > > > > <numatune> > > > <memory mode='strict' nodeset='0-1'/> > > > </numatune> > > prior to RHEL 7 this was the configuration we could use AFAIR. > > Roman what is the runtime effect of this configuration? is it really > different? > At least on PPC it is some kind of ignored. Will add some more data for x86. @Martin could you share your PPC findings? > > > > instead of > > > > > <numatune> > > > <memory mode='strict' nodeset='0-1'/> > > > <memnode cellid="0" mode="strict" nodeset="0"/> > > > <memnode cellid="1" mode="strict" nodeset="1"/> > > > </numatune> > > > > Version-Release number of selected component (if applicable): > > > > > > How reproducible: > > > > > > Steps to Reproduce: > > 1. Create a VM with two numa nodes (0, 1) > > 2. Pin the numa nodes to two different host numa nodes (0,1) > > 3. Start the VM > > 4. run virsh -r dumpxml <vmname> > > > > Actual results: > > > > Virtual NUMA node cpus are pinned but memory is not pinned > > > > Expected results: > > > > Both should be pinned > > > > Additional info: > > > > Full example with two virtual NUMA nodes which are pinned with the strict > > policy to two physical NUMA nodes: > > > > > <domain type='kvm' id='3'> > > > [...] > > > <cputune> > > > <shares>1020</shares> > > > <vcpupin vcpu='1' cpuset='1'/> > > > <vcpupin vcpu='0' cpuset='0'/> > > > </cputune> > > > <numatune> > > > <memory mode='strict' nodeset='0-1'/> > > > </numatune> > > > <cpu mode='custom' match='exact'> > > > <model fallback='allow'>SandyBridge</model> > > > <topology sockets='16' cores='1' threads='1'/> > > > <numa> > > > <cell id='0' cpus='0' memory='3072' unit='KiB'/> > > > <cell id='1' cpus='1' memory='3072' unit='KiB'/> > > > </numa> > > > </cpu> > > > [...] > > > </domain>
I have created a VM with 4 NUMA vcells pinned to pcell 0. Resulting XML looked as follows: <cpu> <model>POWER8</model> <topology cores="2" sockets="2" threads="2"/> <numa> <cell cpus="0,1" memory="2621440"/> <cell cpus="2,3" memory="2621440"/> <cell cpus="4,5" memory="2621440"/> <cell cpus="6,7" memory="2621440"/> </numa> </cpu> <numatune> <memory mode="strict" nodeset="0"/> </numatune> but checking the memory maps of the vcpu pids unveils this information: for vcpu_pid in /proc/1476{09..16}; do echo $vcpu_pid cat $vcpu_pid/numa_maps | cut -d \ -f2 | uniq done /proc/147609 prefer:16 /proc/147610 prefer:16 /proc/147611 prefer:16 /proc/147612 prefer:16 /proc/147613 prefer:16 /proc/147614 prefer:16 /proc/147615 prefer:16 /proc/147616 prefer:16 My conclussion is that, at least on PPC, the memory is not pinned at all and prefers node that was not at all chosen.
Testing with Cluster-on-Die PC (Xeon 2650-v3) with COD enabled (2 numa nodes, 0 and 1) <cpu match="exact" mode="host-passthrough"> <topology cores="2" sockets="16" threads="2"/> <numa> <cell cpus="0,1" memory="2621440"/> <cell cpus="2,3" memory="2621440"/> <cell cpus="4,5" memory="2621440"/> <cell cpus="6,7" memory="2621440"/> </numa> </cpu> <numatune> <memory mode="strict" nodeset="1"/> </numatune> --- /proc/28616 prefer:0 /proc/28617 prefer:1 /proc/28618 prefer:0 /proc/28619 prefer:1 /proc/28620 prefer:0 /proc/28621 prefer:1 /proc/28622 prefer:0 /proc/28623 prefer:0
The fix for this issue should be included in oVirt 4.1.0 beta 1 released on December 1st. If not included please move back to modified.
Verified on ovirt-engine-setup-plugin-ovirt-engine-4.1.0-0.2.master.20161212172238.gitea103bd.el7.centos.noarch dumpxml for two NUMA nodes <numatune> <memory mode='strict' nodeset='0-1'/> <memnode cellid='0' mode='strict' nodeset='0'/> <memnode cellid='1' mode='strict' nodeset='1'/> </numatune> VM process also looks fine: default bind:1 default bind:0 default /proc/16941/task/16943 default bind:1 default bind:0 default /proc/16941/task/16947 default bind:1 default bind:0 default /proc/16941/task/16948 default bind:1 default bind:0 default /proc/16941/task/16950 default bind:1 default bind:0 default