Bug 1306698 - NUMA memory mapping is not generated correctly
NUMA memory mapping is not generated correctly
Status: CLOSED CURRENTRELEASE
Product: ovirt-engine
Classification: oVirt
Component: Backend.Core (Show other bugs)
3.6.2.1
Unspecified Unspecified
unspecified Severity high (vote)
: ovirt-4.1.0-alpha
: 4.1.0.2
Assigned To: Andrej Krejcir
Artyom
: Triaged
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-02-11 10:38 EST by Roman Mohr
Modified: 2017-07-13 23:50 EDT (History)
8 users (show)

See Also:
Fixed In Version:
Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-02-01 09:49:39 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: SLA
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
rule-engine: ovirt‑4.1+
rule-engine: planning_ack+
dfediuck: devel_ack+
mavital: testing_ack+


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
oVirt gerrit 65564 master MERGED core: Generate detailed NUMA node pinning info 2016-11-24 08:21 EST
oVirt gerrit 65565 master NEW Allow pinning of VM NUMA nodes to host NUMA nodes. 2016-10-27 04:27 EDT

  None (edit)
Description Roman Mohr 2016-02-11 10:38:39 EST
Description of problem:

When pinning virtual NUMA nodes to physical numa node with memory allocation policy 'STRICT', we create the required virtual numa nodes and we pin the virtual numa node CPUs to the correct physical NUMA nodes, but we do not create the correct memory mapping.

We produce something like 

>  <numatune>
>    <memory mode='strict' nodeset='0-1'/>
>  </numatune>

instead of

>  <numatune>
>    <memory mode='strict' nodeset='0-1'/>
>    <memnode cellid="0" mode="strict" nodeset="0"/>
>    <memnode cellid="1" mode="strict" nodeset="1"/>
>  </numatune>

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Create a VM with two numa nodes (0, 1)
2. Pin the numa nodes to two different host numa nodes (0,1)
3. Start the VM
4. run virsh -r dumpxml <vmname>

Actual results:

Virtual NUMA node cpus are pinned but memory is not pinned

Expected results:

Both should be pinned

Additional info:

Full example with two virtual NUMA nodes which are pinned with the strict policy to two physical NUMA nodes:

> <domain type='kvm' id='3'>
> [...]
>  <cputune>
>    <shares>1020</shares>
>    <vcpupin vcpu='1' cpuset='1'/>
>    <vcpupin vcpu='0' cpuset='0'/>
>  </cputune>
>  <numatune>
>    <memory mode='strict' nodeset='0-1'/>
>  </numatune>
>  <cpu mode='custom' match='exact'>
>    <model fallback='allow'>SandyBridge</model>
>    <topology sockets='16' cores='1' threads='1'/>
>    <numa>
>      <cell id='0' cpus='0' memory='3072' unit='KiB'/>
>      <cell id='1' cpus='1' memory='3072' unit='KiB'/>
>    </numa>
>  </cpu>
> [...]
> </domain>
Comment 1 Roy Golan 2016-02-17 06:55:05 EST
(In reply to Roman Mohr from comment #0)
> Description of problem:
> 
> When pinning virtual NUMA nodes to physical numa node with memory allocation
> policy 'STRICT', we create the required virtual numa nodes and we pin the
> virtual numa node CPUs to the correct physical NUMA nodes, but we do not
> create the correct memory mapping.
> 
> We produce something like 
> 
> >  <numatune>
> >    <memory mode='strict' nodeset='0-1'/>
> >  </numatune>

prior to RHEL 7 this was the configuration we could use AFAIR. 

Roman what is the runtime effect of this configuration? is it really different?

> 
> instead of
> 
> >  <numatune>
> >    <memory mode='strict' nodeset='0-1'/>
> >    <memnode cellid="0" mode="strict" nodeset="0"/>
> >    <memnode cellid="1" mode="strict" nodeset="1"/>
> >  </numatune>
> 
> Version-Release number of selected component (if applicable):
> 
> 
> How reproducible:
> 
> 
> Steps to Reproduce:
> 1. Create a VM with two numa nodes (0, 1)
> 2. Pin the numa nodes to two different host numa nodes (0,1)
> 3. Start the VM
> 4. run virsh -r dumpxml <vmname>
> 
> Actual results:
> 
> Virtual NUMA node cpus are pinned but memory is not pinned
> 
> Expected results:
> 
> Both should be pinned
> 
> Additional info:
> 
> Full example with two virtual NUMA nodes which are pinned with the strict
> policy to two physical NUMA nodes:
> 
> > <domain type='kvm' id='3'>
> > [...]
> >  <cputune>
> >    <shares>1020</shares>
> >    <vcpupin vcpu='1' cpuset='1'/>
> >    <vcpupin vcpu='0' cpuset='0'/>
> >  </cputune>
> >  <numatune>
> >    <memory mode='strict' nodeset='0-1'/>
> >  </numatune>
> >  <cpu mode='custom' match='exact'>
> >    <model fallback='allow'>SandyBridge</model>
> >    <topology sockets='16' cores='1' threads='1'/>
> >    <numa>
> >      <cell id='0' cpus='0' memory='3072' unit='KiB'/>
> >      <cell id='1' cpus='1' memory='3072' unit='KiB'/>
> >    </numa>
> >  </cpu>
> > [...]
> > </domain>
Comment 2 Roman Mohr 2016-02-29 07:11:27 EST
(In reply to Roy Golan from comment #1)
> (In reply to Roman Mohr from comment #0)
> > Description of problem:
> > 
> > When pinning virtual NUMA nodes to physical numa node with memory allocation
> > policy 'STRICT', we create the required virtual numa nodes and we pin the
> > virtual numa node CPUs to the correct physical NUMA nodes, but we do not
> > create the correct memory mapping.
> > 
> > We produce something like 
> > 
> > >  <numatune>
> > >    <memory mode='strict' nodeset='0-1'/>
> > >  </numatune>
> 
> prior to RHEL 7 this was the configuration we could use AFAIR. 
> 
> Roman what is the runtime effect of this configuration? is it really
> different?
> 

At least on PPC it is some kind of ignored. Will add some more data for x86.

@Martin could you share your PPC findings?

> > 
> > instead of
> > 
> > >  <numatune>
> > >    <memory mode='strict' nodeset='0-1'/>
> > >    <memnode cellid="0" mode="strict" nodeset="0"/>
> > >    <memnode cellid="1" mode="strict" nodeset="1"/>
> > >  </numatune>
> > 
> > Version-Release number of selected component (if applicable):
> > 
> > 
> > How reproducible:
> > 
> > 
> > Steps to Reproduce:
> > 1. Create a VM with two numa nodes (0, 1)
> > 2. Pin the numa nodes to two different host numa nodes (0,1)
> > 3. Start the VM
> > 4. run virsh -r dumpxml <vmname>
> > 
> > Actual results:
> > 
> > Virtual NUMA node cpus are pinned but memory is not pinned
> > 
> > Expected results:
> > 
> > Both should be pinned
> > 
> > Additional info:
> > 
> > Full example with two virtual NUMA nodes which are pinned with the strict
> > policy to two physical NUMA nodes:
> > 
> > > <domain type='kvm' id='3'>
> > > [...]
> > >  <cputune>
> > >    <shares>1020</shares>
> > >    <vcpupin vcpu='1' cpuset='1'/>
> > >    <vcpupin vcpu='0' cpuset='0'/>
> > >  </cputune>
> > >  <numatune>
> > >    <memory mode='strict' nodeset='0-1'/>
> > >  </numatune>
> > >  <cpu mode='custom' match='exact'>
> > >    <model fallback='allow'>SandyBridge</model>
> > >    <topology sockets='16' cores='1' threads='1'/>
> > >    <numa>
> > >      <cell id='0' cpus='0' memory='3072' unit='KiB'/>
> > >      <cell id='1' cpus='1' memory='3072' unit='KiB'/>
> > >    </numa>
> > >  </cpu>
> > > [...]
> > > </domain>
Comment 3 Martin Polednik 2016-03-07 10:10:30 EST
I have created a VM with 4 NUMA vcells pinned to pcell 0.

Resulting XML looked as follows:

<cpu>
        <model>POWER8</model>
        <topology cores="2" sockets="2" threads="2"/>
        <numa>
                <cell cpus="0,1" memory="2621440"/>
                <cell cpus="2,3" memory="2621440"/>
                <cell cpus="4,5" memory="2621440"/>
                <cell cpus="6,7" memory="2621440"/>
        </numa>
</cpu>
<numatune>
        <memory mode="strict" nodeset="0"/>
</numatune>

but checking the memory maps of the vcpu pids unveils this information:

for vcpu_pid in /proc/1476{09..16}; do
    echo $vcpu_pid
    cat $vcpu_pid/numa_maps | cut -d \  -f2 | uniq
done

/proc/147609
prefer:16
/proc/147610
prefer:16
/proc/147611
prefer:16
/proc/147612
prefer:16
/proc/147613
prefer:16
/proc/147614
prefer:16
/proc/147615
prefer:16
/proc/147616
prefer:16

My conclussion is that, at least on PPC, the memory is not pinned at all and prefers node that was not at all chosen.
Comment 4 Martin Polednik 2016-03-07 10:41:24 EST
Testing with Cluster-on-Die PC (Xeon 2650-v3) with COD enabled (2 numa nodes, 0 and 1)

<cpu match="exact" mode="host-passthrough">
        <topology cores="2" sockets="16" threads="2"/>
        <numa>
                <cell cpus="0,1" memory="2621440"/>
                <cell cpus="2,3" memory="2621440"/>
                <cell cpus="4,5" memory="2621440"/>
                <cell cpus="6,7" memory="2621440"/>
        </numa>
</cpu>
<numatune>
        <memory mode="strict" nodeset="1"/>
</numatune>

---

/proc/28616
prefer:0
/proc/28617
prefer:1
/proc/28618
prefer:0
/proc/28619
prefer:1
/proc/28620
prefer:0
/proc/28621
prefer:1
/proc/28622
prefer:0
/proc/28623
prefer:0
Comment 7 Sandro Bonazzola 2016-12-12 08:56:00 EST
The fix for this issue should be included in oVirt 4.1.0 beta 1 released on December 1st. If not included please move back to modified.
Comment 8 Artyom 2016-12-13 09:37:55 EST
Verified on ovirt-engine-setup-plugin-ovirt-engine-4.1.0-0.2.master.20161212172238.gitea103bd.el7.centos.noarch

dumpxml for two NUMA nodes
<numatune>
    <memory mode='strict' nodeset='0-1'/>
    <memnode cellid='0' mode='strict' nodeset='0'/>
    <memnode cellid='1' mode='strict' nodeset='1'/>
  </numatune>


VM process also looks fine:
default
bind:1
default
bind:0
default
/proc/16941/task/16943
default
bind:1
default
bind:0
default
/proc/16941/task/16947
default
bind:1
default
bind:0
default
/proc/16941/task/16948
default
bind:1
default
bind:0
default
/proc/16941/task/16950
default
bind:1
default
bind:0
default

Note You need to log in before you can comment on or make changes to this bug.