Bug 810157 - numad: Pre-set memory policy and convert nodeset from numad to CPUs list before affinity setting
numad: Pre-set memory policy and convert nodeset from numad to CPUs list befo...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt (Show other bugs)
6.3
Unspecified Unspecified
unspecified Severity unspecified
: rc
: ---
Assigned To: Osier Yang
Virtualization Bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-04-05 05:54 EDT by Osier Yang
Modified: 2012-06-20 02:52 EDT (History)
9 users (show)

See Also:
Fixed In Version: libvirt-0.9.10-19.el6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-06-20 02:52:01 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Osier Yang 2012-04-05 05:54:58 EDT
Description of problem:
numad's document was a bit confused, and libvirt expects the CPUs list, but not
node list, so there are two ways to fix the problem 1) numad returns CPUs list
instead, 2) libvirt converts the node list into CPUs list, and numad updates the
doc. We had agreement to go forward with 2).

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:
For a box which has n nodes, and m (let's suppose m = 16 * n) CPUs, the
domain process will be pinned to part of CPU0...CPUn, which definitely
will cause significient low performance. 

Expected results:


Additional info:
Comment 6 Wayne Sun 2012-04-18 06:30:06 EDT
pkgs:
libvirt-0.9.10-12.el6.x86_64
numad-0.5-3.20120316git.el6.x86_64
kernel-2.6.32-250.el6.x86_64
qemu-kvm-0.12.1.2-2.275.el6.x86_64

steps
1. prepare a domain with vcpu parts set auto placement:
# virsh dumpxml rhel6u3-501|grep vcpu
  <vcpu placement='auto'>24</vcpu>

2. check domain vcpuinfo
# virsh start rhel6u3-501
# virsh vcpuinfo rhel6u3-501
VCPU:           0
CPU:            2
State:          running
CPU time:       1.5s
CPU Affinity:   yyyyyyyyyyyyyyyy----------------yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy

VCPU:           1
CPU:            39
State:          running
CPU time:       0.0s
CPU Affinity:   yyyyyyyyyyyyyyyy----------------yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy

VCPU:           2
CPU:            36
State:          running
CPU time:       0.0s
CPU Affinity:   yyyyyyyyyyyyyyyy----------------yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy

VCPU:           3
CPU:            12
State:          running
CPU time:       0.0s
CPU Affinity:   yyyyyyyyyyyyyyyy----------------yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy

VCPU:           4
CPU:            32
State:          running
CPU time:       0.0s
CPU Affinity:   yyyyyyyyyyyyyyyy----------------yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy

VCPU:           5
CPU:            0
State:          running
CPU time:       0.0s
CPU Affinity:   yyyyyyyyyyyyyyyy----------------yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy

VCPU:           6
CPU:            52
State:          running
CPU time:       0.0s
CPU Affinity:   yyyyyyyyyyyyyyyy----------------yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy

VCPU:           7
CPU:            33
State:          running
CPU time:       0.0s
CPU Affinity:   yyyyyyyyyyyyyyyy----------------yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy

VCPU:           8
CPU:            1
State:          running
CPU time:       0.0s
CPU Affinity:   yyyyyyyyyyyyyyyy----------------yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy

VCPU:           9
CPU:            4
State:          running
CPU time:       0.0s
CPU Affinity:   yyyyyyyyyyyyyyyy----------------yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy

VCPU:           10
CPU:            12
State:          running
CPU time:       0.0s
CPU Affinity:   yyyyyyyyyyyyyyyy----------------yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy

VCPU:           11
CPU:            48
State:          running
CPU time:       0.0s
CPU Affinity:   yyyyyyyyyyyyyyyy----------------yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy

VCPU:           12
CPU:            2
State:          running
CPU time:       0.0s
CPU Affinity:   yyyyyyyyyyyyyyyy----------------yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy

VCPU:           13
CPU:            37
State:          running
CPU time:       0.0s
CPU Affinity:   yyyyyyyyyyyyyyyy----------------yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy

VCPU:           14
CPU:            9
State:          running
CPU time:       0.0s
CPU Affinity:   yyyyyyyyyyyyyyyy----------------yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy

VCPU:           15
CPU:            41
State:          running
CPU time:       0.0s
CPU Affinity:   yyyyyyyyyyyyyyyy----------------yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy

VCPU:           16
CPU:            56
State:          running
CPU time:       0.0s
CPU Affinity:   yyyyyyyyyyyyyyyy----------------yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy

VCPU:           17
CPU:            8
State:          running
CPU time:       0.0s
CPU Affinity:   yyyyyyyyyyyyyyyy----------------yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy

VCPU:           18
CPU:            40
State:          running
CPU time:       0.0s
CPU Affinity:   yyyyyyyyyyyyyyyy----------------yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy

VCPU:           19
CPU:            13
State:          running
CPU time:       0.0s
CPU Affinity:   yyyyyyyyyyyyyyyy----------------yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy

VCPU:           20
CPU:            44
State:          running
CPU time:       0.0s
CPU Affinity:   yyyyyyyyyyyyyyyy----------------yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy

VCPU:           21
CPU:            43
State:          running
CPU time:       0.0s
CPU Affinity:   yyyyyyyyyyyyyyyy----------------yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy

VCPU:           22
CPU:            3
State:          running
CPU time:       0.0s
CPU Affinity:   yyyyyyyyyyyyyyyy----------------yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy

VCPU:           23
CPU:            6
State:          running
CPU time:       0.0s
CPU Affinity:   yyyyyyyyyyyyyyyy----------------yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy

After numad default refresh interval time, recheck: 

# virsh vcpuinfo rhel6u3-501
VCPU:           0
CPU:            35
State:          running
CPU time:       16.4s
CPU Affinity:   ---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y

VCPU:           1
CPU:            47
State:          running
CPU time:       1.0s
CPU Affinity:   ---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y

VCPU:           2
CPU:            55
State:          running
CPU time:       2.1s
CPU Affinity:   ---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y

VCPU:           3
CPU:            39
State:          running
CPU time:       1.5s
CPU Affinity:   ---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y

VCPU:           4
CPU:            39
State:          running
CPU time:       1.7s
CPU Affinity:   ---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y

VCPU:           5
CPU:            3
State:          running
CPU time:       0.9s
CPU Affinity:   ---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y

VCPU:           6
CPU:            23
State:          running
CPU time:       1.4s
CPU Affinity:   ---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y

VCPU:           7
CPU:            11
State:          running
CPU time:       1.2s
CPU Affinity:   ---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y

VCPU:           8
CPU:            43
State:          running
CPU time:       2.1s
CPU Affinity:   ---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y

VCPU:           9
CPU:            3
State:          running
CPU time:       1.0s
CPU Affinity:   ---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y

VCPU:           10
CPU:            47
State:          running
CPU time:       1.0s
CPU Affinity:   ---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y

VCPU:           11
CPU:            3
State:          running
CPU time:       1.4s
CPU Affinity:   ---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y

VCPU:           12
CPU:            15
State:          running
CPU time:       0.9s
CPU Affinity:   ---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y

VCPU:           13
CPU:            15
State:          running
CPU time:       0.7s
CPU Affinity:   ---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y

VCPU:           14
CPU:            3
State:          running
CPU time:       0.6s
CPU Affinity:   ---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y

VCPU:           15
CPU:            43
State:          running
CPU time:       0.6s
CPU Affinity:   ---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y

VCPU:           16
CPU:            39
State:          running
CPU time:       0.8s
CPU Affinity:   ---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y

VCPU:           17
CPU:            3
State:          running
CPU time:       0.6s
CPU Affinity:   ---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y

VCPU:           18
CPU:            11
State:          running
CPU time:       0.6s
CPU Affinity:   ---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y

VCPU:           19
CPU:            7
State:          running
CPU time:       0.5s
CPU Affinity:   ---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y

VCPU:           20
CPU:            3
State:          running
CPU time:       1.4s
CPU Affinity:   ---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y

VCPU:           21
CPU:            51
State:          running
CPU time:       0.5s
CPU Affinity:   ---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y

VCPU:           22
CPU:            7
State:          running
CPU time:       0.4s
CPU Affinity:   ---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y

VCPU:           23
CPU:            23
State:          running
CPU time:       0.4s
CPU Affinity:   ---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y---y

VCPU and CPU relation changed after rescan, but CPU Affinity always show the wrong pin, and it will follow the pattern as first & second check shows, at last it will remain as what second time check show. Something might wrong here.
Comment 7 Osier Yang 2012-04-18 07:06:36 EDT
Hi, Bill, 

I suspect it's caused by numad rebalance the affinity dynamically, and thus "virsh vcpuinfo" (using sched_getaffinity underlying) will display different results. but I'd like to confirm with you, is that true? 

Osier
Comment 8 Osier Yang 2012-05-09 03:36:56 EDT
(In reply to comment #0)
> Description of problem:
> numad's document was a bit confused, and libvirt expects the CPUs list, but not
> node list, so there are two ways to fix the problem 1) numad returns CPUs list
> instead, 2) libvirt converts the node list into CPUs list, and numad updates
> the doc. We had agreement to go forward with 2).

Fixes go together with this BZ:

1) Pre-set memory policy of domain process with the advisory nodeset from numad, using libnuma's API.

2) Devide currentMemory's value by 1024 before passing it to numad's command line,
as libvirt stores the value in KB unit in memory.

Patches posted internally:
http://post-office.corp.redhat.com/archives/rhvirt-patches/2012-May/msg00201.html

Check the documents come with the patches to see how to fully drive numad now.
Comment 9 Daniel Veillard 2012-05-09 04:41:22 EDT
Okay the default placement based on both memory and CPU affinity should
have been improved in build libvirt-0.9.10-18.el6, I suggest to use
that version for further testing

Daniel
Comment 13 Wayne Sun 2012-05-11 00:24:34 EDT
pkgs:
libvirt-0.9.10-18.el6NumadBuild.x86_64
numad-0.5-3.20120316git.el6.x86_64
kernel-2.6.32-269.el6.x86_64
qemu-kvm-0.12.1.2-2.290.el6.x86_64

steps
1. prepare a domain with vcpu parts set auto placement:
# virsh dumpxml rhel6u2
...
  <vcpu placement='auto'>24</vcpu>
...

2. start domain and check xml
# virsh start rhel6u2
Domain rhel6u2 started

# virsh dumpxml rhel6u2
...
  <vcpu placement='auto'>24</vcpu>
  <numatune>
    <memory mode='strict' placement='auto'/>
  </numatune>
...

3. check vcpuinfo
# virsh vcpuinfo rhel6u2
VCPU:           0
CPU:            1
State:          running
CPU time:       10.6s
CPU Affinity:   yyyyyyyyyyyyyyyyyyyy--------------------yyyyyyyyyyyyyyyyyyyy--------------------

VCPU:           1
CPU:            40
State:          running
CPU time:       1.1s
CPU Affinity:   yyyyyyyyyyyyyyyyyyyy--------------------yyyyyyyyyyyyyyyyyyyy--------------------

VCPU:           2
CPU:            14
State:          running
CPU time:       1.1s
CPU Affinity:   yyyyyyyyyyyyyyyyyyyy--------------------yyyyyyyyyyyyyyyyyyyy--------------------

VCPU:           3
CPU:            46
State:          running
CPU time:       1.1s
CPU Affinity:   yyyyyyyyyyyyyyyyyyyy--------------------yyyyyyyyyyyyyyyyyyyy--------------------

VCPU:           4
CPU:            6
State:          running
CPU time:       1.4s
CPU Affinity:   yyyyyyyyyyyyyyyyyyyy--------------------yyyyyyyyyyyyyyyyyyyy--------------------

VCPU:           5
CPU:            59
State:          running
CPU time:       1.6s
CPU Affinity:   yyyyyyyyyyyyyyyyyyyy--------------------yyyyyyyyyyyyyyyyyyyy--------------------

VCPU:           6
CPU:            19
State:          running
CPU time:       1.8s
CPU Affinity:   yyyyyyyyyyyyyyyyyyyy--------------------yyyyyyyyyyyyyyyyyyyy--------------------

VCPU:           7
CPU:            13
State:          running
CPU time:       1.1s
CPU Affinity:   yyyyyyyyyyyyyyyyyyyy--------------------yyyyyyyyyyyyyyyyyyyy--------------------

VCPU:           8
CPU:            53
State:          running
CPU time:       1.0s
CPU Affinity:   yyyyyyyyyyyyyyyyyyyy--------------------yyyyyyyyyyyyyyyyyyyy--------------------

VCPU:           9
CPU:            1
State:          running
CPU time:       1.2s
CPU Affinity:   yyyyyyyyyyyyyyyyyyyy--------------------yyyyyyyyyyyyyyyyyyyy--------------------

VCPU:           10
CPU:            11
State:          running
CPU time:       1.3s
CPU Affinity:   yyyyyyyyyyyyyyyyyyyy--------------------yyyyyyyyyyyyyyyyyyyy--------------------

VCPU:           11
CPU:            40
State:          running
CPU time:       1.4s
CPU Affinity:   yyyyyyyyyyyyyyyyyyyy--------------------yyyyyyyyyyyyyyyyyyyy--------------------

VCPU:           12
CPU:            4
State:          running
CPU time:       1.3s
CPU Affinity:   yyyyyyyyyyyyyyyyyyyy--------------------yyyyyyyyyyyyyyyyyyyy--------------------

VCPU:           13
CPU:            40
State:          running
CPU time:       2.3s
CPU Affinity:   yyyyyyyyyyyyyyyyyyyy--------------------yyyyyyyyyyyyyyyyyyyy--------------------

VCPU:           14
CPU:            42
State:          running
CPU time:       1.6s
CPU Affinity:   yyyyyyyyyyyyyyyyyyyy--------------------yyyyyyyyyyyyyyyyyyyy--------------------

VCPU:           15
CPU:            41
State:          running
CPU time:       1.5s
CPU Affinity:   yyyyyyyyyyyyyyyyyyyy--------------------yyyyyyyyyyyyyyyyyyyy--------------------

VCPU:           16
CPU:            42
State:          running
CPU time:       1.2s
CPU Affinity:   yyyyyyyyyyyyyyyyyyyy--------------------yyyyyyyyyyyyyyyyyyyy--------------------

VCPU:           17
CPU:            12
State:          running
CPU time:       1.1s
CPU Affinity:   yyyyyyyyyyyyyyyyyyyy--------------------yyyyyyyyyyyyyyyyyyyy--------------------

VCPU:           18
CPU:            40
State:          running
CPU time:       1.2s
CPU Affinity:   yyyyyyyyyyyyyyyyyyyy--------------------yyyyyyyyyyyyyyyyyyyy--------------------

VCPU:           19
CPU:            10
State:          running
CPU time:       1.3s
CPU Affinity:   yyyyyyyyyyyyyyyyyyyy--------------------yyyyyyyyyyyyyyyyyyyy--------------------

VCPU:           20
CPU:            50
State:          running
CPU time:       1.2s
CPU Affinity:   yyyyyyyyyyyyyyyyyyyy--------------------yyyyyyyyyyyyyyyyyyyy--------------------

VCPU:           21
CPU:            2
State:          running
CPU time:       1.2s
CPU Affinity:   yyyyyyyyyyyyyyyyyyyy--------------------yyyyyyyyyyyyyyyyyyyy--------------------

VCPU:           22
CPU:            48
State:          running
CPU time:       1.3s
CPU Affinity:   yyyyyyyyyyyyyyyyyyyy--------------------yyyyyyyyyyyyyyyyyyyy--------------------

VCPU:           23
CPU:            2
State:          running
CPU time:       1.5s
CPU Affinity:   yyyyyyyyyyyyyyyyyyyy--------------------yyyyyyyyyyyyyyyyyyyy--------------------


The CPU pin is in CPU Affinity range, so this is working right now.

4. destroy domain and edit domain xml as:
# virsh destroy rhel6u2

# virsh edit rhel6u2
...
  <vcpu placement='auto'>24</vcpu>
  <numatune>
    <memory mode='interleave'/>
  </numatune>
...

5. start domain and check xml
# virsh start rhel6u2
Domain rhel6u2 started

# virsh dumpxml rhel6u2
...
  <vcpu placement='auto'>24</vcpu>
  <numatune>
    <memory mode='interleave' placement='auto'/>
  </numatune>
...

6. check with vcpupin
# virsh vcpuinfo rhel6u2
result is similar to step 5.

7. destroy and edit domain
# virsh destroy rhel6u2

# virsh edit rhel6u2
...
  <vcpu placement='auto'>24</vcpu>
  <numatune>
    <memory mode='strict' nodeset='0-1,3'/>
  </numatune>
...

8. start domain and check vcpuinfo
# virsh start rhel6u2
Domain rhel6u2 started

# virsh vcpuinfo rhel6u2
result is similar to step 5.

9. destroy and edit domain
# virsh destroy rhel6u2

# virsh edit rhel6u2
...
  <vcpu placement='static' cpuset='0-11,13-22,66-79'>24</vcpu>
  <numatune>
    <memory mode='interleave' placement='auto'/>
  </numatune>
...

10. start domain and check vcpuinfo
# virsh start rhel6u2
Domain rhel6u2 started

# virsh vcpuinfo rhel6u2
result is similar to step 5 and affinity is right:

CPU Affinity:   yyyyyyyyyyyy-yyyyyyyyyy-------------------------------------------yyyyyyyyyyyyyy
Comment 15 yanbing du 2012-05-17 08:35:36 EDT
Test with libvirt-0.9.10-20.el6nodeinfo.x86_64. Results as following:
0. 
# numactl --hardware
available: 8 nodes (0-7)
node 0 cpus: 0 4 8 12 16 20
node 0 size: 65526 MB
node 0 free: 62852 MB
node 1 cpus: 24 28 32 36 40 44
node 1 size: 65536 MB
node 1 free: 60043 MB
node 2 cpus: 3 7 11 15 19 23
node 2 size: 65536 MB
node 2 free: 58456 MB
node 3 cpus: 27 31 35 39 43 47
node 3 size: 65536 MB
node 3 free: 63275 MB
node 4 cpus: 2 6 10 14 18 22
node 4 size: 65536 MB
node 4 free: 63405 MB
node 5 cpus: 26 30 34 38 42 46
node 5 size: 65536 MB
node 5 free: 63714 MB
node 6 cpus: 1 5 9 13 17 21
node 6 size: 65536 MB
node 6 free: 63833 MB
node 7 cpus: 25 29 33 37 41 45
node 7 size: 65536 MB
node 7 free: 63806 MB
node distances:
node   0   1   2   3   4   5   6   7 
  0:  10  16  16  22  16  22  16  22 
  1:  16  10  16  22  22  16  22  16 
  2:  16  16  10  16  16  16  16  22 
  3:  22  22  16  10  16  16  22  16 
  4:  16  22  16  16  10  16  16  16 
  5:  22  16  16  16  16  10  22  22 
  6:  16  22  16  22  16  22  10  16 
  7:  22  16  22  16  16  22  16  10 
1. Prepare a domain with no 'cpuset', no 'placement' for <vcpu>, and no <numatune>, then start the domain.
#virsh dumpxml rhel62 |grep vcpu
  <vcpu placement='static'>24</vcpu>
#virsh vcpuinfo rhel62 
<snip>
VCPU:           0
CPU:            29
State:          running
CPU time:       10.7s
CPU Affinity:   yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy
</snip>
the domain process is pinned to all available CPUs as expected.
2. Edit domain with no 'cpuset', no 'placement' for <vcpu>, and 'placement' for <numatune> is 'auto'. Then start the domain.
# virsh dumpxml rhel62|grep vcpu -A 3
  <vcpu placement='auto'>24</vcpu>
  <numatune>
    <memory mode='strict' placement='auto'/>
  </numatune>
#cat /tmp/libvirtd.debug | grep Nodeset
2012-05-17 11:52:07.434+0000: 53030: debug : qemuProcessStart:3356 : Nodeset returned from numad: 2-5

# virsh vcpuinfo rhel62
<snip>
VCPU:           0
CPU:            2
State:          running
CPU time:       11.1s
CPU Affinity:   --yy--yy--yy--yy--yy--yy--yy--yy--yy--yy--yy--yy
</snip>
#tail -n50 /var/log/libvirt/qemu/rhel62.log
 <snip>
2012-05-17 11:52:07.521+0000: 53150: debug : qemuProcessInitCpuAffinity:1731 : Setting CPU affinity
2012-05-17 11:52:07.526+0000: 53150: debug : qemuProcessInitCpuAffinity:1749 : Set CPU affinity with advisory nodeset from numad
2012-05-17 11:52:07.526+0000: 53150: debug : qemuProcessInitNumaMemoryPolicy:1599 : Set NUMA memory policy with advisory nodeset from numad
 </snip>
# cat /proc/53150/status 
 <snip>
Cpus_allowed:	0000cccc,cccccccc
Cpus_allowed_list:	2-3,6-7,10-11,14-15,18-19,22-23,26-27,30-31,34-35,38-39,42-43,46-47
Mems_allowed:	00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,0000003c
Mems_allowed_list:	2-5
 </snip>
3. Edit domain with no 'cpuset', 'placement=auto' for <vcpu>, and no <numatune>. Then start the domain.
# cat /tmp/libvirtd.debug | grep Nodeset
2012-05-17 11:52:07.434+0000: 53030: debug : qemuProcessStart:3356 : Nodeset returned from numad: 2-5
2012-05-17 12:01:11.469+0000: 53030: debug : qemuProcessStart:3356 : Nodeset returned from numad: 3-4,6-7
# virsh vcpuinfo rhel62
<snip>
VCPU:           23
CPU:            47
State:          running
CPU time:       0.7s
CPU Affinity:   -yy--yy--yy--yy--yy--yy--y-y-y-y-y-y-y-y-y-y-y-y
</snip>
#cat /proc/53406/status
<snip>
Cpus_allowed:	0000aaaa,aa666666
Cpus_allowed_list:	1-2,5-6,9-10,13-14,17-18,21-22,25,27,29,31,33,35,37,39,41,43,45,47
Mems_allowed:	00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,000000d8
Mems_allowed_list:	3-4,6-7
</snip>
4. Edit domain with placement='static' cpuset='0-11,13-22,66-79' for <vcpu> and no <numatune>, then start the domain
# virsh dumpxml rhel62|grep vcpu -A 1
  <vcpu placement='static' cpuset='0-11,13-22,66-79'>24</vcpu>
  <os>

No new Nodeset recorder in libvirtd log.
# virsh vcpuinfo rhel62
<snip>
VCPU:           0
CPU:            2
State:          running
CPU time:       10.9s
CPU Affinity:   yyyyyyyyyyyy-yyyyyyyyyy-------------------------
</snip>
#cat /proc/53769/status
<snip>
Cpus_allowed:	00000000,007fefff
Cpus_allowed_list:	0-11,13-22
Mems_allowed:	00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,000000ff
Mems_allowed_list:	0-7
</snip>
Comment 16 Osier Yang 2012-05-17 23:27:54 EDT
(In reply to comment #15)
> Test with libvirt-0.9.10-20.el6nodeinfo.x86_64. Results as following:
> 0. 
> # numactl --hardware
> available: 8 nodes (0-7)
> node 0 cpus: 0 4 8 12 16 20
> node 0 size: 65526 MB
> node 0 free: 62852 MB
> node 1 cpus: 24 28 32 36 40 44
> node 1 size: 65536 MB
> node 1 free: 60043 MB
> node 2 cpus: 3 7 11 15 19 23
> node 2 size: 65536 MB
> node 2 free: 58456 MB
> node 3 cpus: 27 31 35 39 43 47
> node 3 size: 65536 MB
> node 3 free: 63275 MB
> node 4 cpus: 2 6 10 14 18 22
> node 4 size: 65536 MB
> node 4 free: 63405 MB
> node 5 cpus: 26 30 34 38 42 46
> node 5 size: 65536 MB
> node 5 free: 63714 MB
> node 6 cpus: 1 5 9 13 17 21
> node 6 size: 65536 MB
> node 6 free: 63833 MB
> node 7 cpus: 25 29 33 37 41 45
> node 7 size: 65536 MB
> node 7 free: 63806 MB
> node distances:
> node   0   1   2   3   4   5   6   7 
>   0:  10  16  16  22  16  22  16  22 
>   1:  16  10  16  22  22  16  22  16 
>   2:  16  16  10  16  16  16  16  22 
>   3:  22  22  16  10  16  16  22  16 
>   4:  16  22  16  16  10  16  16  16 
>   5:  22  16  16  16  16  10  22  22 
>   6:  16  22  16  22  16  22  10  16 
>   7:  22  16  22  16  16  22  16  10 
> 1. Prepare a domain with no 'cpuset', no 'placement' for <vcpu>, and no
> <numatune>, then start the domain.
> #virsh dumpxml rhel62 |grep vcpu
>   <vcpu placement='static'>24</vcpu>
> #virsh vcpuinfo rhel62 
> <snip>
> VCPU:           0
> CPU:            29
> State:          running
> CPU time:       10.7s
> CPU Affinity:   yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy
> </snip>
> the domain process is pinned to all available CPUs as expected.

Works as expected.

> 2. Edit domain with no 'cpuset', no 'placement' for <vcpu>, and 'placement' for
> <numatune> is 'auto'. Then start the domain.
> # virsh dumpxml rhel62|grep vcpu -A 3
>   <vcpu placement='auto'>24</vcpu>
>   <numatune>
>     <memory mode='strict' placement='auto'/>
>   </numatune>
> #cat /tmp/libvirtd.debug | grep Nodeset
> 2012-05-17 11:52:07.434+0000: 53030: debug : qemuProcessStart:3356 : Nodeset
> returned from numad: 2-5
> 
> # virsh vcpuinfo rhel62
> <snip>
> VCPU:           0
> CPU:            2
> State:          running
> CPU time:       11.1s
> CPU Affinity:   --yy--yy--yy--yy--yy--yy--yy--yy--yy--yy--yy--yy
> </snip>
> #tail -n50 /var/log/libvirt/qemu/rhel62.log
>  <snip>
> 2012-05-17 11:52:07.521+0000: 53150: debug : qemuProcessInitCpuAffinity:1731 :
> Setting CPU affinity
> 2012-05-17 11:52:07.526+0000: 53150: debug : qemuProcessInitCpuAffinity:1749 :
> Set CPU affinity with advisory nodeset from numad
> 2012-05-17 11:52:07.526+0000: 53150: debug :
> qemuProcessInitNumaMemoryPolicy:1599 : Set NUMA memory policy with advisory
> nodeset from numad
>  </snip>
> # cat /proc/53150/status 
>  <snip>
> Cpus_allowed: 0000cccc,cccccccc
> Cpus_allowed_list:
> 2-3,6-7,10-11,14-15,18-19,22-23,26-27,30-31,34-35,38-39,42-43,46-47
> Mems_allowed:
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,0000003c
> Mems_allowed_list: 2-5
>  </snip>

Could you confirm the Cpus_allowed_list and vcpuinfo have right CPUs by comparing with what you get from "numactl --hardware"? A simple bash script
can do it.

> 3. Edit domain with no 'cpuset', 'placement=auto' for <vcpu>, and no
> <numatune>. Then start the domain.
> # cat /tmp/libvirtd.debug | grep Nodeset
> 2012-05-17 11:52:07.434+0000: 53030: debug : qemuProcessStart:3356 : Nodeset
> returned from numad: 2-5
> 2012-05-17 12:01:11.469+0000: 53030: debug : qemuProcessStart:3356 : Nodeset
> returned from numad: 3-4,6-7
> # virsh vcpuinfo rhel62
> <snip>
> VCPU:           23
> CPU:            47
> State:          running
> CPU time:       0.7s
> CPU Affinity:   -yy--yy--yy--yy--yy--yy--y-y-y-y-y-y-y-y-y-y-y-y
> </snip>
> #cat /proc/53406/status
> <snip>
> Cpus_allowed: 0000aaaa,aa666666
> Cpus_allowed_list:
> 1-2,5-6,9-10,13-14,17-18,21-22,25,27,29,31,33,35,37,39,41,43,45,47
> Mems_allowed:
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,000000d8
> Mems_allowed_list: 3-4,6-7

Except the CPUs_allowed_list and vcpuinfo which I'm not sure, everything works as expected.

> </snip>
> 4. Edit domain with placement='static' cpuset='0-11,13-22,66-79' for <vcpu> and
> no <numatune>, then start the domain
> # virsh dumpxml rhel62|grep vcpu -A 1
>   <vcpu placement='static' cpuset='0-11,13-22,66-79'>24</vcpu>
>   <os>
> 
> No new Nodeset recorder in libvirtd log.
> # virsh vcpuinfo rhel62
> <snip>
> VCPU:           0
> CPU:            2
> State:          running
> CPU time:       10.9s
> CPU Affinity:   yyyyyyyyyyyy-yyyyyyyyyy-------------------------
> </snip>
> #cat /proc/53769/status
> <snip>
> Cpus_allowed: 00000000,007fefff
> Cpus_allowed_list: 0-11,13-22
> Mems_allowed:
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,000000ff
> Mems_allowed_list: 0-7

Works fine. "66-79" is ignored.

> </snip>
Comment 17 yanbing du 2012-05-18 01:14:44 EDT
Hi Osier,
Compare the CPUs_allowed_list and vcpuinfo in comment 15 step 2,
#cat compare-cpu.sh
#! /bin/sh
for i in {2..5}; do numactl --hardware | grep "node $i cpus:" >> cpus; done
cat cpus | awk -F':' '{print $2}' > cpus2
for i in {0..47}; do
    if grep "\b$i\b" cpus2 > /dev/null; then
        echo -n "y"
    else
        echo -n "-"
    fi
 done
#sh compare-cpu.sh 
--yy--yy--yy--yy--yy--yy--yy--yy--yy--yy--yy--yy

Result is same with vcpuinfo output.
Comment 18 yanbing du 2012-05-18 03:46:14 EDT
According comment 16 & comment 17, move this bug to VERIFIED.
Comment 19 Wayne Sun 2012-05-24 06:14:53 EDT
This is for updated numad compatiability testing.

packages:
libvirt-0.9.10-21.el6.x86_64
numad-0.5-4.20120522git.el6.x86_64
kernel-2.6.32-269.el6.x86_64
qemu-kvm-0.12.1.2-2.294.el6.x86_64

# numactl --hardware
available: 4 nodes (0-3)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 40 41 42 43 44 45 46 47 48 49
node 0 size: 131050 MB
node 0 free: 127328 MB
node 1 cpus: 10 11 12 13 14 15 16 17 18 19 50 51 52 53 54 55 56 57 58 59
node 1 size: 131072 MB
node 1 free: 127338 MB
node 2 cpus: 20 21 22 23 24 25 26 27 28 29 60 61 62 63 64 65 66 67 68 69
node 2 size: 131072 MB
node 2 free: 127540 MB
node 3 cpus: 30 31 32 33 34 35 36 37 38 39 70 71 72 73 74 75 76 77 78 79
node 3 size: 131072 MB
node 3 free: 127442 MB
node distances:
node   0   1   2   3 
  0:  10  11  11  11 
  1:  11  10  11  11 
  2:  11  11  10  11 
  3:  11  11  11  10 

steps:

1. prepare a domain with vcpu parts set auto placement:
# virsh dumpxml rhel63
...
  <vcpu placement='auto'>24</vcpu>
...

# virsh start rhel63
# virsh dumpxml rhel63 
...
  <vcpu placement='auto'>24</vcpu>
  <numatune>
    <memory mode='strict' placement='auto'/>
  </numatune>
...

check log:
# cat /var/log/libvirtd.log| grep Nodeset
2012-05-24 09:33:04.300+0000: 20342: debug : qemuProcessStart:3356 : Nodeset returned from numad: 1-2

# cat /var/log/libvirt/qemu/rhel63.log 
...
2012-05-24 09:33:04.447+0000: 23518: debug : qemuProcessInitCpuAffinity:1731 : Setting CPU affinity
2012-05-24 09:33:04.456+0000: 23518: debug : qemuProcessInitCpuAffinity:1749 : Set CPU affinity with advisory nodeset from numad
2012-05-24 09:33:04.456+0000: 23518: debug : qemuProcessInitNumaMemoryPolicy:1599 : Set NUMA memory policy with advisory nodeset from numad
...

check vcpuinfo:
# virsh vcpuinfo rhel63
VCPU:           0
CPU:            55
State:          running
CPU time:       7.5s
CPU Affinity:   ----------yyyyyyyyyyyyyyyyyyyy--------------------yyyyyyyyyyyyyyyyyyyy----------
...

# cat /proc/23518/status
...
check cpu_allow_list:
Cpus_allowed:	003f,fffc0000,3ffffc00
Cpus_allowed_list:	10-29,50-69
Mems_allowed:	00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000006
Mems_allowed_list:	1-2
...

check acctual cpu affinity:
#! /bin/sh
for i in {1..2}; do numactl --hardware | grep "node $i cpus:" >> cpus; done
cat cpus | awk -F':' '{print $2}' > cpus2
for i in {0..79}; do
    if grep "\b$i\b" cpus2 > /dev/null; then
        echo -n "y"
    else
        echo -n "-"
    fi
 done

# sh compare-cpu.sh 
----------yyyyyyyyyyyyyyyyyyyy--------------------yyyyyyyyyyyyyyyyyyyy----------

The output is the same with vcpuinfo

So this is working as expected.

2. destroy domain and edit domain xml as:

# virsh destroy rhel63

# virsh edit rhel63
...
  <vcpu placement='auto'>24</vcpu>
  <numatune>
    <memory mode='interleave'/>
  </numatune>
...

# virsh start rhel63
Domain rhel63 started

# virsh dumpxml rhel63
...
  <vcpu placement='auto'>24</vcpu>
  <numatune>
    <memory mode='interleave' placement='auto'/>
  </numatune>
...

check libvirtd.log:
# cat /var/log/libvirtd.log| grep Nodeset
2012-05-24 09:50:42.481+0000: 20343: debug : qemuProcessStart:3356 : Nodeset returned from numad: 1,3

# virsh vcpuinfo rhel63
VCPU:           0
CPU:            72
State:          running
CPU time:       7.4s
CPU Affinity:   ----------yyyyyyyyyy----------yyyyyyyyyy----------yyyyyyyyyy----------yyyyyyyyyy
...

# cat /proc/24429/status
...
Cpus_allowed:	ffc0,0ffc00ff,c00ffc00
Cpus_allowed_list:	10-19,30-39,50-59,70-79
Mems_allowed:	00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,0000000f
Mems_allowed_list:	0-3
...

modify compare-cpu.sh and run:
# sh compare-cpu.sh 
----------yyyyyyyyyy----------yyyyyyyyyy----------yyyyyyyyyy----------yyyyyyyyyy



Also run other steps in comment 15 and comment 16, both libvirt and numad is working as expected.
Comment 21 errata-xmlrpc 2012-06-20 02:52:01 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2012-0748.html

Note You need to log in before you can comment on or make changes to this bug.