Bug 855218 - Problems on CPU tuning
Problems on CPU tuning
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt (Show other bugs)
6.4
Unspecified Unspecified
high Severity high
: rc
: ---
Assigned To: Osier Yang
Virtualization Bugs
: Regression
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-09-06 22:25 EDT by hongming
Modified: 2013-02-21 02:23 EST (History)
10 users (show)

See Also:
Fixed In Version: libvirt-0.10.2-3.el6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-02-21 02:23:13 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description hongming 2012-09-06 22:25:23 EDT
Description of problem:
vcpuinfo can't show correct cpu affinity after hotplug vcpu.

Version-Release number of selected component (if applicable):
libvirt-0.10.1-1.el6.x86_64
qemu-kvm-0.12.1.2-2.305.el6.x86_64 

How reproducible:
100% 

Steps to Reproduce:
1.# virsh start rhel6.2
Domain rhel6.2 started

2.# virsh vcpupin rhel6.2
VCPU: CPU Affinity
----------------------------------
   0: 0-3

3.# virsh setvcpus rhel6.2 3

4.# virsh vcpupin rhel6.2 1 1

5.# virsh vcpupin rhel6.2
VCPU: CPU Affinity
----------------------------------
   0: 0-3
   1: 1
   2: 0-3

6.# virsh vcpuinfo rhel6.2
VCPU:           0
CPU:            1
State:          running
CPU time:       42.8s
CPU Affinity:   yyyy

VCPU:           1
CPU:            0
State:          running
CPU time:       30.8s
CPU Affinity:   yyyy

VCPU:           2
CPU:            3
State:          running
CPU Affinity:   yyyy

Actual results:
vcpuinfo can't show correct cpu affinity after hotplug vcpu.

Expected results:
vcpuinfo show correct cpu affinity after hotplug vcpu.

Additional info:
Comment 3 hongming 2012-09-10 04:26:19 EDT
It can't be reproduced in libvirt-0.10.0-0rc0.el6.x86_64. So it is a Regression bug.
Comment 5 huzhang@redhat.com 2012-09-10 05:50:59 EDT
Description of problem:
vcpuinfo can't show correct cpu affinity after hotplug vcpu.

Version-Release number of selected component (if applicable):
libvirt-0.10.1-1.el6.x86_64
qemu-kvm-0.12.1.2-2.307.el6.x86_64

How reproducible:
100% 

Steps to Reproduce:
1.#virsh start huzhang2
Domain huzhang2 started


2.#virsh dumpxml huzhang2
<vcpu placement='static' cpuset='3'>4</vcpu>

3.# virsh vcpuinfo huzhang2
VCPU: 0
CPU: 0
State: running
CPU time: 8.3s
CPU Affinity: yyyyyyyy

VCPU: 1
CPU: 4
State: running
CPU time: 2.4s
CPU Affinity: yyyyyyyy

VCPU: 2
CPU: 0
State: running
CPU time: 4.1s
CPU Affinity: yyyyyyyy

VCPU: 3
CPU: 1
State: running
CPU time: 2.8s
CPU Affinity: yyyyyyyy



Actual results:
vcpuinfo can't show correct cpu affinity after hotplug vcpu.

Expected results:
vcpuinfo show correct cpu affinity after hotplug vcpu.
Comment 6 huzhang@redhat.com 2012-09-10 05:59:52 EDT
Description of problem:
vcpuinfo can't show correct cpu affinity after hotplug vcpu.

Version-Release number of selected component (if applicable):
libvirt-0.10.0-0rc0.el6
qemu-kvm-0.12.1.2-2.307.el6.x86_64

How reproducible:
100% 

Steps to Reproduce:
1.#virsh start rhel6
Domain rhel6 started


2.#virsh dumpxml rhel6
<vcpu cpuset="1-3,^2,0">4</vcpu>

3.# # virsh vcpuinfo rhel6
<huzhang> VCPU:           0
<huzhang> CPU:            2
<huzhang> State:          running
<huzhang> CPU time:       9.3s
<huzhang> CPU Affinity:   -yyy----

<huzhang> VCPU:           1
<huzhang> CPU:            1
<huzhang> State:          running
<huzhang> CPU time:       3.7s
<huzhang> CPU Affinity:   -yyy----

<huzhang> VCPU:           2
<huzhang> CPU:            2
<huzhang> State:          running
<huzhang> CPU time:       3.3s
<huzhang> CPU Affinity:   -yyy----

<huzhang> VCPU:           3
<huzhang> CPU:            1
<huzhang> State:          running
<huzhang> CPU time:       2.5s
<huzhang> CPU Affinity:   -yyy----


Actual results:
vcpuinfo can't show correct cpu affinity after hotplug vcpu.

Expected results:
vcpuinfo show correct cpu affinity after hotplug vcpu.
Comment 7 Dave Allan 2012-09-10 08:50:31 EDT
Peter, this BZ has been marked Regression.  Can you confirm?  Thanks.
Comment 9 huzhang@redhat.com 2012-09-11 01:09:52 EDT
(In reply to comment #6)
> Description of problem:
> vcpuinfo can't show correct cpu affinity after hotplug vcpu.
> 
> Version-Release number of selected component (if applicable):
> libvirt-0.10.0-0rc0.el6
> qemu-kvm-0.12.1.2-2.307.el6.x86_64
> 
> How reproducible:
> 100% 
> 
> Steps to Reproduce:
> 1.#virsh start rhel6
> Domain rhel6 started
> 
> 
> 2.#virsh dumpxml rhel6
> <vcpu cpuset="1-3,^2,0">4</vcpu>
> 
> 3.# # virsh vcpuinfo rhel6
> <huzhang> VCPU:           0
> <huzhang> CPU:            2
> <huzhang> State:          running
> <huzhang> CPU time:       9.3s
> <huzhang> CPU Affinity:   -yyy----
> 
> <huzhang> VCPU:           1
> <huzhang> CPU:            1
> <huzhang> State:          running
> <huzhang> CPU time:       3.7s
> <huzhang> CPU Affinity:   -yyy----
> 
> <huzhang> VCPU:           2
> <huzhang> CPU:            2
> <huzhang> State:          running
> <huzhang> CPU time:       3.3s
> <huzhang> CPU Affinity:   -yyy----
> 
> <huzhang> VCPU:           3
> <huzhang> CPU:            1
> <huzhang> State:          running
> <huzhang> CPU time:       2.5s
> <huzhang> CPU Affinity:   -yyy----
> 
> 
> Actual results:
> vcpuinfo can't show correct cpu affinity after hotplug vcpu.
> 
> Expected results:
> vcpuinfo show correct cpu affinity after hotplug vcpu.

***************************************************************
Sorry, I made a mistake, this version does not have this error.
Comment 10 huzhang@redhat.com 2012-09-11 02:22:47 EDT
In libvirt-0.10.0-0rc0.el6:
When setting <vcpu placement='static' cpuset='3'>4</vcpu> and start the guest
# virsh vcpupin rhel6
VCPU: CPU Affinity
----------------------------------
   0: 0-7
   1: 0-7
   2: 0-7
   3: 0-7
The result of "#virsh vcpupin rhel6" is wrong.
But the result of "#virsh vcpuinfo rhel6" is right:
#virsh vcpuinfo rhel6
VCPU:           0
CPU:            3
State:          running
CPU time:       9.6s
CPU Affinity:   ---y----
VCPU:           1
CPU:            3
State:          running
CPU time:       4.2s
CPU Affinity:   ---y----
VCPU:           2
CPU:            3
State:          running
CPU time:       3.7s
CPU Affinity:   ---y----
VCPU:           3
CPU:            3
State:          running
CPU time:       3.7s
CPU Affinity:   ---y----

*****************************************************************************
In libvirt-0.10.1-1.el6.x86_64:
When setting <vcpu placement='static' cpuset='3'>4</vcpu> and start the guest
# virsh vcpupin rhel6
VCPU: CPU Affinity
----------------------------------
   0: 0-7
   1: 0-7
   2: 0-7
   3: 0-7
#virsh vcpuinfo rhel6
VCPU:           0
CPU:            1
State:          running
CPU time:       30.6s
CPU Affinity:   yyyyyyyy

VCPU:           1
CPU:            4
State:          running
CPU time:       26.6s
CPU Affinity:   yyyyyyyy

VCPU:           2
CPU:            0
State:          running
CPU time:       25.2s
CPU Affinity:   yyyyyyyy

VCPU:           3
CPU:            0
State:          running
CPU time:       29.3s
CPU Affinity:   yyyyyyyy
The result of both "# virsh vcpupin rhel6" and "#virsh vcpuinfo rhel6" are wrong.
Comment 11 Peter Krempa 2012-09-11 05:50:56 EDT
There's a bug in the virDomainGetVcpus() API method that does not return correct cpu pinning data when the new cgroup cpu pinning method is used. This is a regression as this call worked with the previous method of pinning cpu's. The problem is not only present after a cpu-hotplug, so I'm changing the summary of this bug to reflect this.

Setting up of the cpu pinning works fine, the problem is just with requesting of the state using the "virsh vcpuinfo" command. Correct pinning data can be requested using "virsh vcpupin" command.
Comment 17 Osier Yang 2012-10-14 23:54:01 EDT
v2 patches are committed to upstream.

https://www.redhat.com/archives/libvir-list/2012-October/msg00534.html

One thing need to clarify is: The problems describe in this bug is not the all,
actually the root problem is we have conflicts between <vcpu>, <vcpupin>, and
<emulatorpin>. The problems are:

Problem 1:

The doc shouldn't simply say "These settings are superseded
by CPU tuning. " for element <vcpu>. As except the tuning, <vcpu>
allows to specify the current, maxmum vcpu number. Apart from that,
<vcpu> also allows to specify the placement as "auto", which binds
the domain process to the advisory nodeset from numad.

Problem 2:

Doc for <vcpu> says its "cpuset" specify the physical CPUs
that the vcpus can be pinned. But it's not the truth, as
actually it only pin domain process to the specified physical
CPUs. So either it's a document bug, or code bug.

Problem 3:

Doc for <vcpupin> says it supersed "cpuset" of <vcpu>, it's
not quite correct, as each <vcpupin> specify the pinning policy
only for one vcpu. How about the ones which doesn't have
<vcpupin> specified? it says the vcpu will be pinned to all
available physical CPUs, but what's the meaning of attribute
"cpuset" of <vcpu> then?

Problem 4:

Doc for <emulatorpin> says it pin the emulator threads (domain
process in other context, perhaps another follow up patch to
cleanup the inconsistency is needed) to the physical CPUs
specified its attribute "cpuset". Which conflicts with
<vcpu>'s "cpuset". And actually in the underlying codes,
it set the affinity for domain process twice if both
"cpuset" for <vcpu> and <emulatorpin> are specified,
and <emulatorpin>'s pinning will override <vcpu>'s.

Problem 5:

==========

When "placement" of <vcpu> is "auto" (I.e. uses numad to
get the advisory nodeset to which the domain process is
pinned to), it will also be overridden by <emulatorpin>.

It's hard to separate these problems into separate bugs, as they
are related with each other tightly, especially on the document
patch. You can't explain the relationship well if they are
splitted. And an overview doc patch keep a good history if
one want to get the overall idea.

So I'd like keep track of all these problems in this bug,
with changing the bug title into a more general one:
  problems on CPU tuning


==========
The solutions:

) Don't say <vcpu> is superseded by <cputune>

2) Keep the semanteme for "cpuset" of <vcpu> (I.e. Still says it
   specify the physical CPUs the virtual CPUs). But modifying it
   to mention it also set the pinning policy for domain process,
   and the CPU placement of domain process specified by "cpuset"
   of <vcpu> will be ingored if <emulatorpin> specified, and
   similary, the CPU placement of vcpu thread will be ignored
   if it has <vcpupin> specified, for vcpu which doesn't have
   <vcpupin> specified, it inherits "cpuset" of <vcpu>.

3) Don't say <vcpu> is supersed by <vcpupin>. If neither <vcpupin>
   nor "cpuset" of <vcpu> is specified, the vcpu will be pinned
   to all available pCPUs.

4) If neither <emulatorpin> nor "cpuset" of <vcpu> is specified,
   the domain process (emulator threads in the context) will be
   pinned to all available pCPUs.

5) If "placement" of <vcpu> is "auto", <emulatorpin> is not allowed.

6) hotplugged vcpus will also inherit "cpuset" of <vcpu>
Comment 20 yuping zhang 2012-10-17 07:16:43 EDT
Verified this issue with:
libvirt-0.10.2-4.el6.x86_64
qemu-kvm-0.12.1.2-2.316.el6.x86_64

1.Patch [RHEL6.4 libvirt PATCH 1/7] doc: Sort out the relationship between <vcpu>, <vcpupin>, and <emulatorpin> ==============>PASS
Document was updated. http://libvirt.org/formatdomain.html
cpu
    The content of this element defines the maximum number of virtual CPUs allocated for the guest OS, which must be between 1 and the maximum supported by the hypervisor. Since 0.4.4, this element can contain an optional cpuset attribute, which is a comma-separated list of physical CPU numbers that domain process and virtual CPUs can be pinned to by default. (NB: The pinning policy of domain process and virtual CPUs can be specified separately by cputune. If attribute emulatorpin of cputune is specified, cpuset specified by vcpu here will be ingored; Similarly, For virtual CPUs which has vcpupin specified, cpuset specified by cpuset here will be ignored; For virtual CPUs which doesn't have vcpupin specified, it will be pinned to the physical CPUs specified by cpuset here). 
....

If both cpuset and placement are not specified, or if placement is "static", but no cpuset is specified, the domain process will be pinned to all the available physical CPUs. 
...
vcpupin
    The optional vcpupin element specifies which of host's physical CPUs the domain VCPU will be pinned to. If this is omitted, and attribute cpuset of element vcpu is not specified, the vCPU is pinned to all the physical CPUs by default. It contains two required attributes, the attribute vcpu specifies vcpu id, and the attribute cpuset is same as attribute cpuset of element vcpu. (NB: Only qemu driver support) Since 0.9.0 
emulatorpin
    The optional emulatorpin element specifies which of host physical CPUs the "emulator", a subset of a domain not including vcpu, will be pinned to. If this is omitted, and attribute cpuset of element vcpu is not specified, "emulator" is pinned to all the physical CPUs by default. It contains one required attribute cpuset specifying which physical CPUs to pin to. NB, emulatorpin is not allowed if attribute placement of element vcpu is "auto". 

2.Patch [RHEL6.4 libvirt PATCH 2/7] conf: Ignore vcpupin for not onlined vcpus when parsing ============>Fail,can not get the error according to the set in patch.Confirmed with osier,it is an improvement.I will do more research on it and will file a new bug if needed.
2.1 #virsh edit RHEL6.4
...
  <vcpu placement='static' current='1'>4</vcpu>
  <cputune>
   <vcpupin vcpu='3' cpuset='4'/>
  </cputune>
...
2.2.Save it.Then #virsh dumpxml RHEL6.4
...
  <vcpu placement='static' current='1'>4</vcpu>
  <cputune>
  </cputune>
...
2.3.Change the configuration like this:
#virsh edit RHEL6.4
...
  <vcpu placement='static' current='1'>4</vcpu>
  <cputune>
        <vcpu vcpu='0' cpuset='4'/>
  </cputune>
...

2.4 Save successfully.
# virsh start RHEL6.4
error: Failed to start domain RHEL6.4
error: Unable to set cpuset.cpus: Invalid argument

It report:out of range error in another machines. I will research this issue later.

3.Patch [RHEL6.4 libvirt PATCH 3/7] conf: Initialize the pinning policy for vcpus: ==================>PASS
# virsh dumpxml RHEL6.4
...
  <vcpu placement='static' cpuset='2' current='1'>4</vcpu>
...
# virsh start RHEL6.4
Domain RHEL6.4 started

# virsh vcpuinfo RHEL6.4
VCPU:           0
CPU:            2
State:          running
CPU time:       2.6s
CPU Affinity:   --y-

4.Patch qemu: Initialize cpuset for hotplugged vcpu as def->cpuset    =================>PASS
#virsh edit RHEL6.4
...
 <vcpu placement='static' cpuset='2' current='1'>4</vcpu>
..

# virsh setvcpus RHEL6.4 3

# virsh dumpxml RHEL6.4|grep current
  <currentMemory unit='KiB'>1048576</currentMemory>
  <vcpu placement='static' cpuset='2' current='3'>4</vcpu>

# virsh vcpupin RHEL6.4 1 1

# virsh vcpuinfo RHEL6.4
VCPU:           0
CPU:            2
State:          running
CPU time:       23.0s
CPU Affinity:   --y-

VCPU:           1
CPU:            1
State:          running
CPU time:       0.3s
CPU Affinity:   -y--

VCPU:           2
CPU:            2
State:          running
CPU time:       0.5s
CPU Affinity:   --y-


5.Patch [RHEL6.4 libvirt PATCH 6/7] conf: Ignore emulatorpin if vcpu placement is auto ====================>PASS
#virsh edit RHEL6.4
Change the placement to "auto",and add <emulatorpin cpuset="1-3"/> like this:

  <vcpu placement='auto' current='3'>4</vcpu>
  <cputune>
    <vcpupin vcpu="1" cpuset="0,1"/>
    <vcpupin vcpu="2" cpuset="2,3"/>
    <emulatorpin cpuset="1-3"/>
    <shares>2048</shares>
    <period>1000000</period>
    <quota>-1</quota>
    <emulator_period>1000000</emulator_period>
    <emulator_quota>-1</emulator_quota>
   </cputune>

Then save.
#virsh dumpxml RHEL6.4
...
  <vcpu placement='auto' current='3'>4</vcpu>
  <cputune>
    <shares>2048</shares>
    <period>1000000</period>
    <quota>-1</quota>
    <emulator_period>1000000</emulator_period>
    <emulator_quota>-1</emulator_quota>
    <vcpupin vcpu='1' cpuset='0-1'/>
    <vcpupin vcpu='2' cpuset='2-3'/>
  </cputune>
  <numatune>
    <memory mode='strict' placement='auto'/>
  </numatune>
...
emulatorpin will be removed if placement is auto.

6.Patch [RHEL6.4 libvirt PATCH 7/7] qemu: Ignore def->cpumask if emulatorpin is specified =
6.1 Only set cpuset in <vcpu> element 
#virsh edit RHEL6.4
...
 <vcpu placement='static' cpuset='1'>4</vcpu>
...
6.2 #virsh start RHEL6.4
6.3 # taskset -c -p 84588
pid 84588's current affinity list: 0-159
6.4 #cat /proc/84588/status
...
Cpus_allowed:	ffffffff,ffffffff,ffffffff,ffffffff,ffffffff
Cpus_allowed_list:	0-159
...
==============================================> Fail.According to this patch,it should be 1,confirmed with osier,I will file a new bug.
6.5 Set cpuset in <vcpu> and <emulatorpin>
#virsh edit RHEL6.4
...
<vcpu placement='static' current='3' cpuset='0'>4</vcpu>
  <cputune>
        <emulatorpin cpuset="1-3"/>
  </cputune>
...

#cat /proc/87927/status
...
Cpus_allowed_list:	1-3
...
# taskset -c -p 87927
pid 87927's current affinity list: 1-3

According to the above result,I will change the bug status to VERIFIED and file new bugs if needed.
Comment 21 yuping zhang 2012-12-12 05:58:22 EST
Filed Bug 867699 to trace the issue in step 2 of comment above,it is not bug,because the error message display in libvirtd.log,not in command.
Filed Bug 867372 to trace the issue in step 6.4 mentioned above.
Comment 22 errata-xmlrpc 2013-02-21 02:23:13 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0276.html

Note You need to log in before you can comment on or make changes to this bug.