Bug 1249441

Summary: cpu-stats returns error messages with --start <number> (number >=32)
Product: Red Hat Enterprise Linux 7 Reporter: Dan Zheng <dzheng>
Component: libvirtAssignee: Andrea Bolognani <abologna>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 7.2CC: abologna, bugproxy, dyuan, dzheng, gsun, hannsj_uhl, jtomko, mzhan, rbalakri
Target Milestone: rcKeywords: Patch
Target Release: 7.3   
Hardware: ppc64le   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-1.3.3-1.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-03 18:21:48 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1230910, 1288337, 1359843    

Description Dan Zheng 2015-08-03 02:02:26 UTC
Description of problem:
cpu-stats returns error messages with --start <number> when <number> is equal and larger than 32. --start_cpu is calculated incorrectly.



Version-Release number of selected component (if applicable):
libvirt-1.2.17-2.el7.ppc64le
kernel-3.10.0-292.el7.ppc64le
qemu-kvm-rhev-2.3.0-9.el7.ppc64le

How reproducible:
100%

Steps to Reproduce:
1. Start a guest
# virsh list --all
 Id    Name                           State
----------------------------------------------------
 10    dzhengvm2                      running


2. Get cpu statistic info using --start option

# virsh cpu-stats --domain dzhengvm2 --start 32
CPU80:
	cpu_time             0.363755140 seconds
 	vcpu_time            0.275071178 seconds
CPU81:
	cpu_time             0.000000000 seconds
 	vcpu_time            0.000000000 seconds
CPU82:
	cpu_time             0.000000000 seconds
	vcpu_time            0.000000000 seconds
CPU83:
	cpu_time             0.000000000 seconds
	vcpu_time            0.000000000 seconds
CPU84:
	cpu_time             0.000000000 seconds
	vcpu_time            0.000000000 seconds
...
CPU158:
	cpu_time             0.000000000 seconds
	vcpu_time            0.000000000 seconds
CPU159:
	cpu_time             0.000000000 seconds
	vcpu_time            0.000000000 seconds
error: Failed to retrieve CPU statistics for domain 'dzhengvm2'
error: invalid argument: start_cpu 160 larger than maximum of 159

# virsh cpu-stats --domain dzhengvm2 --start 152
CPU152:
	cpu_time             0.073570788 seconds
	vcpu_time            0.033277032 seconds
...
CPU159:
	cpu_time             0.000000000 seconds
	vcpu_time            0.000000000 seconds
error: Failed to retrieve CPU statistics for domain 'dzhengvm2'
error: invalid argument: start_cpu 280 larger than maximum of 159


Actual results:
There are error messages with incorrect calculation of start_cpu.


Expected results:
There should not be any error messages with correct start_cpu option provided.

Additional info:
# virsh cpu-stats dzhengvm2

CPU0:
	cpu_time             0.128380214 seconds
	vcpu_time            0.123251830 seconds
CPU1:
	cpu_time             0.000000000 seconds
	vcpu_time            0.000000000 seconds
CPU2:
	cpu_time             0.000000000 seconds
	vcpu_time            0.000000000 seconds
CPU3:
	cpu_time             0.000000000 seconds
	vcpu_time            0.000000000 seconds
CPU4:
	cpu_time             0.000000000 seconds
...
CPU158:
	cpu_time             0.000000000 seconds
	vcpu_time            0.000000000 seconds
CPU159:
	cpu_time             0.000000000 seconds
	vcpu_time            0.000000000 seconds
Total:
	cpu_time            34.505515446 seconds
	user_time            2.520000000 seconds
	system_time          3.210000000 seconds
# ppc64_cpu --info
Core   0:    0*    1     2     3     4     5     6     7  
Core   1:    8*    9    10    11    12    13    14    15  
Core   2:   16*   17    18    19    20    21    22    23  
Core   3:   24*   25    26    27    28    29    30    31  
Core   4:   32*   33    34    35    36    37    38    39  
Core   5:   40*   41    42    43    44    45    46    47  
Core   6:   48*   49    50    51    52    53    54    55  
Core   7:   56*   57    58    59    60    61    62    63  
Core   8:   64*   65    66    67    68    69    70    71  
Core   9:   72*   73    74    75    76    77    78    79  
Core  10:   80*   81    82    83    84    85    86    87  
Core  11:   88*   89    90    91    92    93    94    95  
Core  12:   96*   97    98    99   100   101   102   103  
Core  13:  104*  105   106   107   108   109   110   111  
Core  14:  112*  113   114   115   116   117   118   119  
Core  15:  120*  121   122   123   124   125   126   127  
Core  16:  128*  129   130   131   132   133   134   135  
Core  17:  136*  137   138   139   140   141   142   143  
Core  18:  144*  145   146   147   148   149   150   151  
Core  19:  152*  153   154   155   156   157   158   159  

# uname -r
3.10.0-292.el7.ppc64le

# cat /sys/devices/system/cpu/present
0-159

# ppc64_cpu --cores-on
Number of cores online = 20

#  ppc64_cpu --cores-present
Number of cores present = 20

Comment 3 IBM Bug Proxy 2016-03-29 03:50:34 UTC
------- Comment From niteshkonkar.com 2016-03-28 23:49 EDT-------
Hello All,

I have written a patch for it. Will confirm the approach once and then send it to community for review.

#virsh cpu-stats --domain 40 --start 157
CPU157:

#

Thanks.

Comment 4 Hanns-Joachim Uhl 2016-03-29 08:30:01 UTC
(In reply to IBM Bug Proxy from comment #3)
> ------- Comment From niteshkonkar.com 2016-03-28 23:49 EDT-------
> Hello All,
> 
> I have written a patch for it. Will confirm the approach once and then send
> it to community for review.
> 
> #virsh cpu-stats --domain 40 --start 157
> CPU157:
> 
> #
> 
> Thanks.
.
oops .. the above comment has to read:
"
--- Comment #6 from Nitesh Konkar <niteshkonkar.com> ---
Hello All,

I have written a patch for it. Will confirm the approach once and then send it
to community for review. 

#virsh cpu-stats --domain 40 --start 157
CPU157:
    cpu_time             0.000000000 seconds
    vcpu_time            0.000000000 seconds
CPU158:
    cpu_time             0.000000000 seconds
    vcpu_time            0.000000000 seconds
CPU159:
    cpu_time             0.000000000 seconds
    vcpu_time            0.000000000 seconds

# 



Thanks.
"
...

Comment 5 Ján Tomko 2016-04-01 09:31:20 UTC
Proposed upstream patch:
https://www.redhat.com/archives/libvir-list/2016-April/msg00005.html

Comment 6 Ján Tomko 2016-04-01 09:37:26 UTC
Pushed as:
commit d9a0a885e2b1cf3c9fc5260f9cdf4fc8a768f26c
Author:     Nitesh Konkar <niteshkonkar.libvirt>
AuthorDate: 2016-04-01 02:05:04 -0400
Commit:     Ján Tomko <jtomko>
CommitDate: 2016-04-01 11:36:04 +0200

    Pass the correct cpu count when calling virDomainGetCPUStats.
    
    When using the --start option, the show_count should not be set to
    max_id as the --start <cpu> means we dont need those many initial cpu
    stats. Hence, show_count should be adjusted accordingly.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1249441
    
    Signed-off-by: Nitesh Konkar <nitkon12.ibm.com>
    Signed-off-by: Ján Tomko <jtomko>

git describe: v1.3.3-rc2-3-gd9a0a88

Comment 8 Dan Zheng 2016-04-14 09:45:25 UTC
test packages:
libvirt-1.3.3-1.el7.ppc64le
qemu-kvm-rhev-2.5.0-4.el7.ppc64le
kernel-3.10.0-327.8.1.el7.ppc64le

Cases:
# virsh cpu-stats --domain gsun-test1
CPU0:
	cpu_time             0.000000000 seconds
	vcpu_time            0.000000000 seconds
...

CPU157:
	cpu_time             0.000000000 seconds
	vcpu_time            0.000000000 seconds
CPU158:
	cpu_time             0.000000000 seconds
	vcpu_time            0.000000000 seconds
CPU159:
	cpu_time             0.000000000 seconds
	vcpu_time            0.000000000 seconds
Total:
	cpu_time            21.437286628 seconds
	user_time            1.540000000 seconds
	system_time          1.290000000 seconds

*********************
Case1:  PASS
# virsh cpu-stats --domain gsun-test1 --start 32
CPU32:
	cpu_time             0.000000000 seconds
	vcpu_time            0.000000000 seconds
...
CPU159:
	cpu_time             0.000000000 seconds
	vcpu_time            0.000000000 seconds

Case2:  Fail
# virsh cpu-stats --domain gsun-test1 --start 160
<no output>

'159' is the maximum cpu id. When 160 is specified for --start, there should be an error message to show something like 'invalid cpu ...'.

Comment 9 Dan Zheng 2016-04-15 02:02:34 UTC
Hi Andrea,
What do you think of ?

Comment 10 IBM Bug Proxy 2016-04-15 07:40:38 UTC
------- Comment From niteshkonkar.com 2016-04-15 03:32 EDT-------
I have sent a patch for review.

After the patch:-

# virsh cpu-stats --domain 40 --start 159

# virsh cpu-stats --domain 40 --start 160
Start cpu 160 larger than maximum of 159.

Nitesh Konkar.

Comment 11 Andrea Bolognani 2016-04-15 13:22:19 UTC
This has now been fixed upstream.

commit 0ed35e0939c8ee2c38dbb4d67233e864499287ee
Author: Nitesh Konkar <niteshkonkar.libvirt>
Date:   Fri Apr 15 03:28:53 2016 -0400

    Return error when --start <number> in cpu-stats is invalid.
    
    Signed-off-by: Nitesh Konkar <nitkon12.ibm.com>

v1.3.3-163-g0ed35e0

Comment 12 Dan Zheng 2016-05-16 09:57:08 UTC
Test package:
libvirt-1.3.4-1.el7.ppc64le
qemu-kvm-rhev-2.5.0-4.el7.ppc64le
kernel-3.10.0-327.13.1.el7.ppc64le

# virsh list --all
 Id    Name                           State
----------------------------------------------------
 2     avocado-vt-vm1                 running

#  virsh cpu-stats 2
CPU0:
	cpu_time             2.470659226 seconds
	vcpu_time            2.389113024 seconds
...

CPU79:
	cpu_time             0.000000000 seconds
	vcpu_time            0.000000000 seconds
Total:
	cpu_time            37.949896658 seconds
	user_time            2.100000000 seconds
	system_time          2.350000000 seconds

#  virsh cpu-stats 2 --start 160
error: Start CPU 160 is out of range (min: 0, max: 79)

#  virsh cpu-stats 2 --start 78 --count 3
CPU78:
	cpu_time             0.000000000 seconds
	vcpu_time            0.000000000 seconds
CPU79:
	cpu_time             0.000000000 seconds
	vcpu_time            0.000000000 seconds

#  virsh cpu-stats 2 --start 78 --count -1
error: Invalid value for number of CPUs to show

[root@ibm-p8-rhevm-17 test]#  virsh cpu-stats 2 --start -78 --count -1
error: Invalid value for start CPU

#  virsh cpu-stats 2 --start 60 --count 0
<no output>

All tests above are passed, so set it verified.

Comment 14 errata-xmlrpc 2016-11-03 18:21:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-2577.html