Bug 448943

Summary: HTS 5.2 -13 cpu-scaling is failing with very large delta
Product: [Retired] Red Hat Hardware Certification Program Reporter: Jeff Svatek <jeffrey.svatek>
Component: Test Suite (tests)Assignee: Greg Nichols <gnichols>
Status: CLOSED DUPLICATE QA Contact: Lawrence Lim <llim>
Severity: medium Docs Contact:
Priority: low    
Version: 5CC: akarlsso, bxu, dwa, gregg.shick, nagananda.chumbalkar, rlandry, ryan.armstrong, tao, tools-bugs, ykun
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-10-01 19:56:38 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
hts-log-package
none
DL320G5 results 1
none
DL360G5 Results For Two Dual Core
none
DL360G5 Results For Single Dual Core
none
quad core failure first run attempt
none
quad core cpuscaling pass, same system that failed attempt 1
none
affected cpu output on baremetal
none
affected cpu output on xen
none
cpuinfo baremetal
none
cpuinfo xen
none
Bare metal test results on an Intel SDV
none
Xen kernel test results on Intel SDV
none
cpuinfo from intel sdv in baremetal kernel
none
cpuinfo from intel sdv in xen kernel none

Description Jeff Svatek 2008-05-29 16:23:23 UTC
Description of problem:
The cpuscaling test is giving an extremely large delta for the On Demand 
governor test.

Version-Release number of selected component (if applicable):
dt-15.14-2.EL5.x86_64.rpm
hts-5.2-13.el5.noarch.rpm
lmbench-3.0a7-6.EL5.x86_64.rpm
stress-0.18.8-1.3.EL5.x86_64.rpm
RHEL 5.2 x64 GA

I'm working on an HP ProLiant DL360 G5 w/ 3.16GHz Intel Harpertown processors.

How reproducible:
I get around 57-58% each time.

Steps to Reproduce:
1. Run HTS cpuscaling



  
Actual results:
Running ./cpuscaling.py:
Note: decimal module from python 2.5 copied by HTS

System Capabilites:
-------------------------------------------------
System has 8 cpus

Supported CPU Frequencies: 
    3165 MHz
    1999 MHz


Supported Governors: 
    ondemand
    userspace
    performance

Current governors:
    cpu0: ondemand
    cpu1: ondemand
    cpu2: ondemand
    cpu3: ondemand
    cpu4: ondemand
    cpu5: ondemand
    cpu6: ondemand
    cpu7: ondemand

On Userspace Governor Test:
-------------------------------------------------
Setting governor to userspace
Setting cpu frequency to 1999 MHz
Running CPU load test...
try 1 - 21.17 sec
comparing 0.00
Minumum frequency load test time: 21.17
Setting cpu frequency to 3165 MHz
Running CPU load test...
try 1 - 13.37 sec
comparing 0.00
Maximum frequency load test time: 13.37

CPU Frequency Speed Up: 1.58
Measured Speed Up: 1.58
Percentage Difference 0.0%

On Demand Governor Test:
-------------------------------------------------
Setting governor to ondemand
Waiting 5 seconds... done.
Running CPU load test...
try 1 - 21.18 sec
comparing 0.00
On Demand load test time: 21.18
Percentage Difference vs. maximum frequency: 58.4%
Error: on demand performance is not within 5% margin

Performance Governor Test:
-------------------------------------------------
Setting governor to performance
Running CPU load test...
try 1 - 13.37 sec
comparing 0.00
Performace load test time: 13.37
Percentage Difference vs. maximum frequency: 0.0%

Summary:
-------------------------------------------------
Load Test Times:
    Minimum:     21.17
    Maximum:     13.37
    On Demand:   21.18
    Performance: 13.37
Margins:
   Speed Up:    0.0%
   On Demand:   58.4%

Restoring original governor to ondemand
...finished running ./cpuscaling.py, exit code=1




Expected results:
< 5% or 10%

Additional info:

Comment 1 Jeff Svatek 2008-05-29 16:23:23 UTC
Created attachment 307102 [details]
hts-log-package

Comment 2 Rob Landry 2008-05-29 17:11:51 UTC
David,

Can you work with 

Comment 3 Jeff Svatek 2008-06-04 21:38:02 UTC
Created attachment 308399 [details]
DL320G5 results 1

Comment 4 Jeff Svatek 2008-06-04 21:38:30 UTC
result 1 above

Here is the cpuscaling results from the DL320G5p system with one Yorkfield 
X3360 2.83GHz Quad Core, RHEL5x86U2 & HTS5.2-13.
The On Demand Governor Test fails b/c the demand performance is not within the 
5% margin, actual was 41.4%.


Comment 5 David Aquilina 2008-06-04 21:44:31 UTC
Jeff, can you please check the BIOS on the system for the Power Control setting?
Was it set to OS Control, or one of the HP-specific options? 

Thanks, 
David

Comment 6 Jeff Svatek 2008-06-05 22:49:03 UTC
The server was set to OS Control in RBSU.  The test complains about the cpu and
will not run the test if this setting is not enabled.

Comment 7 David Aquilina 2008-06-11 22:00:44 UTC
I ran the test on a DL360 G5 (for red hat folks,
hp-dl360g5-01.rhts.bos.redhat.com) and saw the following results. As with the
tests run at HP, the test is /slower/ using the on-demand governor compared to
statically setting the CPU to its lowest speed: 

System Capabilites:
-------------------------------------------------
System has 8 cpus

Supported CPU Frequencies: 
    1865 MHz
    1599 MHz


Supported Governors: 
    ondemand
    userspace
    performance

Current governors:
    cpu0: userspace
    cpu1: userspace
    cpu2: userspace
    cpu3: userspace
    cpu4: userspace
    cpu5: userspace
    cpu6: userspace
    cpu7: userspace

On Userspace Governor Test:
-------------------------------------------------
Setting governor to userspace
Setting cpu frequency to 1599 MHz
Running CPU load test...
try 1 - 28.39 sec
comparing 0.00
Minumum frequency load test time: 28.39
Setting cpu frequency to 1865 MHz
Running CPU load test...
try 1 - 24.35 sec
comparing 0.00
Maximum frequency load test time: 24.35

CPU Frequency Speed Up: 1.17
Measured Speed Up: 1.17
Percentage Difference -0.0%

On Demand Governor Test:
-------------------------------------------------
Setting governor to ondemand
Waiting 5 seconds... done.
Running CPU load test...
try 1 - 28.29 sec
comparing 0.00
On Demand load test time: 28.29
Percentage Difference vs. maximum frequency: 16.2%
Error: on demand performance is not within 5% margin

Performance Governor Test:
-------------------------------------------------
Setting governor to performance
Running CPU load test...
try 1 - 24.37 sec
comparing 0.00
Performace load test time: 24.37
Percentage Difference vs. maximum frequency: 0.1%

Summary:
-------------------------------------------------
Load Test Times:
    Minimum:     28.39
    Maximum:     24.35
    On Demand:   28.29
    Performance: 24.37
Margins:
   Speed Up:    -0.0%
   On Demand:   16.2%
   Performance: 0.1%


Comment 8 David Aquilina 2008-06-11 22:02:10 UTC
oops, ignore the previous comment about it being slower... I was looking at the
wrong line. Still, the difference is only .1 second. 

Comment 9 David Aquilina 2008-06-11 22:19:33 UTC
With the system set to the ondemand governor, I ran the core test while watching
/proc/cpuinfo and could see the CPUs scale up as the test runs. If I stop or
suspend the test, the CPUs scale back down.  

Rob, would you be willing to take a look at the system and offer your opinion? 

Comment 12 Rob Landry 2008-06-12 05:02:43 UTC
In manual testing it appears the box does not respond properly.  @ the slowest
setting I can set a baseline performance which is clearly bested by the
performance setting (this is good), unfortunately the on-demand setting which
does increase the reported speed on the impacted CPU to the maximum does not
match or approach the performance setting but instead compares to the baseline
as was reported by HTS.  

Discussing with dwa, we want to continue to investigate this a bit further to
verify if this is a specific CPU issue, a multi-core issue, something more
system specific or otherwise.

Comment 13 Ryan Armstrong 2008-06-17 17:23:46 UTC
Created attachment 309640 [details]
DL360G5 Results For Two Dual Core

Comment 14 Ryan Armstrong 2008-06-17 17:26:33 UTC
See attachment above for "DL360G5 Results For Two Dual Core":

Here is the cpuscaling results from the DL360G5 system with two Wolfdale 
X5260 3.33GHz Dual Core, RHEL5x64U2 & HTS5.2-13.
The On Demand Governor Test fails b/c the demand performance is not within the 
5% margin, actual was 65.9%.



Comment 15 Ryan Armstrong 2008-06-17 17:29:21 UTC
Created attachment 309641 [details]
DL360G5 Results For Single Dual Core

Comment 16 Ryan Armstrong 2008-06-17 17:30:25 UTC
See attachment above for "DL360G5 Results For Single Dual Core":

Here is the cpuscaling results from the DL360G5 system with a single Wolfdale 
X5260 3.33GHz Dual Core, RHEL5x64U2 & HTS5.2-13.
The On Demand Governor Test fails b/c the demand performance is not within the 
5% margin, actual was 65.7%.

Comment 17 David Aquilina 2008-07-09 19:36:22 UTC
We've tested this on an Intel SDV (DQ35JO motherboard with Yorkfield quad-core
45nm processor, 2GB RAM) and the cpuscaling test returns a successful result for
both the Xen and bare-metal kernels. 

For the guts of the cpuscaling test, you can take a look at
/usr/share/hts/tests/cpuscaling/cpuscaling.py. The actual work that's being
timed is simply calculating pi: 

    174     def pi(self):
    175         decimal.getcontext().prec = 500
    176         s = decimal.Decimal(1)
    177         h = decimal.Decimal(3).sqrt()/2
    178         n = 6
    179         for i in range(170):
    180             A = n*h*s/2  # A ... area of polygon
    181             s2 = ((1-h)**2+s**2/4)
    182             s = s2.sqrt()
    183             h = (1-s2/4).sqrt()
    184             n = 2*n



Comment 19 Gregg Shick 2008-07-15 20:11:40 UTC
David

I guess my next question would be, how is an acceptable delta determined?  If I
can do a certain amount of work at speed x, then the proc gets stepped up to
speed x+500 and I do more work, how is it determined how much more work should
the system be capable of?



Comment 20 David Aquilina 2008-07-15 20:39:10 UTC
(In reply to comment #19)
> I guess my next question would be, how is an acceptable delta determined?  If I
> can do a certain amount of work at speed x, then the proc gets stepped up to
> speed x+500 and I do more work, how is it determined how much more work should
> the system be capable of?

It's based on the percentage speedup. If the frequency increases by 100%, we
expect the time to complete the workload to increase by approximately 100% (with
a 10% margin allowed). 

The Intel SDV we tested supported 1.99Ghz and 2.6Ghz. Setting the CPU frequency
to its lowest speed resulted in a runtime of 21.17 seconds, setting the CPU to
its fastest speed resulted in a 15.87 second runtime. 

The On Demand governor test resulted in a runtime of 15.86 seconds. This is the
behavior we expect to see with the on demand governor. 


Comment 21 Gregg Shick 2008-07-22 16:54:48 UTC
David

A couple of questions.

1.  What proc did you use in the SDV unit?
2.  During the on demand governor test, which proc is being tested?

On other question I have for the dev, I have a 360 G5 with a single quad core
Clovertown processor with 2 p states.  Most of the time when I run the test it
will fail, but I have seen a couple of passing runs.  I noticed on the passing
runs, the procs running the speed test are switched between iterations.  The on
demand governor test moved to 3 different cpu's before giving me passing
results.  On the failing test, he never moves off cpu3 and fails after just 1
attempt.  I will attach the logs.



Comment 22 Gregg Shick 2008-07-22 16:59:25 UTC
Created attachment 312366 [details]
quad core failure first run attempt

Comment 23 Gregg Shick 2008-07-22 17:00:04 UTC
Created attachment 312368 [details]
quad core cpuscaling pass, same system that failed attempt 1

Comment 24 Gregg Shick 2008-07-22 20:01:10 UTC
One other question I have, if the system you tested on was quadcore, which CPU
was running the stress during the ondemand governor test.  

Comment 25 David Aquilina 2008-07-23 01:45:29 UTC
(In reply to comment #24)
> One other question I have, if the system you tested on was quadcore, which CPU
> was running the stress during the ondemand governor test.  

I'm not sure off-hand, however (if I recall correctly) we don't pin the test to
a particular core - we let the OS assign it to whichever one the scheduler sees
fit to. 


Comment 26 Gregg Shick 2008-07-25 19:50:29 UTC
David

After further investigation it appears that our roms are handling the p-states
correctly.  We have found that the test only fails in the xen kernel.  We need
someone to look at the cpuinfo and affected cpu's logs I am attaching.  It looks
like the xen kernel is not properly setting up the p-state processor associations.

Also on the SDV unit that passed, can we get the following information please:
- Dump of affected cpu for all procs (question, are the cores paired?)
- cpuinfo output from the machine in both xen and baremetal kernels



Comment 27 Gregg Shick 2008-07-25 19:51:43 UTC
Created attachment 312671 [details]
affected cpu output on baremetal

Comment 28 Gregg Shick 2008-07-25 19:52:02 UTC
Created attachment 312672 [details]
affected cpu output on xen

Comment 29 Gregg Shick 2008-07-25 19:52:16 UTC
Created attachment 312673 [details]
cpuinfo baremetal

Comment 30 Gregg Shick 2008-07-25 19:52:31 UTC
Created attachment 312674 [details]
cpuinfo xen

Comment 31 Gregg Shick 2008-07-25 20:46:20 UTC
Note for affected_cpu we are looking for the output from:

cat /sys/devices/system/cpu/cpu*/cpufreq/affected_cpus

Comment 32 David Aquilina 2008-08-06 16:19:47 UTC
Created attachment 313593 [details]
Bare metal test results on an Intel SDV

Comment 33 David Aquilina 2008-08-06 16:20:40 UTC
Created attachment 313594 [details]
Xen kernel test results on Intel SDV

Comment 34 Gregg Shick 2008-08-07 15:28:08 UTC
David

Can I also get the output of:

cat /sys/devices/system/cpu/cpu*/cpufreq/affected_cpus

on both kernels as well please.

Thanks

Comment 35 Gregg Shick 2008-08-07 15:41:12 UTC
Created attachment 313710 [details]
cpuinfo from intel sdv in baremetal kernel

David

Is there a reason why the xen kernel would be enumerating "physical id" differently between the xen and baremetal kernels?

Comment 36 Gregg Shick 2008-08-07 15:43:48 UTC
Created attachment 313711 [details]
cpuinfo from intel sdv in xen kernel

Please see "physical id"

Comment 37 David Aquilina 2008-08-14 19:58:30 UTC
(In reply to comment #34)
> Can I also get the output of:
> 
> cat /sys/devices/system/cpu/cpu*/cpufreq/affected_cpus

Here you go: 

[root@nosferatu ~]# uname -a
Linux nosferatu.rdu.redhat.com 2.6.18-92.1.10.el5xen #1 SMP Wed Jul 23 04:11:52 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux
[root@nosferatu ~]# cat /sys/devices/system/cpu/cpu*/cpufreq/affected_cpus
0
1
2
3

[root@nosferatu ~]# uname -a
Linux nosferatu.rdu.redhat.com 2.6.18-100.el5 #1 SMP Thu Jul 24 18:37:45 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux
[root@nosferatu ~]# cat /sys/devices/system/cpu/cpu*/cpufreq/affected_cpus
0
1
2
3

Comment 38 Gregg Shick 2008-08-14 21:45:40 UTC
Update:  Rick Hester is seeing the same failing behavior on ia64 that we are seeing on the x86/x64 boxes.  Passes in baremetal, fails in Xen.

David, has anyone in engineering commented yet on why the xen kernel sets up the processors and affected_cpu differently than baremetal?  Currently with the way the xen kernel configures affected_cpu we will not be able to pass this test.

Comment 43 David Aquilina 2008-10-01 19:56:38 UTC

*** This bug has been marked as a duplicate of bug 458894 ***