Bug 448943
Summary: | HTS 5.2 -13 cpu-scaling is failing with very large delta | ||
---|---|---|---|
Product: | [Retired] Red Hat Hardware Certification Program | Reporter: | Jeff Svatek <jeffrey.svatek> |
Component: | Test Suite (tests) | Assignee: | Greg Nichols <gnichols> |
Status: | CLOSED DUPLICATE | QA Contact: | Lawrence Lim <llim> |
Severity: | medium | Docs Contact: | |
Priority: | low | ||
Version: | 5 | CC: | akarlsso, bxu, dwa, gregg.shick, nagananda.chumbalkar, rlandry, ryan.armstrong, tao, tools-bugs, ykun |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2008-10-01 19:56:38 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Attachments: |
Description
Jeff Svatek
2008-05-29 16:23:23 UTC
Created attachment 307102 [details]
hts-log-package
David, Can you work with Created attachment 308399 [details]
DL320G5 results 1
result 1 above Here is the cpuscaling results from the DL320G5p system with one Yorkfield X3360 2.83GHz Quad Core, RHEL5x86U2 & HTS5.2-13. The On Demand Governor Test fails b/c the demand performance is not within the 5% margin, actual was 41.4%. Jeff, can you please check the BIOS on the system for the Power Control setting? Was it set to OS Control, or one of the HP-specific options? Thanks, David The server was set to OS Control in RBSU. The test complains about the cpu and will not run the test if this setting is not enabled. I ran the test on a DL360 G5 (for red hat folks, hp-dl360g5-01.rhts.bos.redhat.com) and saw the following results. As with the tests run at HP, the test is /slower/ using the on-demand governor compared to statically setting the CPU to its lowest speed: System Capabilites: ------------------------------------------------- System has 8 cpus Supported CPU Frequencies: 1865 MHz 1599 MHz Supported Governors: ondemand userspace performance Current governors: cpu0: userspace cpu1: userspace cpu2: userspace cpu3: userspace cpu4: userspace cpu5: userspace cpu6: userspace cpu7: userspace On Userspace Governor Test: ------------------------------------------------- Setting governor to userspace Setting cpu frequency to 1599 MHz Running CPU load test... try 1 - 28.39 sec comparing 0.00 Minumum frequency load test time: 28.39 Setting cpu frequency to 1865 MHz Running CPU load test... try 1 - 24.35 sec comparing 0.00 Maximum frequency load test time: 24.35 CPU Frequency Speed Up: 1.17 Measured Speed Up: 1.17 Percentage Difference -0.0% On Demand Governor Test: ------------------------------------------------- Setting governor to ondemand Waiting 5 seconds... done. Running CPU load test... try 1 - 28.29 sec comparing 0.00 On Demand load test time: 28.29 Percentage Difference vs. maximum frequency: 16.2% Error: on demand performance is not within 5% margin Performance Governor Test: ------------------------------------------------- Setting governor to performance Running CPU load test... try 1 - 24.37 sec comparing 0.00 Performace load test time: 24.37 Percentage Difference vs. maximum frequency: 0.1% Summary: ------------------------------------------------- Load Test Times: Minimum: 28.39 Maximum: 24.35 On Demand: 28.29 Performance: 24.37 Margins: Speed Up: -0.0% On Demand: 16.2% Performance: 0.1% oops, ignore the previous comment about it being slower... I was looking at the wrong line. Still, the difference is only .1 second. With the system set to the ondemand governor, I ran the core test while watching /proc/cpuinfo and could see the CPUs scale up as the test runs. If I stop or suspend the test, the CPUs scale back down. Rob, would you be willing to take a look at the system and offer your opinion? In manual testing it appears the box does not respond properly. @ the slowest setting I can set a baseline performance which is clearly bested by the performance setting (this is good), unfortunately the on-demand setting which does increase the reported speed on the impacted CPU to the maximum does not match or approach the performance setting but instead compares to the baseline as was reported by HTS. Discussing with dwa, we want to continue to investigate this a bit further to verify if this is a specific CPU issue, a multi-core issue, something more system specific or otherwise. Created attachment 309640 [details]
DL360G5 Results For Two Dual Core
See attachment above for "DL360G5 Results For Two Dual Core": Here is the cpuscaling results from the DL360G5 system with two Wolfdale X5260 3.33GHz Dual Core, RHEL5x64U2 & HTS5.2-13. The On Demand Governor Test fails b/c the demand performance is not within the 5% margin, actual was 65.9%. Created attachment 309641 [details]
DL360G5 Results For Single Dual Core
See attachment above for "DL360G5 Results For Single Dual Core": Here is the cpuscaling results from the DL360G5 system with a single Wolfdale X5260 3.33GHz Dual Core, RHEL5x64U2 & HTS5.2-13. The On Demand Governor Test fails b/c the demand performance is not within the 5% margin, actual was 65.7%. We've tested this on an Intel SDV (DQ35JO motherboard with Yorkfield quad-core 45nm processor, 2GB RAM) and the cpuscaling test returns a successful result for both the Xen and bare-metal kernels. For the guts of the cpuscaling test, you can take a look at /usr/share/hts/tests/cpuscaling/cpuscaling.py. The actual work that's being timed is simply calculating pi: 174 def pi(self): 175 decimal.getcontext().prec = 500 176 s = decimal.Decimal(1) 177 h = decimal.Decimal(3).sqrt()/2 178 n = 6 179 for i in range(170): 180 A = n*h*s/2 # A ... area of polygon 181 s2 = ((1-h)**2+s**2/4) 182 s = s2.sqrt() 183 h = (1-s2/4).sqrt() 184 n = 2*n David I guess my next question would be, how is an acceptable delta determined? If I can do a certain amount of work at speed x, then the proc gets stepped up to speed x+500 and I do more work, how is it determined how much more work should the system be capable of? (In reply to comment #19) > I guess my next question would be, how is an acceptable delta determined? If I > can do a certain amount of work at speed x, then the proc gets stepped up to > speed x+500 and I do more work, how is it determined how much more work should > the system be capable of? It's based on the percentage speedup. If the frequency increases by 100%, we expect the time to complete the workload to increase by approximately 100% (with a 10% margin allowed). The Intel SDV we tested supported 1.99Ghz and 2.6Ghz. Setting the CPU frequency to its lowest speed resulted in a runtime of 21.17 seconds, setting the CPU to its fastest speed resulted in a 15.87 second runtime. The On Demand governor test resulted in a runtime of 15.86 seconds. This is the behavior we expect to see with the on demand governor. David A couple of questions. 1. What proc did you use in the SDV unit? 2. During the on demand governor test, which proc is being tested? On other question I have for the dev, I have a 360 G5 with a single quad core Clovertown processor with 2 p states. Most of the time when I run the test it will fail, but I have seen a couple of passing runs. I noticed on the passing runs, the procs running the speed test are switched between iterations. The on demand governor test moved to 3 different cpu's before giving me passing results. On the failing test, he never moves off cpu3 and fails after just 1 attempt. I will attach the logs. Created attachment 312366 [details]
quad core failure first run attempt
Created attachment 312368 [details]
quad core cpuscaling pass, same system that failed attempt 1
One other question I have, if the system you tested on was quadcore, which CPU was running the stress during the ondemand governor test. (In reply to comment #24) > One other question I have, if the system you tested on was quadcore, which CPU > was running the stress during the ondemand governor test. I'm not sure off-hand, however (if I recall correctly) we don't pin the test to a particular core - we let the OS assign it to whichever one the scheduler sees fit to. David After further investigation it appears that our roms are handling the p-states correctly. We have found that the test only fails in the xen kernel. We need someone to look at the cpuinfo and affected cpu's logs I am attaching. It looks like the xen kernel is not properly setting up the p-state processor associations. Also on the SDV unit that passed, can we get the following information please: - Dump of affected cpu for all procs (question, are the cores paired?) - cpuinfo output from the machine in both xen and baremetal kernels Created attachment 312671 [details]
affected cpu output on baremetal
Created attachment 312672 [details]
affected cpu output on xen
Created attachment 312673 [details]
cpuinfo baremetal
Created attachment 312674 [details]
cpuinfo xen
Note for affected_cpu we are looking for the output from: cat /sys/devices/system/cpu/cpu*/cpufreq/affected_cpus Created attachment 313593 [details]
Bare metal test results on an Intel SDV
Created attachment 313594 [details]
Xen kernel test results on Intel SDV
David Can I also get the output of: cat /sys/devices/system/cpu/cpu*/cpufreq/affected_cpus on both kernels as well please. Thanks Created attachment 313710 [details]
cpuinfo from intel sdv in baremetal kernel
David
Is there a reason why the xen kernel would be enumerating "physical id" differently between the xen and baremetal kernels?
Created attachment 313711 [details]
cpuinfo from intel sdv in xen kernel
Please see "physical id"
(In reply to comment #34) > Can I also get the output of: > > cat /sys/devices/system/cpu/cpu*/cpufreq/affected_cpus Here you go: [root@nosferatu ~]# uname -a Linux nosferatu.rdu.redhat.com 2.6.18-92.1.10.el5xen #1 SMP Wed Jul 23 04:11:52 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux [root@nosferatu ~]# cat /sys/devices/system/cpu/cpu*/cpufreq/affected_cpus 0 1 2 3 [root@nosferatu ~]# uname -a Linux nosferatu.rdu.redhat.com 2.6.18-100.el5 #1 SMP Thu Jul 24 18:37:45 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux [root@nosferatu ~]# cat /sys/devices/system/cpu/cpu*/cpufreq/affected_cpus 0 1 2 3 Update: Rick Hester is seeing the same failing behavior on ia64 that we are seeing on the x86/x64 boxes. Passes in baremetal, fails in Xen. David, has anyone in engineering commented yet on why the xen kernel sets up the processors and affected_cpu differently than baremetal? Currently with the way the xen kernel configures affected_cpu we will not be able to pass this test. *** This bug has been marked as a duplicate of bug 458894 *** |