Bug 613476

Summary: Linpack benchmark: performance regression upto 30% on AMD CPUs
Product: Red Hat Enterprise Linux 6 Reporter: Jiri Hladky <jhladky>
Component: kernelAssignee: Johannes Weiner <jweiner>
Status: CLOSED DUPLICATE QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: high Docs Contact:
Priority: medium    
Version: 6.0CC: bmarson, bnagendr, dshaks, hladky.jiri, kkolakow, lwang, perfbz, pzijlstr, rmusil
Target Milestone: rc   
Target Release: ---   
Hardware: athlon   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-03-30 16:33:42 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Results from worse run http://rhts.redhat.com/cgi-bin/rhts/test_log.cgi?id=15260375 none

Description Jiri Hladky 2010-07-11 22:43:23 UTC
Description of problem:
We compare linpack floating point results for
-simultaneous runs without CPU affinity set (just relying on kernel task-scheduler to pick-up the best cores and stick with them)
-simultaneous runs with CPU affinity set

We see very poor results on AMD CPUs. task-scheduler is moving linpack between different CPUs.

Worst result:
                 |          S C H E D U L I N G    M O D E         |DEFAULT &  |
                 |                                                 |AFFINITY   |
                 |                                                 |COMPARISON |
NUMBER |FLOATING |         DEFAULT        |       CPU AFFINITY     |           |
  OF   |  POINT  |                        |                        | %   TEST  |
STREAMS|PRECISION| TOTAL   AVG STDEV SCALE| TOTAL   AVG STDEV SCALE|DIFF STATUS|
-------+---------+------------------------+------------------------+-----------+

   2      Double |   770   385  80.7  1.68|  1010   505  43.1  1.99|  31   FAIL|


Results are consistent. I have done 2 runs and each run is using 5 loops to create statistics.

ibm-x3655-02.ovirt.rhts.eng.bos.redhat.com
Quad-Core AMD Opteron(tm) Processor 2356
See:
http://rhts.redhat.com/cgi-bin/rhts/test_log.cgi?id=15260375
http://rhts.redhat.com/cgi-bin/rhts/test_log.cgi?id=15260347

dell-pe6950-01.rhts.eng.bos.redhat.com
Dual-Core AMD Opteron(tm) Processor 8212
See:
http://rhts.redhat.com/cgi-bin/rhts/test_log.cgi?id=15260411
http://rhts.redhat.com/cgi-bin/rhts/test_log.cgi?id=15260357

Version-Release number of selected component (if applicable):
RHEL6.0-20100707.4
2.6.32-44.el6.x86_64

How reproducible:

Use one of these boxes:
ibm-x3655-02.ovirt.rhts.eng.bos.redhat.com
or
dell-pe6950-01.rhts.eng.bos.redhat.com

Alternatively use any other box with same CPU layout. Please make sure to pick AMD CPUs. See
https://beaker.engineering.redhat.com/view/dell-pe6950-01.rhts.eng.bos.redhat.com

https://beaker.engineering.redhat.com/view/ibm-x3655-02.ovirt.rhts.eng.bos.redhat.com

(Details tab) for inspiration.


Steps to Reproduce:
1. Get linpack benchmark at this location:
http://cvs.devel.redhat.com/cgi-bin/cvsweb.cgi/tests/performance/linpack/linpack.tar
2. Untar, run "make". You will get linpacks (single precision) and linpackd (double precision) executable files.
3. Discover CPU topology. I recommend to use hwloc
http://www.open-mpi.org/software/hwloc/v1.0/
wget http://www.open-mpi.org/software/hwloc/v1.0/downloads/hwloc-1.0.1.tar.gz
Untar, configure, make, make install
lstopo --physical -
will give you CPU topology.

4. Pick best CPUs by keeping CPU cache in mind.
 hwloc-distrib --single<number_of_concurrent_runs>
  
5. Start 2 runs
-without CPU affinity set
./linpackd & ./linpackd &
-with CPU affinity set
taskset -c <number> ./linpackd & taskset -c <number> ./linpackd & 

Compare KFlops reported.

6. Use mpstat to see how task-sheduler is moving jobs between different cores. 

Actual results:
CPU affinity run upto 30% quicker (30 % higher KFlops reported) than CPU affinity run.


Expected results:
Both runs are giving same results.


Additional info:

Comment 1 Jiri Hladky 2010-07-11 22:54:14 UTC
Created attachment 431035 [details]
Results from worse run http://rhts.redhat.com/cgi-bin/rhts/test_log.cgi?id=15260375

Files to check:
Directory:
results_2010-Jul-11_11h11m32s/summarylogs

linpackd.2stream.histograms-stats.log
===> See how task-scheduler is moving jobs between different cores. Compare this with affinity runs.

linpackd.4stream.histograms-stats.log
===> same for 4 parallel linpackd runs

Comment 4 Jiri Hladky 2010-07-12 14:09:41 UTC
Hi Ben,

the CPUs are in fact pretty old:
Opteron 8212 belongs to Opteron 8200-series "Santa Rosa" (90 nm), Released in Aug 2006

Opteron 2356 belongs to Opteron 2300-series "Barcelona" (65 nm), Released in Sept 2007

I don't see this bug on newest Opteron 6100-series "Magny-Cours" (45 nm) CPUs. (tested on amd-dinar-02.lab.bos.redhat.com)

Please note that I have opened similar bug on Intel(R) Xeon(R) CPU E5530 as well:
https://bugzilla.redhat.com/show_bug.cgi?id=610297

It seems to be more generic problem but it affects only certain CPUs models and typologies.

IMHO, this is not a blocker. linpackd benchmark is very sensitive to hopes between different processors. Others program will not see such huge performance drop when hoping between different processors.

Thanks
Jirka

Comment 5 RHEL Program Management 2010-07-15 15:04:35 UTC
This issue has been proposed when we are only considering blocker
issues in the current Red Hat Enterprise Linux release. It has
been denied for the current Red Hat Enterprise Linux release.

** If you would still like this issue considered for the current
release, ask your support representative to file as a blocker on
your behalf. Otherwise ask that it be considered for the next
Red Hat Enterprise Linux release. **

Comment 6 Johannes Weiner 2010-09-17 09:19:23 UTC
Jiri, as 610297 has been resolved, can we close this one as well?

Comment 7 Jiri Hladky 2010-09-21 08:31:02 UTC
Hallo Johannes,

I have rerun the benchmark on 5.5 and results are still bad. Job for RHEL6.0-RC-3 is in the queue. I will post the results here as soon as RHTS job will finish.

Thanks
Jirka

Comment 8 Jiri Hladky 2010-09-23 11:17:10 UTC
Hallo Johannes,

I have completed linpack benchmark on RHEL 6.0 RC-3. Results are still bad:

Please check the summary of results in Beaker:
https://beaker.engineering.redhat.com/logs/2010/09/192/19265/35945/439372/1314750///test_log--performance-linpack-certification.log


Thanks
Jirka

Comment 11 RHEL Program Management 2011-01-07 04:43:50 UTC
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated
in the current release, Red Hat is unfortunately unable to
address this request at this time. Red Hat invites you to
ask your support representative to propose this request, if
appropriate and relevant, in the next release of Red Hat
Enterprise Linux. If you would like it considered as an
exception in the current release, please ask your support
representative.

Comment 12 Suzanne Logcher 2011-01-07 16:06:52 UTC
This request was erroneously denied for the current release of Red Hat
Enterprise Linux.  The error has been fixed and this request has been
re-proposed for the current release.

Comment 13 RHEL Program Management 2011-02-01 06:15:09 UTC
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated
in the current release, Red Hat is unfortunately unable to
address this request at this time. Red Hat invites you to
ask your support representative to propose this request, if
appropriate and relevant, in the next release of Red Hat
Enterprise Linux. If you would like it considered as an
exception in the current release, please ask your support
representative.

Comment 14 RHEL Program Management 2011-02-01 18:27:56 UTC
This request was erroneously denied for the current release of
Red Hat Enterprise Linux.  The error has been fixed and this
request has been re-proposed for the current release.

Comment 15 RHEL Program Management 2011-04-04 02:39:46 UTC
Since RHEL 6.1 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 16 RHEL Program Management 2011-10-07 15:08:08 UTC
Since RHEL 6.2 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 17 Johannes Weiner 2012-03-30 16:33:42 UTC

*** This bug has been marked as a duplicate of bug 610297 ***