Description of problem: Following error is given on an ia64 box: ------------------------------------------------------------------ #bytes #iterations BW peak[MB/sec] BW average[MB/sec] Conflicting CPU frequency values detected: 1466.000000 != 1667.000000 65536 1000 0.00 0.00 ------------------------------------------------------------------ Note that it only happens when ia64 is on the client side. When it's used as server, things work fine. Version-Release number of selected component (if applicable): RHEL5.2-Server-20080212.0 tree How reproducible: Everytime. Steps to Reproduce: 1. Run ib_send_bw utility where the client is an ia64 node 2. 3. Actual results: Expected results: Additional info: This is a regression
OK, this isn't an openib issue, it's a kernel issue. Basically, the output of /proc/cpuinfo on the 16 way Itamium this bug comes from is not, in any way, constant. The nominal cpu MHz is 1466, but by running this command: while true; do grep "cpu MHz" /proc/cpuinfo | grep 1667; done you do in fact get an occasional jump up from 1466. What's more, the bogomips in this file can be totally horked. This is the 16 bogomips values from a single cat of /proc/cpuinfo: BogoMIPS : 1658.88 BogoMIPS : 1671.16 BogoMIPS : 16.35 BogoMIPS : 16.35 BogoMIPS : 3301.37 BogoMIPS : 3309.56 BogoMIPS : 3309.56 BogoMIPS : 3309.56 BogoMIPS : 1671.16 BogoMIPS : 1683.45 BogoMIPS : 1662.97 BogoMIPS : 1667.07 BogoMIPS : 3309.56 BogoMIPS : 3309.56 BogoMIPS : 3309.56 BogoMIPS : 3301.37 In any case, it really looks like the contents of /proc/cpuinfo on ia64 isn't reliable/trustable, and the perftest program is noticing that and refusing to make bandwidth numbers from an inconsistent divisor (the numbers don't have much value if you don't have a reference between wall time and cpu time).
Changing severity to low. This Intel whitebox system appears to be the only system with this issue. In the past, Intel whiteboxes have shown other strange behavior. I suspect that an SCI is being issued during system init and that is causing problems with the bogomips calibration. P.
Gurhan -- is this system connected to a serial console? P.
(In reply to comment #6) > Gurhan -- is this system connected to a serial console? > > P. Prarit, No, not yet:( Right now the box is being used by mjenner, you can grab me or him to show you in the lab where the box is if you are in the office.
Ok, so trying this on another ia64 box, I can get the client program working, however it also prints out this warning message: Warning: measured timestamp frequency 399.187 differs from nominal 1594 MHz Prarit, I'll let you decide what to do about it since I don't know what could be causing it or if it's a bug. This was, by the way, on hp-sapphire-02.rhts.boston.redhat.com box, borrowed from dchapman .
why does the tool use the bogomips in /proc/cpuinfo as a definitive information to decide run or not run? Instead of using the boot-time data, the tool should calibrate the value by itself to reflect the most recent status.
The tool doesn't use the bogomips value. I only posted the bogomips value to demonstrate how screwed up the values in proc/cpuinfo are on the machine in question. The tool uses the CPU MHz value only, and that value varies on this particular machine.
It is also an iffy decision to use CPU MHz value for your test program, because With the DBS and cpuspeed enabled, the proc CPU MHZ value is to be consistent with current CPU p-state which will be adjusted from time to time based on workload of the time.
I think it is not proper to use CPU MHZ of proc/cpuinfo in the ib_send_bw test program based on it's changeability... The tool should calibrate the CPU MHZ value by itself. After bootup, if you still can calibrate the bad bogomips value as comment# 1, then that is a real problem , and we need to worry about it.
moving it back to openib issue for now.
In response to comment #13, the program *is* generating its own CPU MHz rating. That's the whole point of the message in comment #8. Contrary to Prarit's comment #12, the program isn't comparing an itc clock to a cpu clock, it's comparing the CPU MHz as generated using this method: /* Use linear regression to calculate cycles per microsecond. http://en.wikipedia.org/wiki/Linear_regression#Parameter_estimation */ versus the value reported in /proc/cpuinfo and is then reporting the variance whenever the variance is > 1%. Basically, the program has two checks it performs on CPU MHz. The first is that it reads all of the CPU MHz values from /proc/cpuinfo. It makes sure that all CPUs report the same speed. If, in a single reading of /proc/cpuinfo, some CPUs have one speed and others have another, then it reports the message that originally caused this bug to be opened. Once the reading of /proc/cpuinfo has passed the "all cpus are identical speed" test, then the code generates its own value of CPU MHz based upon the linear regression technique and compares that to the value report in the /proc/cpuinfo file and if the difference between the two is greater than 1%, you get the second message. Now, I should point out that these same tests have never produced any problems anywhere other than ia64, so I'm pretty sure the linear regression method is working properly (at least on i686/x86_64 and probably ppc64 too). That would seem to indicate that, contrary to Luming's comment #11, the values in /proc/cpuinfo are *not* in fact being kept consistent with the current processor state. All that makes me think that this is still a kernel problem, not a problem with the ib test code.
ok, could you share the test case that I can try on my ia64 box ...
OK, it looks like Prarit's comment #12 was correct. My mistake on that. While looking through the header file get_clock.h from the source code, it appears that the method by which the code gets the cycle count on ia64 is to access ar.itc, which I can only assume is the itc clock that Prarit referred to. I'm attaching get_clock.c and get_clock.h to this report. These contain the functions the perftest programs use to calibrate/check the cpu cycle times. Now, if the itc clock isn't always the same as the cpu clock, is it true that they are always a set multiple of each other, and if so what are the possible multiples? I could write the code so that on ia64 is checks alternative multiples before declaring the values bad.
Created attachment 301633 [details] get_clock.c
Created attachment 301634 [details] get_clock.h
Created attachment 301635 [details] possibly fixed get_clock.c This version of get_clock.c attempts to determine if a multiple is in use between the itc and cpu clocks, and if so adjusts things accordingly.
I tested it, and get the following results. The DBS does make the testing results complete different. Probaby RHEL 5 kernel doesn't have /proc/cpuinfo linked to CPUFREQ driver for retrieving current cpu frequency. Another reason is the current cpu frequency is indeed different at calibrating time than at time of peeking /proc/cpuinfo. (It is quite possible because that is what DBS is doing to adapt to different load for power saving purpose.) [root@tigerG tmp]# service cpuspeed stop Disabling ondemand cpu frequency scaling: [ OK ] [root@tigerG tmp]# ./a.out proc frequency values detected: 1667.000000 , 1667.000000 proc frequency values detected: 1667.000000 , 1667.000000 proc frequency values detected: 1667.000000 , 1667.000000 proc frequency values detected: 1667.000000 , 1667.000000 proc frequency values detected: 1667.000000 , 1667.000000 proc frequency values detected: 1667.000000 , 1667.000000 proc frequency values detected: 1667.000000 , 1667.000000 proc frequency values detected: 1667.000000 , 1667.000000 proc frequency values detected: 1667.000000 , 1667.000000 proc frequency values detected: 1667.000000 , 1667.000000 proc frequency values detected: 1667.000000 , 1667.000000 proc frequency values detected: 1667.000000 , 1667.000000 proc frequency values detected: 1667.000000 , 1667.000000 proc frequency values detected: 1667.000000 , 1667.000000 proc frequency values detected: 1667.000000 , 1667.000000 [root@tigerG tmp]# service cpuspeed start Enabling ondemand cpu frequency scaling: [ OK ] [root@tigerG tmp]# ./a.out proc frequency values detected: 1466.000000 , 1466.000000 proc frequency values detected: 1466.000000 , 1466.000000 proc frequency values detected: 1466.000000 , 1466.000000 proc frequency values detected: 1466.000000 , 1466.000000 proc frequency values detected: 1466.000000 , 1466.000000 proc frequency values detected: 1466.000000 , 1667.000000 Conflicting CPU frequency values detected: 1466.000000 != 1667.000000 [root@tigerG tmp]#
OK, I've built a new version of perftest with the modified routine to check for a clock multiple of the itc. It will still fail if it detects different cpu speeds, and there's not much I think we should do about that. I would be more include to tell people to disable cpu speed scaling during performance runs.
So even on this intel whitebox i can reproduce the behavior Luming pointed: [root@intel-s6e4533-01-mm 2008:8175]# service cpuspeed stop Disabling ondemand cpu frequency scaling: [ OK ] [root@intel-s6e4533-01-mm 2008:8175]# ib_send_bw -m 2048 dell-pe1950-03.rhts.boston.redhat.com ------------------------------------------------------------------ Send BW Test Connection type : RC Inline data is used up to 400 bytes message local address: LID 0x08, QPN 0x7c0406, PSN 0x7747da remote address: LID 0x01, QPN 0x0003, PSN 0xaab168 Mtu : 2048 ------------------------------------------------------------------ #bytes #iterations BW peak[MB/sec] BW average[MB/sec] 65536 1000 3687.79 3687.69 ------------------------------------------------------------------ [root@intel-s6e4533-01-mm 2008:8175]# service cpuspeed start Enabling ondemand cpu frequency scaling: [ OK ] [root@intel-s6e4533-01-mm 2008:8175]# ib_send_bw -m 2048 dell-pe1950-03.rhts.boston.redhat.com ------------------------------------------------------------------ Send BW Test Connection type : RC Inline data is used up to 400 bytes message local address: LID 0x08, QPN 0x7d0406, PSN 0xef907e remote address: LID 0x01, QPN 0x0004, PSN 0xa5f393 Mtu : 2048 ------------------------------------------------------------------ #bytes #iterations BW peak[MB/sec] BW average[MB/sec] Conflicting CPU frequency values detected: 1466.000000 != 1667.000000 65536 1000 0.00 0.00 ------------------------------------------------------------------ [root@intel-s6e4533-01-mm 2008:8175]# Shall we release note this per comment #24 ?
Thanks Gurhan. Added the following note to RHEl5.2 release notes updates: <quote> (ia64) Running perftest will fail if different CPU speeds are detected. As such, you should disable CPU speed scaling before running perftest </quote> please advise if any further revisions are required. thanks!
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2008-0432.html
Tracking this bug for the Red Hat Enterprise Linux 5.3 Release Notes. This Release Note is currently located in the Known Issues section.
Release note added. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team.