Bug 923942 - Thinkpad T420s overheating and showing wrong [or overclocked?] freq in /proc/cpuinfo with kernel-3.9.0-0.rc3.git0.5.fc20.x86_64
Summary: Thinkpad T420s overheating and showing wrong [or overclocked?] freq in /proc/...
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-03-20 18:40 UTC by Satish Balay
Modified: 2013-04-03 16:03 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-04-03 16:03:42 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
dmesg from currently running 3.9.0-0.rc3.git0.5.fc20.x86_64 [the overheating issue hasn't occured yet] (84.61 KB, text/x-log)
2013-03-21 17:14 UTC, Satish Balay
no flags Details

Description Satish Balay 2013-03-20 18:40:19 UTC
Description of problem:

The laptop [Thinkpad T420s] is overheating - and /proc/cpuinfo shows out-of-range frequency

Version-Release number of selected component (if applicable):

kernel-3.9.0-0.rc3.git0.5.fc20.x86_64
[from rawhide-kernel-nodebug repo with F18]

How reproducible:

Tried only once [till now]

Steps to Reproduce:
1. Upgrade from F16 to F18 [via F17]
2. update to latest testing packages
3. Install kernel form rawhide-kernel-nodebug repo
4. reboot and perhaps suspend/resume
  
Actual results:

Machine overheating - with /proc/cpuinfo listing out-of-range [overclocked?] value for frequency. The CPU is has (max)2500MHz - and Turbo 3200MHz.
But /proc/cpuinfo reports 3968MHz - a value well over 3200MHz

Expected results:

No overheating and sane frequency numbers.
Additional info:


[root@asterix ~]# uname -a
Linux asterix 3.9.0-0.rc3.git0.5.fc20.x86_64 #1 SMP Tue Mar 19 01:25:41 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
[root@asterix ~]# grep CPU /proc/cpuinfo 
model name	: Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz
model name	: Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz
model name	: Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz
model name	: Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz
[root@asterix ~]# grep MHz /proc/cpuinfo 
cpu MHz		: 3872.000
cpu MHz		: 3840.000
cpu MHz		: 3904.000
cpu MHz		: 3968.000
[root@asterix ~]# sensors
acpitz-virtual-0
Adapter: Virtual device
temp1:        +95.0°C  (crit = +97.0°C)

coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +97.0°C  (high = +86.0°C, crit = +100.0°C)
Core 0:         +94.0°C  (high = +86.0°C, crit = +100.0°C)
Core 1:         +93.0°C  (high = +86.0°C, crit = +100.0°C)

thinkpad-isa-0000
Adapter: ISA adapter
fan1:        4628 RPM

[root@asterix ~]# cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq
cat: /sys/devices/system/cpu/cpu0/cpufreq/: No such file or directory
[root@asterix ~]# cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq 
3200000
[root@asterix ~]# cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq 
800000
[root@asterix ~]# 

Additional Note: With previous kernels - I do see 'scaling_cur_freq' set.

[I think  scaling_max_freq is usually 2500000 - but I'll have to reboot back to old kernel to recheck]

Related bugzilla #859597

Comment 1 Satish Balay 2013-03-21 15:26:52 UTC
That related bugzilla is https://bugzilla.redhat.com/show_bug.cgi?id=859597

I've tried a few more suspend resumes - and the overheating goes away with that. [so that part of the issue must be same as the above bugzilla]

But the funny frequency numbers persist. So this part is the new issue?

[normally the current frequency corresponds to scaling_frequencies - but now I see all kid of odd numbers for current frequency]

[the following is with 3.8.3-203.fc18.x86_64. I didn't check the values with kernel-3.9.0-0.rc3.git0.5.fc20.x86_64]

[root@asterix cpufreq]# cat scaling_available_frequencies
2501000 2500000 2200000 2000000 1800000 1600000 1400000 1200000 1000000 800000 

Also scaling_max_freq is 2501000 for 3.8.3-203.fc18.x86_64. as oppposed to 3200000 on 3.9.0-0.rc3.git0.5

[root@asterix cpufreq]# cat  scaling_max_freq
2501000
[root@asterix cpufreq]#

Comment 2 Satish Balay 2013-03-21 15:39:24 UTC
With 3.9.0-0.rc3.git0.5.fc20.x86_64 - I don't see scaling_available_frequencies
[and scaling_setspeed says 'unsupported' - so thats the primary issue?. Again I don't know this value with 3.8.3]

[root@asterix ~]# cd /sys/devices/system/cpu/cpu0/cpufreq/
[root@asterix cpufreq]# ls
affected_cpus     cpuinfo_min_freq            scaling_available_governors  scaling_max_freq
cpuinfo_cur_freq  cpuinfo_transition_latency  scaling_driver               scaling_min_freq
cpuinfo_max_freq  related_cpus                scaling_governor             scaling_setspeed
[root@asterix cpufreq]# cat scaling_available_frequencies
cat: scaling_available_frequencies: No such file or directory
[root@asterix cpufreq]# cat scaling_setspeed
<unsupported>
[root@asterix cpufreq]# 

Wrt idling frequencies - I see the following with 3.8.3

[root@asterix cpufreq]# grep MHz /proc/cpuinfo 
cpu MHz		: 800.000
cpu MHz		: 800.000
cpu MHz		: 800.000
cpu MHz		: 800.000
[root@asterix cpufreq]# grep MHz /proc/cpuinfo 
cpu MHz		: 800.000
cpu MHz		: 800.000
cpu MHz		: 800.000
cpu MHz		: 800.000
[root@asterix cpufreq]# grep MHz /proc/cpuinfo 
cpu MHz		: 800.000
cpu MHz		: 800.000
cpu MHz		: 800.000
cpu MHz		: 800.000
[root@asterix cpufreq]# grep MHz /proc/cpuinfo 
cpu MHz		: 800.000
cpu MHz		: 800.000
cpu MHz		: 800.000
cpu MHz		: 800.000
[root@asterix cpufreq]# grep MHz /proc/cpuinfo 
cpu MHz		: 800.000
cpu MHz		: 800.000
cpu MHz		: 800.000
cpu MHz		: 800.000
[root@asterix cpufreq]# grep MHz /proc/cpuinfo 
cpu MHz		: 800.000
cpu MHz		: 800.000
cpu MHz		: 800.000
cpu MHz		: 800.000
[root@asterix cpufreq]# 

---------------------------------------------

but with 3.9.0-0.rc3.git0.5.fc20.x86_64 - I see odd numbers kike the following:


[root@asterix ~]# grep MHz /proc/cpuinfo 
cpu MHz		: 992.000
cpu MHz		: 992.000
cpu MHz		: 1024.000
cpu MHz		: 992.000
[root@asterix ~]# grep MHz /proc/cpuinfo 
cpu MHz		: 992.000
cpu MHz		: 992.000
cpu MHz		: 1024.000
cpu MHz		: 992.000
[root@asterix ~]# grep MHz /proc/cpuinfo 
cpu MHz		: 992.000
cpu MHz		: 992.000
cpu MHz		: 992.000
cpu MHz		: 992.000
[root@asterix ~]# grep MHz /proc/cpuinfo 
cpu MHz		: 992.000
cpu MHz		: 992.000
cpu MHz		: 992.000
cpu MHz		: 1024.000
[root@asterix ~]# grep MHz /proc/cpuinfo 
cpu MHz		: 1984.000
cpu MHz		: 1472.000
cpu MHz		: 1152.000
cpu MHz		: 1632.000
[root@asterix ~]#

Comment 3 Dave Jones 2013-03-21 15:55:53 UTC
please attach the output of dmesg.

Comment 4 Satish Balay 2013-03-21 17:14:22 UTC
Created attachment 714003 [details]
dmesg from currently running 3.9.0-0.rc3.git0.5.fc20.x86_64 [the overheating issue hasn't occured yet]

Comment 5 Dirk Brandewie 2013-03-22 15:04:34 UTC
(In reply to comment #2)
> With 3.9.0-0.rc3.git0.5.fc20.x86_64 - I don't see
> scaling_available_frequencies
> [and scaling_setspeed says 'unsupported' - so thats the primary issue?.
> Again I don't know this value with 3.8.3]
> 

scaling_setspeed is only be supported by the userspace governor. So this is normal.

> [root@asterix cpufreq]# cat scaling_available_frequencies
> cat: scaling_available_frequencies: No such file or directory

This is normal. intel_pstate does provide a frequency table.  The frequency table provided by acpi-cpufreq come from the BIOS and is a well documented fiction.

> Wrt idling frequencies - I see the following with 3.8.3
> 
> [root@asterix cpufreq]# grep MHz /proc/cpuinfo 
> cpu MHz		: 800.000
> cpu MHz		: 800.000
> cpu MHz		: 800.000
> cpu MHz		: 800.000
> ---------------------------------------------
> 
> but with 3.9.0-0.rc3.git0.5.fc20.x86_64 - I see odd numbers kike the
> following:
> 
> 
> [root@asterix ~]# grep MHz /proc/cpuinfo 
> cpu MHz		: 992.000
> cpu MHz		: 992.000
> cpu MHz		: 1024.000
> cpu MHz		: 992.000

acpi-cpufreq returns the frequency associated with the pstate that it requested for the CPU.  intel_pstate returns a measured value of the effective frequency that the CPU ran at during intel_pstate's most recent sample.

Comment 6 Dirk Brandewie 2013-03-22 15:25:23 UTC
(In reply to comment #1)
> 
> [root@asterix cpufreq]# cat scaling_available_frequencies
> 2501000 2500000 2200000 2000000 1800000 1600000 1400000 1200000 1000000
> 800000 
> 
> Also scaling_max_freq is 2501000 for 3.8.3-203.fc18.x86_64. as oppposed to
> 3200000 on 3.9.0-0.rc3.git0.5
> 
> [root@asterix cpufreq]# cat  scaling_max_freq
> 2501000

The 2501000 frequency really means "use turbo range" on your part that range is 2500000 - 3200000.  intel_pstate returns the max_turbo frequency supported by the part.

Comment 7 Satish Balay 2013-03-22 15:31:33 UTC
(In reply to comment #5)

> This is normal.  does provide a frequency table.  The frequency
> table provided by acpi-cpufreq come from the BIOS and is a well documented
> fiction.

> acpi-cpufreq returns the frequency associated with the pstate that it
> requested for the CPU.  intel_pstate returns a measured value of the
> effective frequency that the CPU ran at during intel_pstate's most recent
> sample.

I don't completely understand.

Are you saying the default driver changeed [or the drive is querying bios instead of p-states? or something else?] between 3.8 and 3.9 - so the reported values are different?

Or that a different driver is loaded hence the values are different - and I should somehow load the correct driver to get back the 3.8 kernel behaviour?

Comment 8 Dirk Brandewie 2013-03-22 15:46:09 UTC
(In reply to comment #7)
> (In reply to comment #5)
> 
> > This is normal.  does provide a frequency table.  The frequency
> > table provided by acpi-cpufreq come from the BIOS and is a well documented
> > fiction.
> 
> > acpi-cpufreq returns the frequency associated with the pstate that it
> > requested for the CPU.  intel_pstate returns a measured value of the
> > effective frequency that the CPU ran at during intel_pstate's most recent
> > sample.
> 
> I don't completely understand.
> 
> Are you saying the default driver changeed [or the drive is querying bios
> instead of p-states? or something else?] between 3.8 and 3.9 - so the
> reported values are different?

The default scaling driver for SandyBridge processors did change to intel_pstate from acpi-cpufreq.  Both drivers are built-in.

> 
> Or that a different driver is loaded hence the values are different - and I
> should somehow load the correct driver to get back the 3.8 kernel behaviour?

The intel_pstate driver returns real/measured values and does not reflect the values coming from the BIOS/ACPI.

If you want to revert the the 3.8 behaviour you can add intel_pstate=disable to your command line, this will get you back to the previous defaults from your kernel.

This will cost you some amount of power efficiecy however.  The goal of intel_pstate is to improve the efficiency of SandyBridge based platform.

Comment 9 Satish Balay 2013-03-22 16:01:44 UTC
(In reply to comment #8)

> The intel_pstate driver returns real/measured values and does not reflect
> the values coming from the BIOS/ACPI.
> 
> If you want to revert the the 3.8 behaviour you can add intel_pstate=disable
> to your command line, this will get you back to the previous defaults from
> your kernel.
> 
> This will cost you some amount of power efficiecy however.  The goal of
> intel_pstate is to improve the efficiency of SandyBridge based platform.

Thanks for the clarification.

But this means that the high values reported are coming from  - intel_pstate driver.

Are values like "cpu MHz: 3968.000" valid - even though the turbo max is 3200000 or is this a bug in intel_pstate driver?

[I've also noticed 4K+ numbers aswell]

Comment 10 Dirk Brandewie 2013-03-22 16:11:14 UTC
(In reply to comment #9)
> (In reply to comment #8)
> 
> > The intel_pstate driver returns real/measured values and does not reflect
> > the values coming from the BIOS/ACPI.
> > 
> > If you want to revert the the 3.8 behaviour you can add intel_pstate=disable
> > to your command line, this will get you back to the previous defaults from
> > your kernel.
> > 
> > This will cost you some amount of power efficiecy however.  The goal of
> > intel_pstate is to improve the efficiency of SandyBridge based platform.
> 
> Thanks for the clarification.
> 
> But this means that the high values reported are coming from  - intel_pstate
> driver.
> 

Correct

> Are values like "cpu MHz: 3968.000" valid - even though the turbo max is
> 3200000 or is this a bug in intel_pstate driver?

I will go back and make sure I am doing the math correctly.  The 3200000 number is the highest pstate that you can request time 100 Khz not a hard limit on the frequency that the processor will run at. The actual frequency the processor operates at is controlled by the processor.

> 
> [I've also noticed 4K+ numbers aswell]

Comment 11 Satish Balay 2013-03-22 17:12:36 UTC
(In reply to comment #10)

> > Are values like "cpu MHz: 3968.000" valid - even though the turbo max is
> > 3200000 or is this a bug in intel_pstate driver?
> 
> I will go back and make sure I am doing the math correctly.  The 3200000
> number is the highest pstate that you can request time 100 Khz not a hard
> limit on the frequency that the processor will run at. The actual frequency
> the processor operates at is controlled by the processor.

Assuming low freq 800.000 [from 3.8] is getting reported as 992.000 [in 3.9] - there is a factor of 1.24 discrepancy.

But looking at the high side - I've seen up to 4064.000 [under load] - so if that corresponds to the max 3200 - that gives a factor 1.27 there.

[root@asterix ~]# grep MHz /proc/cpuinfo 
cpu MHz		: 3968.000
cpu MHz		: 4064.000
cpu MHz		: 3840.000
cpu MHz		: 3840.000
[root@asterix ~]# grep MHz /proc/cpuinfo 
cpu MHz		: 3968.000
cpu MHz		: 4000.000
cpu MHz		: 3840.000
cpu MHz		: 3808.000
[root@asterix ~]#

Note: this is a dual-core cpu with hyperthread enabled.

Comment 12 Satish Balay 2013-03-22 17:19:25 UTC
Just saw 4096 [briefly] with cpuburn-in. :)

[root@asterix ~]# grep MHz /proc/cpuinfo
cpu MHz: 4096.000
cpu MHz: 4064.000
cpu MHz: 3808.000
cpu MHz: 3808.000

Comment 13 Dirk Brandewie 2013-03-22 17:43:48 UTC
Seems as though I fat fingered the math to calculate the effective frequency :-(

This patch will fix it thanks for the report.  I will get this queued up.

commit c69d751e4307506ad76ca532ab4d6763d5c74f2b
Author: Dirk Brandewie <dirk.brandewie>
Date:   Fri Mar 22 10:34:26 2013 -0700

    cpufreq/intel_pstate: Fix calculation of current frequency
    
    Use the correct pstate value to calculate the effective frequency.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=923942
    Reported-by: Satish Balay <balay>
    
    Signed-off-by: Dirk Brandewie <dirk.brandewie>
---
 drivers/cpufreq/intel_pstate.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index e84af66..ad72922 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -454,7 +454,7 @@ static inline void intel_pstate_calc_busy(struct cpudata *cpu,
 					sample->idletime_us * 100,
 					sample->duration_us);
 	core_pct = div64_u64(sample->aperf * 100, sample->mperf);
-	sample->freq = cpu->pstate.turbo_pstate * core_pct * 1000;
+	sample->freq = cpu->pstate.max_pstate * core_pct * 1000;
 
 	sample->core_pct_busy = div_s64((sample->pstate_pct_busy * core_pct),
 					100);

Comment 14 Satish Balay 2013-03-23 01:36:55 UTC
with the patched kernel 3.9.0-0.rc3.git1.3.fc20.x86_64 [from koji] - I see better numbers now. So far the low freq I see is 775 [slightly less than 800] - and the high of 3175 with load [close to 3200 turbo spec]


grep MHz /proc/cpuinfo 
cpu MHz		: 825.000
cpu MHz		: 775.000
cpu MHz		: 800.000
cpu MHz		: 775.000


$ grep MHz /proc/cpuinfo 
cpu MHz		: 3000.000
cpu MHz		: 2975.000
cpu MHz		: 3175.000
cpu MHz		: 3175.000

thanks!


Note You need to log in before you can comment on or make changes to this bug.