Bug 233069

Summary:	ondemand frequency scaling doesn't work after multiple suspend/resume
Product:	[Fedora] Fedora	Reporter:	Nils Philippsen <nphilipp>
Component:	kernel	Assignee:	Kernel Maintainer List <kernel-maint>
Status:	CLOSED WONTFIX	QA Contact:	Fedora Extras Quality Assurance <extras-qa>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	9	CC:	cebbert, chris.brown, davej, j1, jeppe.andersen, jonstanley, kernel-maint, magnus_vesterlund, opensource, pknirsch, richard
Target Milestone:	---
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2009-07-14 17:05:32 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	427887

Description Nils Philippsen 2007-03-20 12:05:22 UTC

Description of problem:

After doing some suspend/resumes to disk ("hibernate"), cpufreq doesn't scale
anymore, it is fixed to the lowest available frequency.

Version-Release number of selected component (if applicable):

kernel-2.6.20-1.2925.fc6

How reproducible:

Regularly.

Steps to Reproduce:
1. Do some suspend/resume cycles
2. Do something that needs much CPU
  
Actual results:
Freq stays fixed at lowest frequency setting.

Expected results:
Freq gets scaled according to demand.

Additional info:
This is a Dell Latitude D800 laptop with an Intel Centrino chipset:

00:00.0 Host bridge: Intel Corporation 82855PM Processor to I/O Controller (rev 03)
00:01.0 PCI bridge: Intel Corporation 82855PM Processor to AGP Controller (rev 03)
00:1d.0 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M)
USB UHCI Controller #1 (rev 01)
00:1d.1 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M)
USB UHCI Controller #2 (rev 01)
00:1d.2 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M)
USB UHCI Controller #3 (rev 01)
00:1d.7 USB Controller: Intel Corporation 82801DB/DBM (ICH4/ICH4-M) USB2 EHCI
Controller (rev 01)
00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev 81)
00:1f.0 ISA bridge: Intel Corporation 82801DBM (ICH4-M) LPC Interface Bridge
(rev 01)
00:1f.1 IDE interface: Intel Corporation 82801DBM (ICH4-M) IDE Controller (rev 01)
00:1f.5 Multimedia audio controller: Intel Corporation 82801DB/DBL/DBM
(ICH4/ICH4-L/ICH4-M) AC'97 Audio Controller (rev 01)
00:1f.6 Modem: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) AC'97
Modem Controller (rev 01)
01:00.0 VGA compatible controller: nVidia Corporation NV28 [GeForce4 Ti 4200 Go
AGP 8x] (rev a1)
02:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5705M Gigabit
Ethernet (rev 01)
02:01.0 CardBus bridge: Texas Instruments PCI7510 PC card Cardbus Controller
(rev 01)
02:01.1 CardBus bridge: Texas Instruments PCI7510,7610 PC card Cardbus
Controller (rev 01)
02:01.2 FireWire (IEEE 1394): Texas Instruments PCI7410,7510,7610 OHCI-Lynx
Controller
02:01.3 System peripheral: Texas Instruments PCI7410,7510,7610 PCI Firmware
Loading Function

Comment 1 Chuck Ebbert 2007-03-20 14:09:07 UTC

Next time it happens, post the contents of all the files in the
/sys/devices/system/cpu/cpu0/cpufreq directory, and if there's an "ondemand"
subdirectory post all its files too.

Comment 2 Nils Philippsen 2007-03-22 16:06:10 UTC

nils@gibraltar:~> sudo fgrep -r '' /sys/devices/system/cpu/cpu0/cpufreq
/sys/devices/system/cpu/cpu0/cpufreq/ondemand/powersave_bias:0
/sys/devices/system/cpu/cpu0/cpufreq/ondemand/ignore_nice_load:0
/sys/devices/system/cpu/cpu0/cpufreq/ondemand/up_threshold:80
/sys/devices/system/cpu/cpu0/cpufreq/ondemand/sampling_rate:20000
/sys/devices/system/cpu/cpu0/cpufreq/ondemand/sampling_rate_min:10000
/sys/devices/system/cpu/cpu0/cpufreq/ondemand/sampling_rate_max:10000000
/sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq:600000
/sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq:600000
/sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies:1600000
1600000 1600000 1400000 1200000 1000000 800000 600000 
/sys/devices/system/cpu/cpu0/cpufreq/scaling_available_governors:ondemand
userspace performance 
/sys/devices/system/cpu/cpu0/cpufreq/scaling_driver:centrino
/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor:ondemand
/sys/devices/system/cpu/cpu0/cpufreq/affected_cpus:0
/sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq:600000
/sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq:600000
/sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq:1600000
/sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq:600000

Comment 3 Nils Philippsen 2007-03-22 16:09:45 UTC

Note that a) I had something running to generate load during the time I ran the
above command and b) "echo 1600000 > scaling_max_freq" doesn't help, it still
yields 600000 afterwards.

Comment 4 Magnus Vesterlund 2007-08-30 08:22:59 UTC

I still have this problem now and then with kernel 2.6.22.4-65.fc7 for x86_64. I
have a Dell Latitude D820 with a Core 2 T7200 processor.

Comment 5 Jon Stanley 2008-01-08 01:51:49 UTC

(This is a mass-update to all current FC6 kernel bugs in NEW state)

Hello,

I'm reviewing this bug list as part of the kernel bug triage project, an attempt
to isolate current bugs in the Fedora kernel.

http://fedoraproject.org/wiki/KernelBugTriage

I am CC'ing myself to this bug, however this version of Fedora is no longer
maintained.

Please attempt to reproduce this bug with a current version of Fedora (presently
Fedora 8). If the bug no longer exists, please close the bug or I'll do so in a
few days if there is no further information lodged.

Thanks for using Fedora!

Comment 6 Nils Philippsen 2008-01-08 08:14:23 UTC

I don't have the hardware around here anymore, so it would be good if somebody
else on Cc could check this, otherwise we can only close the bug with
INSUFFICIENT_DATA.

Comment 7 Magnus Vesterlund 2008-01-09 15:37:00 UTC

This still regularly happens to me with Fedora 8.

Comment 8 Jon Stanley 2008-01-09 15:53:00 UTC

Changing version to 8 in that case.  Can you post the data requested in comment
#1 since it's a different CPU and chipset in the D820 than the D800?

Comment 9 Magnus Vesterlund 2008-01-09 19:04:04 UTC

/sys/devices/system/cpu/cpu0/cpufreq/ondemand/powersave_bias:0
/sys/devices/system/cpu/cpu0/cpufreq/ondemand/ignore_nice_load:0
/sys/devices/system/cpu/cpu0/cpufreq/ondemand/up_threshold:80
/sys/devices/system/cpu/cpu0/cpufreq/ondemand/sampling_rate:20000
/sys/devices/system/cpu/cpu0/cpufreq/ondemand/sampling_rate_min:10000
/sys/devices/system/cpu/cpu0/cpufreq/ondemand/sampling_rate_max:10000000
/sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq:1000000
/sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq:1000000
/sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies:2000000
1667000 1333000 1000000 
/sys/devices/system/cpu/cpu0/cpufreq/scaling_available_governors:ondemand
userspace performance 
/sys/devices/system/cpu/cpu0/cpufreq/scaling_driver:acpi-cpufreq
/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor:ondemand
/sys/devices/system/cpu/cpu0/cpufreq/affected_cpus:0 1
/sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq:1000000
/sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq:1000000
/sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq:2000000
/sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq:1000000

Comment 10 Magnus Vesterlund 2008-01-11 13:17:45 UTC

I have found a weird workaround. Disconnect and reconnect the power cord.
scaling_max_freq changes from 1000000 to 2000000 and the frequency scaling
starts working again.

Still annoying when you don't have power available, though.

Comment 11 Christopher Brown 2008-01-16 23:04:56 UTC

Hello,

I'm reviewing this bug as part of the kernel bug triage project, an attempt to
isolate current bugs in the Fedora kernel.

http://fedoraproject.org/wiki/KernelBugTriage

I am CC'ing myself to this bug and will try and assist you in resolving it if I
can. I am re-assigning this to the relevant maintainer who may be able to shed
some light on it.

Comment 12 Chuck Ebbert 2008-01-29 22:35:45 UTC

There were some bugs fixed that were only present after multiple susupend and
resume cycles. Can you try kernel 2.6.23.14-123?

http://koji.fedoraproject.org/koji/buildinfo?buildID=32986

Comment 13 Magnus Vesterlund 2008-02-07 09:41:07 UTC

I still get this problem with 2.6.23.14-123.

Comment 14 Andrey Jivsov 2008-05-18 09:24:29 UTC

Happens for me with Fedora Core 9, from the DVD Fedora-9-i386-DVD.iso and
updated to 2.6.25.3-18.fc9.i686. IBM Thinkpad Z60m. comment #10 works for me too.

Comment 15 Andrey Jivsov 2008-05-18 09:35:03 UTC

Actually, comment #10 rarely works for me, certainly not every time.

Comment 16 Jeppe R. Andersen 2008-10-10 16:55:09 UTC

The cause of the failure of the CPU governor is discussed here:
https://bugs.launchpad.net/ubuntu/+source/acpi-support/+bug/68191

Note that the bug does not appear if you _hibernate_ the computer, only _suspending_ causes the loss of the CPU governor for the second CPU. As far as I understand, this should be fixed in pm-utils, but still has not made it into Fedora.

I can confirm (Dell Latitude D630, Intel(R) Core(TM)2 Duo CPU T7700 @ 2.40GHz) that resuming after suspend _always_ has the second CPU running at full frequency, eating away the battery lifetime. Hibernation does not cause this problem.

Comment 17 Till Maas 2008-10-10 17:29:19 UTC

Looking through the comments, I cannot see any reason, why this is a bug in pm-utils. Does it not happen when you suspend only with "echo -n mem > /sys/power/state"? 

You may need to apply some quirks manually to sucessfully resume, but afaik in most cases the machine should be still reachable via ssh, so you can still reboot it. I guess the best approach would be to use a live cd to avoid crashing your filesystems. I can also help you with applying the quirks if you need help. I will try to reproduce this, too.


In case it also happens with echoing to /sys/power/state, it is clearly a kernel bug.

Comment 18 Dave Jones 2008-10-10 18:08:27 UTC

when we hibernate, and the CPUs go offline, we throw away all the state related to that processor in the kernel. When it comes back online, we don't even have a way of knowing it was the same CPU, it appears as a 'new' CPU, with new state.

Due to this, pm-utils needs to store the state before beginning the suspend process, and restore it afterwards.

Comment 19 Till Maas 2008-10-10 18:26:54 UTC

(In reply to comment #18)
> when we hibernate, and the CPUs go offline, we throw away all the state related

Do you mean hibernate (suspend to disk) or suspend (to ram) here? Comment #16 states, that it only occurs with suspend, but not hibernate.

> to that processor in the kernel. When it comes back online, we don't even have
> a way of knowing it was the same CPU, it appears as a 'new' CPU, with new
> state.
> 
> Due to this, pm-utils needs to store the state before beginning the suspend
> process, and restore it afterwards.

What should pm-utils do? According to comment #10 it seems that the kernel could generate some event, that makes the system work againg without the need to store the pre-suspend state. WHich is something the kernel could do by itself.

Also comment #3 makes it look, like the well know interface that pm-utils could use to restore the previous max_freq does not work, which again looks like a kernel bug.

Comment 20 Dave Jones 2008-10-10 18:49:27 UTC

re-reading the comments, I think you're right that the bug mentioned in comment 16 is unrelated.

Comment 21 Andrey Jivsov 2008-10-10 18:57:24 UTC

In Re to comment #15, updating to recent  2.6.25.14-108.fc9.i686 solved the problem for me. I've been running the setup of for a few weeks and observe consistent result of CPU scaling working properly. This includes going through sleep cycle. I don't have hybernation configured. 

( I now have different problem: system overheating, which clearly never happened before, so I need to watch if the scaling sticks at the max for too long.)

Comment 22 Jeppe R. Andersen 2008-10-10 21:49:43 UTC

Let me then elaborate on Comment#16:
I am using kernel "2.6.26.5-45.fc9.i686 #1 SMP".

Normally, with the duo core system there is a directory 
/sys/devices/system/cpu/cpu#/cpufreq/
for each cpu. But after a wakeup from _suspend_, this directory is lost for cpu1 (but is still there for cpu0). Which leads to cpu1 steaming ahead with full 2.4GHz constantly, quickly eating up any point in suspending in the first place. The directory is still there after a wakeup from _hibernation_.

Please let me know if I can offer any more information which will assist in the debugging.

Thanks, Jeppe

Comment 23 Bug Zapper 2008-11-26 07:12:39 UTC

This message is a reminder that Fedora 8 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 8.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '8'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 8's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 8 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 24 Nils Philippsen 2008-11-26 09:45:30 UTC

I think I've not seen this with F10 (yet), so I'll change the product version to F9 (as per the last comments). Someone who's still on F9, please confirm if the bug is still present in the latest F9 kernel (kernel-2.6.27.5-41.fc9).

Comment 25 Jeppe R. Andersen 2008-11-26 15:40:24 UTC

Problem still present on a fully updated (Nov 26, 2008) F9 with 2.6.27.5-41.fc9.i686 #1 SMP.

Normally, with the duo core system there is a directory 
/sys/devices/system/cpu/cpu#/cpufreq/
for each cpu. But after a wakeup from _suspend_, this directory is lost for
cpu1 (but is still there for cpu0). Which leads to cpu1 steaming ahead with
full 2.4GHz constantly, quickly eating up any point in suspending in the first
place. The directory is still there after a wakeup from _hibernation_.

Comment 26 Bug Zapper 2009-06-09 22:30:17 UTC

This message is a reminder that Fedora 9 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 9.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '9'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 9's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 9 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 27 Bug Zapper 2009-07-14 17:05:32 UTC

Fedora 9 changed to end-of-life (EOL) status on 2009-07-10. Fedora 9 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.