Description of problem: There are really two bigger issues here. One is that with the current /etc/rc.d/init.d/cpuspeed with governor different from 'userspace' (and one gets 'ondemand' when a driver happens to be 'centrino' or 'powernow-k8', like it or not) there are obvious problems with 'status' and 'condrestart'. A status will be reported "dead but locked" which, in the context, is a nonsense. A visual feedback from 'start' and 'stop' will be also confusing. The other is that there is no good way to specify in a configuration file a desired governor even if there are good reasons to do so. An attached rewrite, which actually started with that init file from rawhide 1.2.1-1.46.fc7, is a proposition to cure those deficiencies. It also tries to make that file uniform in a style throughout with a consistent indentation all over and similar things. To make that useful and reasonable to use the following fragments should show up in a configuration file (in an addition to what is already there): # Do not set CPU below that (in kHz) # MIN_SPEED="" [ "$MIN_SPEED" ] && OPTS="$OPTS -m $MIN_SPEED" # Do not set CPU above that (in kHz) # MAX_SPEED="" [ "$MAX_SPEED" ] && OPTS="$OPTS -M $MAX_SPEED" # Check /lib/modules/*/kernel/drivers/cpufreq/ for possible governors # (plus "userspace" and "performance"). They are described in # cpu-freq/governors.txt file from kernel-doc package # GOVERNOR=ondemand With 'ondemand' and 'conservative' governors there are also other tune-up parameters available and those could be exposed through a config file too; but that maybe for another day. Version-Release number of selected component (if applicable): cpuspeed-1.2.1-1.42.fc6
Created attachment 143861 [details] a proposed replacement for /etc/rc.d/init.d/cpuspeed
Some of the problems with stopping and starting should be resolved by the latest rawhide build, which simply touches /var/lock/lock/subsys/cpuspeed for the centrino/powernow-k8 case. I'm still tinkering with what additional settings to expose to users -- personally, tweaking the thresholds for when throttling occurs is still more interesting than some of the other options (esp. for my home media center system...).
> Some of the problems with stopping and starting should be resolved > by the latest rawhide build, which simply touches > var/lock/lock/subsys/cpuspeed for the centrino/powernow-k8 case. That is precisely the problem. If you have not a 'userspace' governor, now implied _only_ and unconditionally (weird!) by centrino/powernow-k8, then cpuspeed daemon does not start, as it is written that way, and status reports "dead but locked". Of course restart is affected too. A rewrite from an attachment to comment #1 proposes one way to deal with that issue.
Ah, I see now, forgot to fix up the status function... Just checked in some changes to deal with that, as well as laying some of the groundwork for using different governors. Still need to fix condrestart though, and still giving thought to how to support all the possible governor options (I think at least one more case -- performance -- needs to be dealt with). Out of time today, more tomorrow...
Checked in a ton more changes today, seems to be doing the right thing in all cases I could think of. Only thing really left to do at the moment is enable the config file parsing bits so in the case the cpuspeed daemon is running, it picks up min/max speed settings (though I'm still a little leery about exposing this without also adding in a fair amount of error-checking to prevent people from doing dumb things). Once again, more tomorrow...
> though I'm still a little leery about exposing this > without also adding in a fair amount of error-checking > to prevent people from doing dumb things As a matter of fact I tried to do various "dumb things" and nothing bad happened. I was thinking about checking max and min against a list of available frequencies, which is exposed, but there is no need even if that would be fairly trivial. Whatever I was doing was "rounded" to one of allowed values or ignored if really out-of-whack. The other thing is that an old principle says that if you will prevent doing "dumb things" then you will invariably ban also "smart things". I have seen many times how this works in practice. Checking inputs has its place but let's not overdo it.
Ah, okay, good to know. In that case, I've checked in some more changes and submitted a new rawhide build complete with a config file and init script that support specifying governor, max/min frequencies, up/down scaling thresholds and whether or not to ignore nice processes, all of which should work for both the cpufreq module and cpuspeed cases. Definitely give it a look, if you would!
OK, I had a quick peek. IMO tests for 'centrino|powernow-k8' are clearly bogus, beyond possible change in a default governor, and actually harmful. A laptop I happen to have on hands in order to have working cpuspeed at all needs to use "p4-clockmod" driver. Once you have this then you can use different governors and configurations (and actually you need to as otherwise the whole thing tends to stick at 1/8th of its maximum speed - which is not that useful) and startup from cpuspeed-1.2.1-1.47.fc7 does not allow that. From my POV this is plain broken. Not much more functionality in that 'centrino|powernow-k8' test when stopping. Either cpuspeed daemon is running, and that means that a governor has to be "userspace" regardless of a driver as otherwise this daemon just exits, or it is not running for whatever reasons. Well??? Yes, I know that it was like that in the old startup script version. That is why I rewrote it like in an attachment to a comment #1.
Ugh, this is what I get for only poking at x86_64 machines -- didn't see the p4-clockmod module... Okay, cleaned up that mess, things are now more in line with your initscript, the centrino/powernow-k8 case item only sets the governor in start(). I've just pushed a new build with these changes (plus a fix for a stupid mistake I made -- bug 221829).
> Ugh, this is what I get for only poking at x86_64 machines ... My test x86_64 machine is simply not doing any cpu scaling at all. So there any version of that stuff is equally good. :-)
I had a chance to look at cpuspeed-1.2.1-1.43.fc6. This is definitely improvement but I still have a number of comments/beefs. I tested all that on laptop which actually does have cpu scaling (as oposed to some other machines I have around) In /etc/cpuspeed.conf: 1. About "DRIVER" it says: # default value: empty (auto-detect/use built-in) If I will not specify explicitely "p4-clockmod" then 'service cpuspeed start' simply silently fails. Nothing is built-in or auto-detected. 2. Comment says: # Valid scaling governors for your cpu(s) can be found in # /sys/devices/system/cpu/cpu*/cpufreq/scaling_available_governors This is plain incorrect. First if cpuspeed is not started than that file does not exist. Even when it is started then with 'userspace' governor loaded I see there userspace performance while actually I can equally well use ondemand, conservative or powersave but they will show in these "scaling_available_governors" ONLY when corresponding modules are loaded. 'cpufreq_stats' module also can be loaded but this is another story. OTOH if I will start with ondemand governor then a content of the said file changes to ondemand userspace performance so this is hard to consult for really availalble governors. I still think that a reference to cpu-freq/governors.txt file from kernel-doc package would be a good idea. 3. A comment says about governor # default value: empty (defaults to ondemand) This is surely not what a startup script is doing and if I will not set it to something else then for me it defaults to userspace. Not a big surprise here after reading that code. :-) 4. What sets a default "ondemand" governor in a case of centrino|powernow-k8 I have no idea; but I have a gnawing suspicion, which I cannot test, than nothing. The startup script surely differs here in actions from the old version. What was the reason do default to "ondemand" in this particular case I really have no clue. "userspace" does not work so well for centrino|powernow-k8? 5. I still think that at least a symlink /etc/sysconfig/cpuspeed.conf to /etc/cpuspeed.conf would be a good idea for consistency. 6. '/etc/init.d/cpuspeed status' seems to work reliably does not matter which governor is in use. OTOH for some reasons, _sometimes_, and only with "userspace" governor I am seeing something like that: # /etc/init.d/cpuspeed status cpuspeed (pid 4607) is running... # /etc/init.d/cpuspeed restart Stopping cpuspeed: [FAILED] Starting cpuspeed: [ OK ] This is even more difficult to cause if you are doing 'sh -x /etc/init.d/cpuspeed restart'. I attach a trace when "[FAILED]" did actually happen. Something is racing somewhere?
Created attachment 145303 [details] a trace from a restart with cpuspeed-1.2.1-1.43.fc6
(In reply to comment #11) > I had a chance to look at cpuspeed-1.2.1-1.43.fc6. This is definitely > improvement but I still have a number of comments/beefs. Me too... :) > I tested all that on laptop which actually does have cpu scaling > (as oposed to some other machines I have around) > > In /etc/cpuspeed.conf: > 1. About "DRIVER" it says: > # default value: empty (auto-detect/use built-in) > If I will not specify explicitely "p4-clockmod" then > 'service cpuspeed start' simply silently fails. Nothing is built-in > or auto-detected. Ah, I'll try to clarify that... However, in talking with some kernel folks today, its been brought to my attention that p4-clockmod is largely a useless piece of junk. There are a few threads of interest on the LKML that detail how p4-clockmod is actually worse (in most cases) than simply using no frequency scaling, because the idle C2 state at max freq uses the same wattage as idle C2 at min freq, and you have to waste time/cycles/power ramping back up. The big issue is that p4-clockmod doesn't adjust voltage at all. http://lkml.org/lkml/2004/5/13/76 http://lkml.org/lkml/2006/2/25/84 > 2. Comment says: > # Valid scaling governors for your cpu(s) can be found in > # /sys/devices/system/cpu/cpu*/cpufreq/scaling_available_governors > This is plain incorrect. First if cpuspeed is not started than > that file does not exist. [...] > I still think that a reference to cpu-freq/governors.txt file from > kernel-doc package would be a good idea. Yeah, I realized last night how far off that was. :\ Will fix soon. > 3. A comment says about governor > # default value: empty (defaults to ondemand) > This is surely not what a startup script is doing and if I will > not set it to something else then for me it defaults to > userspace. Not a big surprise here after reading that code. :-) Should maybe be "defaults to ondemand for centrino and powernow-k8 systems, userspace for everything else". However, the latest rawhide changes I pushed earlier today also use ondemand by default in the p4-clockmod case (assuming the user has turned it on, and I've been convinced we don't want to turn it on by default). > 4. What sets a default "ondemand" governor in a case of > centrino|powernow-k8 I have no idea; but I have a gnawing suspicion, > which I cannot test, than nothing. The governor validation check was actually screwed up, as was the governor assignment. Both are fixed in rawhide now. But the validation check was actually setting ondemand as a fallback, so it is/was indeed working, though by accident. > The startup script surely differs > here in actions from the old version. What was the reason do default > to "ondemand" in this particular case I really have no clue. > "userspace" does not work so well for centrino|powernow-k8? In theory, a kernel-space governor should be more efficient/responsive than a user-space one. Its also one (or more) less process(es) to run. > 5. I still think that at least a symlink /etc/sysconfig/cpuspeed.conf > to /etc/cpuspeed.conf would be a good idea for consistency. Both rawhide and rhel5 use /etc/sysconfig/cpuspeed now, I was planning to leave fc6 as-is. > 6. '/etc/init.d/cpuspeed status' seems to work reliably does not matter > which governor is in use. Hey, I actually got something right!?! ;) > OTOH for some reasons, _sometimes_, and > only with "userspace" governor I am seeing something like that: > > # /etc/init.d/cpuspeed status > cpuspeed (pid 4607) is running... > # /etc/init.d/cpuspeed restart > Stopping cpuspeed: [FAILED] > Starting cpuspeed: [ OK ] > > This is even more difficult to cause if you are doing > 'sh -x /etc/init.d/cpuspeed restart'. I attach a trace when > "[FAILED]" did actually happen. Something is racing somewhere? Huh. Never seen that one myself. Given the cpuspeed daemon stop code... ---8<--- if test "x`pidof cpuspeed`" != x; then echo -n $"Stopping $prog: " killproc cpuspeed -USR1 killproc cpuspeed -INT echo fi if test "x`pidof cpuspeed`" != x; then killproc cpuspeed fi RETVAL=$? ---8<--- ...it would appear that it might be possible to have the last test return a cpuspeed pid, but by the time we get to killproc (microseconds later?), its already finished exiting... Best theory I can come up with at the moment, anyhow. Time for sleep...
> ... its been brought to my attention that p4-clockmod > is largely a useless piece of junk. I will take your word for it. The only catch is that on those machines I have around this is the _only_ driver which works on at least one box. :-) Using entirely non-scientific test how warm that laptop gets in touch I would hestitate a bit to deem that "actually worse than simply using no frequency scaling" as a laptop seems cooler. Maybe this is one of those few cases where this driver does something useful? > In theory, a kernel-space governor should be more > efficient/responsive than a user-space one. That what I thought but could not fathom a reason why things were set the way they were in the original script. One would think that if what you are saying is really the case then, if governor is not configured explicitely, first 'ondemand' (or possibly 'conservative') should be tried and if this fails then 'userspace' should be attempted. This actually should be easy to do. > we get to killproc (microseconds later?), its already finished exiting Quite possibly. Could not figure that out although did not spent much time on that. The fact that running with 'sh -x ...' makes it harder to repeat seems to indicate a race. OTOH I do not remember anything of that sort with this my first version of a changed startup.
Next rawhide build (1.2.1-1.53.fc7) attempts to address most of these issues. I *think* I see where things may be going wonky w/the userspace stop case. I thought I hadn't altered the cpuspeed stop function, but I actually did, the above-quoted bit doesn't match what's in the latest builds (oops)... I've reverted that bit to match the earlier, quoted version... At the moment, if not specified, governor will be ondemand for the centrino|powernow-k8|p4-clockmod case, userspace otherwise. Similarly, if the user specifies an invalid governor (now that the governor validation is fixed ;), those are the respective fallback values. So far as I can tell, ondemand should always be a valid option in the centrino|powernow-k8|p4-clockmod case, but if not, its certainly possible to add a fallback to userspace as a last resort. As for p4-clockmod, it certainly does still seem possible it could save some power/heat in some cases, but since one of the guys that authored the kernel-space governor support says it generally doesn't, we'll leave it up to the user to turn it on and use it if they find it helps...
With cpuspeed-1.2.1-1.46.fc6 I still got twice "[FAILED]", in fifteen tries, on stopping when doing '/etc/init.d/cpuspeed restart'. This was only with 'userspace' governor. It never happened when I was trying that with 'ondemand'. This is a wart but I do not think that this really so important or that it really affects operations.
(In reply to comment #16) > With cpuspeed-1.2.1-1.46.fc6 I still got twice "[FAILED]", in fifteen > tries, on stopping when doing '/etc/init.d/cpuspeed restart'. This was > only with 'userspace' governor. It never happened when I was trying that > with 'ondemand'. > > This is a wart but I do not think that this really so important or > that it really affects operations. Yeah, seems purely cosmetic, but rather annoying... The cpuspeed stop function is now identical to what it was before I started touching things, so I'm without an explanation at the moment. :\
> The cpuspeed stop function is now identical to what it was > before I started touching things Maybe it was failing before as well? I never seriously tried. Issues were diverse and more serious so there was no point.
Yeah, quite possible things failed before as well... I'll keep it in mind though (probably should open a new bug for it). Gonna close out this bug now though, since I'm inclined to say its purpose has been fulfilled.