Bug 184469

Summary: irqbalance and cpuspeed daemons cannot start
Product: Red Hat Enterprise Linux 4 Reporter: Niksa Jurinovic <niksa>
Component: kernel-utilsAssignee: Neil Horman <nhorman>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.0CC: jbarton, lizhang, rbiba, travnicj-priv
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: RHBA-2007-0282 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-05-07 23:34:01 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 227845    

Description Niksa Jurinovic 2006-03-08 23:26:33 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20060130 Red Hat/1.0.7-1.4.3 Firefox/1.0.7

Description of problem:
After applying the Update 3 on top of the RHEL4 AS Nahant Update 2 using 
up2date (about 130 packages: kernel-2.6.9-34.EL, kernel-utils-2.4-13.1.80, 
glibc-2.3.4-2.19 etc.), irqbalance and cpuspeed daemons don't work (sybsys 
locked but daemon is dead).

# /etc/rc.d/init.d/irqbalance start
Starting irqbalance:                  [  OK  ]

# /etc/rc.d/init.d/irqbalance restart
Stopping irqbalance:                  [FAILED]
Starting irqbalance:                  [  OK  ]

# /etc/rc.d/init.d/irqbalance status
irqbalance dead but subsys locked

# /etc/rc.d/init.d/cpuspeed start
(no message)

# /etc/rc.d/init.d/cpuspeed status
cpuspeed is stopped

Also, `ps ax` doesn't show irqbalance or cpuspeed processes running.

I've tried to rebuild kernel-utils-2.4-13.1.80.src.rpm source package, but 
building crashed with:

---------------
gcc -DLOCALEDIR=\"\" -g -O2 -W -Wall -o longrun longrun.c
groff -Tascii -man longrun.1 | col -bx > README
make -C po
make[1]: Entering directory 
`/usr/src/redhat/BUILD/kernel-utils-2.4/longrun/po'
file=`echo ja | sed 's,.*/,,'`.gmo \
  && rm -f $file && msgfmt -o $file ja.po
usage: msgfmt [ -dv ] [ - ] [ name ... ]
ja.gmo: No such file or directory
make[1]: *** [ja.gmo] Error 2
make[1]: Leaving directory `/usr/src/redhat/BUILD/kernel-utils-2.4/longrun/po'
make: *** [stamp-po] Error 2
error: Bad exit status from /var/tmp/rpm-tmp.22221 (%install)
---------------

Note that everything was OK with kernel-2.6.9-22.0.2.EL and 
kernel-utils-2.4-13.1.69 from Update 2.


Version-Release number of selected component (if applicable):
kernel-utils-2.4-13.1.80, kernel-2.6.9-34.EL, glibc-2.3.4-2.19

How reproducible:
Always

Steps to Reproduce:
1. # /etc/rc.d/init.d/irqbalance start
   Starting irqbalance:                  [  OK  ]

   # /etc/rc.d/init.d/irqbalance restart
   Stopping irqbalance:                  [FAILED]
   Starting irqbalance:                  [  OK  ]

   # /etc/rc.d/init.d/irqbalance status
   irqbalance dead but subsys locked

2. # /etc/rc.d/init.d/cpuspeed start
   (no message)

   # /etc/rc.d/init.d/cpuspeed status
   cpuspeed is stopped

3. # ps ax | less doesn't show irqbalance or cpuspeed processes running


Additional info:

The problem didn't appear with kernel-2.6.9-22.0.2.EL and 
kernel-utils-2.4-13.1.69 from Update 2.

Comment 2 Ian Laurie 2006-06-28 10:56:00 UTC
Same problem here with irqbalance and cpuspeed, though I didn't try to compile
anything.

Running cpuspeed from the command line:

server# cpuspeed
Error: Could not open file for writing:
/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
Error: No such file or directory
server#

There is no directory "cpufreq" inside cpu0, and I cannot create one:

server# mkdir cpufreq
mkdir: cannot create directory `cpufreq': Operation not permitted

The fstab entry looks like this:

none    /sys      sysfs   defaults        0 0

I have the cpuspeed problem on 2 RHEL4 systems, and the irqbalance problem
on 1 RHEL4 system.  The system with both problems is a DELL Poweredge 600SC, and
the system with just the cpuspeed problem is a DELL Poweredge 750 rack-server.

"cat /proc/cpuinfo" shows both systems are running at their maximum speed.

For what its worth, I have both problems on an FC5 box (Athlon XP1800+).

Using (on the RHEL4 boxes):
  kernel-2.6.9-34.0.1.EL
  kernel-utils-2.4-13.1.80


Comment 3 Jiri TRAVNICEK, alias JITR {temporarily not reading bugmail} 2006-07-08 23:43:49 UTC
Well, don't know about `cpuspeed', it didn't complain any errors so I haven't
checked it. But I think I know what's going on with `irqbalance'. Too bad the
CentOS Vault server is KO at the moment, so I can't check the previous RPM. I
did only install the latest `kernel-utils' today and soon noticed the problem.

It seems the problems with `irqbalance' only appears with the uniprocessor
kernel (package `kernel'). Everything works just right with the SMP one (package
`kernel-smp'). I don't use the `highmem' variant, so can't speak for that one.

Obviously `irqbalance' is not intended for the uniprocessor kernel, so it is
quite logical it fails (but not really pretty) in this case. Apparently the
source of problem is as follows:

From irqbalance.c, lines 68--69, function main():

        if (cpucount < 2).
                exit(EXIT_SUCCESS); /* UP balancing is useless */

This causes the `/usr/sbin/irqbalance' to terminate without the calling script,
`/etc/init.d/irqbalance' even knowing the daemon actually didn't start. The
script then happily creates the lock file and reports success.

Upon system shutdown, since the subsys is locked (ie. lock file exists) the rc
script attempts to stop the service, which will report a failure due to
`killproc irqbalance' command expectable failure.

If I understand and recall correctly, there was some behaviour change in this
regard recently. I have a feeling the `irqbalance' was failing at the startup
before. The bug #107122, comment #2 suggests this (as also do some fading
records in my memory...).

Again, with SMP kernel everything works just fine (at least for me).

The question, however is, if the current behaviour is correct? If success is
reported and the service doesn't really start any daemon on uniprocessor kernel,
it should also be OK no daemon is killed when the service is stopped.

I can also quite confirm this problem didn't occur (or at least not in the form
described) with uodate 2. I have in fact CentOS 4.2 installed which is currently
being gradually updated to 4.3. These should correspond to RHEL 4's updates 2
and 3 respectively.


Comment 5 RHEL Program Management 2006-08-18 16:32:40 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 7 Neil Horman 2006-09-06 15:03:22 UTC
I fixed this by adding some logic to the irqbalance init script.  I didn't want
to break any cases in which silent failure for the U.P. case was expected when
irqbalance is used in another script (several people seem to do this for custom
irq balancing purposes).  This maintains the current behavior for the
application binary, but fails the service and logs an error in /var/log/messages
in the event that "service irqbalance start" is run on a uniprocessor system.

Comment 15 Red Hat Bugzilla 2007-05-07 23:34:01 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0282.html