Bug 8688

Summary: klogd consumes all (97-100%) cpu
Product: [Retired] Red Hat Linux Reporter: zuazaga
Component: sysklogdAssignee: Bill Nottingham <notting>
Status: CLOSED DEFERRED QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 6.1CC: humberto, notting, rvokal, scottk
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2000-07-22 19:37:00 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description zuazaga 2000-01-21 02:20:58 UTC
klogd on 6.1 (stock and updated with all the current errata, including
sysklogd-1.3.31-14) will start to consume 100% cpu until killed.

This has supposedly been fixed in a patch to 2.3.12, backported to 2.2.11,
as described at:

http://www.uwsg.indiana.edu/hypermail/linux/kernel/9908.1/0976.html

but it still ocurrs on the redhat 2.2.12 kernel package shipped with
(kernel-2.2.12-20).

I have a pair of machines at work that will peg the klogd process within 5
minutes of booting, every time, and will test the proposed patch.

Please let me know if there is another solution.

Comment 1 zuazaga 2000-01-21 02:34:59 UTC
Sorry, I looked at Andrea's patch, and something similar _was_ applied to
2.2.12.

At least one other fellow on comp.os.linux.misc reported the same problem on
1/5/2000.

Comment 2 Bill Nottingham 2000-01-21 06:17:59 UTC
Do you have an exploit/example that causes this behavior?

Comment 3 humberto 2000-01-21 13:02:59 UTC
I have two brand new machines with redhat 6.1 factory installed. Within about 10
minutes of booting, klogd will climb to 100% cpu, every single time.

My home machine, and 3 other redhat 6.1 boxes i've set up do not exhibit this
behavior.

Comment 4 Bill Nottingham 2000-01-21 21:54:59 UTC
What modules do you have loaded? Is there anything
being output to the logs at these times?

If you strace klogd, what is it doing?

Comment 5 Scott Kammerzell 2000-02-03 17:30:59 UTC
I have this same problem on a 6.0 box that has been upgraded all the way to the
sysklogd-1.3.31-15 RPM that I found on rawhide (alikins 's
suggestion).  We have been unable to get an strace from it as of yet.

Comment 6 Scott Kammerzell 2000-02-04 14:34:59 UTC
This should probably be it's own bug, but it is related.
On a Red Hat 6.0 machine running sysklogd-1.3.31-6 when it starts eating cpu
time, an strace was run (strace -Tp <pid> ) and not a single system call went
thru while watching it.
 Since I couldn't get a strace, I tried to get it to generate a core file.  I
sent it SIGABRT, SIGBUG, SIGFPE, SIGSEGV, etc...  but it did not respond to
any of them.  Thus, as last resort, I kill -9 to regain the CPU.

Comment 7 Bill Nottingham 2000-02-04 14:41:59 UTC
That is probably the result of the kernel bug mentioned above.

Comment 8 zuazaga 2000-02-06 23:11:59 UTC
Sorry I've not been able to work on this for a while. I modifed
/etc/rc.d/init.d/syslog to run strace on klogd and have a few files where klogd
ran up the cpu. They don seem to show anything unusual.  Here's a snippet:

1179  write(1, "<5>Jan 26 09:29:37 kernel: Tryin"..., 64) = 64
1179  time([948893377])                 = 948893377
1179  write(1, "<4>Jan 26 09:29:37 kernel: Freei"..., 68) = 68
1179  time([948893377])                 = 948893377
1179  write(1, "<6>Jan 26 09:29:37 kernel: Addin"..., 74) = 74
1179  time([948893377])                 = 948893377
1179  write(1, "<6>Jan 26 09:29:37 kernel: es137"..., 76) = 76
1179  time([948893377])                 = 948893377
1179  write(1, "<6>Jan 26 09:29:37 kernel: es137"..., 70) = 70
1179  time([948893377])                 = 948893377
1179  write(1, "<6>Jan 26 09:29:37 kernel: es137"..., 59) = 59
1179  --- SIGSTOP (Stopped (signal)) ---
1179  --- SIGTERM (Terminated) ---

I killed the klogd after it used the CPU for a few minutes. There were no
unusual messages in the syslog output.

Checking the WCHAN with ps shows the klogd process in the "-" state.

One machine is still running the stock kernel, the other was upgraded to arecent
development version. I'll check Monday with the owner to see if it still locks
up klogd.

There are more strace log files availalable at:

http://www.hpcf.upr.edu/~humberto/klogd/

They're 96-100K each.

Comment 9 humberto 2000-02-07 14:13:59 UTC
The second machine has been running 2.3.42 since Feb 3 without locking up klogd
once. Perhaps it's fixed in newer kernels?

Comment 10 Bill Nottingham 2000-03-07 20:57:59 UTC
*** Bug 9155 has been marked as a duplicate of this bug. ***

Comment 11 Anonymous 2000-04-21 03:29:59 UTC
I also have this problem, using the 2.2.14 kernel with Red Hat 6.1

Comment 12 Bill Nottingham 2000-04-21 19:10:59 UTC
Do you have a sequence of events that you can do to make this
100% reproducible?

Comment 13 Bill Nottingham 2000-08-07 04:41:42 UTC
closed, lack of input.