Bug 4446 - klogd takes up 100% CPU time after unaligned trap
Summary: klogd takes up 100% CPU time after unaligned trap
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: sysklogd
Version: 6.0
Hardware: alpha
OS: Linux
high
high
Target Milestone: ---
Assignee: David Lawrence
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 1999-08-09 15:40 UTC by niles
Modified: 2008-05-01 15:37 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 1999-08-09 18:59:17 UTC
Embargoed:


Attachments (Terms of Use)

Description niles 1999-08-09 15:40:05 UTC
Running emacs on my DP264 causes an unaligned trap message,
which in turn seems to cause klogd to go in to a bad state
where it uses 100% CPU time if the scheduler with let it.
I imagine this happens after any unaligned trap, but I
have'nt verified that.  This is serious since most people
will use this machine for it's performance and this could
seriously affect it.

Comment 1 niles 1999-08-09 15:50:59 UTC
This may be related or a duplicate of Bug#:4371

Comment 2 Bill Nottingham 1999-08-09 16:15:59 UTC
Can you do an strace of klogd to see what happens when it
gets the unaligned trap?

Comment 3 niles 1999-08-09 16:28:59 UTC
This may be related or a duplicate of Bug#:4371

Comment 4 niles 1999-08-09 16:52:59 UTC
It's emacs, not klogd that causes the un-aligned trap.

When I try to strace emacs it causes a Segmentation fault.

Here's the last few lines of the emacs on alpha before the crash:
personality(0 /* PER_??? */)            = 0
getxpid()                               = 17010
getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=8192*1024}) = 0
setrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=8192*1024}) = 0
getpgid(0)                              = 17009
getxpid()                               = 17010
setpgid(0, 17010)                       = 0

Program received signal SIGSEGV, Segmentation fault.
0x2000027cb1c in sigismember (set=0x1, signo=1) at sigismem.c:31
sigismem.c:31: No such file or directory.

Compared with the strace from a working emacs on i386:
personality(PER_LINUX)                  = 0
getpid()                                = 27104
getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM_INFINITY})
= 0
setrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM_INFINITY})
= 0
getpgid(0)                              = 27102
getpid()                                = 27104
setpgid(0, 27104)                       = 0
SYS_175(0, 0xbffff48c, 0xbffff400, 0x8, 0) = 0

Notice the personality() function call.  I'm not sure who's calling
this but my guess would be the ld loader itself.

Here's the last bit of the strace from klogd:
write(4, "<6>Aug  9 12:43:58 kernel: No mo"..., 53) = 53
read(3, "<4>emacs(17173): unaligned trap "..., 4095) = 75
gettimeofday({934217052, 7439}, NULL)   = 0
write(4, "<4>Aug  9 12:44:12 kernel: emacs"..., 100) = 100
read(3, "<4>emacs(17173): unaligned trap "..., 4095) = 75
gettimeofday({934217052, 8416}, NULL)   = 0
write(4, "<4>Aug  9 12:44:12 kernel: emacs"..., 100) = 100
read(3, "<4>emacs(17173): unaligned trap "..., 4095) = 141
gettimeofday({934217052, 313240}, NULL) = 0
write(4, "<4>Aug  9 12:44:12 kernel: emacs"..., 101) = 101

After this klogd is caught in an infinite loop somewhere
that it never returns from.  Here's what gdb klogd says:

Starting program: /sbin/klogd -n -d
Logging line:
	Line: klogd %s-%s, log source = %s started.
	Priority: 6
Searching for symbol map.
Trying /boot/System.map-2.2.10.
Logging line:
	Line: Inspecting %s
	Priority: 6
Version string = 131594, Major = 2, Minor = 2, Patch = 10.
Comparing kernel 2.2.10 with symbol table 2.2.10.
Found table with matching version number.
End of search list encountered.
Version string = 131594, Major = 2, Minor = 2, Patch = 10.
Comparing kernel 2.2.10 with symbol table 2.2.10.
Logging line:
	Line: Loaded %d symbols from %s.
	Priority: 6
Logging line:
	Line: Symbols match kernel version %s.
	Priority: 6
Loading kernel module symbols - Size of table: 789
Logging line:
	Line: No module symbols loaded.
	Priority: 6
Logging line:
	Line: <4>emacs(17182): unaligned trap at 0000020000ac4fe8:
00000001203db8a2 2c 0

	Priority: 6
Logging line:
	Line: <4>emacs(17182): unaligned trap at 0000020000ac4f54:
00000001203db8a2 2c 0

	Priority: 6
Logging line:
	Line: <4>emacs(17182): unaligned trap at 0000020000ac8d60:
00000001203db8a2 28 16

	Priority: 6

After that klogd is hung.

What else can I try?

	Thanks, Rick Niles.

Comment 5 niles 1999-08-09 17:00:59 UTC
By looking at the 'dmesg' output I can see what klogd is actually
choking on:

emacs(17182): unaligned trap at 0000020000ac4fe8: 00000001203db8a2 2c
0
emacs(17182): unaligned trap at 0000020000ac4f54: 00000001203db8a2 2c
0
emacs(17182): unaligned trap at 0000020000ac8d60: 00000001203db8a2 28
16
>emacs(17182): unaligned trap at 00000<4>emacs(17183): unaligned trap
at 0000020000ac4fe8: 00000001203db8a2 2c 0

Notice how there was no carriage return before the next "<4>".  I
think this is a race condition either caused by the fact that
the DP264 machine is SMP or because it's just so fast. :)

Comment 6 Bill Nottingham 1999-08-09 18:11:59 UTC
Can you try the patch I'm attaching
(go to the 'e-mail' link on the bugzilla page)

This may not solve the problem, but it does fix at least one bug. :)

Comment 7 niles 1999-08-09 18:51:59 UTC
Oh yes!  Problem is fixed with this patch.
Thank you.

Comment 8 Bill Nottingham 1999-08-09 18:59:59 UTC
OK. sysklogd with this patch will be in next Raw Hide release.


Note You need to log in before you can comment on or make changes to this bug.