Bug 4446 - klogd takes up 100% CPU time after unaligned trap
klogd takes up 100% CPU time after unaligned trap
Status: CLOSED NEXTRELEASE
Product: Red Hat Linux
Classification: Retired
Component: sysklogd (Show other bugs)
6.0
alpha Linux
high Severity high
: ---
: ---
Assigned To: David Lawrence
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 1999-08-09 11:40 EDT by niles
Modified: 2008-05-01 11:37 EDT (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 1999-08-09 14:59:17 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description niles 1999-08-09 11:40:05 EDT
Running emacs on my DP264 causes an unaligned trap message,
which in turn seems to cause klogd to go in to a bad state
where it uses 100% CPU time if the scheduler with let it.
I imagine this happens after any unaligned trap, but I
have'nt verified that.  This is serious since most people
will use this machine for it's performance and this could
seriously affect it.
Comment 1 niles 1999-08-09 11:50:59 EDT
This may be related or a duplicate of Bug#:4371
Comment 2 Bill Nottingham 1999-08-09 12:15:59 EDT
Can you do an strace of klogd to see what happens when it
gets the unaligned trap?
Comment 3 niles 1999-08-09 12:28:59 EDT
This may be related or a duplicate of Bug#:4371
Comment 4 niles 1999-08-09 12:52:59 EDT
It's emacs, not klogd that causes the un-aligned trap.

When I try to strace emacs it causes a Segmentation fault.

Here's the last few lines of the emacs on alpha before the crash:
personality(0 /* PER_??? */)            = 0
getxpid()                               = 17010
getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=8192*1024}) = 0
setrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=8192*1024}) = 0
getpgid(0)                              = 17009
getxpid()                               = 17010
setpgid(0, 17010)                       = 0

Program received signal SIGSEGV, Segmentation fault.
0x2000027cb1c in sigismember (set=0x1, signo=1) at sigismem.c:31
sigismem.c:31: No such file or directory.

Compared with the strace from a working emacs on i386:
personality(PER_LINUX)                  = 0
getpid()                                = 27104
getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM_INFINITY})
= 0
setrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM_INFINITY})
= 0
getpgid(0)                              = 27102
getpid()                                = 27104
setpgid(0, 27104)                       = 0
SYS_175(0, 0xbffff48c, 0xbffff400, 0x8, 0) = 0

Notice the personality() function call.  I'm not sure who's calling
this but my guess would be the ld loader itself.

Here's the last bit of the strace from klogd:
write(4, "<6>Aug  9 12:43:58 kernel: No mo"..., 53) = 53
read(3, "<4>emacs(17173): unaligned trap "..., 4095) = 75
gettimeofday({934217052, 7439}, NULL)   = 0
write(4, "<4>Aug  9 12:44:12 kernel: emacs"..., 100) = 100
read(3, "<4>emacs(17173): unaligned trap "..., 4095) = 75
gettimeofday({934217052, 8416}, NULL)   = 0
write(4, "<4>Aug  9 12:44:12 kernel: emacs"..., 100) = 100
read(3, "<4>emacs(17173): unaligned trap "..., 4095) = 141
gettimeofday({934217052, 313240}, NULL) = 0
write(4, "<4>Aug  9 12:44:12 kernel: emacs"..., 101) = 101

After this klogd is caught in an infinite loop somewhere
that it never returns from.  Here's what gdb klogd says:

Starting program: /sbin/klogd -n -d
Logging line:
	Line: klogd %s-%s, log source = %s started.
	Priority: 6
Searching for symbol map.
Trying /boot/System.map-2.2.10.
Logging line:
	Line: Inspecting %s
	Priority: 6
Version string = 131594, Major = 2, Minor = 2, Patch = 10.
Comparing kernel 2.2.10 with symbol table 2.2.10.
Found table with matching version number.
End of search list encountered.
Version string = 131594, Major = 2, Minor = 2, Patch = 10.
Comparing kernel 2.2.10 with symbol table 2.2.10.
Logging line:
	Line: Loaded %d symbols from %s.
	Priority: 6
Logging line:
	Line: Symbols match kernel version %s.
	Priority: 6
Loading kernel module symbols - Size of table: 789
Logging line:
	Line: No module symbols loaded.
	Priority: 6
Logging line:
	Line: <4>emacs(17182): unaligned trap at 0000020000ac4fe8:
00000001203db8a2 2c 0

	Priority: 6
Logging line:
	Line: <4>emacs(17182): unaligned trap at 0000020000ac4f54:
00000001203db8a2 2c 0

	Priority: 6
Logging line:
	Line: <4>emacs(17182): unaligned trap at 0000020000ac8d60:
00000001203db8a2 28 16

	Priority: 6

After that klogd is hung.

What else can I try?

	Thanks, Rick Niles.
Comment 5 niles 1999-08-09 13:00:59 EDT
By looking at the 'dmesg' output I can see what klogd is actually
choking on:

emacs(17182): unaligned trap at 0000020000ac4fe8: 00000001203db8a2 2c
0
emacs(17182): unaligned trap at 0000020000ac4f54: 00000001203db8a2 2c
0
emacs(17182): unaligned trap at 0000020000ac8d60: 00000001203db8a2 28
16
>emacs(17182): unaligned trap at 00000<4>emacs(17183): unaligned trap
at 0000020000ac4fe8: 00000001203db8a2 2c 0

Notice how there was no carriage return before the next "<4>".  I
think this is a race condition either caused by the fact that
the DP264 machine is SMP or because it's just so fast. :)
Comment 6 Bill Nottingham 1999-08-09 14:11:59 EDT
Can you try the patch I'm attaching
(go to the 'e-mail' link on the bugzilla page)

This may not solve the problem, but it does fix at least one bug. :)
Comment 7 niles 1999-08-09 14:51:59 EDT
Oh yes!  Problem is fixed with this patch.
Thank you.
Comment 8 Bill Nottingham 1999-08-09 14:59:59 EDT
OK. sysklogd with this patch will be in next Raw Hide release.

Note You need to log in before you can comment on or make changes to this bug.