Bug 462280 - sys_times access leads to "kernel BUG at kernel/exit.c:904!"
sys_times access leads to "kernel BUG at kernel/exit.c:904!"
Status: CLOSED DUPLICATE of bug 455074
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.7
i686 Linux
urgent Severity urgent
: rc
: ---
Assigned To: Red Hat Kernel Manager
Martin Jenner
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-09-15 00:42 EDT by Mikkilineni Suresh Babu
Modified: 2008-10-08 23:19 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-10-08 23:19:23 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
/var/log/message (2.14 KB, text/plain)
2008-09-16 14:29 EDT, Hongguang Li
no flags Details

  None (edit)
Description Mikkilineni Suresh Babu 2008-09-15 00:42:41 EDT
Description of problem: Servers getting Kernel panic.


Version-Release number of selected component (if applicable):Kernel 2.6.9-78.ELsmp 


How reproducible: No reproduce steps.

Additional info: Kernel throwing following messages in to /var/log/messages and kernel getting panic

/var/log/messages
_____________________________________________________________________________________________
Sep 14 12:49:55 CNDAPACNDBOP02 kernel: kernel BUG at kernel/exit.c:904!

Sep 14 12:49:55 CNDAPACNDBOP02 kernel: invalid operand: 0000 [#1]

Sep 14 12:49:55 CNDAPACNDBOP02 kernel: SMP

Sep 14 12:49:55 CNDAPACNDBOP02 kernel: Modules linked in: md5 ipv6 i2c_dev i2c_core sunrpc dm_mirror dm_multipath dm_mod joydev button battery ac uhci_hcd ehci_hcd bnx2 e1000 ext3 jbd qla2400 ata_piix libata mptsas mptscsi mptbase qla2xxx scsi_transport_fc sd_mod scsi_mod

Sep 14 12:49:55 CNDAPACNDBOP02 kernel: CPU:    10

Sep 14 12:49:55 CNDAPACNDBOP02 kernel: EIP:    0060:[<c0124cc5>]    Not tainted VLI

Sep 14 12:49:55 CNDAPACNDBOP02 kernel: EFLAGS: 00010046   (2.6.9-78.ELsmp)

Sep 14 12:49:55 CNDAPACNDBOP02 kernel: EIP is at next_thread+0xc/0x3f

Sep 14 12:49:55 CNDAPACNDBOP02 kernel: eax: 00000000   ebx: f50374b0   ecx: 004a1ff4   edx: f5036430

Sep 14 12:49:55 CNDAPACNDBOP02 kernel: esi: 000054b4   edi: 00003427   ebp: 03b5a138   esp: e7133f8c

Sep 14 12:49:55 CNDAPACNDBOP02 kernel: ds: 007b   es: 007b   ss: 0068

Sep 14 12:49:55 CNDAPACNDBOP02 kernel: Process emagent (pid: 11801, threadinfo=e7133000 task=f50374b0)

Sep 14 12:49:55 CNDAPACNDBOP02 kernel: Stack: c012f536 00000246 00000000 03b5a130 00000000 e7133fa8 00000000 03b5a1bc

Sep 14 12:49:55 CNDAPACNDBOP02 kernel:        c01265f5 03b5a138 0034a800 03b5a1e8 e7133000 c02e09db 03b5a138 004a1ff4

Sep 14 12:49:55 CNDAPACNDBOP02 kernel:        0034a800 0034a800 03b5a1e8 03b5a170 0000002b c02e007b 0000007b 0000002b

Sep 14 12:49:55 CNDAPACNDBOP02 kernel: Call Trace:

Sep 14 12:49:55 CNDAPACNDBOP02 kernel:  [<c012f536>] sys_times+0x56/0x1c5

Sep 14 12:49:55 CNDAPACNDBOP02 kernel:  [<c01265f5>] sys_gettimeofday+0x53/0xac

Sep 14 12:49:55 CNDAPACNDBOP02 kernel:  [<c02e09db>] syscall_call+0x7/0xb

Sep 14 12:49:56 CNDAPACNDBOP02 kernel:  [<c02e007b>] __lock_text_end+0x880/0x107d

Sep 14 12:49:56 CNDAPACNDBOP02 kernel: Code: 85 c0 89 d3 74 05 e8 53 9c ff ff 53 e8 b9 fb ff ff 0f b6 44 24 04 c1 e0 08 50 e8 ab fb ff ff 89 c2 8b 80 f0 04 00 00 85 c0 75 08 <0f> 0b 88 03 ae 25 2f c0 0f b6 80 04 05 00 00 84 c0 7e 14 a1 80

Sep 14 12:49:56 CNDAPACNDBOP02 kernel:  <0>Fatal exception: panic in 5 seconds
_________________________________________________________________________________________________

We got same problem in 3 other servers also with same version of kernel. With previous kernel we never faced this problem.
Comment 1 Prarit Bhargava 2008-09-16 08:17:15 EDT
Mikkilineni, please test with the latest kernel from 

http://people.redhat.com/vgoyal/rhel4/RPMS.kernel/

and report the results here.  A sys_times patch was previously identified as causing this error.

P.
Comment 2 Hongguang Li 2008-09-16 14:29:45 EDT
Created attachment 316876 [details]
/var/log/message
Comment 3 Hongguang Li 2008-09-16 14:37:21 EDT
It's easy to reproduce in our environment. When Cognos reporting service BIBusTKServerMain uses more memory, it will cause this panic.
Comment 8 Hongguang Li 2008-09-17 10:04:34 EDT
kernel-smp-2.6.9-78.9.EL.i686.rpm fixed the problem. Is kernel-smp-2.6.9-78.9.EL.i686.rpm officially supported by RedHat?
Comment 9 Hongguang Li 2008-09-17 10:07:27 EDT
(In reply to comment #1)
> Mikkilineni, please test with the latest kernel from 
> http://people.redhat.com/vgoyal/rhel4/RPMS.kernel/
> and report the results here.  A sys_times patch was previously identified as
> causing this error.
> P.

We can consistently reproduce the panic. After we apply kernel-smp-2.6.9-78.9.EL.i686.rpm the problme is gone. Is this kernel officiall supported by RedHat?
Comment 10 Prarit Bhargava 2008-09-17 10:11:17 EDT
Honguang, 2.6.9-78.9.EL is not officially supported by Red Hat.  Please contact your TAM or customer service for details.

P.
Comment 11 Hongguang Li 2008-09-17 10:30:10 EDT
(In reply to comment #10)
> Honguang, 2.6.9-78.9.EL is not officially supported by Red Hat.  Please contact
> your TAM or customer service for details.
> P.

Does 2.6.9-67.ELsmp (Red Hat Enterprise Linux AS release 4 (Nahant Update 6)) have same problem?
Comment 12 Hongguang Li 2008-09-17 23:38:34 EDT
(In reply to comment #11)
> (In reply to comment #10)
> > Honguang, 2.6.9-78.9.EL is not officially supported by Red Hat.  Please contact
> > your TAM or customer service for details.
> > P.
> Does 2.6.9-67.ELsmp (Red Hat Enterprise Linux AS release 4 (Nahant Update 6))
> have same problem?

We just confirmed that 2.6.9-67.ELsmp has the same issue.
Comment 13 Mikkilineni Suresh Babu 2008-09-18 01:27:57 EDT
(In reply to comment #12)
> (In reply to comment #11)
> > (In reply to comment #10)
> > > Honguang, 2.6.9-78.9.EL is not officially supported by Red Hat.  Please contact
> > > your TAM or customer service for details.
> > > P.
> > Does 2.6.9-67.ELsmp (Red Hat Enterprise Linux AS release 4 (Nahant Update 6))
> > have same problem?
> 
> We just confirmed that 2.6.9-67.ELsmp has the same issue.

Did you check with kernel-2.6.9-78.0.1 kernel?
Comment 14 Michael Kearey 2008-10-08 23:19:23 EDT

*** This bug has been marked as a duplicate of bug 455074 ***

Note You need to log in before you can comment on or make changes to this bug.