Bug 435674 - kernel crashes with 2.6.24.1-24.el5rt
kernel crashes with 2.6.24.1-24.el5rt
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: realtime-kernel (Show other bugs)
1.0
i386 Linux
low Severity high
: ---
: ---
Assigned To: Steven Rostedt
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-03-03 04:51 EST by Roland Westrelin
Modified: 2008-04-23 16:24 EDT (History)
1 user (show)

See Also:
Fixed In Version: 2.6.24.4-32.el5rt
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-04-23 16:24:50 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Roland Westrelin 2008-03-03 04:51:09 EST
Description of problem:

We observe crashes with 2.6.24.1-24.el5rt when running java testsuite. Console
output for 2 different crashes follows:

Unable to handle kernel NULL pointer dereference at 0000000000000060 RIP: 
 [<ffffffff80231807>] pick_next_task_fair+0x2d/0x3f
PGD d5c1f067 PUD 6b548067 PMD 0 
Oops: 0000 [1] PREEMPT SMP 
CPU 1 
Modules linked in: ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 xt_state
ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge nfsd auth_rpcgss
exportfs nfs lockd nfs_acl autofs4 hidp rfcomm l2cap bluetooth sunrpc iscsi_tcp
ib_iser libiscsi scsi_transport_iscsi rdma_ucm ib_ucm rdma_cm iw_cm ib_addr
ib_srp scsi_transport_srp scsi_tgt ib_ipoib ib_cm ib_sa ipv6 ib_uverbs ib_umad
ib_mad ib_core dm_mirror dm_multipath dm_mod video output sbs sbshc dock battery
ac parport_pc lp parport ide_cd ata_generic sr_mod cdrom joydev pata_acpi sg
e1000 serio_raw rtc_cmos rtc_core button rtc_lib jedec_probe pata_amd libata
cfi_probe gen_probe i2c_nforce2 forcedeth mtd pcspkr i2c_core chipreg k8temp
hwmon shpchp usb_storage mptsas mptscsih mptbase scsi_transport_sas sd_mod
scsi_mod ext3 jbd ehci_hcd ohci_hcd ssb uhci_hcd
Pid: 8097, comm: java Not tainted 2.6.24.1-24.el5rt #1
RIP: 0010:[<ffffffff80231807>]  [<ffffffff80231807>] pick_next_task_fair+0x2d/0x3f
RSP: 0000:ffff8100df8edc08  EFLAGS: 00010046
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff8095d940
RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8100010217e0
RBP: ffff8100df8edc18 R08: 00000000d5431fc0 R09: ffff810001021780
R10: ffff8100df8edbc8 R11: 0000000000000000 R12: 0000000000402140
R13: 0000000000000004 R14: ffff810001021780 R15: 0000000000000296
FS:  00002b49feade480(0000) GS:ffff81011fc23bc0(0063) knlGS:00000000d52b5b90
CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
CR2: 0000000000000060 CR3: 00000000d5cd1000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process java (pid: 8097, threadinfo ffff8100df8ec000, task ffff8100c4ceb120)
Stack:  ffffffff80231372 ffff81011e7cd660 ffff8100df8edcf8 ffffffff804a4cbd
 ffff8100df8edc58 0000000000000092 ffff8100df8edca8 ffff8101200a2900
 ffff81010452b6a0 ffff8100c4ceb120 ffff8100df8edcd8 ffff8100c4ceb3b8
Call Trace:
 [<ffffffff80231372>] put_prev_task_rt+0xd/0x18
 [<ffffffff804a4cbd>] __schedule+0x43e/0x78d
 [<ffffffff8025e255>] __rt_mutex_adjust_prio+0x11/0x24
 [<ffffffff8025e9c5>] task_blocks_on_rt_mutex+0x103/0x1bf
 [<ffffffff804a5327>] schedule+0xdf/0xff
 [<ffffffff804a5e81>] rt_mutex_slowlock+0x1c3/0x29d
 [<ffffffff804a5b1e>] rt_mutex_lock+0x28/0x2a
 [<ffffffff8025ec4d>] __rt_down_read+0x47/0x4b
 [<ffffffff8025ec67>] rt_down_read+0xb/0xd
 [<ffffffff8025cfe9>] do_futex+0x36e/0xb1d
 [<ffffffff80231b66>] enqueue_entity+0x2b/0x5b
 [<ffffffff80257965>] getnstimeofday+0x31/0x8f
 [<ffffffff8025dccc>] compat_sys_futex+0xd8/0xf6
 [<ffffffff8020f66b>] syscall_trace_enter+0x95/0x99
 [<ffffffff80229b62>] ia32_sysret+0x0/0xa


Code: 48 8b 7b 60 48 85 ff 75 e0 48 8d 43 b8 41 58 5b c9 c3 55 48 
RIP  [<ffffffff80231807>] pick_next_task_fair+0x2d/0x3f
 RSP <ffff8100df8edc08>
CR2: 0000000000000060



kernel BUG at kernel/sched.c:818!
invalid opcode: 0000 [1] PREEMPT SMP 
CPU 0 
Modules linked in: ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 xt_state
ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge nfsd auth_rpcgss
exportfs nfs lockd nfs_acl autofs4 hidp rfcomm l2cap bluetooth sunrpc iscsi_tcp
ib_iser libiscsi scsi_transport_iscsi rdma_ucm ib_ucm rdma_cm iw_cm ib_addr
ib_srp scsi_transport_srp scsi_tgt ib_ipoib ib_cm ib_sa ipv6 ib_uverbs ib_umad
ib_mad ib_core dm_mirror dm_multipath dm_mod video output sbs sbshc dock battery
ac parport_pc lp parport ide_cd ata_generic joydev sr_mod cdrom pata_acpi sg
jedec_probe e1000 serio_raw cfi_probe gen_probe rtc_cmos pata_amd button
forcedeth rtc_core k8temp i2c_nforce2 libata rtc_lib hwmon i2c_core mtd pcspkr
chipreg shpchp usb_storage mptsas mptscsih mptbase scsi_transport_sas sd_mod
scsi_mod ext3 jbd ehci_hcd ohci_hcd ssb uhci_hcd                     
Pid: 4258, comm: java Not tainted 2.6.24.1-24.el5rt #1
RIP: 0010:[<ffffffff80231690>]  [<ffffffff80231690>] resched_task+0x24/0x5e
RSP: 0018:ffff8100df9dbc18  EFLAGS: 00010002
RAX: 0000000000000001 RBX: ffff810204086b90 RCX: ffff810204104000
RDX: ffffffff8063a100 RSI: 00000000000000bf RDI: ffff810204086b90
RBP: ffff8100df9dbc18 R08: 0000000000000003 R09: 000000000000003d
R10: ffff81021fb90048 R11: ffff810204086b90 R12: ffff8101200ae780
R13: 0000000000000001 R14: 0000000000000035 R15: ffffffff804beb20
FS:  00002b6ece788150(0000) GS:ffffffff8063a100(0063) knlGS:00000000e5651b90
CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
CR2: 00000000f7f9f000 CR3: 000000010726c000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process java (pid: 4258, threadinfo ffff8100df9da000, task ffff8100df9a86c0)
Stack:  ffff8100df9dbc38 ffffffff8023308b ffff810204086b90 ffff8101200ae780
 ffff8100df9dbc88 ffffffff8023a772 0000003400000001 ffff8100df9a86c0
 0000000000000097 ffff810204086b90 ffff8102040872b0 ffff8100df9dbd38
Call Trace:
 [<ffffffff8023308b>] prio_changed_rt+0x41/0x46
 [<ffffffff8023a772>] task_setprio+0x178/0x1a0
 [<ffffffff8025e264>] __rt_mutex_adjust_prio+0x20/0x24
 [<ffffffff8025ea1d>] task_blocks_on_rt_mutex+0x15b/0x1bf
 [<ffffffff804a5e42>] rt_mutex_slowlock+0x184/0x29d
 [<ffffffff804a5b1e>] rt_mutex_lock+0x28/0x2a
 [<ffffffff8025ec4d>] __rt_down_read+0x47/0x4b
 [<ffffffff8025ec67>] rt_down_read+0xb/0xd
 [<ffffffff8025d3ce>] do_futex+0x753/0xb1d
 [<ffffffff8020c866>] retint_kernel+0x26/0x30
 [<ffffffff80257965>] getnstimeofday+0x31/0x8f
 [<ffffffff8025dccc>] compat_sys_futex+0xd8/0xf6
 [<ffffffff8020f66b>] syscall_trace_enter+0x95/0x99
 [<ffffffff80229a04>] cstar_do_call+0x1b/0x65


Code: 0f 0b eb fe 8b 41 10 a8 08 75 2d f0 0f ba 69 10 03 48 8b 47 
RIP  [<ffffffff80231690>] resched_task+0x24/0x5e
 RSP <ffff8100df9dbc18>




Version-Release number of selected component (if applicable):
2.6.24.1-24.el5rt kernel

How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:
Comment 1 Clark Williams 2008-04-08 18:27:49 EDT
Have you seen this failure with our latest kernel (2.6.24.4-30.el5rt)?
Comment 2 Roland Westrelin 2008-04-09 03:38:30 EDT
I think you can close this one. The ones that we've seen recently are reported
in 438478 and 437933. They were reported for 2.6.24.3. Our linux machines are
busy for the next few days, so I won't be able to report whether they still
occur with 2.6.24.4 until end of next week.
Comment 3 Clark Williams 2008-04-23 16:24:50 EDT
Ok, we'll close this for now.

Clark

Note You need to log in before you can comment on or make changes to this bug.