Description of problem: We observe crashes with 2.6.24.1-24.el5rt when running java testsuite. Console output for 2 different crashes follows: Unable to handle kernel NULL pointer dereference at 0000000000000060 RIP: [<ffffffff80231807>] pick_next_task_fair+0x2d/0x3f PGD d5c1f067 PUD 6b548067 PMD 0 Oops: 0000 [1] PREEMPT SMP CPU 1 Modules linked in: ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 xt_state ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge nfsd auth_rpcgss exportfs nfs lockd nfs_acl autofs4 hidp rfcomm l2cap bluetooth sunrpc iscsi_tcp ib_iser libiscsi scsi_transport_iscsi rdma_ucm ib_ucm rdma_cm iw_cm ib_addr ib_srp scsi_transport_srp scsi_tgt ib_ipoib ib_cm ib_sa ipv6 ib_uverbs ib_umad ib_mad ib_core dm_mirror dm_multipath dm_mod video output sbs sbshc dock battery ac parport_pc lp parport ide_cd ata_generic sr_mod cdrom joydev pata_acpi sg e1000 serio_raw rtc_cmos rtc_core button rtc_lib jedec_probe pata_amd libata cfi_probe gen_probe i2c_nforce2 forcedeth mtd pcspkr i2c_core chipreg k8temp hwmon shpchp usb_storage mptsas mptscsih mptbase scsi_transport_sas sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd ssb uhci_hcd Pid: 8097, comm: java Not tainted 2.6.24.1-24.el5rt #1 RIP: 0010:[<ffffffff80231807>] [<ffffffff80231807>] pick_next_task_fair+0x2d/0x3f RSP: 0000:ffff8100df8edc08 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff8095d940 RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8100010217e0 RBP: ffff8100df8edc18 R08: 00000000d5431fc0 R09: ffff810001021780 R10: ffff8100df8edbc8 R11: 0000000000000000 R12: 0000000000402140 R13: 0000000000000004 R14: ffff810001021780 R15: 0000000000000296 FS: 00002b49feade480(0000) GS:ffff81011fc23bc0(0063) knlGS:00000000d52b5b90 CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b CR2: 0000000000000060 CR3: 00000000d5cd1000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process java (pid: 8097, threadinfo ffff8100df8ec000, task ffff8100c4ceb120) Stack: ffffffff80231372 ffff81011e7cd660 ffff8100df8edcf8 ffffffff804a4cbd ffff8100df8edc58 0000000000000092 ffff8100df8edca8 ffff8101200a2900 ffff81010452b6a0 ffff8100c4ceb120 ffff8100df8edcd8 ffff8100c4ceb3b8 Call Trace: [<ffffffff80231372>] put_prev_task_rt+0xd/0x18 [<ffffffff804a4cbd>] __schedule+0x43e/0x78d [<ffffffff8025e255>] __rt_mutex_adjust_prio+0x11/0x24 [<ffffffff8025e9c5>] task_blocks_on_rt_mutex+0x103/0x1bf [<ffffffff804a5327>] schedule+0xdf/0xff [<ffffffff804a5e81>] rt_mutex_slowlock+0x1c3/0x29d [<ffffffff804a5b1e>] rt_mutex_lock+0x28/0x2a [<ffffffff8025ec4d>] __rt_down_read+0x47/0x4b [<ffffffff8025ec67>] rt_down_read+0xb/0xd [<ffffffff8025cfe9>] do_futex+0x36e/0xb1d [<ffffffff80231b66>] enqueue_entity+0x2b/0x5b [<ffffffff80257965>] getnstimeofday+0x31/0x8f [<ffffffff8025dccc>] compat_sys_futex+0xd8/0xf6 [<ffffffff8020f66b>] syscall_trace_enter+0x95/0x99 [<ffffffff80229b62>] ia32_sysret+0x0/0xa Code: 48 8b 7b 60 48 85 ff 75 e0 48 8d 43 b8 41 58 5b c9 c3 55 48 RIP [<ffffffff80231807>] pick_next_task_fair+0x2d/0x3f RSP <ffff8100df8edc08> CR2: 0000000000000060 kernel BUG at kernel/sched.c:818! invalid opcode: 0000 [1] PREEMPT SMP CPU 0 Modules linked in: ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 xt_state ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge nfsd auth_rpcgss exportfs nfs lockd nfs_acl autofs4 hidp rfcomm l2cap bluetooth sunrpc iscsi_tcp ib_iser libiscsi scsi_transport_iscsi rdma_ucm ib_ucm rdma_cm iw_cm ib_addr ib_srp scsi_transport_srp scsi_tgt ib_ipoib ib_cm ib_sa ipv6 ib_uverbs ib_umad ib_mad ib_core dm_mirror dm_multipath dm_mod video output sbs sbshc dock battery ac parport_pc lp parport ide_cd ata_generic joydev sr_mod cdrom pata_acpi sg jedec_probe e1000 serio_raw cfi_probe gen_probe rtc_cmos pata_amd button forcedeth rtc_core k8temp i2c_nforce2 libata rtc_lib hwmon i2c_core mtd pcspkr chipreg shpchp usb_storage mptsas mptscsih mptbase scsi_transport_sas sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd ssb uhci_hcd Pid: 4258, comm: java Not tainted 2.6.24.1-24.el5rt #1 RIP: 0010:[<ffffffff80231690>] [<ffffffff80231690>] resched_task+0x24/0x5e RSP: 0018:ffff8100df9dbc18 EFLAGS: 00010002 RAX: 0000000000000001 RBX: ffff810204086b90 RCX: ffff810204104000 RDX: ffffffff8063a100 RSI: 00000000000000bf RDI: ffff810204086b90 RBP: ffff8100df9dbc18 R08: 0000000000000003 R09: 000000000000003d R10: ffff81021fb90048 R11: ffff810204086b90 R12: ffff8101200ae780 R13: 0000000000000001 R14: 0000000000000035 R15: ffffffff804beb20 FS: 00002b6ece788150(0000) GS:ffffffff8063a100(0063) knlGS:00000000e5651b90 CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b CR2: 00000000f7f9f000 CR3: 000000010726c000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process java (pid: 4258, threadinfo ffff8100df9da000, task ffff8100df9a86c0) Stack: ffff8100df9dbc38 ffffffff8023308b ffff810204086b90 ffff8101200ae780 ffff8100df9dbc88 ffffffff8023a772 0000003400000001 ffff8100df9a86c0 0000000000000097 ffff810204086b90 ffff8102040872b0 ffff8100df9dbd38 Call Trace: [<ffffffff8023308b>] prio_changed_rt+0x41/0x46 [<ffffffff8023a772>] task_setprio+0x178/0x1a0 [<ffffffff8025e264>] __rt_mutex_adjust_prio+0x20/0x24 [<ffffffff8025ea1d>] task_blocks_on_rt_mutex+0x15b/0x1bf [<ffffffff804a5e42>] rt_mutex_slowlock+0x184/0x29d [<ffffffff804a5b1e>] rt_mutex_lock+0x28/0x2a [<ffffffff8025ec4d>] __rt_down_read+0x47/0x4b [<ffffffff8025ec67>] rt_down_read+0xb/0xd [<ffffffff8025d3ce>] do_futex+0x753/0xb1d [<ffffffff8020c866>] retint_kernel+0x26/0x30 [<ffffffff80257965>] getnstimeofday+0x31/0x8f [<ffffffff8025dccc>] compat_sys_futex+0xd8/0xf6 [<ffffffff8020f66b>] syscall_trace_enter+0x95/0x99 [<ffffffff80229a04>] cstar_do_call+0x1b/0x65 Code: 0f 0b eb fe 8b 41 10 a8 08 75 2d f0 0f ba 69 10 03 48 8b 47 RIP [<ffffffff80231690>] resched_task+0x24/0x5e RSP <ffff8100df9dbc18> Version-Release number of selected component (if applicable): 2.6.24.1-24.el5rt kernel How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Have you seen this failure with our latest kernel (2.6.24.4-30.el5rt)?
I think you can close this one. The ones that we've seen recently are reported in 438478 and 437933. They were reported for 2.6.24.3. Our linux machines are busy for the next few days, so I won't be able to report whether they still occur with 2.6.24.4 until end of next week.
Ok, we'll close this for now. Clark