Description of problem: attempting to dump core for multithreaded applications can trigger a kernel panic. currently, we've seen it both for apache 2.x and an internal app. we're running as3 update 6, with kernel 2.4.21-37.ELsmp ------------[ cut here ]------------ kernel BUG at exec.c:1298! invalid operand: 0000 nfs lockd sunrpc tg3 sg keybdev mousedev hid input ehci-hcd usb-uhci usbcore ext3 jbd raid1 ata_piix scsi_dump_register libata mptscsih mptbase diskdumplib sd CPU: 2 EIP: 0060:[<c0170d49>] Not tainted EFLAGS: 00010202 EIP is at coredump_wait [kernel] 0x39 (2.4.21-37.ELsmp/i686) eax: 000001bf ebx: de120080 ecx: e7216000 edx: f40dd780 esi: e7217e68 edi: c03aca80 ebp: e7216000 esp: e7217e64 ds: 0068 es: 0068 ss: 0068 Process atlas (pid: 28004, stackpage=e7217000) Stack: ffffffff 00000000 00000001 e7217e70 e7217e70 00000475 de120080 c0170f3a de120080 0000000a e72168b4 00000000 c8c2dbf0 f3e95314 f3e95314 c0138e1e c8c2dbf0 f3e95314 00000020 0000000b 0000000b e7216000 e72168b4 c013605f Call Trace: [<c0170f3a>] do_coredump [kernel] 0x16a (0xe7217e80) [<c0138e1e>] collect_signal [kernel] 0xae (0xe7217ea0) [<c013605f>] __dequeue_signal [kernel] 0x6f (0xe7217ec0) [<c01360c4>] dequeue_signal [kernel] 0x34 (0xe7217edc) [<c013767c>] get_signal_to_deliver [kernel] 0x20c (0xe7217ef8) [<c010bf84>] do_signal [kernel] 0x64 (0xe7217f20) [<c013c908>] do_futex [kernel] 0xf8 (0xe7217f58) [<c013c9c9>] sys_futex [kernel] 0xb9 (0xe7217f88) [<c011ff60>] do_page_fault [kernel] 0x0 (0xe7217fbc) Code: 0f 0b 12 05 85 03 2c c0 89 b3 18 01 00 00 40 89 83 14 01 00 Kernel panic: Fatal exception Version-Release number of selected component (if applicable): How reproducible: currently eratic. I'm guessing we're triggering a race condition. I would guess that it could be reproduced by killing any sufficiently threaded app. Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Hi, Todd. The fix that went into U7 for bug 168392 might avoid the problem you're seeing. Please try upgrading to U7 (released a couple of months ago) to see if that fix resolves this problem. Unfortunately, RHEL3 is now closed (and U8 is currently in beta).
Closing re: comment #1. P.