Could you please integrate this patch posted by Herbert Xu on linux- net into the current RHEL 3 kernel? http://marc.theaimsgroup.com/?l=linux-net&m=106826484201781&w=2 We have been hit by this bug, and the patch fixes the problem (it's also fixed in the 2.6 kernel, but unfortunately, it's not in your backported linux-2.4.21-ipsec.patch, even in 2.4.21-9.0.1.EL). For the sake of completeness, here is the kernel output for two of those panics: ------------[ cut here ]------------ kernel BUG at xfrm_policy.c:241! invalid operand: 0000 esp4 sg esm aes sha256 af_key e100 e1000 ip_nat_ftp ip_conntrack_tftp ip_conntrack_ftp ipt_MARK iptable_mangle iptable_nat ipt_mark ipt_LOG ipt_state ip_connt CPU: 0 EIP: 0060:[<c025c406>] Tainted: PF EFLAGS: 00010282 EIP is at __xfrm_policy_destroy [kernel] 0x26 (2.4.21-9.0.1.EL/i686) eax: cc389880 ebx: cc112400 ecx: 00000001 edx: c03c2688 esi: cc112410 edi: c038ff24 ebp: c03490a0 esp: c038ff0c ds: 0068 es: 0068 ss: 0068 Process swapper (pid: 0, stackpage=c038f000) Stack: c038ff24 c025c110 c012b67e cc112400 00000000 00000000 cc10a920 cc10a920 000a2b06 00000001 c03be480 0000000a 00000046 c012b337 c03490a0 c01117b0 00000000 c0126832 c0126746 00000000 00000001 c012658d c03be480 c03bd000 Call Trace: [<c025c110>] xfrm_policy_timer [kernel] 0x0 (0xc038ff10) [<c012b67e>] __run_timers [kernel] 0xae (0xc038ff14) [<c012b337>] timer_bh [kernel] 0x47 (0xc038ff40) [<c01117b0>] timer_interrupt [kernel] 0x80 (0xc038ff48) [<c0126832>] bh_action [kernel] 0x22 (0xc038ff50) [<c0126746>] tasklet_hi_action [kernel] 0x46 (0xc038ff54) [<c012658d>] do_softirq [kernel] 0x8d (0xc038ff60) [<c010d20b>] do_IRQ [kernel] 0xfb (0xc038ff78) [<c010d110>] do_IRQ [kernel] 0x0 (0xc038ff98) [<c0109070>] default_idle [kernel] 0x0 (0xc038ffa0) [<c0109070>] default_idle [kernel] 0x0 (0xc038ffb4) [<c0109093>] default_idle [kernel] 0x23 (0xc038ffc8) [<c0109122>] cpu_idle [kernel] 0x42 (0xc038ffd4) [<c0107000>] stext [kernel] 0x0 (0xc038ffe0) Code: 0f 0b f1 00 7c 1a 2a c0 8d 43 10 89 04 24 e8 07 e1 ec ff 85 Kernel panic: Fatal exception In interrupt handler - not syncing *** Unable to handle kernel NULL pointer dereference at virtual address 00000032 printing eip: c025c460 *pde = 00000000 Oops: 0000 esp4 sg esm aes sha256 af_key e100 e1000 ip_nat_ftp ip_conntrack_tftp ip_conntrack_ftp ipt_MARK iptable_mangle iptable_nat ipt_mark ipt_LOG ipt_state ip_connt CPU: 0 EIP: 0060:[<c025c460>] Tainted: PF EFLAGS: 00010202 EIP is at xfrm_policy_gc_kill [kernel] 0x20 (2.4.21-9.0.1.EL/i686) eax: ce8c1c00 ebx: ce8c1c00 ecx: 00000000 edx: 00000032 esi: c192bf68 edi: c192a360 ebp: c192a000 esp: c192bf58 ds: 0068 es: 0068 ss: 0068 Process keventd (pid: 2, stackpage=c192b000) Stack: cbc45210 c192bf68 c025c55b ce8c1c00 cbc45204 ce8c1c04 c192bf80 c192bf80 c01268fa 00000000 c042dc60 c0429ba4 00000001 c028c334 c01313b3 c034a350 c192bfa0 00000000 00000001 00000000 00000000 00010000 00000000 c192a000 Call Trace: [<c025c55b>] xfrm_policy_gc_task [kernel] 0x8b (0xc192bf60) [<c01268fa>] __run_task_queue [kernel] 0x5a (0xc192bf78) [<c01313b3>] context_thread [kernel] 0x103 (0xc192bf90) [<c01312b0>] context_thread [kernel] 0x0 (0xc192bfe0) [<c010945d>] kernel_thread_helper [kernel] 0x5 (0xc192bff0) Code: 8b 02 89 d1 89 83 d0 00 00 00 83 7a 14 01 7e 41 8b 93 d0 00 Kernel panic: Fatal exception
Created attachment 99092 [details] Fix for XFRM crash, ported from 2.6.x Yes, integrating this fix is perfectly fine.
The fix in the prior comment has just been committed to the RHEL3 U2 patch pool for tonight's respin (creating kernel version 2.4.21-13.EL).
An errata has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2004-188.html