Description of problem: I am encountering a kernel panic rougly 40-50% of the time when running MEMORY2. Debug: sleeping function called from invalid context at include/linux/rwsem.h:43 in_atomic():0[expected: 0], irqs_disabled():1 I've put a console trace in the Additional Info section. Here's our engineer's analysis of the issue: include/linux/rwsem.h:43 is a call to might_sleep() in static inline function down_read(). There are only two direct callers of down_read in ia64_do_page_fault(). The first is called on every page fault, so seems unlikely to be the culprit. The second is only called when there's a page fault in the init process when we're already under an out of memory condition. This is probably a fairly rare error path and may not be well tested. I don't see any reason yet to suspect this is a platform or processor issue. Version-Release number of selected component (if applicable): rhr2-2.0.3 MEMORY2 kernel - 2.6.9-34.EL How reproducible: Appears to occur roughly 2 out of 3 invocations of MEMORY2 on a 32GB 8-way system. Steps to Reproduce: 1. Repeatedly run the MEMORY2 test until panic 2. 3. Actual results: Mixed - sometimes test runs to completion, sometimes it panics Expected results: Additional info: Console Trace: MEMORY2... Debug: sleeping function called from invalid context at include/linux/rwsem.h:43 in_atomic():0[expected: 0], irqs_disabled():1 Call Trace: [<a000000100016b20>] show_stack+0x80/0xa0 sp=e0000007d06879a0 bsp=e0000007d06811c0 [<a000000100016b70>] dump_stack+0x30/0x60 sp=e0000007d0687b70 bsp=e0000007d06811a8 [<a000000100067fb0>] __might_sleep+0x190/0x260 sp=e0000007d0687b70 bsp=e0000007d0681180 [<a000000100061570>] ia64_do_page_fault+0x110/0xbc0 sp=e0000007d0687b80 bsp=e0000007d0681118 [<a00000010000f540>] ia64_leave_kernel+0x0/0x260 sp=e0000007d0687c30 bsp=e0000007d0681118 Unable to handle kernel NULL pointer dereference (address 0000000000000000) threaded_memtes[9956]: Oops 4294967296 [1] Modules linked in: iptable_nat ip_conntrack iptable_filter ip_tables md5 ipv6 parport_pc lp parport autofs4 sunrpc ds yenta_socket pcmcia_core vfat fat dm_multipath button ohci_hcd ehci_hcd tg3 e1000 dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod qla2300 qla2xxx scsi_transport_fc mptscsih mptsas mptspi mptfc mptscsi mptbase sd_mod scsi_mod Pid: 9956, CPU 4, comm: threaded_memtes psr : 0000101008122010 ifs : 8000000000000000 ip : [<0000000000000000>] Not tainted ip is at 0x0 unat: 0000000000000000 pfs : 0000000000000000 rsc : 0000000000000003 rnat: 0000000000000000 bsps: 0000000000000000 pr : 000000000555669b ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8a70033f csd : 0000000000000000 ssd : 0000000000000000 b0 : 0000000000000002 b6 : a0000001000af440 b7 : a00000010000edd0 f6 : 1003e000000000000004b f7 : 1003e000000000000004b f8 : 1003e000000000000122a f9 : 1003e0000000000000330 f10 : 1003e00000000000007f8 f11 : 1003e6666666666666667 r1 : a0000001009add90 r2 : e0000007d0687df8 r3 : e0000007d0687de8 r8 : e0000040c20d0000 r9 : 0000000002f6d339 r10 : 00001213081a6018 r11 : 0000000002f6d139 r12 : e0000007d0687e00 r13 : e0000007d0680000 r14 : e0000007d0687da0 r15 : e0000007d0687da8 r16 : 0000000000000000 r17 : 0000000000000000 r18 : 0000000000000000 r19 : 0009804c8a70033f r20 : 00000007d0680000 r21 : a000000100068500 r22 : 0000000000000000 r23 : 0000000000000000 r24 : 0000000000000000 r25 : 0000000000000000 r26 : 0000000000000000 r27 : e0000007d0681118 r28 : 000000000555669b r29 : 0000000000000000 r30 : 0000000000000000 r31 : 0000000000000000 Call Trace: [<a000000100016b20>] show_stack+0x80/0xa0 sp=e0000007d0687990 bsp=e0000007d0681210 [<a000000100017430>] show_regs+0x890/0x8c0 sp=e0000007d0687b60 bsp=e0000007d06811c0 [<a00000010003dbf0>] die+0x150/0x240 sp=e0000007d0687b80 bsp=e0000007d0681180 [<a000000100061d20>] ia64_do_page_fault+0x8c0/0xbc0 sp=e0000007d0687b80 bsp=e0000007d0681118 [<a00000010000f540>] ia64_leave_kernel+0x0/0x260 sp=e0000007d0687c30 bsp=e0000007d0681118 Kernel panic - not syncing: Fatal exception
The cause of the crash is the branch through a null pointer, which is reported as a "kernel NULL pointer dereference." The fact that this happened while interrupts were disabled (PSR.i=0) caused the "sleeping function called from invalid context" message, which is not itself fatal. From the oops dump, interrupts are indeed disabled: psr : 0000101008122010 (PSR.i=0) Branch registers often contain return addresses that have hints about where we came from: b0 : 0000000000000002 b6 : a0000001000af440 b7 : a00000010000edd0 Since b0 = 0x2, probably a branch through b0 caused the crash. Branching through b0 is a typical "return from function". From the kernel with symbols in kernel-debuginfo-2.6.9-34.EL.ia64.rpm: b6=do_futex+128 b7=ia64_switch_to+176 (return point from calling load_switch_stack) I think we're in the middle of a context switch. We've completed load_switch_stack, which restores various registers: r28 = pr = 000000000555669b r19 = fpsr = 0009804c8a70033f r21 = a000000100068500 <context_switch+1152> (return point after calling ia64_switch_to) b0 should be the same as r21 (restored by load_switch_stack). Why isn't it? r12 is the stack pointer. r2, r3, r14, and r15 are cursors that move through the struct switch_stack as we restore registers. By code inspection, we can determine the final cursor values at the exit of load_switch_stack: r12 : e0000007d0687e00 r2 : e0000007d0687df8 r3 : e0000007d0687de8 r14 : e0000007d0687da0 r15 : e0000007d0687da8 member offsets into struct switch_stack: 0x1a0 r6 0x1a8 r7 0x1b0 b0 0x1b8 b1 0x1f0 ar_unat 0x1f8 ar_rnat 0x200 ar_bspstore 0x208 pr r2 = 0xe0000007d0687df8 = &switch_stack->pr r3 = 0xe0000007d0687de8 = &switch_stack->rnat r14 = 0xe0000007d0687da0 = &switch_stack->b0 r15 = 0xe0000007d0687da8 = &switch_stack->b1 Based on the final cursor values, we can compute the stack pointer (which is the address of the struct switch_stack) at the entry to load_switch_stack: switch_stack should be at 0xe0000007d0687da0 - 0x1b0 = 0xe0000007d0687bf0 or (using r2) 0xe0000007d0687df8 - 0x208 = 0xe0000007d0687bf0 or (using r3) 0xe0000007d0687de8 - 0x1f8 = 0xe0000007d0687bf0 So at entry to load_switch_stack, sp = &switch_stack (= 0xe0000007d0687bf0) load_switch_stack returns with "br.many b7" (br7 = 0xa00000010000edd0), so we should execute this code: 0xa00000010000edd0 <ia64_switch_to+176>: [MMB] adds r12=528,r12 0xa00000010000edd1 <ia64_switch_to+177>: sync.i 0xa00000010000edd2 <ia64_switch_to+178>: br.ret.sptk.many b0 We increment the stack pointer (r12) by 528. Using the sp value computed above, the result should be: 0xe0000007d0687bf0 + 528 = 0xe0000007d0687e00 That matches the r12 value in the oops dump, so we probably executed the code above. load_switch_stack should have restored b0 from r21. From the oops dump: b0 should have been 0xa000000100068500 (based on r21) So the "br.ret.sptk.many b0" above should have returned here: 0xa000000100068500 <context_switch+1152>: [MII] nop.m 0x0 0xa000000100068501 <context_switch+1153>: mov.i ar.pfs=r53 0xa000000100068502 <context_switch+1154>: mov b0=r52 0xa000000100068510 <context_switch+1168>: [MIB] nop.m 0x0 0xa000000100068511 <context_switch+1169>: adds r12=16,r12 0xa000000100068512 <context_switch+1170>: br.ret.sptk.many b0;; Above, we restored a new b0 value from r52. This is a stacked register, so the value could be restored from the register backing store in memory. If that memory were corrupted, we could see this crash. But we should have incremented r12 again by 16. That doesn't match the oops dump. But maybe I made an error in deducing its value. We can't reproduce the problem on RHEL4 U4. U4 contains the following change that is not in U3: * Tue Mar 21 2006 Jason Baron <jbaron> [2.6.9-34.6] -ia64: Fix corrupt ar.bspstore (Prarit Bhargava) [177297] ar.bspstore is the backing store pointer, which points to the register backing store. If this were corrupted, we could restore r52 from the wrong place, which would lead to a corrupted b0, which could cause this crash. The changelog references Red Hat bugzilla 177297, which is private, so I can't read it. It would be interesting to know whether that bug mentions a crash similar to this one.
This defect can be closed. Testing with the U3 kernel that Bjorn had patched with the patch referred to above yielded no panics. Testing with the U4 kernel yielded no panics. Using the same tests and configs on an unpatched U3 kernel yielded panics.