Bug 204398 - occassional panic when running MEMORY2 test on rhel4u3
occassional panic when running MEMORY2 test on rhel4u3
Status: CLOSED WORKSFORME
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.3
All Linux
medium Severity medium
: ---
: ---
Assigned To: Will Woods
Will Woods
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2006-08-28 16:58 EDT by Rick Hester
Modified: 2008-10-14 15:50 EDT (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-10-14 15:50:54 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Rick Hester 2006-08-28 16:58:06 EDT
Description of problem:

I am encountering a kernel panic rougly 40-50% of the time when 
running MEMORY2.  

Debug: sleeping function called from invalid context at include/linux/rwsem.h:43
in_atomic():0[expected: 0], irqs_disabled():1

I've put a console trace in the Additional Info section.

Here's our engineer's analysis of the issue:
include/linux/rwsem.h:43 is a call to might_sleep() in static inline 
function down_read().  There are only two direct callers of down_read 
in ia64_do_page_fault().  The first is called on every page fault, so 
seems unlikely to be the culprit.  The second is only called when there's 
a page fault in the init process when we're already under an out of memory 
condition.  This is probably a fairly rare error path and may not be 
well tested.  I don't see any reason yet to suspect this is a platform 
or processor issue.


Version-Release number of selected component (if applicable):
rhr2-2.0.3 MEMORY2 
kernel - 2.6.9-34.EL


How reproducible:
Appears to occur roughly 2 out of 3 invocations of MEMORY2 on a 32GB 8-way
system.

Steps to Reproduce:
1. Repeatedly run the MEMORY2 test until panic
2.
3.
  
Actual results:
Mixed - sometimes test runs to completion, sometimes it panics

Expected results:


Additional info:

Console Trace:
 MEMORY2... Debug: sleeping function called from invalid context at
include/linux/rwsem.h:43
in_atomic():0[expected: 0], irqs_disabled():1

Call Trace:
 [<a000000100016b20>] show_stack+0x80/0xa0
                                sp=e0000007d06879a0 bsp=e0000007d06811c0
 [<a000000100016b70>] dump_stack+0x30/0x60
                                sp=e0000007d0687b70 bsp=e0000007d06811a8
 [<a000000100067fb0>] __might_sleep+0x190/0x260
                                sp=e0000007d0687b70 bsp=e0000007d0681180
 [<a000000100061570>] ia64_do_page_fault+0x110/0xbc0
                                sp=e0000007d0687b80 bsp=e0000007d0681118
 [<a00000010000f540>] ia64_leave_kernel+0x0/0x260
                                sp=e0000007d0687c30 bsp=e0000007d0681118
Unable to handle kernel NULL pointer dereference (address 0000000000000000)
threaded_memtes[9956]: Oops 4294967296 [1]
Modules linked in: iptable_nat ip_conntrack iptable_filter ip_tables md5 ipv6
parport_pc lp parport autofs4 sunrpc ds yenta_socket pcmcia_core vfat fat
dm_multipath button ohci_hcd ehci_hcd tg3 e1000 dm_snapshot dm_zero dm_mirror
ext3 jbd dm_mod qla2300 qla2xxx scsi_transport_fc mptscsih mptsas mptspi mptfc
mptscsi mptbase sd_mod scsi_mod

Pid: 9956, CPU 4, comm:      threaded_memtes
psr : 0000101008122010 ifs : 8000000000000000 ip  : [<0000000000000000>]    Not
tainted
ip is at 0x0
unat: 0000000000000000 pfs : 0000000000000000 rsc : 0000000000000003
rnat: 0000000000000000 bsps: 0000000000000000 pr  : 000000000555669b
ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8a70033f
csd : 0000000000000000 ssd : 0000000000000000
b0  : 0000000000000002 b6  : a0000001000af440 b7  : a00000010000edd0
f6  : 1003e000000000000004b f7  : 1003e000000000000004b
f8  : 1003e000000000000122a f9  : 1003e0000000000000330
f10 : 1003e00000000000007f8 f11 : 1003e6666666666666667
r1  : a0000001009add90 r2  : e0000007d0687df8 r3  : e0000007d0687de8
r8  : e0000040c20d0000 r9  : 0000000002f6d339 r10 : 00001213081a6018
r11 : 0000000002f6d139 r12 : e0000007d0687e00 r13 : e0000007d0680000
r14 : e0000007d0687da0 r15 : e0000007d0687da8 r16 : 0000000000000000
r17 : 0000000000000000 r18 : 0000000000000000 r19 : 0009804c8a70033f
r20 : 00000007d0680000 r21 : a000000100068500 r22 : 0000000000000000
r23 : 0000000000000000 r24 : 0000000000000000 r25 : 0000000000000000
r26 : 0000000000000000 r27 : e0000007d0681118 r28 : 000000000555669b
r29 : 0000000000000000 r30 : 0000000000000000 r31 : 0000000000000000

Call Trace:
 [<a000000100016b20>] show_stack+0x80/0xa0
                                sp=e0000007d0687990 bsp=e0000007d0681210
 [<a000000100017430>] show_regs+0x890/0x8c0
                                sp=e0000007d0687b60 bsp=e0000007d06811c0
 [<a00000010003dbf0>] die+0x150/0x240
                                sp=e0000007d0687b80 bsp=e0000007d0681180
 [<a000000100061d20>] ia64_do_page_fault+0x8c0/0xbc0
                                sp=e0000007d0687b80 bsp=e0000007d0681118
 [<a00000010000f540>] ia64_leave_kernel+0x0/0x260
                                sp=e0000007d0687c30 bsp=e0000007d0681118
Kernel panic - not syncing: Fatal exception
Comment 1 Bjorn Helgaas 2006-09-01 12:36:53 EDT
The cause of the crash is the branch through a null pointer, which is
reported as a "kernel NULL pointer dereference."

The fact that this happened while interrupts were disabled (PSR.i=0) caused
the "sleeping function called from invalid context" message, which is not
itself fatal.

From the oops dump, interrupts are indeed disabled:

  psr : 0000101008122010 (PSR.i=0)

Branch registers often contain return addresses that have hints about
where we came from:

  b0  : 0000000000000002 b6  : a0000001000af440 b7  : a00000010000edd0

Since b0 = 0x2, probably a branch through b0 caused the crash.  Branching
through b0 is a typical "return from function".  From the kernel with
symbols in kernel-debuginfo-2.6.9-34.EL.ia64.rpm:

  b6=do_futex+128
  b7=ia64_switch_to+176 (return point from calling load_switch_stack)

I think we're in the middle of a context switch.  We've completed
load_switch_stack, which restores various registers:

  r28 = pr   = 000000000555669b
  r19 = fpsr = 0009804c8a70033f
  r21 =        a000000100068500 <context_switch+1152> (return point after
                                                       calling ia64_switch_to)

b0 should be the same as r21 (restored by load_switch_stack).  Why isn't
it?

r12 is the stack pointer.  r2, r3, r14, and r15 are cursors that move
through the struct switch_stack as we restore registers.  By code
inspection, we can determine the final cursor values at the exit of
load_switch_stack:

  r12 : e0000007d0687e00
  r2  : e0000007d0687df8 r3  : e0000007d0687de8
  r14 : e0000007d0687da0 r15 : e0000007d0687da8

  member offsets into struct switch_stack:
    0x1a0 r6
    0x1a8 r7
    0x1b0 b0
    0x1b8 b1
    0x1f0 ar_unat
    0x1f8 ar_rnat
    0x200 ar_bspstore
    0x208 pr

  r2  = 0xe0000007d0687df8 = &switch_stack->pr
  r3  = 0xe0000007d0687de8 = &switch_stack->rnat
  r14 = 0xe0000007d0687da0 = &switch_stack->b0
  r15 = 0xe0000007d0687da8 = &switch_stack->b1

Based on the final cursor values, we can compute the stack pointer (which
is the address of the struct switch_stack) at the entry to
load_switch_stack:

  switch_stack should be at 0xe0000007d0687da0 - 0x1b0 = 0xe0000007d0687bf0
      or (using r2)         0xe0000007d0687df8 - 0x208 = 0xe0000007d0687bf0
      or (using r3)         0xe0000007d0687de8 - 0x1f8 = 0xe0000007d0687bf0

So at entry to load_switch_stack, sp = &switch_stack (= 0xe0000007d0687bf0)

load_switch_stack returns with "br.many b7" (br7 = 0xa00000010000edd0), so
we should execute this code:

  0xa00000010000edd0 <ia64_switch_to+176>:      [MMB]       adds r12=528,r12
  0xa00000010000edd1 <ia64_switch_to+177>:                  sync.i
  0xa00000010000edd2 <ia64_switch_to+178>:                  br.ret.sptk.many b0

We increment the stack pointer (r12) by 528.  Using the sp value computed
above, the result should be:

  0xe0000007d0687bf0 + 528 = 0xe0000007d0687e00

That matches the r12 value in the oops dump, so we probably executed the
code above.  load_switch_stack should have restored b0 from r21.  From
the oops dump:

  b0 should have been 0xa000000100068500 (based on r21)

So the "br.ret.sptk.many b0" above should have returned here:

  0xa000000100068500 <context_switch+1152>:   [MII]       nop.m 0x0
  0xa000000100068501 <context_switch+1153>:               mov.i ar.pfs=r53
  0xa000000100068502 <context_switch+1154>:               mov b0=r52
  0xa000000100068510 <context_switch+1168>:   [MIB]       nop.m 0x0
  0xa000000100068511 <context_switch+1169>:               adds r12=16,r12
  0xa000000100068512 <context_switch+1170>:               br.ret.sptk.many b0;;

Above, we restored a new b0 value from r52.  This is a stacked register, so
the value could be restored from the register backing store in memory.  If
that memory were corrupted, we could see this crash.

But we should have incremented r12 again by 16.  That doesn't match the
oops dump.  But maybe I made an error in deducing its value.

We can't reproduce the problem on RHEL4 U4.  U4 contains the following
change that is not in U3:

    * Tue Mar 21 2006 Jason Baron <jbaron@redhat.com> [2.6.9-34.6]
    -ia64: Fix corrupt ar.bspstore (Prarit Bhargava) [177297]

ar.bspstore is the backing store pointer, which points to the register
backing store.  If this were corrupted, we could restore r52 from the
wrong place, which would lead to a corrupted b0, which could cause this
crash.

The changelog references Red Hat bugzilla 177297, which is private, so I
can't read it.  It would be interesting to know whether that bug mentions
a crash similar to this one.
Comment 2 Rick Hester 2006-09-05 14:36:36 EDT
This defect can be closed.

Testing with the U3 kernel that Bjorn had patched with the
patch referred to above yielded no panics.

Testing with the U4 kernel yielded no panics.

Using the same tests and configs on an unpatched U3 kernel
yielded panics.

Note You need to log in before you can comment on or make changes to this bug.