Bug 497327

Summary: [RHEL4.8][Kernel] Unable to handle kernel pointer dereference, (bitmap_search_next_usable_block+0x 314/0x3f8 ext3)
Product: Red Hat Enterprise Linux 4 Reporter: Jeff Burke <jburke>
Component: kernelAssignee: Josef Bacik <jbacik>
Status: CLOSED WONTFIX QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: medium Docs Contact:
Priority: low    
Version: 4.8CC: ddumas, esandeen, jbacik, jtluka, lwang, mgahagan, pbunyan, sforsber, vgoyal
Target Milestone: rc   
Target Release: 4.9   
Hardware: s390   
OS: Linux   
URL: http://rhts.redhat.com/testlogs/55303/185580/1552385/console.txt
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-04-26 13:45:13 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Console log none

Description Jeff Burke 2009-04-23 12:19:22 UTC
Created attachment 340932 [details]
Console log 

Description of problem:
 While running the LTP fsstress test on the 2.6.9-89.EL with RHEL4-u8-re20090416.0 AS baseline on a s/390x system. The machine Oops'd we have panic_on_oops enabled then the machine Panic'd 

Version-Release number of selected component (if applicable):
2.6.9-89.EL

How reproducible:
Not often

Steps to Reproduce:
1. Install RHEL4-u8-re20090416.0 AS on s/390x 
2. Upgrade the kernel to 2.6.9-89.EL
3. The download LTP, Run the fsstress test
  
Actual results:
Unable to handle kernel pointer dereference at virtual kernel address 0000000020
000000
Oops: 0010 #1
CPU:    0    Not tainted
Process nfsd (pid: 22123, task: 000000001daac678, ksp: 000000001a247568)
Krnl PSW : 0700200180000000 000000002090e278 (bitmap_search_next_usable_block+0x
314/0x3f8 ext3)
Krnl GPRS: 000000001ffffc00 0000000000017470 fffffffffffffff3 000000001ffffe30
           00000000000001d0 0000000000000000 00000000032c5d28 0000000000001173
           000000000000116b 00000000ffffffc6 ffffffffffffffff 00000000160253f0
           000000002090c000 0000000020927330 000000002090e130 000000001a247230
Krnl Code: e3 a4 30 00 00 21 a7 74 00 0a a7 4b 00 08 a7 96 ff f9 b9 04
Call Trace:
(<000000002090e130> bitmap_search_next_usable_block+0x1cc/0x3f8 ext3)
 <000000002090e6c2> ext3_try_to_allocate+0x366/0x544 ext3
 <000000002090edd8> ext3_try_to_allocate_with_rsv+0x538/0x5f8 ext3
 <000000002090f3c0> ext3_new_block+0x3ec/0x7e0 ext3
 <000000002091259a> ext3_alloc_block+0x1a/0x2c ext3
 <000000002091508c> ext3_get_block_handle+0x4c0/0xb98 ext3
 <0000000020915802> ext3_get_block+0x9e/0xb0 ext3
 <00000000000984ce> __block_prepare_write+0x1ca/0x578
 <00000000000988b6> block_prepare_write+0x3a/0x70
 <0000000020912ae6> ext3_prepare_write+0x92/0x16c ext3
 <000000000006d284> generic_file_buffered_write+0x2c0/0x6a4
 <000000000006db54> __generic_file_aio_write_nolock+0x31c/0x34c
 <000000000006dc16> __generic_file_write_nolock+0x92/0xc0
 <000000000006dccc> generic_file_writev+0x88/0x10c
 <0000000000092c04> do_readv_writev+0x1f8/0x2c4
 <0000000000092dc2> vfs_writev+0x6e/0x84
 <00000000211f6904> nfsd_write+0x160/0x39c nfsd
 <00000000211fef7c> nfsd3_proc_write+0x108/0x128 nfsd
 <00000000211f2144> nfsd_dispatch+0x118/0x214 nfsd
 <000000002105eb48> svc_process+0x540/0x900 sunrpc
 <00000000211f1e1a> nfsd+0x29a/0x4ac nfsd
 <0000000000019b52> kernel_thread_starter+0x6/0xc
 <0000000000019b4c> kernel_thread_starter+0x0/0xc

 <0>Kernel panic - not syncing: Fatal exception: panic_on_oops
00: HCPGSP2629I The virtual machine is placed in CP mode due to a SIGP stop from
 CPU 01.

Expected results:
This should pass

Additional info:
********** System Information **********
Hostname                = z208.z900.redhat.com
Kernel Version          = 2.6.9-88.EL
Machine Hardware Name   = s390x
Processor Type          = s390x
uname -a output         = Linux z208.z900.redhat.com 2.6.9-88.EL #1 SMP Mon Apr 13 19:53:15 EDT 2009 s390x s390x s390x GNU/Linux
Swap Size               = 1023 MB
Mem Size                = 496 MB
Number of Processors    = 3
System Release          = Red Hat Enterprise Linux AS release 4 (Nahant Update 8 Beta)
Command Line            = root=/dev/VolGroup00/LogVol00 BOOT_IMAGE=0

Comment 3 Mike Gahagan 2009-04-24 19:27:36 UTC
 post_create:  setxattr failed, rc=28 (dev=loop7 ino=11820)
nfsd: last server has exited
nfsd: unexporting all filesystems
post_create:  setxattr failed, rc=28 (dev=loop7 ino=9896)
post_create:  setxattr failed, rc=28 (dev=loop7 ino=19554)
post_create:  setxattr failed, rc=28 (dev=loop7 ino=19555)
post_create:  setxattr failed, rc=28 (dev=loop7 ino=19556)
post_create:  setxattr failed, rc=28 (dev=loop7 ino=221)
post_create:  setxattr failed, rc=28 (dev=loop7 ino=19557)
post_create:  setxattr failed, rc=28 (dev=loop7 ino=19558)
nfsd: last server has exited
nfsd: unexporting all filesystems
post_create:  setxattr failed, rc=28 (dev=loop7 ino=7871)
post_create:  setxattr failed, rc=28 (dev=loop7 ino=120)
post_create:  setxattr failed, rc=28 (dev=loop7 ino=9806)
nfsd: last server has exited
nfsd: unexporting all filesystems
post_create:  setxattr failed, rc=28 (dev=loop7 ino=4086)
post_create:  setxattr failed, rc=28 (dev=loop7 ino=6036)
nfsd: last server has exited
nfsd: unexporting all filesystems
nfsd: last server has exited
nfsd: unexporting all filesystems
nfsd: last server has exited
nfsd: unexporting all filesystems
nfsd: last server has exited
nfsd: unexporting all filesystems


I tried to reproduce this bug by running ltp-fsstress and /kernel/stress/racer at the same time. I ran into the above messages one time, and the fsstress test failed because it could never mount the nfs share it creates (it timed out), but so far no oops.

Comment 4 Mike Gahagan 2009-04-29 15:26:33 UTC
Since it looks like this may either be an isolated incident or just a hard to reproduce bug with no customer reports of similar issues, I'm going to go ahead and propose this for 4.9

Comment 6 Josef Bacik 2011-02-09 20:48:58 UTC
I'm no s390 expert but objdump seems to be pointing after the ext3_test_allocatable, so it seems that maybe the jh->b_committed_data may have disappeared or something, it's not really clear.