Bug 83258 - panic during disk intense disk read & writing
Summary: panic during disk intense disk read & writing
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: kernel
Version: 7.3
Hardware: i386
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Arjan van de Ven
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2003-01-31 22:53 UTC by tom georgoulias
Modified: 2007-04-18 16:50 UTC (History)
2 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2004-09-30 15:40:28 UTC
Embargoed:


Attachments (Terms of Use)

Description tom georgoulias 2003-01-31 22:53:01 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; SunOS sun4u; en-US; rv:1.1) Gecko/20020827

Description of problem:
NFS file server has been crashing with kernel panics and oops' during normal
usage by users.  Because it seems to occur when users are doing moderate
read/write operations, I am using two shell scripts to intensely test disk
read/writes and try to recreate the problem.  

From dmesg:

Jan 31 15:15:42 mom kernel: Unable to handle kernel NULL pointer dereference at
virtual address 00000028
Jan 31 15:15:42 mom kernel:  printing eip:
Jan 31 15:15:42 mom kernel: f880e8b2
Jan 31 15:15:42 mom kernel: *pde = 00000000
Jan 31 15:15:42 mom kernel: Oops: 0000
Jan 31 15:15:42 mom kernel: nfs lockd sunrpc autofs 3c59x ide-cd cdrom loop
lvm-mod ext3 jbd  
Jan 31 15:15:42 mom kernel: CPU:    0
Jan 31 15:15:42 mom kernel: EIP:    0010:[<f880e8b2>]    Not tainted
Jan 31 15:15:42 mom kernel: EFLAGS: 00010207
Jan 31 15:15:42 mom kernel: 
Jan 31 15:15:42 mom kernel: EIP is at journal_try_to_free_buffers_R6069dd2f
[jbd] 0x52 (2.4.18-19.7.x)
Jan 31 15:15:42 mom kernel: eax: 00000001   ebx: 00000000   ecx: 000001d0   edx:
00000000
Jan 31 15:15:42 mom kernel: esi: e699df40   edi: 00000001   ebp: 00000000   esp:
c36bff2c
Jan 31 15:15:42 mom kernel: ds: 0018   es: 0018   ss: 0018
Jan 31 15:15:42 mom kernel: Process kswapd (pid:5,stackpage=c36bf000)
Jan 31 15:15:42 mom kernel: Stack: 00000000 c199de10 000001d0 e699df40 c199e2c4
f881e672 f6feae00 c199de10 
Jan 31 15:15:42 mom kernel:        000001d0 c013b49f c199de10 000001d0 c199de10 
000001d0 c01302f9 c199de10 
Jan 31 15:15:42 mom kernel:        000001d0 00001d80 00001d80 00000c38 0002f8ba
c02d4864 00000cb3 00001d80 
Jan 31 15:15:42 mom kernel: Call Trace: [<f881e672>] ext3_releasepage [ext3]
0x22 (0xc36bff40))
Jan 31 15:15:42 mom kernel: [<c013b49f>] try_to_release_page [kernel] 0x2f
(0xc36bff50))
Jan 31 15:15:42 mom kernel: [<c01302f9>] page_launder_zone [kernel]0x519 (0xc36
bff64))
Jan 31 15:15:42 mom kernel: [<c01306f8>] page_launder [kernel] 0x168 (0xc36bff90))
Jan 31 15:15:42 mom kernel: [<c0130fa2>] do_try_to_free_pages [kernel] 0x12
(0xc36bffb0))
Jan 31 15:15:42 mom kernel: [<c01312c1>] kswapd [kernel] 0x121 (0xc36bffd4))
Jan 31 15:15:42 mom kernel: [<c0105000>] stext [kernel] 0x0 (0xc36bffe8))
Jan 31 15:15:42 mom kernel: [<c0107146>] kernel_thread [kernel] 0x26 (0xc36bfff0))
Jan 31 15:15:42 mom kernel: [<c01311a0>] kswapd [kernel] 0x0 (0xc36bfff8))
Jan 31 15:15:42 mom kernel: 
Jan 31 15:15:42 mom kernel: 
Jan 31 15:15:42 mom kernel: Code: 8b 5b 28 f6 42 19 02 74 10 89 e0 50 52 e8 fc
fe ff ff 5a 85


Version-Release number of selected component (if applicable):
2.4.18-19.7.x

How reproducible:
Always

Steps to Reproduce:
1. Execute this shell script on a single disk partition:
dd if=/dev/zero of=testfile bs=16384 count=131072
while true
        do
                time cat testfile >/dev/null
        done

2. Execute this script on a different partition on the same disk:
while true
        do
                dd if=/dev/zero of=largefile bs=16384 count=131072
done


Actual Results:  System crashed and produced kernel oops.

Expected Results:  Normal system operation, with intense disk activity.

Additional info:

System consists of 2 120 GB Maxtor IDE disks attached to a promise Ultra133 TX2
ide controller card, 1 120GB & 1 20 GB Maxtor disk attached to motherboard, with
an Athlon 2200 CPU and 1GB DDR.

ksymoops output:
[root@mom log]# ksymoops /tmp/mom_oops.txt 
ksymoops 2.4.4 on i686 2.4.18-19.7.x.  Options used
     -V (default)
     -k /proc/ksyms (default)
     -l /proc/modules (default)
     -o /lib/modules/2.4.18-19.7.x/ (default)
     -m /boot/System.map-2.4.18-19.7.x (default)

Warning: You did not tell me where to find symbol information.  I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc.  ksymoops -h explains the options.

Error (expand_objects): cannot stat(/lib/ext3.o) for ext3
ksymoops: No such file or directory
Error (expand_objects): cannot stat(/lib/jbd.o) for jbd
ksymoops: No such file or directory
Warning (map_ksym_to_module): cannot match loaded module ext3 to a unique module
object.  Trace may not be reliable.
Jan 31 15:15:42 mom kernel: Unable to handle kernel NULL pointer dereference at 
Jan 31 15:15:42 mom kernel: f880e8b2
Jan 31 15:15:42 mom kernel: *pde = 00000000
Jan 31 15:15:42 mom kernel: Oops: 0000
Jan 31 15:15:42 mom kernel: CPU:    0
Jan 31 15:15:42 mom kernel: EIP:    0010:[<f880e8b2>]    Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
Jan 31 15:15:42 mom kernel: EFLAGS: 00010207
Jan 31 15:15:42 mom kernel: eax: 00000001   ebx: 00000000   ecx: 000001d0   edx:
 00000000
Jan 31 15:15:42 mom kernel: esi: e699df40   edi: 00000001   ebp: 00000000   esp:
 c36bff2c
Jan 31 15:15:42 mom kernel: ds: 0018   es: 0018   ss: 0018
Jan 31 15:15:42 mom kernel: Process kswapd (pid: 5, stackpage=c36bf000)
Jan 31 15:15:42 mom kernel: Stack: 00000000 c199de10 000001d0 e699df40 c199e2c4 
f881e672 f6feae00 c199de10 
Jan 31 15:15:42 mom kernel:        000001d0 c013b49f c199de10 000001d0 c199de10 
000001d0 c01302f9 c199de10 
Jan 31 15:15:42 mom kernel:        000001d0 00001d80 00001d80 00000c38 0002f8ba 
c02d4864 00000cb3 00001d80 
Jan 31 15:15:42 mom kernel: Call Trace: [<f881e672>] ext3_releasepage [ext3] 0x2
Jan 31 15:15:42 mom kernel: [<c013b49f>] try_to_release_page [kernel] 0x2f (0xc3
6bff50))
Jan 31 15:15:42 mom kernel: [<c01302f9>] page_launder_zone [kernel] 0x519 (0xc36
bff64))
Jan 31 15:15:42 mom kernel: [<c01306f8>] page_launder [kernel] 0x168 (0xc36bff90
Jan 31 15:15:42 mom kernel: [<c0130fa2>] do_try_to_free_pages [kernel] 0x12 (0xc
36bffb0))
Jan 31 15:15:42 mom kernel: [<c01312c1>] kswapd [kernel] 0x121 (0xc36bffd4))
Jan 31 15:15:42 mom kernel: [<c0105000>] stext [kernel] 0x0 (0xc36bffe8))
Jan 31 15:15:42 mom kernel: [<c0107146>] kernel_thread [kernel] 0x26 (0xc36bfff0
Jan 31 15:15:42 mom kernel: [<c01311a0>] kswapd [kernel] 0x0 (0xc36bfff8))
Jan 31 15:15:42 mom kernel: Code: 8b 5b 28 f6 42 19 02 74 10 89 e0 50 52 e8 fc f
Error (Oops_code_values): invalid value 0xf in Code line, must be 2, 4, 8 or 16
digits, value ignored

>>EIP; f880e8b2 <[jbd]journal_try_to_free_buffers+52/90>   <=====
Trace; f881e672 <[ext3].text.start+4612/ab8f>
Trace; c013b49f <try_to_release_page+2f/50>
Code;  f880e8b2 <[jbd]journal_try_to_free_buffers+52/90>
00000000 <_EIP>:
Code;  f880e8b2 <[jbd]journal_try_to_free_buffers+52/90>   <=====
   0:   8b 5b 28                  mov    0x28(%ebx),%ebx   <=====
Code;  f880e8b5 <[jbd]journal_try_to_free_buffers+55/90>
   3:   f6 42 19 02               testb  $0x2,0x19(%edx)
Code;  f880e8b9 <[jbd]journal_try_to_free_buffers+59/90>
   7:   74 10                     je     19 <_EIP+0x19> f880e8cb
<[jbd]journal_try_to_free_buffers+6b/90>
Code;  f880e8bb <[jbd]journal_try_to_free_buffers+5b/90>
   9:   89 e0                     mov    %esp,%eax
Code;  f880e8bd <[jbd]journal_try_to_free_buffers+5d/90>
   b:   50                        push   %eax
Code;  f880e8be <[jbd]journal_try_to_free_buffers+5e/90>
   c:   52                        push   %edx
Code;  f880e8bf <[jbd]journal_try_to_free_buffers+5f/90>
   d:   e8 fc 00 00 00            call   10e <_EIP+0x10e> f880e9c0
<[jbd]journal_unmap_buffer+70/1d0>


2 warnings and 3 errors issued.  Results may not be reliable.
[root@mom log]#

Comment 1 tom georgoulias 2003-02-06 21:30:08 UTC
I installed the latest errata kernel (2.4.18-24.7.x) on this system and ran the
same tests for 24 hours without experiencing any of the kernel panics & oops
mentioned in this bug report.  The server was returned to production and appears
to be functioning normally.

Comment 2 tom georgoulias 2003-02-07 17:24:43 UTC
Looks like I spoke too soon.  This oops occurred during an rsync of 30GB from a
remote server.

[root@mom tmp]# ksymoops 0207_oops.txt 
ksymoops 2.4.4 on i686 2.4.18-24.7.x.  Options used
     -V (default)
     -k /proc/ksyms (default)
     -l /proc/modules (default)
     -o /lib/modules/2.4.18-24.7.x/ (default)
     -m /boot/System.map-2.4.18-24.7.x (default)

Warning: You did not tell me where to find symbol information.  I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc.  ksymoops -h explains the options.

Error (expand_objects): cannot stat(/lib/ext3.o) for ext3
ksymoops: No such file or directory
Error (expand_objects): cannot stat(/lib/jbd.o) for jbd
ksymoops: No such file or directory
Error (expand_objects): cannot stat(/lib/lvm-mod.o) for lvm-mod
ksymoops: No such file or directory
/usr/bin/find: /lib/modules/2.4.18-24.7.x/build: No such file or directory
Error (pclose_local): find_objects pclose failed 0x100
Warning (map_ksym_to_module): cannot match loaded module ext3 to a unique module
object.  Trace may not be reliable.
Feb  7 11:10:41 mom kernel: Unable to handle kernel NULL pointer dereference at
virtual address 00000028
Feb  7 11:10:41 mom kernel: f881f8b2
Feb  7 11:10:41 mom kernel: *pde = 00000000
Feb  7 11:10:41 mom kernel: Oops: 0000
Feb  7 11:10:41 mom kernel: CPU:    0
Feb  7 11:10:41 mom kernel: EIP:    0010:[<f881f8b2>]    Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
Feb  7 11:10:41 mom kernel: EFLAGS: 00010207
Feb  7 11:10:41 mom kernel: eax: 00000001   ebx: 00000000   ecx: 000001d0   edx:
00000000
Feb  7 11:10:41 mom kernel: esi: f597e740   edi: 00000001   ebp: 00000000   esp:
c34b3f2c
Feb  7 11:10:42 mom kernel: ds: 0018   es: 0018   ss: 0018
Feb  7 11:10:42 mom kernel: Process kswapd (pid: 5, stackpage=c34b3000)
Feb  7 11:10:42 mom kernel: Stack: 00000000 c12f4810 000001d0 f597e740 c19b202c
f882f672 f6d91c00 c12f4810 
Feb  7 11:10:42 mom kernel:        000001d0 c013b4df c12f4810 000001d0 c12f4810
000001d0 c0130329 c12f4810 
Feb  7 11:10:42 mom kernel:        000001d0 000016a2 000014a9 000005ad 00012b1a
c02d4a24 00001342 000016a2 
Feb  7 11:10:42 mom kernel: Call Trace: [<f882f672>] ext3_releasepage [ext3]
0x22 (0xc34b3f40))
Feb  7 11:10:42 mom kernel: [<c013b4df>] try_to_release_page [kernel] 0x2f
(0xc34b3f50))
Feb  7 11:10:42 mom kernel: [<c0130329>] page_launder_zone [kernel] 0x519
(0xc34b3f64))
Feb  7 11:10:42 mom kernel: [<c0130728>] page_launder [kernel] 0x168 (0xc34b3f90))
Feb  7 11:10:42 mom kernel: [<c0130fd2>] do_try_to_free_pages [kernel] 0x12
(0xc34b3fb0))
Feb  7 11:10:42 mom kernel: [<c01312f1>] kswapd [kernel] 0x121 (0xc34b3fd4))
Feb  7 11:10:42 mom kernel: [<c0105000>] stext [kernel] 0x0 (0xc34b3fe8))
Feb  7 11:10:42 mom kernel: [<c0107166>] kernel_thread [kernel] 0x26 (0xc34b3ff0))
Feb  7 11:10:42 mom kernel: [<c01311d0>] kswapd [kernel] 0x0 (0xc34b3ff8))
Feb  7 11:10:42 mom kernel: Code: 8b 5b 28 f6 42 19 02 74 10 89 e0 50 52 e8 fc
fe ff ff 5a 85 

>>EIP; f881f8b2 <[jbd]journal_try_to_free_buffers+52/90>   <=====
Trace; f882f672 <[ext3].text.start+4612/ab8f>
Trace; c013b4df <try_to_release_page+2f/50>
Trace; c0130329 <page_launder_zone+519/7b0>
Trace; c0130728 <page_launder+168/2f0>
Trace; c0130fd2 <do_try_to_free_pages+12/180>
Trace; c01312f1 <kswapd+121/330>
Trace; c0105000 <_stext+0/0>
Trace; c0107166 <kernel_thread+26/30>
Trace; c01311d0 <kswapd+0/330>
Code;  f881f8b2 <[jbd]journal_try_to_free_buffers+52/90>
00000000 <_EIP>:
Code;  f881f8b2 <[jbd]journal_try_to_free_buffers+52/90>   <=====
   0:   8b 5b 28                  mov    0x28(%ebx),%ebx   <=====
Code;  f881f8b5 <[jbd]journal_try_to_free_buffers+55/90>
   3:   f6 42 19 02               testb  $0x2,0x19(%edx)
Code;  f881f8b9 <[jbd]journal_try_to_free_buffers+59/90>
   7:   74 10                     je     19 <_EIP+0x19> f881f8cb
<[jbd]journal_try_to_free_buffers+6b/90>
Code;  f881f8bb <[jbd]journal_try_to_free_buffers+5b/90>
   9:   89 e0                     mov    %esp,%eax
Code;  f881f8bd <[jbd]journal_try_to_free_buffers+5d/90>
   b:   50                        push   %eax
Code;  f881f8be <[jbd]journal_try_to_free_buffers+5e/90>
   c:   52                        push   %edx
Code;  f881f8bf <[jbd]journal_try_to_free_buffers+5f/90>
   d:   e8 fc fe ff ff            call   ffffff0e <_EIP+0xffffff0e> f881f7c0
<[jbd]__journal_try_to_free_buffer+0/a0>
Code;  f881f8c4 <[jbd]journal_try_to_free_buffers+64/90>
  12:   5a                        pop    %edx
Code;  f881f8c5 <[jbd]journal_try_to_free_buffers+65/90>
  13:   85 00                     test   %eax,(%eax)


2 warnings and 4 errors issued.  Results may not be reliable.
[root@mom tmp]# 


Comment 3 Bugzilla owner 2004-09-30 15:40:28 UTC
Thanks for the bug report. However, Red Hat no longer maintains this version of
the product. Please upgrade to the latest version and open a new bug if the problem
persists.

The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, 
and if you believe this bug is interesting to them, please report the problem in
the bug tracker at: http://bugzilla.fedora.us/



Note You need to log in before you can comment on or make changes to this bug.