Bug 102741 - Kernel crash under high I/O load - reproductible.
Kernel crash under high I/O load - reproductible.
Status: CLOSED NOTABUG
Product: Red Hat Enterprise Linux 2.1
Classification: Red Hat
Component: kernel (Show other bugs)
2.1
i686 Linux
high Severity high
: ---
: ---
Assigned To: Larry Woodman
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2003-08-20 10:59 EDT by Renato
Modified: 2007-11-30 17:06 EST (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-04-04 11:12:31 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Renato 2003-08-20 10:59:59 EDT
Description of problem:

Kernel crashes under high I/O load.

Version-Release number of selected component (if applicable):

kernel-smp-2.4.9-e.25 / aic79xx

How reproducible:

See below

Steps to Reproduce:
Create a RAID-1 partition and mount it as:

/dev/md1   /home    ext3   rw,nodev,noatime,nodiratime,usrquota    1 2

Run 3 instances of iozone in parallel:

./iozone -a -M -O -+u -f /home/testfile1 -R -b ./iozonerun1.ext2.xls -y 2k -q 
256k -n 64k -g 2008m &
./iozone -a -M -O -+u -f /home/testfile2 -R -b ./iozonerun2.ext2.xls -y 2k -q 
256k -n 64k -g 2008m &
./iozone -a -M -O -+u -f /home/testfile3 -R -b ./iozonerun3.ext2.xls -y 2k -q 
256k -n 64k -g 2008m &

It will eventually crash. See more details.
    
Actual results:

Kernel message:

Aug 20 11:08:11 hm16 kernel: Kernel 2.4.9-e.25smp
Aug 20 11:08:11 hm16 kernel: CPU:    1
Aug 20 11:08:11 hm16 kernel: EIP:    0010:[<c0161013>]    Not tainted
Aug 20 11:08:11 hm16 kernel: EFLAGS: 00010246
Aug 20 11:08:11 hm16 kernel: EIP is at prune_dqcache [kernel] 0x73
Aug 20 11:08:11 hm16 kernel: eax: 00000000   ebx: e46f1e20   ecx: e46f1e28   
edx: 00000000
Aug 20 11:08:11 hm16 kernel: esi: 00000000   edi: 00000001   ebp: 00000030   
esp: ed85da24
Aug 20 11:08:11 hm16 kernel: ds: 0018   es: 0018   ss: 0018
Aug 20 11:08:11 hm16 kernel: Process iozone (pid: 3538, stackpage=ed85d000)
Aug 20 11:08:11 hm16 kernel: Stack: 00000001 00000000 00000030 c013cb65 
00000007 00000030 c0161075 00000001
Aug 20 11:08:11 hm16 kernel:        c013c8d0 00000006 00000030 00000064 
00000030 00000003 00000030 00000030
Aug 20 11:08:11 hm16 kernel:        00000001 ed85c000 00000000 00000000 
c013ce05 00000030 00000001 ed85c000
Aug 20 11:08:11 hm16 kernel: Call Trace: [<c013cb65>] wakeup_kswapd [kernel] 
0x85
Aug 20 11:08:11 hm16 kernel: [<c0161075>] shrink_dqcache_memory [kernel] 0x15
Aug 20 11:08:11 hm16 kernel: [<c013c8d0>] do_try_to_free_pages [kernel] 0x30
Aug 20 11:08:11 hm16 kernel: [<c013ce05>] try_to_free_pages [kernel] 0x35
Aug 20 11:08:11 hm16 kernel: [<c013dbb1>] _wrapped_alloc_pages [kernel] 0x1d1
Aug 20 11:08:11 hm16 kernel: [<c013dc9f>] __alloc_pages [kernel] 0xf
Aug 20 11:08:11 hm16 kernel: [<c014399d>] alloc_bounce_page [kernel] 0x3d
Aug 20 11:08:11 hm16 kernel: [<c0199629>] req_new_io [kernel] 0x49
Aug 20 11:08:11 hm16 kernel: [<c0143b4c>] create_bounce [kernel] 0x2c
Aug 20 11:08:11 hm16 kernel: [<c01998f0>] __make_request [kernel] 0xb0
Aug 20 11:08:11 hm16 kernel: [<c013dd40>] __get_free_pages [kernel] 0x10
Aug 20 11:08:11 hm16 kernel: [<c0120bab>] do_softirq [kernel] 0x7b
Aug 20 11:08:11 hm16 kernel: [<c013825b>] kmalloc [kernel] 0x7b
Aug 20 11:08:11 hm16 kernel: [<c019a1bc>] generic_make_request [kernel] 0x17c
Aug 20 11:08:11 hm16 kernel: [<c0246669>] call_apic_timer_interrupt [kernel] 0x5
Aug 20 11:08:11 hm16 kernel: [<f8865f30>] raid1_make_request [raid1] 0x360
Aug 20 11:08:11 hm16 kernel: [<c0125014>] run_local_timers [kernel] 0x94
Aug 20 11:08:11 hm16 kernel: [<c01ceb37>] md_make_request [kernel] 0x47
Aug 20 11:08:11 hm16 kernel: [<c0114228>] smp_apic_timer_interrupt [kernel] 0xb8
Aug 20 11:08:11 hm16 kernel: [<c019a1bc>] generic_make_request [kernel] 0x17c
Aug 20 11:08:11 hm16 kernel: [<c019a229>] submit_bh [kernel] 0x59
Aug 20 11:08:12 hm16 kernel: [<c019a737>] ll_rw_block [kernel] 0x267
Aug 20 11:08:12 hm16 kernel: [<c013da56>] _wrapped_alloc_pages [kernel] 0x76
Aug 20 11:08:12 hm16 kernel: [<c013dc9f>] __alloc_pages [kernel] 0xf
Aug 20 11:08:12 hm16 kernel: [<c013dd40>] __get_free_pages [kernel] 0x10
Aug 20 11:08:12 hm16 kernel: [<c0147e12>] fsync_inode_buffers [kernel] 0xb2
Aug 20 11:08:12 hm16 kernel: [<f8871e0e>] journal_alloc_journal_head [jbd] 0xe
Aug 20 11:08:12 hm16 kernel: [<c01482e7>] refile_buffer [kernel] 0x17
Aug 20 11:08:12 hm16 kernel: [<f886c445>] journal_dirty_data_Rsmp_7f1fc3f3 
[jbd] 0x1d5
Aug 20 11:08:12 hm16 kernel: [<f887e634>] journal_dirty_sync_data [ext3] 0x64
Aug 20 11:08:12 hm16 kernel: [<c0148e29>] __block_prepare_write [kernel] 0x59
Aug 20 11:08:12 hm16 kernel: [<f8871cd7>] __jbd_kmalloc [jbd] 0x27
Aug 20 11:08:12 hm16 kernel: [<f886c9b1>] journal_stop_Rsmp_28baf751 [jbd] 0x1a1
Aug 20 11:08:12 hm16 kernel: [<f887e946>] ext3_commit_write [ext3] 0x236
Aug 20 11:08:12 hm16 kernel: [<c0130ec7>] __find_lock_page [kernel] 0x97
Aug 20 11:08:12 hm16 kernel: [<c0139207>] deactivate_page [kernel] 0x17
Aug 20 11:08:12 hm16 kernel: [<c0133dca>] generic_file_write [kernel] 0x53a
Aug 20 11:08:12 hm16 kernel: [<c0133df6>] generic_file_write [kernel] 0x566
Aug 20 11:08:12 hm16 kernel: [<c0159d7c>] dput [kernel] 0x1c
Aug 20 11:08:12 hm16 kernel: [<f887bc64>] ext3_release_file [ext3] 0x14
Aug 20 11:08:12 hm16 kernel: [<c0146b48>] __fput [kernel] 0x68
Aug 20 11:08:12 hm16 kernel: [<c0145d9e>] sys_write [kernel] 0x10e 
Aug 20 11:08:12 hm16 kernel: [<c014571e>] filp_close [kernel] 0x9e  
Aug 20 11:08:12 hm16 kernel: [<f887bd5e>] ext3_sync_file [ext3] 0x4e 
Aug 20 11:08:12 hm16 kernel: [<c014738d>] sys_fsync [kernel] 0x5d  
Aug 20 11:08:12 hm16 kernel: [<c01073c3>] system_call [kernel] 0x33 
Aug 20 11:08:12 hm16 kernel: 
Aug 20 11:08:13 hm16 kernel: 
Aug 20 11:08:13 hm16 kernel: Code: 89 50 04 89 02 c7 41 04 00 00 00 00 c7 43 08 
00 00 00 00 8b
Aug 20 11:08:13 hm16 kernel:  <0>Kernel panic: not continuing

lsmod:

e1000                  60536   1 
ext3                   74176   2 
jbd                    55304   2  [ext3]
raid1                  16548   2 
aic79xx               257068   6 
sd_mod                 13888   6 
scsi_mod              126252   2  [aic79xx sd_mod]


Expected results:

It won't crash :))

Additional info:


I tried to use the latest aic79xx driver on 
http://people.freebsd.org/~gibbs/linux/, but there is no driver available for 
e.25smp. Since the other kernels have security issues, I'm not going to test ( 
or use ) previous versions.

If I can have access to the new aic79xx 1.3.11 for e-25.smp I can run the test 
again.

Thanks.
Renato.

Note You need to log in before you can comment on or make changes to this bug.