Bug 167173 - kernel crashed with do_IRQ: stack overflow: 452
kernel crashed with do_IRQ: stack overflow: 452
Status: CLOSED INSUFFICIENT_DATA
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
5
i686 Linux
medium Severity high
: ---
: ---
Assigned To: Dave Jones
Brian Brock
NeedsRetesting
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2005-08-31 06:24 EDT by Stuart Midgley
Modified: 2015-01-04 17:21 EST (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-11-20 19:26:09 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Stuart Midgley 2005-08-31 06:24:35 EDT
Description of problem:
 IO load (smb+rsync) causes all kernels since 2.6.10-1.770_FC3 to crash.

The stack trace is

do_IRQ: stack overflow: 452
[<c0105962>] do_IRQ+0x83/0x85 
[<c0103b0a>] common_interrupt+0x1a/0x20
[<c0291e32>] cfq_set_request+0x1b5/0x500
[<c013e9b9>] autoremove_wake_function+0x0/0x37
[<c0291c7d>] cfq_set_request+0x0/0x500 
[<c028452e>] elv_set_request+0x20/0x23
[<c02872bd>] get_request+0x219/0x582 
[<c029017f>] cfq_find_rq_rb+0x2e/0x96
[<c029030c>] cfq_merge+0x0/0xd1 
[<c02903a7>] cfq_merge+0x9b/0xd1
[<c02883d0>] __make_request+0x165/0x628
[<c028418a>] __elv_add_request+0x74/0x9d
[<c0288fd4>] generic_make_request+0x19b/0x276
[<c013e9b9>] autoremove_wake_function+0x0/0x37
[<c013e9b9>] autoremove_wake_function+0x0/0x37
[<e006414a>] handle_stripe+0xfa2/0x16c4 [raid5]
[<e0061f99>] raid5_build_block+0x20/0x75 [raid5]
[<e00613de>] get_active_stripe+0x96/0x566 [raid5]
[<e0061fe3>] raid5_build_block+0x6a/0x75 [raid5] 
[<e0064f01>] make_request+0x34a/0x53a [raid5] 
[<c013e9b9>] autoremove_wake_function+0x0/0x37
[<c013e9b9>] autoremove_wake_function+0x0/0x37
[<c013e9b9>] autoremove_wake_function+0x0/0x37
[<c0288fd4>] generic_make_request+0x19b/0x276 
[<c013e9b9>] autoremove_wake_function+0x0/0x37
[<c013e9b9>] autoremove_wake_function+0x0/0x37
[<c013e9b9>] autoremove_wake_function+0x0/0x37
[<c017da39>] bio_clone+0xad/0xb2 
[<e00432f7>] __map_bio+0x30/0xc8 [dm_mod]
[<e0043531>] __clone_and_map+0xcd/0x309 [dm_mod]
[<c02b0b61>] ide_dma_exec_cmd+0x1f/0x23 
[<c02b0b86>] ide_dma_start+0x21/0x2d 
[<e004380a>] __split_bio+0x9d/0x10b [dm_mod]
[<c02a0000>] ide_timing_merge+0xc2/0xc8 
[<e00438d7>] dm_request+0x5f/0x88 [dm_mod]
[<c0288fd4>] generic_make_request+0x19b/0x276
[<c0155507>] buffered_rmqueue+0x154/0x2e2 
[<c013e9b9>] autoremove_wake_function+0x0/0x37
[<c013e9b9>] autoremove_wake_function+0x0/0x37
[<c013e9b9>] autoremove_wake_function+0x0/0x37
[<c02890fa>] submit_bio+0x4b/0xc5 
[<c013e9b9>] autoremove_wake_function+0x0/0x37
[<c017d776>] bio_alloc_bioset+0x154/0x1c5 
[<c017cfa2>] submit_bh+0x133/0x17f 
[<c017d06f>] ll_rw_block+0x81/0x83
[<e014759d>] search_by_key+0x113/0xd8b [reiserfs]
[<c028405c>] elv_merged_request+0x15/0x1a 
[<c028867f>] __make_request+0x414/0x628 
[<c0103b0a>] common_interrupt+0x1a/0x20
[<c0288fd4>] generic_make_request+0x19b/0x276
[<e014825d>] search_for_position_by_key+0x48/0x358 [reiserfs]
[<c013e9b9>] autoremove_wake_function+0x0/0x37 
[<e0133022>] make_cpu_key+0x42/0x49 [reiserfs]
[<e01332b4>] _get_block_create_0+0xcd/0x680 [reiserfs]
[<e0064f08>] make_request+0x351/0x53a [raid5] 
[<e0134601>] reiserfs_get_block+0xb87/0x11b7 [reiserfs]
[<c013e9b9>] autoremove_wake_function+0x0/0x37 
[<c0288fd4>] generic_make_request+0x19b/0x276 
[<c013e9b9>] autoremove_wake_function+0x0/0x37
[<c013e9b9>] autoremove_wake_function+0x0/0x37
[<c013e9b9>] autoremove_wake_function+0x0/0x37
[<c017da39>] bio_clone+0xad/0xb2 
[<e00432f7>] __map_bio+0x30/0xc8 [dm_mod]
[<e0043531>] __clone_and_map+0xcd/0x309 [dm_mod]
[<e0043855>] __split_bio+0xe8/0x10b [dm_mod] 
[<c013e9b9>] autoremove_wake_function+0x0/0x37
[<e00438d7>] dm_request+0x5f/0x88 [dm_mod] 
[<c0288fd4>] generic_make_request+0x19b/0x276
[<c013e9b9>] autoremove_wake_function+0x0/0x37
[<c0153c3a>] mempool_alloc+0x72/0x250 
[<c013e9b9>] autoremove_wake_function+0x0/0x37
[<c013e9b9>] autoremove_wake_function+0x0/0x37
[<c01a7be0>] mpage_end_io_read+0x0/0x6f 
[<c01a7be0>] mpage_end_io_read+0x0/0x6f
[<c01a7f15>] do_mpage_readpage+0x151/0x417
[<c01a7be0>] mpage_end_io_read+0x0/0x6f 
[<c02890fa>] submit_bio+0x4b/0xc5 
[<c015bb5e>] __pagevec_lru_add+0x133/0x287
[<c0206b4d>] radix_tree_insert+0x74/0x10b 
[<c01a8270>] mpage_readpages+0x95/0x111 
[<e0133a7a>] reiserfs_get_block+0x0/0x11b7 [reiserfs]
[<e0134c31>] reiserfs_readpages+0x0/0x15 [reiserfs] 
[<c0157e2b>] read_pages+0xf5/0x105 
[<e0133a7a>] reiserfs_get_block+0x0/0x11b7 [reiserfs]
[<c015589d>] __alloc_pages+0x169/0x3cb 
[<c0157f28>] __do_page_cache_readahead+0xed/0xf9
[<c0158032>] blockable_page_cache_readahead+0x41/0xa2
[<c0158103>] make_ahead_window+0x70/0xa4 
[<c01581bf>] page_cache_readahead+0x88/0x161
[<c0150fd3>] do_generic_mapping_read+0x524/0x6ce
[<c01513ea>] __generic_file_aio_read+0x18a/0x1f0
[<c015117d>] file_read_actor+0x0/0xe3 
[<c015153b>] generic_file_read+0x9c/0xbe
[<c013e9b9>] autoremove_wake_function+0x0/0x37
[<c0177546>] vfs_read+0xad/0x108 
[<c01777e1>] sys_read+0x41/0x6a 
[<c010394d>] syscall_call+0x7/0xb
======================= 
======================= 
=======================
=======================
=======================
=======================



Version-Release number of selected component (if applicable):
2.6.12-1.1376_FC3

How reproducible:
generate a moderate amount of IO load (smb and an rsync is usually enough)


Additional info:

grub boot parameters

title Fedora Core (2.6.12-1.1376_FC3)
root (hd0,0)
kernel /vmlinuz-2.6.12-1.1376_FC3 ro root=/dev/md2 rhgb quiet console=ttyS0,38400 console=tty0 
noapic
initrd /initrd-2.6.12-1.1376_FC3.img


gem:/tmp # cat /proc/partitions 
major minor #blocks name

3 0 156290904 hda
3 1 104391 hda1
3 2 987997 hda2
3 4 1 hda4
3 5 155195901 hda5
3 64 156290904 hdb
3 65 104391 hdb1
3 66 987997 hdb2
3 68 1 hdb4
3 69 155195901 hdb5
22 0 156290904 hdc
22 1 104391 hdc1
22 2 987997 hdc2
22 4 1 hdc4
22 5 155195901 hdc5
22 64 156290904 hdd
22 65 104391 hdd1
22 66 987997 hdd2
22 68 1 hdd4
22 69 155195901 hdd5
9 0 104320 md0
9 3 465587328 md3
9 2 987904 md2
9 1 987904 md1
253 0 2097152 dm-0
253 1 10485760 dm-1
253 2 10485760 dm-2
253 3 442499072 dm-3



gem:/tmp # lspci
00:00.0 Host bridge: ATI Technologies Inc Radeon 9100 IGP Host Bridge (rev 02)
00:01.0 PCI bridge: ATI Technologies Inc Radeon 9100 IGP AGP Bridge
00:13.0 USB Controller: ATI Technologies Inc OHCI USB Controller #1 (rev 01)
00:13.1 USB Controller: ATI Technologies Inc OHCI USB Controller #2 (rev 01)
00:13.2 USB Controller: ATI Technologies Inc EHCI USB Controller (rev 01)
00:14.0 SMBus: ATI Technologies Inc ATI SMBus (rev 18)
00:14.1 IDE interface: ATI Technologies Inc: Unknown device 4349
00:14.3 ISA bridge: ATI Technologies Inc: Unknown device 434c
00:14.4 PCI bridge: ATI Technologies Inc: Unknown device 4342
01:05.0 VGA compatible controller: ATI Technologies Inc Radeon 9100 IGP
02:06.0 RAID bus controller: Integrated Technology Express, Inc. IT/ITE8212 Dual channel ATA RAID 
controller (PCI version seems to be IT8212, embedded seems (rev 11)
02:09.0 Ethernet controller: National Semiconductor Corporation DP83815 (MacPhyter) Ethernet 
Controller
Comment 1 Dave Jones 2005-09-30 07:38:47 EDT
possible fix has been merged into cvs for the next update.
Comment 2 Orion Poplawski 2005-10-05 10:53:25 EDT
I got similar with kernel-smp-2.6.13-1.1526_FC4:

Oct  5 03:25:43 alexandria kernel: do_IRQ: stack overflow: 496
Oct  5 03:25:43 alexandria kernel:  [<c0105f44>] do_IRQ+0x84/0x86
Oct  5 03:25:43 alexandria kernel: Unable to handle kernel paging request at
virtual addr
ess bec93e70
Oct  5 03:25:43 alexandria kernel:  printing eip:
Oct  5 03:25:43 alexandria kernel: c012180d
Oct  5 03:25:43 alexandria kernel: *pde = 00000000
Oct  5 03:25:43 alexandria kernel: Oops: 0000 [#1]
Oct  5 03:25:43 alexandria kernel: SMP
Oct  5 03:25:43 alexandria kernel: Modules linked in: nfs nfsd exportfs lockd
nfs_acl ipv
6 autofs4 w83627hf w83781d adm1021 i2c_sensor i2c_isa sunrpc jfs video button
battery ac
uhci_hcd hw_random i2c_i801 i2c_core shpchp eepro100 mii e1000 dm_snapshot
dm_zero dm_mir
ror ext3 jbd raid5 xor raid1 dm_mod mv_sata(U) sd_mod scsi_mod
Oct  5 03:25:43 alexandria kernel: CPU:    -195358332
Oct  5 03:25:43 alexandria kernel: EIP:    0060:[<c012180d>]    Not tainted VLI
Oct  5 03:25:43 alexandria kernel: EFLAGS: 00010086   (2.6.13-1.1526_FC4smp)
Oct  5 03:25:43 alexandria kernel: EIP is at vprintk+0x1a7/0x2aa
Oct  5 03:25:43 alexandria kernel: eax: f45b109c   ebx: 00000000   ecx: 00020000
  edx: c
0466841
Oct  5 03:25:43 alexandria kernel: esi: 00000001   edi: 00000082   ebp: 00000010
  esp: f
45b1184
Oct  5 03:25:43 alexandria kernel: ds: 007b   es: 007b   ss: 0068
Oct  5 03:25:43 alexandria kernel: Process  (pid: -195358164,
threadinfo=f45b1000 task=f4
5b1184)
Oct  5 03:25:43 alexandria kernel: Stack: f45b120c c0121a42 00000000 c0466840
0000001f 00
000086 00000000 c012185a
Oct  5 03:25:43 alexandria kernel:        c0466841 00000000 00000000 00000000
00000000 00
000000 00000000 00000000
Oct  5 03:25:43 alexandria kernel:        c046685c 00000000 00000000 00000000
00000000 00
000000 00000000 00000000
Oct  5 03:25:43 alexandria kernel: Call Trace:
Oct  5 03:25:43 alexandria kernel:  [<c0121a42>] release_console_sem+0xad/0xb5
Oct  5 03:25:43 alexandria kernel:  [<c012185a>] vprintk+0x1f4/0x2aa
Oct  5 03:25:43 alexandria kernel:  [<c0105f44>] do_IRQ+0x84/0x86
Oct  5 03:25:43 alexandria kernel:  [<c0121662>] printk+0x1b/0x1f
Oct  5 03:25:43 alexandria kernel:  [<c01047b5>] show_trace+0x56/0x78
Oct  5 03:25:43 alexandria kernel:  [<c0105f44>] do_IRQ+0x84/0x86
Oct  5 03:25:43 alexandria kernel:  [<c01048b2>] dump_stack+0x13/0x17
Oct  5 03:25:43 alexandria kernel:  [<c0105f44>] do_IRQ+0x84/0x86
Oct  5 03:25:43 alexandria kernel:  [<c0104392>] common_interrupt+0x1a/0x20
Oct  5 03:25:43 alexandria kernel:  [<c024e3b0>] get_io_context+0xc/0xd
Oct  5 03:25:43 alexandria kernel:  [<c0255e24>] cfq_get_io_context+0x18/0xcd
Oct  5 03:25:43 alexandria kernel:  [<c02566ff>] cfq_set_request+0x69/0x225
Oct  5 03:25:43 alexandria kernel:  [<c0256696>] cfq_set_request+0x0/0x225
Oct  5 03:25:43 alexandria kernel:  [<c024a2cf>] elv_set_request+0x1e/0x33
Oct  5 03:25:43 alexandria kernel:  [<c024c86f>] get_request+0xfd/0x2af
Oct  5 03:25:43 alexandria kernel:  [<c024ca3a>] get_request_wait+0x19/0xfb
Oct  5 03:25:43 alexandria kernel:  [<f889f6f5>] commandsQueueAddTail+0x71/0x80
[mv_sata]
Oct  5 03:25:43 alexandria kernel:  [<c024d31f>] __make_request+0xa7/0x4c3
Oct  5 03:25:43 alexandria kernel:  [<f88a0000>] _doDevErrorRecovery+0x2e/0x46
[mv_sata]
Oct  5 03:25:43 alexandria kernel:  [<c024da21>] generic_make_request+0x9a/0x24b
Oct  5 03:25:43 alexandria kernel:  [<f881ae58>] compute_blocknr+0xe5/0x16e [raid5]
Oct  5 03:25:43 alexandria kernel:  [<c01347c2>] autoremove_wake_function+0x0/0x37
Oct  5 03:25:43 alexandria kernel:  [<f881c0d2>] handle_stripe+0x721/0x1079 [raid5]
Oct  5 03:25:43 alexandria kernel:  [<f881ab41>] raid5_build_block+0x66/0x70 [raid5]
Oct  5 03:25:43 alexandria kernel:  [<f881a3ff>] get_active_stripe+0x1a0/0x393
[raid5]
Oct  5 03:25:43 alexandria kernel:  [<f881ced4>] make_request+0x2cf/0x300 [raid5]
Oct  5 03:25:43 alexandria kernel:  [<c01347c2>] autoremove_wake_function+0x0/0x37
Oct  5 03:25:43 alexandria kernel:  [<c024da21>] generic_make_request+0x9a/0x24b
Oct  5 03:25:43 alexandria kernel:  [<c0169012>] bio_clone+0xa5/0xb6
Oct  5 03:25:43 alexandria kernel:  [<c01347c2>] autoremove_wake_function+0x0/0x37
Oct  5 03:25:43 alexandria kernel:  [<f886854d>] __clone_and_map+0xb3/0x328 [dm_mod]
Oct  5 03:25:43 alexandria kernel:  [<c0148ce1>] mempool_alloc+0x26/0xe7
Oct  5 03:25:43 alexandria kernel:  [<f8868894>] __split_bio+0xd2/0x114 [dm_mod]
Oct  5 03:25:43 alexandria kernel:  [<f8868954>] dm_request+0x7e/0x94 [dm_mod]
Oct  5 03:25:43 alexandria kernel:  [<c024da21>] generic_make_request+0x9a/0x24b
Oct  5 03:25:43 alexandria kernel:  [<c01347d7>] autoremove_wake_function+0x15/0x37
Oct  5 03:25:43 alexandria kernel:  [<c01347c2>] autoremove_wake_function+0x0/0x37
Oct  5 03:25:43 alexandria kernel:  [<c024dc17>] submit_bio+0x45/0xcb
Oct  5 03:25:43 alexandria kernel:  [<c0148ce1>] mempool_alloc+0x26/0xe7
Oct  5 03:25:43 alexandria kernel:  [<c0149f39>] buffered_rmqueue+0xc6/0x228
Oct  5 03:25:43 alexandria kernel:  [<c01691f6>] bio_add_page+0x26/0x2c
Oct  5 03:25:43 alexandria kernel:  [<f8b72a58>] metapage_readpage+0x186/0x1c5 [jfs]
Oct  5 03:25:43 alexandria kernel:  [<c0147291>] read_cache_page+0x88/0x137
Oct  5 03:25:43 alexandria kernel:  [<f8b728d2>] metapage_readpage+0x0/0x1c5 [jfs]
Oct  5 03:25:43 alexandria kernel:  [<f8b72c5e>] __get_metapage+0x112/0x425 [jfs]
Oct  5 03:25:43 alexandria kernel:  [<c024da21>] generic_make_request+0x9a/0x24b
Oct  5 03:25:43 alexandria kernel:  [<f8b5d459>] xtSearch+0x3ef/0x739 [jfs]
Oct  5 03:25:43 alexandria kernel:  [<c0169012>] bio_clone+0xa5/0xb6
Oct  5 03:25:43 alexandria kernel:  [<f8b5c644>] xtLookup+0xc4/0x236 [jfs]
Oct  5 03:25:43 alexandria kernel:  [<f8868954>] dm_request+0x7e/0x94 [dm_mod]
Oct  5 03:25:43 alexandria kernel:  [<c024da21>] generic_make_request+0x9a/0x24b
Oct  5 03:25:43 alexandria kernel:  [<f886854d>] __clone_and_map+0xb3/0x328 [dm_mod]
Oct  5 03:25:43 alexandria kernel:  [<f8b7252e>] metapage_get_blocks+0xa3/0xe4 [jfs]
Oct  5 03:25:43 alexandria kernel:  [<f8b72798>] metapage_writepage+0xa8/0x1e2 [jfs]
Oct  5 03:25:43 alexandria kernel:  [<c0186900>] mpage_writepages+0x227/0x3ee
Oct  5 03:25:44 alexandria kernel:  [<f8b726f0>] metapage_writepage+0x0/0x1e2 [jfs]
Oct  5 03:25:44 alexandria kernel:  [<c01456a8>]
__filemap_fdatawrite_range+0x66/0x72
Oct  5 03:25:44 alexandria kernel:  [<c0145725>] filemap_flush+0x23/0x27
Oct  5 03:25:44 alexandria kernel:  [<f8b740c9>] lmLogSync+0x15d/0x1ed [jfs]
Oct  5 03:25:44 alexandria kernel:  [<f8b7358a>] lmLog+0x7a/0x194 [jfs]
Oct  5 03:25:44 alexandria kernel:  [<f8b77799>] diLog+0xf1/0x103 [jfs]
Oct  5 03:25:44 alexandria kernel:  [<f8b7764a>] txLog+0xb1/0x10f [jfs]
Oct  5 03:25:44 alexandria kernel:  [<f8b774ab>] txCommit+0x1fa/0x2e8 [jfs]
Oct  5 03:25:44 alexandria kernel:  [<f8b58d84>] jfs_commit_inode+0x109/0x11c [jfs]
Oct  5 03:25:44 alexandria kernel:  [<f8b71efe>] extAlloc+0x3ae/0x471 [jfs]
Oct  5 03:25:44 alexandria kernel:  [<f8b59198>] jfs_get_blocks+0x274/0x2cd [jfs]
Oct  5 03:25:44 alexandria kernel:  [<f8b59211>] jfs_get_block+0x20/0x25 [jfs]
Oct  5 03:25:44 alexandria kernel:  [<c0167cb8>] nobh_prepare_write+0x13d/0x3f4
Oct  5 03:25:44 alexandria kernel:  [<c014a22a>] __alloc_pages+0xfe/0x44e
Oct  5 03:25:44 alexandria kernel:  [<c0145b01>] add_to_page_cache+0x4e/0xaf
Oct  5 03:25:44 alexandria kernel:  [<f8b5924d>] jfs_prepare_write+0x0/0x15 [jfs]
Oct  5 03:25:44 alexandria kernel:  [<c0147ada>]
generic_file_buffered_write+0x298/0x642
Oct  5 03:25:44 alexandria kernel:  [<f8b591f1>] jfs_get_block+0x0/0x25 [jfs]
Oct  5 03:25:44 alexandria kernel:  [<c0126089>] current_fs_time+0x5a/0x75
Oct  5 03:25:44 alexandria kernel:  [<c017d3a4>] inode_update_time+0x2d/0x9b
Oct  5 03:25:44 alexandria kernel:  [<c0148128>]
__generic_file_aio_write_nolock+0x2a4/0x
4d2
Oct  5 03:25:44 alexandria kernel:  [<c02b35f4>] sock_common_recvmsg+0x41/0x57
Oct  5 03:25:44 alexandria kernel:  [<c0148471>]
__generic_file_write_nolock+0x89/0xa3
Oct  5 03:25:44 alexandria kernel:  [<f9302813>] svc_expkey_lookup+0x371/0x3ef
[nfsd]
Oct  5 03:25:44 alexandria kernel:  [<c01347c2>] autoremove_wake_function+0x0/0x37
Oct  5 03:25:44 alexandria kernel:  [<c01487be>] generic_file_writev+0x49/0xb3
Oct  5 03:25:44 alexandria kernel:  [<c0148775>] generic_file_writev+0x0/0xb3
Oct  5 03:25:44 alexandria kernel:  [<c01647c0>] do_readv_writev+0x1f4/0x271
Oct  5 03:25:44 alexandria kernel:  [<c014860d>] generic_file_write+0x0/0xc5
Oct  5 03:25:44 alexandria kernel:  [<f8b58ae9>] jfs_open+0xd/0x87 [jfs]
Oct  5 03:25:44 alexandria kernel:  [<c01635dc>] dentry_open+0x16f/0x1e8
Oct  5 03:25:44 alexandria kernel:  [<c01648cb>] vfs_writev+0x3d/0x53
Oct  5 03:25:44 alexandria kernel:  [<f92ff9db>] nfsd_write+0x31a/0x72c [nfsd]
Oct  5 03:25:44 alexandria kernel:  [<c031657b>] schedule+0x53b/0xb8e
Oct  5 03:25:44 alexandria kernel:  [<c016cc09>] vfs_getattr+0x52/0xa2
Oct  5 03:25:44 alexandria kernel:  [<f9308b94>]
nfs3svc_decode_writeargs+0x0/0x17d [nfsd
]
Oct  5 03:25:44 alexandria kernel:  [<f930712c>] nfsd3_proc_write+0xf9/0x121 [nfsd]
Oct  5 03:25:44 alexandria kernel:  [<f9308b94>]
nfs3svc_decode_writeargs+0x0/0x17d [nfsd
]
Oct  5 03:25:44 alexandria kernel:  [<f92fb5e4>] nfsd_dispatch+0x76/0x1c2 [nfsd]
Oct  5 03:25:44 alexandria kernel:  [<f924d047>] svc_authenticate+0x97/0xae [sunrpc]
Oct  5 03:25:44 alexandria kernel:  [<f924a7c3>] svc_process+0x3b4/0x671 [sunrpc]
Oct  5 03:25:44 alexandria kernel:  [<f92fb3ab>] nfsd+0x184/0x347 [nfsd]
Oct  5 03:25:44 alexandria kernel:  [<f92fb227>] nfsd+0x0/0x347 [nfsd]
Oct  5 03:25:44 alexandria kernel:  [<c0101ca1>] kernel_thread_helper+0x5/0xb
Oct  5 03:25:44 alexandria kernel:  =======================
Oct  5 03:25:44 alexandria kernel: Unable to handle kernel NULL pointer
dereference at vi
rtual address 00000001
Oct  5 03:25:44 alexandria kernel:  printing eip:
Oct  5 03:25:44 alexandria kernel: c010477d
Oct  5 03:25:44 alexandria kernel: *pde = 01b42001
Comment 3 Fedora Update System 2005-10-20 10:29:51 EDT
From User-Agent: XML-RPC

kernel-2.6.12-1.1380_FC3 has been pushed for FC3, which should resolve this issue.  If these problems are still present in this version, then please make note of it in this bug report.
Comment 4 Orion Poplawski 2005-10-20 11:04:24 EDT
Is there a fix for the FC4 kernel?
Comment 5 Stuart Midgley 2005-10-20 19:48:00 EDT
This kernel hangs (no crash) during boot trying to remount the root file system.
Comment 6 Kevin DeKorte 2005-10-21 15:43:55 EDT
Upgrading on FC3 from 1378 to 1380 and rebooting hangs the machine
during/immediatly after the file system check on the MD device. Rolling back to
1378 allows the machine to boot properly.
Comment 7 Dave Jones 2005-10-27 20:51:16 EDT
1381 fixes the hang, but it does so by reverting the change that I was hoping
would fix this bug.
Comment 8 Yeechang Lee 2005-11-03 21:55:40 EST
Yes, I can confirm that 1381smp still has the bug. I don't have a serial
console, and so can't provide screen dumps, but I also get a stack overflow.

I have a SMP dual Xeon system built on a Supermicro X5DAL-G and two 3ware 7506-4
RAID cards in JBOD mode to support eight 400GB drives in JFS-on-LVM-on-software
RAID. The system normally runs fine as a media file server, but when I stress it
(e.g., copy a file to it over NFS, as opposed to pulling from it) there's a fair
chance it'll randomly hang. Right now I'm using 2.6.9-1.667smp; reading the
above gives me hope it'll do until the bug gets fixed.
Comment 9 Yeechang Lee 2005-11-06 21:58:20 EST
I can now also confirm that kernel-smp-2.6.9-1.667 also has the issue. Identical
symptoms; high load (BitTorrent and a RAID 5 rebuild, with a large diff-over-NFS
acting as the feather that broke the camel's back) leads to stack overflow on an
otherwise-stable system.

Can someone confirm that switching to Fedora Core 4 or RHEL gets rid of this issue?
Comment 10 Dave Jones 2005-11-07 00:31:16 EST
every release of every distro has this problem right now. It's still being
worked out upstream, (A patch appeared this afternoon which may solve the issue).
I'll build an FC3 kernel with it soon for testing.
Comment 11 Kevin DeKorte 2005-11-09 19:19:20 EST
Using kernel 1381 I was working with a mail file over 1.3MB and I got this error
fseek: Invalid argument
panic: temporary file seek
Aborted

Rolling back to 1378, didn't exhibit this problem.
Comment 12 Yeechang Lee 2005-11-09 20:17:55 EST
(In reply to comment #10)
> every release of every distro has this problem right now. It's still being
> worked out upstream, (A patch appeared this afternoon which may solve the issue).
> I'll build an FC3 kernel with it soon for testing.

Glad to hear it. Meanwhile, I've rolled 1381 from source with the 4K stack
turned off; hopefully 8K will be enough to keep the system stable. If not, I'll
try a 1381-with-16K-stack-patch variant that Linuxant has made available.
Comment 13 Stuart Midgley 2005-12-14 18:21:06 EST
Any ETA on a fix?  FC3 will be moved to legacy shortly and I would love to upgrade to FC4 or FC5, but am 
unwilling to move from my current working kernel.
Comment 14 Yeechang Lee 2005-12-18 13:06:13 EST
As mentioned above, I've been running 1381 with the 4K stack turned off for the
past six weeks (never had to try the Conexant kernels, but I'm sure they'd do
just as well) and I am happy to report that it has worked out well; no more
stack overflows! Yay!
Comment 15 Dave Jones 2006-01-16 17:10:05 EST
This is a mass-update to all currently open Fedora Core 3 kernel bugs.

Fedora Core 3 support has transitioned to the Fedora Legacy project.
Due to the limited resources of this project, typically only
updates for new security issues are released.

As this bug isn't security related, it has been migrated to a
Fedora Core 4 bug.  Please upgrade to this newer release, and
test if this bug is still present there.

This bug has been placed in NEEDINFO_REPORTER state.
Due to the large volume of inactive bugs in bugzilla, if this bug is
still in this state in two weeks time, it will be closed.

Should this bug still be relevant after this period, the reporter
can reopen the bug at any time. Any other users on the Cc: list
of this bug can request that the bug be reopened by adding a
comment to the bug.

Thank you.
Comment 16 Dave Jones 2006-02-03 00:09:51 EST
This is a mass-update to all currently open kernel bugs.

A new kernel update has been released (Version: 2.6.15-1.1830_FC4)
based upon a new upstream kernel release.

Please retest against this new kernel, as a large number of patches
go into each upstream release, possibly including changes that
may address this problem.

This bug has been placed in NEEDINFO_REPORTER state.
Due to the large volume of inactive bugs in bugzilla, if this bug is
still in this state in two weeks time, it will be closed.

Should this bug still be relevant after this period, the reporter
can reopen the bug at any time. Any other users on the Cc: list
of this bug can request that the bug be reopened by adding a
comment to the bug.

If this bug is a problem preventing you from installing the
release this version is filed against, please see bug 169613.

Thank you.
Comment 17 Stuart Midgley 2006-02-03 00:21:04 EST
Received the mass email regarding the potential closing of this bug.

Unfortunately, due to current requirements of my server, I am unable to upgrade from FC3 and am not 
willing to change my kernel 2.6.10-1.770_FC3.  I don't see any specific information which indicates that 
this patch has been fixed.

Hopefully I will be able to upgrade in about 1 months time (when the server isn't required to be 100% 
available) and will be able to test out any new kernels.
Comment 18 Dave Jones 2006-02-03 11:58:55 EST
The important line from the changelog regarding this bug is this...

- Reduce block layer stack usage.

let me know how it works out when you get a chance.
Comment 19 Stuart Midgley 2006-04-25 10:27:46 EDT
OK, I finally upgraded from FC3 to FC4 and my testing of the 2.6.15-1.2054_FC5 kernel is very positive.  I 
have been doing a severe amount of IO to the box (I copied off all my data, added a disk to my software 
raid array, rebuilt the server and copied back on all my data ~400GB worth) and it hasn't crashed.

This would certainly have caused the FC3 kernels a lot of trouble.  The bug appears squashed.

Thanks.
Comment 20 Dave Jones 2006-09-16 21:34:14 EDT
[This comment added as part of a mass-update to all open FC4 kernel bugs]

FC4 has now transitioned to the Fedora legacy project, which will continue to
release security related updates for the kernel.  As this bug is not security
related, it is unlikely to be fixed in an update for FC4, and has been migrated
to FC5.

Please retest with Fedora Core 5.

Thank you.
Comment 21 Vladimir Gabrelyan 2006-10-14 13:39:23 EDT
I confirm this bugs exists in FC5 kernel 2.6.15-1.2054_FC5smp.

This bug still occurs on a high disk IO load. I'm using Dual Xeon 2.8Ghz\Areca 
ARC 1120 sata raid.
Comment 22 Dave Jones 2006-10-16 13:01:44 EDT
2.6.15 is pretty ancient now, try with the 2.6.18 update that went out today.
Comment 23 Dave Jones 2006-10-16 13:26:37 EDT
A new kernel update has been released (Version: 2.6.18-1.2200.fc5)
based upon a new upstream kernel release.

Please retest against this new kernel, as a large number of patches
go into each upstream release, possibly including changes that
may address this problem.

This bug has been placed in NEEDINFO state.
Due to the large volume of inactive bugs in bugzilla, if this bug is
still in this state in two weeks time, it will be closed.

Should this bug still be relevant after this period, the reporter
can reopen the bug at any time. Any other users on the Cc: list
of this bug can request that the bug be reopened by adding a
comment to the bug.

In the last few updates, some users upgrading from FC4->FC5
have reported that installing a kernel update has left their
systems unbootable. If you have been affected by this problem
please check you only have one version of device-mapper & lvm2
installed.  See bug 207474 for further details.

If this bug is a problem preventing you from installing the
release this version is filed against, please see bug 169613.

If this bug has been fixed, but you are now experiencing a different
problem, please file a separate bug for the new problem.

Thank you.
Comment 24 Jeremy Hunt 2006-11-14 01:32:08 EST
Looks like this bug is alive and well in all kernels up to and including
2.6.18-1.2239.fc5smp. I've been experiencing crashes under heavy I/O on my
mythbacked box ever while recording 2 HD shows, and one standard def at the same
time. This has been going on since ~2.6.16 but I was finally able to get
something to dump to the netconsole last night.

Here's what I got from the first crash:
Nov 12 20:26:37 chef BUG: sleeping function called from invalid context at
kernel/sched.c:4509 
Nov 12 20:26:37 chef in_atomic():1, irqs_disabled():0 
Nov 12 20:26:37 chef  [<c04050ef>] dump_trace+0x69/0x1af 
Nov 12 20:26:37 chef  [<c040524d>] show_trace_log_lvl+0x18/0x2c 
Nov 12 20:26:37 chef  [<c0405800>] show_trace+0xf/0x11 
Nov 12 20:26:37 chef  [<c04058fa>] dump_stack+0x15/0x17 
Nov 12 20:26:37 chef  [<c0420c06>] __cond_resched+0x12/0x3c 
Nov 12 20:26:37 chef  [<c060e4bc>] cond_resched+0x2a/0x31 
Nov 12 20:26:37 chef BUG: unable to handle kernel paging request at virtual
address ffa3d283 
Nov 12 20:26:37 chef  printing eip: 
Nov 12 20:45:57 chef do_IRQ: stack overflow: 500


and from the second crash:

Nov 13 21:56:46 chef do_IRQ: stack overflow: 500 
Nov 13 21:56:46 chef  [<c04050ef>] 
Nov 13 21:56:46 chef dump_trace+0x69/0x1af 


I don't have a free serial port so I can't do a serial console and this seems to
be all that I can get on the netconsole. Is there anything else I can do to get
a good dump?
Comment 25 Jeremy Hunt 2006-11-14 02:57:40 EST
Adding PCI and module info

[root@chef ~]# lsmod
Module                  Size  Used by
netconsole              7649  0 
nfsd                  221169  17 
exportfs               10177  1 nfsd
lockd                  66505  2 nfsd
nfs_acl                 8001  1 nfsd
autofs4                25669  1 
sunrpc                158589  12 nfsd,lockd,nfs_acl
ext3                  136137  1 
jbd                    63593  1 ext3
raid1                  27201  1 
video                  21317  0 
sbs                    20353  0 
i2c_ec                  9537  1 sbs
button                 11217  0 
battery                14533  0 
asus_acpi              20825  0 
ac                      9669  0 
ipv6                  267361  26 
lp                     17161  0 
parport_pc             31461  1 
parport                41097  2 lp,parport_pc
wm8775                 10317  0 
cx25840                28113  0 
lirc_atiusb            19360  1 
lirc_dev               17044  1 lirc_atiusb
cx88_blackbird         22853  0 
tuner                  63221  0 
cx88_dvb               19941  1 
nvidia               4537876  20 
cx8800                 38349  1 cx88_blackbird
cx8802                 17221  2 cx88_blackbird,cx88_dvb
snd_hda_intel          20760  0 
cx88xx                 65637  4 cx88_blackbird,cx88_dvb,cx8800,cx8802
snd_hda_codec         163328  1 snd_hda_intel
cx88_vp3054_i2c         8897  1 cx88_dvb
ivtv                  170128  0 
snd_seq_dummy           7428  0 
b2c2_flexcop_pci       13145  0 
b2c2_flexcop           32469  1 b2c2_flexcop_pci
ir_common              32325  1 cx88xx
snd_seq_oss            36736  0 
or51132                14277  1 cx88_dvb
video_buf_dvb          11077  1 cx88_dvb
mt352                  10949  2 cx88_dvb,b2c2_flexcop
compat_ioctl32          5697  1 cx8800
i2c_algo_bit           13001  3 cx88xx,cx88_vp3054_i2c,ivtv
snd_seq_midi_event     11136  1 snd_seq_oss
mt312                  12356  1 b2c2_flexcop
cx2341x                15429  2 cx88_blackbird,ivtv
snd_seq                54128  5 snd_seq_dummy,snd_seq_oss,snd_seq_midi_event
video_buf              29253  6
cx88_blackbird,cx88_dvb,cx8800,cx8802,cx88xx,video_buf_dvb
btcx_risc               9289  3 cx8800,cx8802,cx88xx
bcm3510                14021  1 b2c2_flexcop
snd_seq_device         11788  3 snd_seq_dummy,snd_seq_oss,snd_seq
dvb_pll                18885  2 cx88_dvb,b2c2_flexcop
tveeprom               18769  2 cx88xx,ivtv
snd_pcm_oss            44416  0 
stv0299                14921  1 b2c2_flexcop
isl6421                 6721  1 cx88_dvb
zl10353                 9797  1 cx88_dvb
dvb_core               83689  3 b2c2_flexcop,video_buf_dvb,stv0299
stv0297                12361  1 b2c2_flexcop
videodev               27201  4 cx88_blackbird,cx8800,cx88xx,ivtv
snd_mixer_oss          19840  1 snd_pcm_oss
cx24123                16329  1 cx88_dvb
snd_pcm                77956  3 snd_hda_intel,snd_hda_codec,snd_pcm_oss
nxt200x                17733  2 cx88_dvb,b2c2_flexcop
cx22702                10565  1 cx88_dvb
lgdt330x               12509  2 cx88_dvb,b2c2_flexcop
v4l1_compat            16581  3 cx8800,ivtv,videodev
snd_timer              23684  2 snd_seq,snd_pcm
v4l2_common            26433  7
cx25840,cx88_blackbird,tuner,cx8800,ivtv,cx2341x,videodev
snd                    53380  10
snd_hda_intel,snd_hda_codec,snd_seq_dummy,snd_seq_oss,snd_seq,snd_seq_device,snd_pcm_oss,snd_mixer_oss,snd_pcm,snd_timer
r8169                  34761  0 
ide_cd                 42593  2 
sg                     38621  0 
uhci_hcd               28109  0 
serio_raw              11461  0 
ehci_hcd               36173  0 
soundcore              14241  1 snd
i2c_i801               11981  0 
i2c_core               25793  25
i2c_ec,wm8775,cx25840,tuner,cx88_dvb,nvidia,cx88xx,ivtv,b2c2_flexcop,or51132,mt352,i2c_algo_bit,mt312,bcm3510,dvb_pll,tveeprom,stv0299,isl6421,zl10353,stv0297,cx24123,nxt200x,cx22702,lgdt330x,i2c_i801
snd_page_alloc         12168  2 snd_hda_intel,snd_pcm
cdrom                  38881  1 ide_cd
pcspkr                  7489  0 
dm_snapshot            21357  0 
dm_zero                 6337  0 
dm_mirror              32913  0 
dm_mod                 61273  16 dm_snapshot,dm_zero,dm_mirror
raid0                  12225  1 
xfs                   526853  2 
ata_piix               18121  4 
sata_sil               15945  0 
libata                103001  2 ata_piix,sata_sil
sd_mod                 24897  16 
scsi_mod              138601  3 sg,libata,sd_mod

[root@chef ~]# lspci
00:00.0 Host bridge: Intel Corporation 945G/P Memory Controller Hub (rev 81)
00:01.0 PCI bridge: Intel Corporation 945G/P PCI Express Graphics Port (rev 81)
00:1b.0 Audio device: Intel Corporation 82801G (ICH7 Family) High Definition
Audio Controller (rev 01)
00:1d.0 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI #1 (rev 01)
00:1d.1 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI #2 (rev 01)
00:1d.2 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI #3 (rev 01)
00:1d.3 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI #4 (rev 01)
00:1d.7 USB Controller: Intel Corporation 82801G (ICH7 Family) USB2 EHCI
Controller (rev 01)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev e1)
00:1f.0 ISA bridge: Intel Corporation 82801GB/GR (ICH7 Family) LPC Interface
Bridge (rev 01)
00:1f.1 IDE interface: Intel Corporation 82801G (ICH7 Family) IDE Controller
(rev 01)
00:1f.2 IDE interface: Intel Corporation 82801GB/GR/GH (ICH7 Family) Serial ATA
Storage Controllers cc=IDE (rev 01)
00:1f.3 SMBus: Intel Corporation 82801G (ICH7 Family) SMBus Controller (rev 01)
01:00.0 VGA compatible controller: nVidia Corporation GeForce 6200
TurboCache(TM) (rev a1)
02:01.0 Multimedia video controller: Conexant CX23880/1/2/3 PCI Video and Audio
Decoder (rev 05)
02:01.2 Multimedia controller: Conexant CX23880/1/2/3 PCI Video and Audio
Decoder [MPEG Port] (rev 05)
02:02.0 Multimedia video controller: Internext Compression Inc iTVC16 (CX23416)
MPEG-2 Encoder (rev 01)
02:03.0 Network controller: Techsan Electronics Co Ltd B2C2 FlexCopII DVB chip /
Technisat SkyStar2 DVB card (rev 02)
02:05.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8169 Gigabit
Ethernet (rev 10)
Comment 26 Dave Jones 2006-11-20 19:26:09 EST
you have a mixture of proprietary modules, and out-of-tree modules loaded that
complicate the situation.  There's nothing that can be fixed in the Fedora
kernel related to these, and we can't rule out that they're involved or not.

In the absense of follow-up from the original reporter, I'm closing this out.
If you can reproduce it on a current kernel using just Fedora kernel modules,
please open up a new bug.
Comment 27 Stuart Midgley 2006-11-20 21:12:32 EST
As the original report of the bug, I have not had a problem with recent kernels.  I have an ever greater 
number of disks attached and hit the server harder than when I reported the bug and it appears stable.

That doesn't say, though, that their isn't a bug :)

Note You need to log in before you can comment on or make changes to this bug.