Bug 129835 - nfsd oopses right after mountd authenticates mount request
Summary: nfsd oopses right after mountd authenticates mount request
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: i686
OS: Linux
medium
high
Target Milestone: ---
Assignee: Dave Jones
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks: FC3Target FC4Target
TreeView+ depends on / blocked
 
Reported: 2004-08-13 04:07 UTC by Alexandre Oliva
Modified: 2015-01-04 22:08 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-03-11 17:57:23 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Alexandre Oliva 2004-08-13 04:07:53 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.2)
Gecko/20040809

Description of problem:
Since 1.515 (but not 1.509), nfsd has crashed as follows.

Usage scenario is:  NFS clients (running 1.515) mount (autofs-mounted
/net tree) the filesystem referenced in the up2date sources file with
a yum file: URL, and tries to grab today's rawhide updates from it. 
Right after the mountd message is logged, nfsd oopses, and clients
hang.  Server remains up, and rebooting it prints lots of messages
about being unable to umount the filesystems that had been exported
and mounted by other hosts, but it eventually gives up after a few
umount retries.

Same problem after upgrading server to 1.517.  Installing FC2's
1.494.2.2 and rebooting enables clients to work again.

1.515 Oops:
Unable to handle kernel NULL pointer dereference at virtual address
00000000
 printing eip:
02134be0
*pde = 00000000
Oops: 0000 [#1]
Modules linked in: loop snd_pcm_oss snd_mixer_oss snd_via82xx
snd_ac97_codec snd_pcm snd_timer snd_page_alloc gameport
snd_mpu401_uart snd_rawmidi snd_seq_device snd soundcore tun nfsd
exportfs lockd usbserial parport_pc lp parport autofs4 sunrpc 8139too
mii ipt_REJECT ipt_LOG ipt_state iptable_filter iptable_nat
ip_conntrack iptable_mangle ip_tables floppy uhci_hcd button battery
asus_acpi ac md5 ipv6 ext3 jbd raid5 xor raid1 dm_mod usb_storage sbp2
ohci1394 ieee1394 sd_mod scsi_mod
CPU:    0
EIP:    0060:[<02134be0>]    Not tainted
EFLAGS: 00010246   (2.6.7-1.515) 
EIP is at page_address+0x6/0x5f
eax: 00000000   ebx: 00000000   ecx: 0000000a   edx: 39ebee00
esi: 377d908c   edi: 00000009   ebp: 377dd800   esp: 377dbf1c
ds: 007b   es: 007b   ss: 0068
Process nfsd (pid: 5324, threadinfo=377db000 task=370014c0)
Stack: ffff75a0 377d908c 00000009 377dd800 42c2b1df 00000001 02280001
00100100 
       00200200 01ddb34b 42c3000a 00000006 00000003 00000006 39ebee00
39ebee00 
       42c2b0bc 42c3fd58 42c3fb18 42c214dd 09619018 39ebee64 39ebee00
42c3fd58 
Call Trace:
 [<42c2b1df>] nfs3svc_decode_readargs+0x123/0x169 [nfsd]
 [<02280001>] fn_hash_dump+0x8/0x15d
 [<42c3000a>] nfsd4_decode_open_confirm+0x89/0x8e [nfsd]
 [<42c2b0bc>] nfs3svc_decode_readargs+0x0/0x169 [nfsd]
 [<42c214dd>] nfsd_dispatch+0x6a/0x15d [nfsd]
 [<42bb7b3a>] svc_process+0x32b/0x569 [sunrpc]
 [<42c2133a>] nfsd+0x18e/0x2c7 [nfsd]
 [<42c211ac>] nfsd+0x0/0x2c7 [nfsd]
 [<021031d9>] kernel_thread_helper+0x5/0xb
Code: 8b 00 f6 c4 01 75 19 2b 1d 30 dd 37 02 c1 fb 05 c1 e3 0c 8d 

1.517 Oops:
Unable to handle kernel NULL pointer dereference at virtual address
00000000
 printing eip:
02134be8
*pde = 00000000
Oops: 0000 [#1]
Modules linked in: tun nfsd exportfs lockd usbserial parport_pc lp
parport autofs4 sunrpc 8139too mii ipt_REJECT ipt_LOG ipt_state
iptable_filter iptable_nat ip_conntrack iptable_mangle ip_tables
floppy uhci_hcd button battery asus_acpi ac md5 ipv6 ext3 jbd raid5
xor raid1 dm_mod usb_storage sbp2 ohci1394 ieee1394 sd_mod scsi_mod
CPU:    0
EIP:    0060:[<02134be8>]    Not tainted
EFLAGS: 00010246   (2.6.7-1.517) 
EIP is at page_address+0x6/0x5f
eax: 00000000   ebx: 00000000   ecx: 0000000a   edx: 39eb3400
esi: 376d508c   edi: 00000009   ebp: 376d4800   esp: 37678f1c
ds: 007b   es: 007b   ss: 0068
Process nfsd (pid: 5021, threadinfo=37678000 task=3766c410)
Stack: ffff75a0 376d508c 00000009 376d4800 42acb1df 00000001 02280001
00100100 
       00200200 0034c95c 42ad000a 00000006 00000003 00000006 39eb3400
39eb3400 
       42acb0bc 42adfd58 42adfb18 42ac14dd 3e03f018 39eb3464 39eb3400
42adfd58 
Call Trace:
 [<42acb1df>] nfs3svc_decode_readargs+0x123/0x169 [nfsd]
 [<02280001>] fn_hash_delete+0x13f/0x250
 [<42ad000a>] nfsd4_decode_open_confirm+0x89/0x8e [nfsd]
 [<42acb0bc>] nfs3svc_decode_readargs+0x0/0x169 [nfsd]
 [<42ac14dd>] nfsd_dispatch+0x6a/0x15d [nfsd]
 [<42a57b3a>] svc_process+0x32b/0x569 [sunrpc]
 [<42ac133a>] nfsd+0x18e/0x2c7 [nfsd]
 [<42ac11ac>] nfsd+0x0/0x2c7 [nfsd]
 [<021031d9>] kernel_thread_helper+0x5/0xb
Code: 8b 00 f6 c4 01 75 19 2b 1d 30 dd 37 02 c1 fb 05 c1 e3 0c 8d 


Version-Release number of selected component (if applicable):
kernel-2.6.7-1.515 and 1.517, but not 1.509 or earlier

How reproducible:
Always

Steps to Reproduce:
1.Try to up2date a rawhide box from a yum repo mounted over NFS from
another rawhide box.

Actual Results:  nfsd oopses on the server.

Expected Results:  normal operation, just like on 1.509 and before.

Additional info:

Comment 1 Alexandre Oliva 2004-08-15 18:11:38 UTC
2.6.8-1.520 doesn't have the problem AFAICT.

Comment 2 Alexandre Oliva 2004-08-16 00:37:42 UTC
OTOH, nfs clients experience this kind of error:

Unable to handle kernel NULL pointer dereference at virtual address
00000014
 printing eip:
42af4b71
*pde = 00000000
Oops: 0002 [#1]
Modules linked in: nfs tun nfsd exportfs lockd md5 ipv6 parport_pc lp
parport autofs4 rfcomm l2cap bluetooth sunrpc iptable_filter 8139too
mii iptable_nat ip_conntrack iptable_mangle ip_tables sg scsi_mod
uhci_hcd button battery asus_acpi ac ext3 jbd raid1 dm_mod
CPU:    0
EIP:    0060:[<42af4b71>]    Not tainted
EFLAGS: 00010246   (2.6.8-1.520) 
EIP is at nfs3_request_init+0xb/0x13 [nfs]
eax: 00000000   ebx: 2f9ce7e0   ecx: 04b8a73c   edx: 22263560
esi: 24850c80   edi: 00000000   ebp: 034a1de0   esp: 24850c50
ds: 007b   es: 007b   ss: 0068
Process emacs (pid: 25859, threadinfo=24850000 task=3f0d78d0)
Stack: 24850c68 42aed61e 2f9ce7e0 27775600 04b8a73c 22263560 1d244b3c
00000000 
       0000000a 42b04283 00000000 00000000 42af3228 00000004 04b8a73c
24850ddc 
       034a1de0 000002c1 42aefd45 00000000 000002c1 00000000 0213da36
24850d30 
Call Trace:
 [<42aed61e>] nfs_create_request+0x106/0x113 [nfs]
 [<42af3228>] nfs_sync_inode+0x4c/0x57 [nfs]
 [<42aefd45>] readpage_async_filler+0x54/0xfe [nfs]
 [<0213da36>] add_to_page_cache+0x9f/0x12f
 [<42aefcf1>] readpage_async_filler+0x0/0xfe [nfs]
 [<02144df6>] read_cache_pages+0x6e/0xdf
 [<0211be05>] autoremove_wake_function+0x0/0x2d
 [<42a378a1>] rpc_call_sync+0x7a/0x87 [sunrpc]
 [<42aefe5e>] nfs_readpages+0x6f/0x91 [nfs]
 [<02144e9a>] read_pages+0x33/0xdd
 [<021420c4>] buffered_rmqueue+0x1e9/0x20c
 [<0214239b>] __alloc_pages+0x2b4/0x2be
 [<02145500>] do_page_cache_readahead+0x29f/0x2bf
 [<02145663>] page_cache_readahead+0x143/0x1b0
 [<0213e516>] do_generic_mapping_read+0x94/0x305
 [<0213e9e5>] __generic_file_aio_read+0x15d/0x177
 [<0213e787>] file_read_actor+0x0/0x101
 [<0213ea3f>] generic_file_aio_read+0x40/0x47
 [<42ae9349>] nfs_file_read+0xcc/0xd6 [nfs]
 [<022f5b6a>] __cond_resched+0x14/0x3b
 [<02160992>] do_sync_read+0x6a/0x99
 [<0215fb5e>] filp_open+0x36/0x3c
 [<02160a79>] vfs_read+0xb8/0xe4
 [<02160c5e>] sys_read+0x3c/0x62
Code: ff 40 14 89 43 18 5b c3 56 31 f6 53 89 c3 8b 40 0c 39 d0 75 


Comment 3 Alexandre Oliva 2004-08-17 13:08:57 UTC
Fixed in 2.6.8-1.521 in FC2 testing updates.  As soon as 2.6.8.1 makes
it to FCdevel, this can be closed/rawhide.

Comment 4 Ronny Buchmann 2004-10-19 20:38:41 UTC
we now have 2.6.9rc, this should be closed.


Note You need to log in before you can comment on or make changes to this bug.