From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.2) Gecko/20040809 Description of problem: Since 1.515 (but not 1.509), nfsd has crashed as follows. Usage scenario is: NFS clients (running 1.515) mount (autofs-mounted /net tree) the filesystem referenced in the up2date sources file with a yum file: URL, and tries to grab today's rawhide updates from it. Right after the mountd message is logged, nfsd oopses, and clients hang. Server remains up, and rebooting it prints lots of messages about being unable to umount the filesystems that had been exported and mounted by other hosts, but it eventually gives up after a few umount retries. Same problem after upgrading server to 1.517. Installing FC2's 1.494.2.2 and rebooting enables clients to work again. 1.515 Oops: Unable to handle kernel NULL pointer dereference at virtual address 00000000 printing eip: 02134be0 *pde = 00000000 Oops: 0000 [#1] Modules linked in: loop snd_pcm_oss snd_mixer_oss snd_via82xx snd_ac97_codec snd_pcm snd_timer snd_page_alloc gameport snd_mpu401_uart snd_rawmidi snd_seq_device snd soundcore tun nfsd exportfs lockd usbserial parport_pc lp parport autofs4 sunrpc 8139too mii ipt_REJECT ipt_LOG ipt_state iptable_filter iptable_nat ip_conntrack iptable_mangle ip_tables floppy uhci_hcd button battery asus_acpi ac md5 ipv6 ext3 jbd raid5 xor raid1 dm_mod usb_storage sbp2 ohci1394 ieee1394 sd_mod scsi_mod CPU: 0 EIP: 0060:[<02134be0>] Not tainted EFLAGS: 00010246 (2.6.7-1.515) EIP is at page_address+0x6/0x5f eax: 00000000 ebx: 00000000 ecx: 0000000a edx: 39ebee00 esi: 377d908c edi: 00000009 ebp: 377dd800 esp: 377dbf1c ds: 007b es: 007b ss: 0068 Process nfsd (pid: 5324, threadinfo=377db000 task=370014c0) Stack: ffff75a0 377d908c 00000009 377dd800 42c2b1df 00000001 02280001 00100100 00200200 01ddb34b 42c3000a 00000006 00000003 00000006 39ebee00 39ebee00 42c2b0bc 42c3fd58 42c3fb18 42c214dd 09619018 39ebee64 39ebee00 42c3fd58 Call Trace: [<42c2b1df>] nfs3svc_decode_readargs+0x123/0x169 [nfsd] [<02280001>] fn_hash_dump+0x8/0x15d [<42c3000a>] nfsd4_decode_open_confirm+0x89/0x8e [nfsd] [<42c2b0bc>] nfs3svc_decode_readargs+0x0/0x169 [nfsd] [<42c214dd>] nfsd_dispatch+0x6a/0x15d [nfsd] [<42bb7b3a>] svc_process+0x32b/0x569 [sunrpc] [<42c2133a>] nfsd+0x18e/0x2c7 [nfsd] [<42c211ac>] nfsd+0x0/0x2c7 [nfsd] [<021031d9>] kernel_thread_helper+0x5/0xb Code: 8b 00 f6 c4 01 75 19 2b 1d 30 dd 37 02 c1 fb 05 c1 e3 0c 8d 1.517 Oops: Unable to handle kernel NULL pointer dereference at virtual address 00000000 printing eip: 02134be8 *pde = 00000000 Oops: 0000 [#1] Modules linked in: tun nfsd exportfs lockd usbserial parport_pc lp parport autofs4 sunrpc 8139too mii ipt_REJECT ipt_LOG ipt_state iptable_filter iptable_nat ip_conntrack iptable_mangle ip_tables floppy uhci_hcd button battery asus_acpi ac md5 ipv6 ext3 jbd raid5 xor raid1 dm_mod usb_storage sbp2 ohci1394 ieee1394 sd_mod scsi_mod CPU: 0 EIP: 0060:[<02134be8>] Not tainted EFLAGS: 00010246 (2.6.7-1.517) EIP is at page_address+0x6/0x5f eax: 00000000 ebx: 00000000 ecx: 0000000a edx: 39eb3400 esi: 376d508c edi: 00000009 ebp: 376d4800 esp: 37678f1c ds: 007b es: 007b ss: 0068 Process nfsd (pid: 5021, threadinfo=37678000 task=3766c410) Stack: ffff75a0 376d508c 00000009 376d4800 42acb1df 00000001 02280001 00100100 00200200 0034c95c 42ad000a 00000006 00000003 00000006 39eb3400 39eb3400 42acb0bc 42adfd58 42adfb18 42ac14dd 3e03f018 39eb3464 39eb3400 42adfd58 Call Trace: [<42acb1df>] nfs3svc_decode_readargs+0x123/0x169 [nfsd] [<02280001>] fn_hash_delete+0x13f/0x250 [<42ad000a>] nfsd4_decode_open_confirm+0x89/0x8e [nfsd] [<42acb0bc>] nfs3svc_decode_readargs+0x0/0x169 [nfsd] [<42ac14dd>] nfsd_dispatch+0x6a/0x15d [nfsd] [<42a57b3a>] svc_process+0x32b/0x569 [sunrpc] [<42ac133a>] nfsd+0x18e/0x2c7 [nfsd] [<42ac11ac>] nfsd+0x0/0x2c7 [nfsd] [<021031d9>] kernel_thread_helper+0x5/0xb Code: 8b 00 f6 c4 01 75 19 2b 1d 30 dd 37 02 c1 fb 05 c1 e3 0c 8d Version-Release number of selected component (if applicable): kernel-2.6.7-1.515 and 1.517, but not 1.509 or earlier How reproducible: Always Steps to Reproduce: 1.Try to up2date a rawhide box from a yum repo mounted over NFS from another rawhide box. Actual Results: nfsd oopses on the server. Expected Results: normal operation, just like on 1.509 and before. Additional info:
2.6.8-1.520 doesn't have the problem AFAICT.
OTOH, nfs clients experience this kind of error: Unable to handle kernel NULL pointer dereference at virtual address 00000014 printing eip: 42af4b71 *pde = 00000000 Oops: 0002 [#1] Modules linked in: nfs tun nfsd exportfs lockd md5 ipv6 parport_pc lp parport autofs4 rfcomm l2cap bluetooth sunrpc iptable_filter 8139too mii iptable_nat ip_conntrack iptable_mangle ip_tables sg scsi_mod uhci_hcd button battery asus_acpi ac ext3 jbd raid1 dm_mod CPU: 0 EIP: 0060:[<42af4b71>] Not tainted EFLAGS: 00010246 (2.6.8-1.520) EIP is at nfs3_request_init+0xb/0x13 [nfs] eax: 00000000 ebx: 2f9ce7e0 ecx: 04b8a73c edx: 22263560 esi: 24850c80 edi: 00000000 ebp: 034a1de0 esp: 24850c50 ds: 007b es: 007b ss: 0068 Process emacs (pid: 25859, threadinfo=24850000 task=3f0d78d0) Stack: 24850c68 42aed61e 2f9ce7e0 27775600 04b8a73c 22263560 1d244b3c 00000000 0000000a 42b04283 00000000 00000000 42af3228 00000004 04b8a73c 24850ddc 034a1de0 000002c1 42aefd45 00000000 000002c1 00000000 0213da36 24850d30 Call Trace: [<42aed61e>] nfs_create_request+0x106/0x113 [nfs] [<42af3228>] nfs_sync_inode+0x4c/0x57 [nfs] [<42aefd45>] readpage_async_filler+0x54/0xfe [nfs] [<0213da36>] add_to_page_cache+0x9f/0x12f [<42aefcf1>] readpage_async_filler+0x0/0xfe [nfs] [<02144df6>] read_cache_pages+0x6e/0xdf [<0211be05>] autoremove_wake_function+0x0/0x2d [<42a378a1>] rpc_call_sync+0x7a/0x87 [sunrpc] [<42aefe5e>] nfs_readpages+0x6f/0x91 [nfs] [<02144e9a>] read_pages+0x33/0xdd [<021420c4>] buffered_rmqueue+0x1e9/0x20c [<0214239b>] __alloc_pages+0x2b4/0x2be [<02145500>] do_page_cache_readahead+0x29f/0x2bf [<02145663>] page_cache_readahead+0x143/0x1b0 [<0213e516>] do_generic_mapping_read+0x94/0x305 [<0213e9e5>] __generic_file_aio_read+0x15d/0x177 [<0213e787>] file_read_actor+0x0/0x101 [<0213ea3f>] generic_file_aio_read+0x40/0x47 [<42ae9349>] nfs_file_read+0xcc/0xd6 [nfs] [<022f5b6a>] __cond_resched+0x14/0x3b [<02160992>] do_sync_read+0x6a/0x99 [<0215fb5e>] filp_open+0x36/0x3c [<02160a79>] vfs_read+0xb8/0xe4 [<02160c5e>] sys_read+0x3c/0x62 Code: ff 40 14 89 43 18 5b c3 56 31 f6 53 89 c3 8b 40 0c 39 d0 75
Fixed in 2.6.8-1.521 in FC2 testing updates. As soon as 2.6.8.1 makes it to FCdevel, this can be closed/rawhide.
we now have 2.6.9rc, this should be closed.