Description of Problem: I have a vfat formatted HD in my RH 7.2 NFS server which is being exported to the client. When I access the files on the HD (eg. xmms playing MP3s) the server oopses after a short time and NFS locks up. Version-Release number of selected component (if applicable): RH 7.2, kernel kernel-2.4.9-13 (i586) How Reproducible: Statistically, within seconds usually Steps to Reproduce: 1. mount vfat HD locally 2. export files from that HD via NFS, no special options 3. access the files remotely via NFS Actual Results: Nov 12 01:26:24 meep kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000008 Nov 12 01:26:24 meep kernel: printing eip: Nov 12 01:26:24 meep kernel: c888733c Nov 12 01:26:24 meep kernel: *pde = 00000000 Nov 12 01:26:24 meep kernel: Oops: 0000 Nov 12 01:26:24 meep kernel: CPU: 0 Nov 12 01:26:24 meep kernel: EIP: 0010:[ext3:__insmod_ext3_S.bss_L1024+447388/115588520] Not tainted Nov 12 01:26:24 meep kernel: EIP: 0010:[<c888733c>] Not tainted Nov 12 01:26:24 meep kernel: EFLAGS: 00010217 Nov 12 01:26:24 meep kernel: eax: 00000000 ebx: 00000000 ecx: 00000005 edx: 00000003 Nov 12 01:26:24 meep kernel: esi: c5a30800 edi: 00000000 ebp: c729f3c0 esp: c7875e10 Nov 12 01:26:24 meep kernel: ds: 0018 es: 0018 ss: 0018 Nov 12 01:26:24 meep kernel: Process nfsd (pid: 1203, stackpage=c7875000) Nov 12 01:26:24 meep kernel: Stack: c5911200 00000005 c6fa6800 c6fa6c04 c619f800 c888773a c5a30800 c6fa6c14 Nov 12 01:26:24 meep kernel: 00000005 00000003 00000001 c1718580 c6fa6c14 11270000 c6fa6c04 00000000 Nov 12 01:26:24 meep kernel: 00000000 00000000 00000000 c6fa6c04 c7875ec8 06202000 c6fa6c04 c8888ac7 Nov 12 01:26:24 meep kernel: Call Trace: [ext3:__insmod_ext3_S.bss_L1024+448410/115587498] __insmod_nfsd_S.text_L52016 [nfsd] 0x26da Nov 12 01:26:24 meep kernel: Call Trace: [<c888773a>] __insmod_nfsd_S.text_L52016 [nfsd] 0x26da Nov 12 01:26:24 meep kernel: [ext3:__insmod_ext3_S.bss_L1024+453415/115582493] __insmod_nfsd_S.text_L52016 [nfsd] 0x3a67 Nov 12 01:26:24 meep kernel: [<c8888ac7>] __insmod_nfsd_S.text_L52016 [nfsd] 0x3a67 Nov 12 01:26:24 meep kernel: [ext3:__insmod_ext3_S.bss_L1024+454158/115581750] __insmod_nfsd_S.text_L52016 [nfsd] 0x3d4e Nov 12 01:26:24 meep kernel: [<c8888dae>] __insmod_nfsd_S.text_L52016 [nfsd] 0x3d4e Nov 12 01:26:24 meep kernel: [ext3:__insmod_ext3_S.bss_L1024+611776/115424132] __insmod_fat_S.data_L704 [fat] 0x80 Nov 12 01:26:24 meep kernel: [<c88af560>] __insmod_fat_S.data_L704 [fat] 0x80 Nov 12 01:26:24 meep kernel: [ext3:__insmod_ext3_S.bss_L1024+335000/115700908] svc_wake_up_Ra6e90865 [sunrpc] 0xf8 Nov 12 01:26:24 meep kernel: [<c886bc38>] svc_wake_up_Ra6e90865 [sunrpc] 0xf8 Nov 12 01:26:24 meep kernel: [schedule+622/960] schedule [kernel] 0x26e Nov 12 01:26:24 meep kernel: [<c01130fe>] schedule [kernel] 0x26e Nov 12 01:26:24 meep kernel: [ext3:__insmod_ext3_S.bss_L1024+475370/115560538] __insmod_nfsd_S.text_L52016 [nfsd] 0x902a Nov 12 01:26:24 meep kernel: [<c888e08a>] __insmod_nfsd_S.text_L52016 [nfsd] 0x902a Nov 12 01:26:24 meep kernel: [ext3:__insmod_ext3_S.bss_L1024+507168/115528740] __insmod_nfsd_S.data_L2208 [nfsd] 0x660 Nov 12 01:26:24 meep kernel: [<c8895cc0>] __insmod_nfsd_S.data_L2208 [nfsd] 0x660 Nov 12 01:26:24 meep kernel: [ext3:__insmod_ext3_S.bss_L1024+439809/115596099] __insmod_nfsd_S.text_L52016 [nfsd] 0x541 Nov 12 01:26:24 meep kernel: [<c88855a1>] __insmod_nfsd_S.text_L52016 [nfsd] 0x541 Nov 12 01:26:24 meep kernel: [ext3:__insmod_ext3_S.bss_L1024+507168/115528740] __insmod_nfsd_S.data_L2208 [nfsd] 0x660 Nov 12 01:26:24 meep kernel: [<c8895cc0>] __insmod_nfsd_S.data_L2208 [nfsd] 0x660 Nov 12 01:26:24 meep kernel: [ext3:__insmod_ext3_S.bss_L1024+333939/115701969] svc_process_R1a9ff20a [sunrpc] 0x3d3 Nov 12 01:26:24 meep kernel: [<c886b813>] svc_process_R1a9ff20a [sunrpc] 0x3d3 Nov 12 01:26:24 meep kernel: [ext3:__insmod_ext3_S.bss_L1024+505560/115530348] __insmod_nfsd_S.data_L2208 [nfsd] 0x18 Nov 12 01:26:24 meep kernel: [<c8895678>] __insmod_nfsd_S.data_L2208 [nfsd] 0x18 Nov 12 01:26:24 meep kernel: [ext3:__insmod_ext3_S.bss_L1024+505592/115530316] __insmod_nfsd_S.data_L2208 [nfsd] 0x38 Nov 12 01:26:24 meep kernel: [<c8895698>] __insmod_nfsd_S.data_L2208 [nfsd] 0x38 Nov 12 01:26:24 meep kernel: [ext3:__insmod_ext3_S.bss_L1024+439253/115596655] __insmod_nfsd_S.text_L52016 [nfsd] 0x315 Nov 12 01:26:24 meep kernel: [<c8885375>] __insmod_nfsd_S.text_L52016 [nfsd] 0x315 Nov 12 01:26:24 meep kernel: [kernel_thread+38/48] kernel_thread [kernel] 0x26 Nov 12 01:26:24 meep kernel: [<c0105716>] kernel_thread [kernel] 0x26 Nov 12 01:26:24 meep kernel: [ext3:__insmod_ext3_S.bss_L1024+438768/115597140] __insmod_nfsd_S.text_L52016 [nfsd] 0x130 Nov 12 01:26:24 meep kernel: [<c8885190>] __insmod_nfsd_S.text_L52016 [nfsd] 0x130 Nov 12 01:26:24 meep kernel: Nov 12 01:26:24 meep kernel: Nov 12 01:26:24 meep kernel: Code: 8b 47 08 bb 8c ff ff ff 85 c0 0f 84 6e 01 00 00 66 8b 40 2a Expected Results: no oops ;) Additional Information: Accessing files on locally mounted ext3 partitions via NFS works fine (everything else on the machine is ext3). I have no idea though why the oops talks about ext3, the problem definetely has been triggered by accessing files on a vfat partiton.
The ext3 bits are just red herrings: klogd has looked for all the loadable modules it knew about *when it was loaded* to decode the addresses. ext3 was loaded when klogd first ran, but nfsd was not, so klogd has tried to interpret all the nfsd addresses relative to the start of the ext3 module (which was the nearest one in memory). The lines with ext3 in them are followed by the token [nfsd], which is the kernel oops printer indicating that the address really is inside the nfsd module. You can do a "klogd -i" to force klogd to reread its module map once nfsd has been reloaded to avoid this, but that's just cosmetic: the oops as it stands still points quite clearly to nfsd in this case.
I've encountered the same issue. Follows is the output from ksymoops ('klogd -i' was run after nfs was started). The behavior has been 100% reproducable. Linux thunder 2.4.9-31 #1 Tue Feb 26 06:23:51 EST 2002 i686 unknown Error (expand_objects): cannot stat(/lib/ext3.o) for ext3 ksymoops: No such file or directory Error (expand_objects): cannot stat(/lib/jbd.o) for jbd ksymoops: No such file or directory Warning (compare_maps): ksyms_base symbol GPLONLY_IO_APIC_get_PCI_irq_vector not found in System.map. Ignoring ksyms_base entry Warning (compare_maps): mismatch on symbol partition_name , ksyms_base says c01b5fc0, System.map says c0156b50. Ignoring ksyms_base entry Warning (compare_maps): mismatch on symbol nlmsvc_grace_period , lockd says e097cd34, /lib/modules/2.4.9-31/kernel/fs/lockd/lockd.o says e097c194. Ignoring /lib/modules/2.4.9-31/kernel/fs/lockd/lockd.o entry Warning (compare_maps): mismatch on symbol nlmsvc_ops , lockd says e097cd30, /lib/modules/2.4.9-31/kernel/fs/lockd/lockd.o says e097c190. Ignoring /lib/modules/2.4.9-31/kernel/fs/lockd/lockd.o entry Warning (compare_maps): mismatch on symbol nlmsvc_timeout , lockd says e097cd38, /lib/modules/2.4.9-31/kernel/fs/lockd/lockd.o says e097c198. Ignoring /lib/modules/2.4.9-31/kernel/fs/lockd/lockd.o entry Warning (compare_maps): mismatch on symbol nfs_debug , sunrpc says e096edc0, /lib/modules/2.4.9-31/kernel/net/sunrpc/sunrpc.o says e096eaa0. Ignoring /lib/modules/2.4.9-31/kernel/net/sunrpc/sunrpc.o entry Warning (compare_maps): mismatch on symbol nfsd_debug , sunrpc says e096edc4, /lib/modules/2.4.9-31/kernel/net/sunrpc/sunrpc.o says e096eaa4. Ignoring /lib/modules/2.4.9-31/kernel/net/sunrpc/sunrpc.o entry Warning (compare_maps): mismatch on symbol nlm_debug , sunrpc says e096edc8, /lib/modules/2.4.9-31/kernel/net/sunrpc/sunrpc.o says e096eaa8. Ignoring /lib/modules/2.4.9-31/kernel/net/sunrpc/sunrpc.o entry Warning (compare_maps): mismatch on symbol rpc_debug , sunrpc says e096edbc, /lib/modules/2.4.9-31/kernel/net/sunrpc/sunrpc.o says e096ea9c. Ignoring /lib/modules/2.4.9-31/kernel/net/sunrpc/sunrpc.o entry Warning (compare_maps): mismatch on symbol rpc_garbage_args , sunrpc says e096ed9c, /lib/modules/2.4.9-31/kernel/net/sunrpc/sunrpc.o says e096ea7c. Ignoring /lib/modules/2.4.9-31/kernel/net/sunrpc/sunrpc.o entry Warning (compare_maps): mismatch on symbol rpc_success , sunrpc says e096ed8c, /lib/modules/2.4.9-31/kernel/net/sunrpc/sunrpc.o says e096ea6c. Ignoring /lib/modules/2.4.9-31/kernel/net/sunrpc/sunrpc.o entry Warning (compare_maps): mismatch on symbol rpc_system_err , sunrpc says e096eda0, /lib/modules/2.4.9-31/kernel/net/sunrpc/sunrpc.o says e096ea80. Ignoring /lib/modules/2.4.9-31/kernel/net/sunrpc/sunrpc.o entry Warning (compare_maps): mismatch on symbol xdr_one , sunrpc says e096ed84, /lib/modules/2.4.9-31/kernel/net/sunrpc/sunrpc.o says e096ea64. Ignoring /lib/modules/2.4.9-31/kernel/net/sunrpc/sunrpc.o entry Warning (compare_maps): mismatch on symbol xdr_two , sunrpc says e096ed88, /lib/modules/2.4.9-31/kernel/net/sunrpc/sunrpc.o says e096ea68. Ignoring /lib/modules/2.4.9-31/kernel/net/sunrpc/sunrpc.o entry Warning (compare_maps): mismatch on symbol xdr_zero , sunrpc says e096ed80, /lib/modules/2.4.9-31/kernel/net/sunrpc/sunrpc.o says e096ea60. Ignoring /lib/modules/2.4.9-31/kernel/net/sunrpc/sunrpc.o entry Warning (map_ksym_to_module): cannot match loaded module ext3 to a unique module object. Trace may not be reliable. Mar 20 03:36:42 thunder kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000008 Mar 20 03:36:42 thunder kernel: e0981350 Mar 20 03:36:42 thunder kernel: *pde = 00000000 Mar 20 03:36:42 thunder kernel: Oops: 0000 Mar 20 03:36:42 thunder kernel: CPU: 0 Mar 20 03:36:42 thunder kernel: EIP: 0010:[nls_iso8859-1:__insmod_nls_iso8859-1_O/lib/modules/2.4.9-31/kernel/fs/nls+-81072/96] Not tainted Mar 20 03:36:42 thunder kernel: EIP: 0010:[<e0981350>] Not tainted Using defaults from ksymoops -t elf32-i386 -a i386 Mar 20 03:36:42 thunder kernel: EFLAGS: 00010217 Mar 20 03:36:42 thunder kernel: eax: 00000000 ebx: 00000000 ecx: 00000005 edx: 00000003 Mar 20 03:36:42 thunder kernel: esi: dcd77000 edi: 00000000 ebp: c1c805c0 esp: dbb97ddc Mar 20 03:36:42 thunder kernel: ds: 0018 es: 0018 ss: 0018 Mar 20 03:36:42 thunder kernel: ds: 0018 es: 0018 ss: 0018 Mar 20 03:36:42 thunder kernel: Process nfsd (pid: 1678, stackpage=dbb97000) Mar 20 03:36:42 thunder kernel: Stack: c03669e0 00000005 dbf81a00 dbf80c04 dcfea000 e098174a dcd77000 dbf80c14 Mar 20 03:36:42 thunder kernel: 00000005 00000003 00000001 00000000 dbf80c14 11270000 dbf80c04 dbb97e64 Mar 20 03:36:42 thunder kernel: 00000000 dd620000 00000000 dbf81a00 dbb97ecc 00008000 dbf80c04 e0982ad7 Mar 20 03:36:42 thunder kernel: Call Trace: [nls_iso8859-1:__insmod_nls_iso8859-1_O/lib/modules/2.4.9-31/kernel/fs/nls+-80054/96] __insmod_nfsd_S.text_L52992 [nfsd] 0x26ea Mar 20 03:36:42 thunder kernel: Call Trace: [<e098174a>] __insmod_nfsd_S.text_L52992 [nfsd] 0x26ea Mar 20 03:36:42 thunder kernel: [<e0982ad7>] __insmod_nfsd_S.text_L52992 [nfsd] 0x3a77 Mar 20 03:36:42 thunder kernel: [<c01d1e0b>] ip_rcv [kernel] 0x35b Mar 20 03:36:42 thunder kernel: [<e098300d>] __insmod_nfsd_S.text_L52992 [nfsd] 0x3fad Mar 20 03:36:42 thunder kernel: [<c01c2e5a>] net_rx_action [kernel] 0x1aa Mar 20 03:36:42 thunder kernel: [<c010833a>] handle_IRQ_event [kernel] 0x3a Mar 20 03:36:42 thunder kernel: [<c011ad7b>] do_softirq [kernel] 0x4b Mar 20 03:36:42 thunder kernel: [<c01084ec>] do_IRQ [kernel] 0x9c Mar 20 03:36:42 thunder kernel: [<c0212e4c>] call_do_IRQ [kernel] 0x5 Mar 20 03:36:42 thunder kernel: [<e0988260>] __insmod_nfsd_S.text_L52992 [nfsd] 0x9200 Mar 20 03:36:42 thunder kernel: [<e09902c0>] __insmod_nfsd_S.data_L2208 [nfsd] 0x680 Mar 20 03:36:42 thunder kernel: [<e09902c0>] __insmod_nfsd_S.data_L2208 [nfsd] 0x680 Mar 20 03:36:42 thunder kernel: [<e097f571>] __insmod_nfsd_S.text_L52992 [nfsd] 0x511 Mar 20 03:36:42 thunder kernel: [<e0965529>] svc_process_R10a85615 [sunrpc] 0x349 Mar 20 03:36:42 thunder kernel: [<e098fc58>] __insmod_nfsd_S.data_L2208 [nfsd] 0x18 Mar 20 03:36:42 thunder kernel: [<e098fc78>] __insmod_nfsd_S.data_L2208 [nfsd] 0x38 Mar 20 03:36:42 thunder kernel: [<e097f367>] __insmod_nfsd_S.text_L52992 [nfsd] 0x307 Mar 20 03:36:42 thunder kernel: [<c0105746>] kernel_thread [kernel] 0x26 Mar 20 03:36:42 thunder kernel: [<e097f190>] __insmod_nfsd_S.text_L52992 [nfsd] 0x130 Mar 20 03:36:42 thunder kernel: Code: 8b 47 08 bb 8c ff ff ff 85 c0 0f 84 6d 01 00 00 0f b7 40 2a >>EIP; e0981350 <[nfsd]find_fh_dentry+160/320> <===== Trace; e098174a <[nfsd]fh_verify+23a/450> Trace; e0982ad7 <[nfsd]nfsd_open+27/1e0> Trace; c01d1e0b <ip_rcv+35b/390> Trace; e098300d <[nfsd]nfsd_write+4d/2d0> Trace; c01c2e5a <net_rx_action+1aa/270> Trace; c010833a <handle_IRQ_event+3a/70> Trace; c011ad7b <do_softirq+4b/90> Trace; c01084ec <do_IRQ+9c/b0> Trace; c0212e4c <call_do_IRQ+5/d> Trace; e0988260 <[nfsd]nfsd3_proc_write+f0/110> Trace; e09902c0 <[nfsd]nfsd_procedures3+e0/2c0> Trace; e09902c0 <[nfsd]nfsd_procedures3+e0/2c0> Trace; e097f571 <[nfsd]nfsd_dispatch+c1/190> Trace; e0965529 <[sunrpc]svc_process+349/510> Trace; e098fc58 <[nfsd]nfsd_version3+0/10> Trace; e098fc78 <[nfsd]nfsd_program+0/18> Trace; e097f367 <[nfsd]nfsd+1d7/320> Trace; c0105746 <kernel_thread+26/30> Trace; e097f190 <[nfsd]nfsd+0/320> Code; e0981350 <[nfsd]find_fh_dentry+160/320> 00000000 <_EIP>: Code; e0981350 <[nfsd]find_fh_dentry+160/320> <===== 0: 8b 47 08 mov 0x8(%edi),%eax <===== Code; e0981353 <[nfsd]find_fh_dentry+163/320> 3: bb 8c ff ff ff mov $0xffffff8c,%ebx Code; e0981358 <[nfsd]find_fh_dentry+168/320> 8: 85 c0 test %eax,%eax Code; e098135a <[nfsd]find_fh_dentry+16a/320> a: 0f 84 6d 01 00 00 je 17d <_EIP+0x17d> e09814cd <[nfsd]find_fh_dentry+2dd/320> Code; e0981360 <[nfsd]find_fh_dentry+170/320> 10: 0f b7 40 2a movzwl 0x2a(%eax),%eax 17 warnings and 2 errors issued. Results may not be reliable. *waves* Hi, batman! --gleep
last I checked nfs in 2.4.X does not support exporting vfat even remotely sanely. I'm fairly certain neilb turned it off intentionally b/c there was no REAL way of counting on locking working at all. if its on in 2.4.9.X it most likely won't be available in 2.4.18 in TNV. I would close this WONTFIX/CANTFIX -sv