Bug 65256 - NFS Oops nfsd system crash
NFS Oops nfsd system crash
Status: CLOSED CANTFIX
Product: Red Hat Linux
Classification: Retired
Component: nfs-utils (Show other bugs)
7.1
i586 Linux
medium Severity high
: ---
: ---
Assigned To: Pete Zaitcev
Ben Levenson
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2002-05-20 21:26 EDT by Ray Shantz
Modified: 2007-04-18 12:42 EDT (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-10-18 14:20:13 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Ray Shantz 2002-05-20 21:26:46 EDT
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows 95)

Description of problem:
When moving a large amount of files via NFS (RedHat acting as the NFS server) 
the server will crash with an Ooops message. I have upgraded to the latest 
kernel (2.4.9-31) in an effort to correct this problem but that didn't help. 

Here is a brief excerpt from /var/log/message:

May  2 12:35:10 effects kernel: Unable to handle kernel NULL pointer 
dereference at virtual address 00000004
May  2 12:35:10 effects kernel:  printing eip:
May  2 12:35:10 effects kernel: c014b27d
May  2 12:35:10 effects kernel: pgd entry c1c40000: 0000000000000000
May  2 12:35:10 effects kernel: pmd entry c1c40000: 0000000000000000
May  2 12:35:10 effects kernel: ... pmd not present!
May  2 12:35:10 effects kernel: Oops: 0002
May  2 12:35:10 effects kernel: CPU:    1
May  2 12:35:10 effects kernel: EIP:    0010:[dput+173/384]
May  2 12:35:10 effects kernel: EIP:    0010:[<c014b27d>]
May  2 12:35:10 effects kernel: EFLAGS: 00010246
May  2 12:35:10 effects kernel: eax: 00000000   ebx: d2e7cda0   ecx: d2e7cdc8   
edx: 00000000
May  2 12:35:10 effects kernel: esi: d2e7cda0   edi: c8b566e0   ebp: df107de8   
esp: df107ca4
May  2 12:35:10 effects kernel: ds: 0018   es: 0018   ss: 0018
May  2 12:35:10 effects kernel: Process nfsd (pid: 872, stackpage=df107000)
May  2 12:35:10 effects kernel: Stack: 00000020 d2e7ce48 d2e7cda0 d2e7ce20 
d2e7cda0 d2e7ce20 e08d7ee4 d2e7cda0 
May  2 12:35:10 effects kernel:        d2e7ce20 00000000 c8b566e0 d2e7ce20 
c8b566e0 e08d816f d2e7ce20 c8b566e0 
May  2 12:35:10 effects kernel:        df107de8 72656d41 6e616369 75616542 
302e7974 2e383130 00696773 c01d85e6 




Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Use tar to copy 26Gb from the NFS mount point (RedHat server) to a local 
mount point (SGI local raid). After about 4Gb the server will crash.

2. The NFS client is an SGI Irix 6.5

3. Command was tar -cvf /redhat-mount /sgi-local
	

Actual Results:  May  2 12:35:10 effects kernel: Unable to handle kernel NULL 
pointer dereference at virtual address 00000004
May  2 12:35:10 effects kernel:  printing eip:
May  2 12:35:10 effects kernel: c014b27d
May  2 12:35:10 effects kernel: pgd entry c1c40000: 0000000000000000
May  2 12:35:10 effects kernel: pmd entry c1c40000: 0000000000000000
May  2 12:35:10 effects kernel: ... pmd not present!
May  2 12:35:10 effects kernel: Oops: 0002
May  2 12:35:10 effects kernel: CPU:    1
May  2 12:35:10 effects kernel: EIP:    0010:[dput+173/384]
May  2 12:35:10 effects kernel: EIP:    0010:[<c014b27d>]
May  2 12:35:10 effects kernel: EFLAGS: 00010246
May  2 12:35:10 effects kernel: eax: 00000000   ebx: d2e7cda0   ecx: d2e7cdc8   
edx: 00000000
May  2 12:35:10 effects kernel: esi: d2e7cda0   edi: c8b566e0   ebp: df107de8   
esp: df107ca4
May  2 12:35:10 effects kernel: ds: 0018   es: 0018ss: 0018
May  2 12:35:10 effects kernel: Process nfsd (pid: 872, stackpage=df107000)
May  2 12:35:10 effects kernel: Stack: 00000020 d2e7ce48 d2e7cda0 d2e7ce20 
d2e7cda0 d2e7ce20 e08d7ee4 d2e7cda0 
May  2 12:35:10 effects kernel:        d2e7ce20 00000000 c8b566e0 d2e7ce20 
c8b566e0 e08d816f d2e7ce20 c8b566e0 
May  2 12:35:10 effects kernel:        df107de8 72656d41 6e616369 75616542 
302e7974 2e383130 00696773 c01d85e6 
May  2 12:35:10 effects kernel: Call Trace: 
[raid5:__insmod_raid5_S.data_L64+371460/71999836] 
[raid5:__insmod_raid5_S.data_L64+372111/71999185] [ip_output+230/336] 
[udp_getfrag+78/208] [run_local_timers+144/272] 
May  2 12:35:10 effects kernel: Call Trace: [<e08d7ee4>] [<e08d816f>] 
[<c01d85e6>] [<c01f118e>] [<c011fc10>] 
May  2 12:35:10 effects kernel:    [smp_apic_timer_interrupt+279/288] 
[call_apic_timer_interrupt+5/13] [kmem_cache_alloc+57/160] [iget4+82/256] 
[iput+73/400] [raid5:__insmod_raid5_S.data_L64+371191/72000105] 
May  2 12:35:10 effects kernel:    [<c0112d77>] [<c010b90c>] [<c012e4a9>] 
[<c014da52>] [<c014dbd9>] [<e08d7dd7>] 
May  2 12:35:10 effects kernel:    
[raid5:__insmod_raid5_S.data_L64+372634/71998662] 
[raid5:__insmod_raid5_S.data_L64+373731/71997565] 
[raid5:__insmod_raid5_S.data_L64+377831/71993465] 
[raid5:__insmod_raid5_S.data_L64+378574/71992722] 
[raid5:__insmod_raid5_S.data_L64+399419/71971877] 
[raid5:__insmod_raid5_S.data_L64+427456/71943840] 
May  2 12:35:10 effects kernel:    [<e08d837a>] [<e08d87c3>] [<e08d97c7>] 
[<e08d9aae>] [<e08dec1b>] [<e08e59a0>] 
May  2 12:35:10 effects kernel:    
[raid5:__insmod_raid5_S.data_L64+364977/72006319] 
[raid5:__insmod_raid5_S.data_L64+427456/71943840] 
[raid5:__insmod_raid5_S.data_L64+263587/72107709] 
[raid5:__insmod_raid5_S.data_L64+425848/71945448] 
[raid5:__insmod_raid5_S.data_L64+425880/71945416] 
[raid5:__insmod_raid5_S.data_L64+364473/72006823] 
May  2 12:35:10 effects kernel:    [<e08d6591>] [<e08e59a0>] [<e08bd983>] 
[<e08e5358>] [<e08e5378>] [<e08d6399>] 
May  2 12:35:10 effects kernel:    
[raid5:__insmod_raid5_S.data_L64+425824/71945472] [kernel_thread+38/48] 
[raid5:__insmod_raid5_S.data_L64+363952/72007344] 
May  2 12:35:10 effects kernel:    [<e08e5340>] [<c0105686>] [<e08d6190>] 
May  2 12:35:10 effects kernel: 
May  2 12:35:10 effects kernel: Code: 89 50 04 89 02 c7 43 28 00 00 00 00 c7 41 
04 00 00 00 00 8b 
May  2 12:52:54 effects syslogd 1.4-0: restart.


Expected Results:  All files copied without crash.

Additional info:
Comment 1 Ray Shantz 2002-05-22 17:33:37 EDT
I did some more testing and can also cause the crash with the cp command. The 
crash only seems to happen when moving data from the RedHat NFS mount point to 
a local disk. If I send data from the local disk to the RedHat NFS mount point 
then the crash does not happen. With all these tests the SGI is always the one 
initiating the copy or tar command. I also turned off NFS on the RedHat machine 
and mounted a SGI NFS mount point (RedHat as the client not the server). I 
didn't experiance any crash with RedHat as the NFS client. 

Commands used on the SGI:

cp -r /RedHat-NFS/* /Local-SGI  (crash after about a gig)
cp -r /Local-SGI/* /RedHat-NFS  (No crash)

Commands used on RedHat:

cp -a /Local-RedHat/* /SGI-NFS (No crash)
Comment 2 Pete Zaitcev 2004-08-26 19:15:57 EDT
Is this still an issue? What release?
Comment 3 Bill Nottingham 2006-10-18 14:20:13 EDT
Red Hat Linux is no longer supported by Red Hat, Inc. If you are still
running Red Hat Linux, you are strongly advised to upgrade to a
current Fedora Core release or Red Hat Enterprise Linux or comparable.
Some information on which option may be right for you is available at
http://www.redhat.com/rhel/migrate/redhatlinux/.

Red Hat apologizes that these issues have not been resolved yet. We do
want to make sure that no important bugs slip through the cracks.
If this issue is still present in a current Fedora Core release, please
open a new bug with the relevant information.

Closing as CANTFIX.

Note You need to log in before you can comment on or make changes to this bug.