Bug 4283

Summary:	Unable to handle kernel NULL pointer
Product:	[Retired] Red Hat Linux	Reporter:	stephen
Component:	knfsd	Assignee:	David Lawrence <dkl>
Status:	CLOSED NEXTRELEASE	QA Contact:
Severity:	high	Docs Contact:
Priority:	high
Version:	6.0
Target Milestone:	---
Target Release:	---
Hardware:	i386
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	1999-08-21 17:06:03 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description stephen 1999-07-31 07:31:16 UTC

I have knfsd-1.4.4 and must use the new, fixed restart every
couple of days.  If I forget, the server will stop
responding.

After re-booting I get an error message in about 1-10
seconds and the server stops responding again.

The only way to bring the server back up is to re-boot
every computer on the network, then re-boot the NFS server.

My network contains 3 Linux NFS servers and about 30 i386
Linux and Ultra 60 SunOS 5.6 clients.

I have not yet applied the mountd growth fix I found a
couple minutes ago.

The error message just after re-boot of the nfs server:

Linux version 2.2.5-22 (root.redhat.com) (gcc
version egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)) #1
Wed Jun 2 09:02:27 EDT 1999
Detected 451026999 Hz processor.
...
ncr53c876-0-<6,*>: FAST-20 WIDE SCSI 40.0 MB/s (50 ns,
offset 15)
SCSI device sda: hdwr sector= 512 bytes. Sectors= 35970860
[17563 MB] [17.6 GB]
 sda: sda1 sda2 sda3 < sda5 sda6 >
...
Installing knfsd (copyright (C) 1996 okir.de).
nfsd_init: initialized fhcache, entries=256
eth0: Changing PNIC configuration to full-duplex, CSR6
812e0200.
Unable to handle kernel NULL pointer dereference at virtual
address 00000008
current->tss.cr3 = 00101000, %cr3 = 00101000
*pde = 00000000
Oops: 0000
CPU:    0
EIP:    0010:[<c807d98e>]
EFLAGS: 00010282
eax: 00000000   ebx: c7dc8000   ecx: 00000000   edx:
c676801c
esi: c7dc8000   edi: c6768014   ebp: c7dc8000   esp:
c67dbf60
ds: 0018   es: 0018   ss: 0018
Process nfsd (pid: 450, process nr: 24, stackpage=c67db000)
Stack: c6768014 c8077462 c7dc8000 c676801c c6e0b360 c7dc80f4
c8080680 c6e0b39c
       c8065354 c7dc8000 c6768014 c67da000 c67da000 00000000
c7dc8000 c6e0b360
       c80809a0 00000000 00000002 000186a3 00000002 c6768014
c808052c 00000000
Call Trace: [<c8077462>] [<c8080680>] [<c8065354>]
[<c80809a0>] [<c808052c>] [<c8077221>] [<c010813b>]
Code: 8b 58 08 85 db 75 07 31 d2 e9 fd 00 00 00 66 8b 43 22
66 c1
Unable to handle kernel NULL pointer dereference at virtual
address 00000008
current->tss.cr3 = 00101000, %cr3 = 00101000
*pde = 00000000
Oops: 0000
CPU:    0
EIP:    0010:[<c807d98e>]
EFLAGS: 00010282
eax: 00000000   ebx: c67d8a00   ecx: 00000000   edx:
c678801c
esi: c67d8a00   edi: c6788014   ebp: c67d8a00   esp:
c6787f60
ds: 0018   es: 0018   ss: 0018
ds: 0018   es: 0018   ss: 0018
Process nfsd (pid: 453, process nr: 27, stackpage=c6787000)
Stack: c6788014 c8077462 c67d8a00 c678801c c6e0ba20 c67d8af4
c8080680 c6e0ba5c
       c8065354 c67d8a00 c6788014 c6786000 c6786000 00000000
c67d8a00 c6e0ba20
       c80809a0 00000000 00000002 000186a3 00000002 c6788014
c808052c 00000000
Call Trace: [<c8077462>] [<c8080680>] [<c8065354>]
[<c80809a0>] [<c808052c>] [<c8077221>] [<c010813b>]
Code: 8b 58 08 85 db 75 07 31 d2 e9 fd 00 00 00 66 8b 43 22
66 c1

Comment 1 stephen 1999-08-07 21:23:59 UTC

This problem could not be duplicated once I upgraded to
glibc-2.1.2-2.i386.rpm

The memory leak and NULL pointer both went away.

Comment 2 Jeff Johnson 1999-08-21 17:06:59 UTC

This problem appears to be resolved.

Comment 3 das_deniz 2000-09-21 19:38:26 UTC

well this doesn't fix the problem for a stock 6.0 RH distro
so perhaps the kernel upgrade is also required.

# netstat -a
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State      
...
udp    65456      0 *:2049                  *:*                                 
...

Linux version 2.2.5-15 (root.redhat.com) (gcc version egcs-2.91.66
19990314/Linux (egcs-1.1.2 release)) #1 Mon Apr 19 23:00:46 EDT 1999 
Detected 448978865 Hz processor. 
Console: colour VGA+ 80x25 
Calibrating delay loop... 447.28 BogoMIPS 
Memory: 127972k/131008k available (996k kernel code, 412k reserved, 1568k data,
60k init) 
VFS: Diskquotas version dquot_6.4.0 initialized 
CPU: Intel Celeron (Mendocino) stepping 00

 Unable to handle kernel NULL pointer dereference at virtual address 00000008 
current->tss.cr3 = 00101000, %cr3 = 00101000 
*pde = 00000000 
Oops: 0000 
CPU:    0 
EIP:    0010:[3c59x+127218/77496320] 
EFLAGS: 00010282 
eax: 00000000   ebx: c0a7be00   ecx: 00000000   edx: c130401c 
esi: c0a7be00   edi: c1304014   ebp: c0a7be00   esp: c15a9f60 
ds: 0018   es: 0018   ss: 0018 
Process nfsd (pid: 500, process nr: 25, stackpage=c15a9000) 
Stack: c1304014 c803e462 c0a7be00 c130401c c039db60 c0a7bef4 c8047680 c039db9c  
       c8023354 c0a7be00 c1304014 c15a8000 c15a8000 00000001 c0a7be00 c039db60  
       c80479a0 00000001 00000002 000186a3 00000002 c1304014 c804752c 00000000  
Call Trace: [3c59x+101318/77496320] [3c59x+138724/77496320]
[lockd:nlmclnt_proc_R05b69af3+-36864/5332] [3c59x+139524/77496320]
[3c59x+138384/77496320] [3c59x+100741/77496320] [kernel_thread+35/48]  
Code: 8b 58 08 85 db 75 07 31 d2 e9 fd 00 00 00 66 8b 43 22 66 c1

Comment 4 das_deniz 2000-10-09 19:50:43 UTC

kernel upgrade and nfs-utils replacement of broken knfsd required.
weird that this port lockup/kernel oops took so long to show up .