Bug 73526

Summary: kernel 2.4.18-10smp crashed; problem in smbfs or VFS
Product: [Retired] Red Hat Linux Reporter: Joel Votaw <joel>
Component: kernelAssignee: Arjan van de Ven <arjanv>
Status: CLOSED CURRENTRELEASE QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 7.3CC: joel, pascal
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-09-30 15:39:54 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Joel Votaw 2002-09-05 18:33:51 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)

Description of problem:
I was running a script on a heavily-loaded system which mounts and unmounts a 
lot of smbfs filesystems.  After a while, the kernel crashed and printed out an 
Oops message -- details below.  The system was still responding to pings, but 
otherwise was dead.

Version-Release number of selected component (if applicable):


How reproducible:
Sometimes

Steps to Reproduce:
1. Load system down with lots of IO and CPU usage: software RAID 5, copying a 
lot of data using rsync over SSH, etc.
2. Use shell script to mount and unmount about 12 smbfs filesystems.

	

Actual Results:  System hung with an oops.

Additional info:

Below is the ooops and what ksymoops had to say.  Let me know if you need 
additional information.  Before the Oops, I saw the following messages:

raid5: multiple 0 requests for sector 256464264
smb_retry: no connection process
last message repeated 2 times
smb_delete_inode: could not close inode 2
VFS: Busy inodes after unmount. Self-destruct in 5 seconds.  Have a nice day...
Unable to handle kernel paging request at virtual address 740a6279
 printing eip:
c01588d3
*pde = 00000000



------------  Ooops -----------




# ksymoops -V -k /proc/ksyms -l /proc/modules -o /lib/modules/2.4.18-10smp/ -
m /boot/System.map-2.4.18-10smp  oops.log 
ksymoops 2.4.4 on i686 2.4.18-10smp.  Options used
     -V (specified)
     -k /proc/ksyms (specified)
     -l /proc/modules (specified)
     -o /lib/modules/2.4.18-10smp/ (specified)
     -m /boot/System.map-2.4.18-10smp (specified)

Error (expand_objects): cannot stat(/lib/ext3.o) for ext3
ksymoops: No such file or directory
Error (expand_objects): cannot stat(/lib/jbd.o) for jbd
ksymoops: No such file or directory
Error (expand_objects): cannot stat(/lib/raid5.o) for raid5
ksymoops: No such file or directory
Error (expand_objects): cannot stat(/lib/xor.o) for xor
ksymoops: No such file or directory
Error (expand_objects): cannot stat(/lib/aic7xxx.o) for aic7xxx
ksymoops: No such file or directory
Error (expand_objects): cannot stat(/lib/sd_mod.o) for sd_mod
ksymoops: No such file or directory
Error (expand_objects): cannot stat(/lib/scsi_mod.o) for scsi_mod
ksymoops: No such file or directory
/usr/bin/find: /lib/modules/2.4.18-10smp/build: No such file or directory
Error (pclose_local): find_objects pclose failed 0x100
Warning (map_ksym_to_module): cannot match loaded module ext3 to a unique 
module object.  Trace may not be reliable.
Oops: 0000
CPU:    0
EIP:    0010:[<c01588d3>]    Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010206
eax: 740a6269   ebx: c9111da0   ecx: f4bfcbd0   edx: c9111db0
esi: c26cc400   edi: 740a6269   ebp: 00000041   esp: f7ed3f5c
ds: 0018   es: 0018   ss: 0018
Process kswapd (pid: 7, stackpage=f7ed3000)
Stack: f4bfcbb8 f4bfcba0 c9111da0 c0155fa6 c9111da0 00000375 000000c3 ffffffff 
      c02f3c28 00000375 00000185 00000493 c01381e8 000001d0 00000556 00000000 
       00000185 c01563d0 0000230c c0138b3c 00000006 000001d0 000001d0 f7ed2000 
Call Trace: [<c0155fa6>] prune_dcache [kernel] 0xe6 
[<c01381e8>] page_launder [kernel] 0x168 
[<c01563d0>] shrink_dcache_memory [kernel] 0x20 
[<c0138b3c>] do_try_to_free_pages [kernel] 0x1c 
[<c0138e31>] kswapd [kernel] 0x101 
[<c0105000>] stext [kernel] 0x0 
[<c0107286>] kernel_thread [kernel] 0x26 
[<c0138d30>] kswapd [kernel] 0x0 
Code: 8b 47 10 85 c0 74 04 53 ff d0 58 68 bc 49 2f c0 8d 43 2c 50 

>>EIP; c01588d3 <iput+43/290>   <=====
Trace; c0155fa6 <prune_dcache+e6/1c0>
Trace; c01381e8 <page_launder+168/300>
Trace; c01563d0 <shrink_dcache_memory+20/30>
Trace; c0138b3c <do_try_to_free_pages+1c/180>
Trace; c0138e31 <kswapd+101/2d0>
Trace; c0105000 <_stext+0/0>
Trace; c0107286 <kernel_thread+26/30>
Trace; c0138d30 <kswapd+0/2d0>
Code;  c01588d3 <iput+43/290>
00000000 <_EIP>:
Code;  c01588d3 <iput+43/290>   <=====
   0:   8b 47 10                  mov    0x10(%edi),%eax   <=====
Code;  c01588d6 <iput+46/290>
   3:   85 c0                     test   %eax,%eax
Code;  c01588d8 <iput+48/290>
   5:   74 04                     je     b <_EIP+0xb> c01588de <iput+4e/290>
Code;  c01588da <iput+4a/290>
   7:   53                        push   %ebx
Code;  c01588db <iput+4b/290>
   8:   ff d0                     call   *%eax
Code;  c01588dd <iput+4d/290>
   a:   58                        pop    %eax
Code;  c01588de <iput+4e/290>
   b:   68 bc 49 2f c0            push   $0xc02f49bc
Code;  c01588e3 <iput+53/290>
  10:   8d 43 2c                  lea    0x2c(%ebx),%eax
Code;  c01588e6 <iput+56/290>
  13:   50                        push   %eax


1 warning and 8 errors issued.  Results may not be reliable.

Comment 1 Joel Votaw 2002-09-12 15:39:28 UTC
I just had another one of these crashes.  I have rewritten my script to only 
handle one filesystem at a time and am still running into the problem.  I 
haven't had problems with these scripts on single-processor systems, so I think 
this may be a race condition somehow related to SMP.

Here is the ksymoops info for this crash:




[root@files2 root]# ksymoops -V -k /proc/ksyms -l /proc/modules -
o /lib/modules/2.4.18-10smp/ -m /boot/System.map-2.4.18-10smp  oops2.log 
ksymoops 2.4.4 on i686 2.4.18-10smp.  Options used
     -V (specified)
     -k /proc/ksyms (specified)
     -l /proc/modules (specified)
     -o /lib/modules/2.4.18-10smp/ (specified)
     -m /boot/System.map-2.4.18-10smp (specified)

Error (expand_objects): cannot stat(/lib/ext3.o) for ext3
ksymoops: No such file or directory
Error (expand_objects): cannot stat(/lib/jbd.o) for jbd
ksymoops: No such file or directory
Error (expand_objects): cannot stat(/lib/raid5.o) for raid5
ksymoops: No such file or directory
Error (expand_objects): cannot stat(/lib/xor.o) for xor
ksymoops: No such file or directory
Error (expand_objects): cannot stat(/lib/aic7xxx.o) for aic7xxx
ksymoops: No such file or directory
Error (expand_objects): cannot stat(/lib/sd_mod.o) for sd_mod
ksymoops: No such file or directory
Error (expand_objects): cannot stat(/lib/scsi_mod.o) for scsi_mod
ksymoops: No such file or directory
/usr/bin/find: /lib/modules/2.4.18-10smp/build: No such file or directory
Error (pclose_local): find_objects pclose failed 0x100
Warning (map_ksym_to_module): cannot match loaded module ext3 to a unique 
module object.  Trace may not be reliable.
kernel BUG at inode.c:1066!
invalid operand: 0000
CPU:    1
EIP:    0010:[<c01588bf>]    Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010286
eax: 0000001c   ebx: d94fe420   ecx: c02f25e0   edx: 00008f60
esi: f7d1c000   edi: 00000000   ebp: 00000051   esp: f7ed3f54
ds: 0018   es: 0018   ss: 0018
Process kswapd (pid: 7, stackpage=f7ed3000)
Stack: c024a65f 0000042a ddd49e58 ddd49e40 d94fe420 c0155fa6 d94fe420 0000038e 
       00000098 ffffffff c02f3c28 0000038e 0000016c 000003cd c01381e8 000001d0 
       00000465 00000000 0000016c c01563d0 000020a0 c0138b3c 00000006 000001d0 
Call Trace: [<c0155fa6>] prune_dcache [kernel] 0xe6 
[<c01381e8>] page_launder [kernel] 0x168 
[<c01563d0>] shrink_dcache_memory [kernel] 0x20 
[<c0138b3c>] do_try_to_free_pages [kernel] 0x1c 
[<c0138e31>] kswapd [kernel] 0x101 
[<c0105000>] stext [kernel] 0x0 
[<c0107286>] kernel_thread [kernel] 0x26 
[<c0138d30>] kswapd [kernel] 0x0 
Code: 0f 0b 5a 59 85 f6 74 08 8b 46 20 85 c0 0f 45 f8 85 ff 74 0b 

>>EIP; c01588bf <iput+2f/290>   <=====
Trace; c0155fa6 <prune_dcache+e6/1c0>
Trace; c01381e8 <page_launder+168/300>
Trace; c01563d0 <shrink_dcache_memory+20/30>
Trace; c0138b3c <do_try_to_free_pages+1c/180>
Trace; c0138e31 <kswapd+101/2d0>
Trace; c0105000 <_stext+0/0>
Trace; c0107286 <kernel_thread+26/30>
Trace; c0138d30 <kswapd+0/2d0>
Code;  c01588bf <iput+2f/290>
00000000 <_EIP>:
Code;  c01588bf <iput+2f/290>   <=====
   0:   0f 0b                     ud2a      <=====
Code;  c01588c1 <iput+31/290>
   2:   5a                        pop    %edx
Code;  c01588c2 <iput+32/290>
   3:   59                        pop    %ecx
Code;  c01588c3 <iput+33/290>
   4:   85 f6                     test   %esi,%esi
Code;  c01588c5 <iput+35/290>
   6:   74 08                     je     10 <_EIP+0x10> c01588cf <iput+3f/290>
Code;  c01588c7 <iput+37/290>
   8:   8b 46 20                  mov    0x20(%esi),%eax
Code;  c01588ca <iput+3a/290>
   b:   85 c0                     test   %eax,%eax
Code;  c01588cc <iput+3c/290>
   d:   0f 45 f8                  cmovne %eax,%edi
Code;  c01588cf <iput+3f/290>
  10:   85 ff                     test   %edi,%edi
Code;  c01588d1 <iput+41/290>
  12:   74 0b                     je     1f <_EIP+0x1f> c01588de <iput+4e/290>


1 warning and 8 errors issued.  Results may not be reliable.


Comment 2 Pascal Montaigne 2003-05-16 06:27:24 UTC
I have a similar problem the kernel crash one a month. I am using rsync with
smbfs over a ADSL link. But this is on a single processor system. Here is what
is in the message log :

May 15 23:11:21 s01 kernel: kernel BUG at inode.c:1066!
May 15 23:11:21 s01 kernel: invalid operand: 0000
May 15 23:11:21 s01 kernel: smbfs loop ide-cd cdrom soundcore autofs eepro100 st
usb-uhci usbcore ext3 jbd
May 15 23:11:21 s01 kernel: CPU:    0
May 15 23:11:21 s01 kernel: EIP:    0010:[<c014c32f>]    Not tainted
May 15 23:11:21 s01 kernel: EFLAGS: 00010286
May 15 23:11:21 s01 kernel:
May 15 23:11:21 s01 kernel: EIP is at iput [kernel] 0x2f (2.4.18-10)
May 15 23:11:21 s01 kernel: eax: 0000001c   ebx: c19b89e0   ecx: 00000001   edx:
00005870
May 15 23:11:21 s01 kernel: esi: cfa6a800   edi: 00000000   ebp: 0000004b   esp:
c13b3f50
May 15 23:11:21 s01 kernel: ds: 0018   es: 0018   ss: 0018
May 15 23:11:21 s01 kernel: Process kswapd (pid: 5, stackpage=c13b3000)
May 15 23:11:21 s01 kernel: Stack: c022968c 0000042a c825a878 c825a860 c19b89e0
c014a0ad c19b89e0 c13b2000
May 15 23:11:21 s01 kernel:        00000000 00000000 ffffffff c02c7ae8 00000000
00000000 0000018f c0130133
May 15 23:11:21 s01 kernel:        000001d0 0000018f 00000000 00000000 c014a410
00000062 c013090c 00000006
May 15 23:11:21 s01 kernel: Call Trace: [<c014a0ad>] prune_dcache [kernel] 0x10d
May 15 23:11:21 s01 kernel: [<c0130133>] page_launder [kernel] 0x2b3
May 15 23:11:21 s01 kernel: [<c014a410>] shrink_dcache_memory [kernel] 0x20
May 15 23:11:21 s01 kernel: [<c013090c>] do_try_to_free_pages [kernel] 0x1c
May 15 23:11:21 s01 kernel: [<c0130c01>] kswapd [kernel] 0x101
May 15 23:11:21 s01 kernel: [<c0105000>] stext [kernel] 0x0
May 15 23:11:21 s01 kernel: [<c0107136>] kernel_thread [kernel] 0x26
May 15 23:11:21 s01 kernel: [<c0130b00>] kswapd [kernel] 0x0
May 15 23:11:21 s01 kernel:
May 15 23:11:21 s01 kernel:
May 15 23:11:21 s01 kernel: Code: 0f 0b 5a 59 85 f6 74 08 8b 46 20 85 c0 0f 45
f8 85 ff 74 0b
May 15 23:11:22 s01 samba(pam_unix)[18311]: session closed for user Carlton


Comment 3 Joel Votaw 2003-05-16 13:38:01 UTC
My work-around for this bug has been to use a custom-compiled 2.4.19 kernel.  
This has generally worked very well: I've been rsyncing about 50 machines in 
parallel over a fast local network; the machines are still mounted using smbfs.

However, even on this kernel I recently had problems:

VFS: Busy inodes after unmount. Self-destruct in 5 seconds.  Have a nice day...

kswapd crashed after this.  (I won't include the full ksymoops here since this 
was not with the RedHat kernel in question.)  There is some mention of problems 
like this on the Linux Kernel Mailing List, but nothing conclusive that I 
found: mixed reports of success with 2.4.18 and 2.4.19, and some people 
suggesting to move to 2.4.20.

Anyways, after this happened I upgraded to the latest RedHat kernel at the time
(2.4.18-27.7.xsmp).  For the past 3 days it has served me well with no crashes, 
under the same load as my custom kernel.  This hasn't been long enough to be 
conclusive, but with 2.4.18-10 I would've expected a crash by now.

I'm going to upgrade to the very recent RedHat 2.4.20-* kernel and see how that 
works, and report back in a few weeks.  However, if you are having a lot of 
crashes with 2.4.18-10 you might try 2.4.18-27.7.x, since it seems to be better.

-Joel


Comment 4 Bugzilla owner 2004-09-30 15:39:54 UTC
Thanks for the bug report. However, Red Hat no longer maintains this version of
the product. Please upgrade to the latest version and open a new bug if the problem
persists.

The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, 
and if you believe this bug is interesting to them, please report the problem in
the bug tracker at: http://bugzilla.fedora.us/