Bug 103227

Summary: multiple NFS mounts anomaly
Product: Red Hat Enterprise Linux 3 Reporter: Larry Troan <ltroan>
Component: kernelAssignee: Steve Dickson <steved>
Status: CLOSED WONTFIX QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 3.0CC: ichute, mansing.li, petrides, steved, tao
Target Milestone: ---   
Target Release: ---   
Hardware: ia64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-10-05 23:35:44 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Larry Troan 2003-08-27 21:54:36 UTC
For an zx2000, or zx6000, create a script that will run forever to umount and
mount several (3-4 in our cases) nfs mounts.  E.G.  i=0; while i=0; do; umount
/iotests/bin; umount /iotests/xpeak; umount /iotests/xpeak; umount
/iotests/linux; sleep 3; mount; mount /iotests/bin; mount /iotests/xpeak; mount
/iotests/xpeak; mount /iotests/linux; mount; sleep 3; done.
And let the script runs forever.  Then from another terminal, have another
script that runs forever to do a 'ls -lR /iotests'.  E.G.  i=o; while i=0; do;
ls -lR /iotests; done

After a few runs (could be 20-30) the 'mount' command in the first scripts will
show 'multiple entries for one or more of the nfs mounts.  E.G.  /iotests/linux
will have multiple entries.  cat /etc/mtab will show multiple entries as well. 
The number of entries will increase as time goes.  One has to umount each
/iotests/linux entries.

This is reproducible.  But it is an accident discovery when trying to reproduce
yet another issue that we have not find a way to reproduce properly.  What we
saw was : mount cmd could just 'hang' the terminal (can't even kill -9 the mount
PID) and sometimes umount will behave the same.  We know for sure the network is
not the problem because there are other test machines on the network accessing
the same server.
----------
Action by: mansing.li
Issue Registered

ISSUE TRACKER 27001 OPENED AS SEV 2

Comment 1 Larry Troan 2003-09-05 16:46:07 UTC
FROM ISSUE TRACKER...
 Event posted 09-02-2003 04:54pm by mansing.li with duration of 0.00       
Install RHEL 3.0 Beta 2 on a zx2000 - Wilson's Peak (cfg103) - on Fri.  Sometime
over the weekend, one of our NFS servers had some trouble.  cfg103 detected the
problem and attempted to do something which resulted in a kernel panic.    Here
is what I saw on the serial console:

[root@localhost env]# nfs: server 10.10.20.14 not responding, still trying
updfstab[19537]: Oops 8813272891392

Pid: 19537, comm:             updfstab
EIP is at strnlen [kernel] 0x20 (2.4.21-1.1931.2.399.ent)
psr : 0000101008026038 ifs : 8000000000000007 ip  : [<e0000000047f96e0>]    Not
tainted
unat: 0000000000000000 pfs : 0000000000000592 rsc : 0000000000000003
rnat: 0000101008026038 bsps: e0000000047f8990 pr  : 80000000f5aa9ad9
ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8a70033f b0  :
e0000000047fb0c0 b6  : e0000000047fb520 b7  : e0000000047fb080 f6  :
0fffbccccccccc8c00000 f7  : 0ffd9a200000000000000 f8  : 0ffff8000000000000000 f9
 : 10002a000000000000000 r1  : e000000004c9bd00 r2  : e00000002f46810c r3  :
e00000012f46810c r8  : 0000000000636439 r9  : 0000000000000020 r10 :
0000000000000025 r11 : 0000000000000073 r12 : e00000001e25fe00 r13 :
e00000001e258000 r14 : 0000000000000000 r15 : 000000000000004e r16 :
0000000000000073 r17 : ffffffffffffffff r18 : 0000000000000000 r19 :
e00000002f4680b1 r20 : 0000000000000298 r21 : e0000000049405f8 r22 :
e000000004b60548 r23 : 0000000000000073 r24 : 0000000000000073 r25 :
e0000000047fb080 r26 : ffffffffffeba790 r27 : e0000000049408f0 r28 :
0000000000000270 r29 : e000000004940680 r30 : e000000004b60550 r31 :
6e6d6c6b6a696867

Call Trace: [<e0000000044155c0>] sp=0xe00000001e25fa10 bsp=0xe00000001e259608 sh
ow_stack [kernel] 0x80 [<e000000004430410>] sp=0xe00000001e25fbd0
bsp=0xe00000001e2595d8 die [kernel] 0 x1b0 [<e000000004451d70>]
sp=0xe00000001e25fbd0 bsp=0xe00000001e259580 ia64_do_page_f ault [kernel] 0x310
[<e00000000440e680>] sp=0xe00000001e25fc60 bsp=0xe00000001e259580 ia64_leave_ker
nel [kernel] 0x0 [<e0000000047f96e0>] sp=0xe00000001e25fe00
bsp=0xe00000001e259548 strnlen [kerne l] 0x20 [<e0000000047fb0c0>]
sp=0xe00000001e25fe00 bsp=0xe00000001e2594f0 vsnprintf [ker nel] 0x780
[<e0000000047fb590>] sp=0xe00000001e25fe10 bsp=0xe00000001e259498 sprintf [kerne
l] 0x70 [<a0000000001c41b0>] sp=0xe00000001e25fe40 bsp=0xe00000001e259420
proc_print_scs idevice_Rsmp_7ba7cbc7 [scsi_mod] 0x390 [<a0000000001bb3b0>]
sp=0xe00000001e25fe40 bsp=0xe00000001e2593b0 scsi_proc_info  [scsi_mod] 0x170
[<e00000000457d240>] sp=0xe00000001e25fe50 bsp=0xe00000001e259330 proc_file_read
 [kernel] 0x3e0 [<e000000004511400>] sp=0xe00000001e25fe60
bsp=0xe00000001e2592b0 sys_read [kern el] 0x1c0 [<e00000000440e660>]
sp=0xe00000001e25fe60 bsp=0xe00000001e2592b0 ia64_ret_from_ syscall [kernel] 0x0
sym53c1010-66-0: detaching ...
sym53c1010-66-0: resetting chip
scsi : 1 host left.


[root@localhost env]# uname -a
Linux cfg103 2.4.21-1.1931.2.399.ent #1 SMP Wed Aug 20 15:23:44 EDT 2003 ia64 ia
64 ia64 GNU/Linux

[root@localhost env]# mount
/dev/hda4 on / type ext2 (rw,errors=remount-ro)
none on /proc type proc (rw)
usbdevfs on /proc/bus/usb type usbdevfs (rw)
/dev/hda1 on /boot/efi type vfat (rw)
none on /dev/pts type devpts (rw,gid=5,mode=620)
none on /dev/shm type tmpfs (rw)
/dev/hda5 on /home type ext2 (rw,errors=remount-ro) 10.10.20.20:/iotests/linux
on /iotests/linux type nfs (rw,addr=10.10.20.20) 10.10.20.14:/iotests/hpux on
/iotests/hpux type nfs (rw,nosuid,nodev,addr=10.10.
20.14)
10.10.20.14:/iotests/xpeak on /iotests/xpeak type nfs (rw,nosuid,nodev,addr=10.1
0.20.14)
10.10.20.14:/iotests/bin on /iotests/bin type nfs (rw,nosuid,nodev,addr=10.10.20
.14)

[root@localhost env]# cat /etc/mtab
/dev/hda4 / ext2 rw,errors=remount-ro 0 0
none /proc proc rw 0 0
usbdevfs /proc/bus/usb usbdevfs rw 0 0
/dev/hda1 /boot/efi vfat rw 0 0
none /dev/pts devpts rw,gid=5,mode=620 0 0
none /dev/shm tmpfs rw 0 0
/dev/hda5 /home ext2 rw,errors=remount-ro 0 0 10.10.20.20:/iotests/linux
/iotests/linux nfs rw,addr=10.10.20.20 0 0 10.10.20.14:/iotests/hpux
/iotests/hpux nfs rw,nosuid,nodev,addr=10.10.20.14 0 0
10.10.20.14:/iotests/xpeak /iotests/xpeak nfs rw,nosuid,nodev,addr=10.10.20.14 0
 0 10.10.20.14:/iotests/bin /iotests/bin nfs rw,nosuid,nodev,addr=10.10.20.14 0
0 [root@localhost env]#

------------------------------
Event posted 09-02-2003 04:58pm by mansing.li with duration of 0.00       
Larry, Sue, I think the NFS code, or the 'mount portion' of the file system has
some defets.  This is very reproducible.  May not be reported by other OEM, but
it is there.  I doubt this is for IPF only.  For IA32, (my guess) is that the
client could hang. Mansing

Comment 2 Larry Troan 2003-09-12 16:19:14 UTC
FROM ISSUE TRACKER
Event posted 09-09-2003 06:17pm by ierickson with duration of 0.00       
Also reproduced on IA32....

any file over 2GB gets ignored by mkisofs.

Comment 3 Larry Troan 2003-09-12 21:07:07 UTC
FROM ISSUE TRACKER
Event posted 09-12-2003 04:59pm by Daryl with duration of 0.00        
Knocking issue down to High, because we think this problem -- while real -- may
only exist in the particular testing environment or setup and not be universal.

Severity set to: High

Comment 4 Larry Troan 2003-10-09 12:42:54 UTC
Need to understand from HP-WS if this is a critical problem or characteristic of
a specific test environment only. 

Comment 5 Ernie Petrides 2005-10-05 23:35:44 UTC
Closing due to lack response.