For an zx2000, or zx6000, create a script that will run forever to umount and mount several (3-4 in our cases) nfs mounts. E.G. i=0; while i=0; do; umount /iotests/bin; umount /iotests/xpeak; umount /iotests/xpeak; umount /iotests/linux; sleep 3; mount; mount /iotests/bin; mount /iotests/xpeak; mount /iotests/xpeak; mount /iotests/linux; mount; sleep 3; done. And let the script runs forever. Then from another terminal, have another script that runs forever to do a 'ls -lR /iotests'. E.G. i=o; while i=0; do; ls -lR /iotests; done After a few runs (could be 20-30) the 'mount' command in the first scripts will show 'multiple entries for one or more of the nfs mounts. E.G. /iotests/linux will have multiple entries. cat /etc/mtab will show multiple entries as well. The number of entries will increase as time goes. One has to umount each /iotests/linux entries. This is reproducible. But it is an accident discovery when trying to reproduce yet another issue that we have not find a way to reproduce properly. What we saw was : mount cmd could just 'hang' the terminal (can't even kill -9 the mount PID) and sometimes umount will behave the same. We know for sure the network is not the problem because there are other test machines on the network accessing the same server. ---------- Action by: mansing.li Issue Registered ISSUE TRACKER 27001 OPENED AS SEV 2
FROM ISSUE TRACKER... Event posted 09-02-2003 04:54pm by mansing.li with duration of 0.00 Install RHEL 3.0 Beta 2 on a zx2000 - Wilson's Peak (cfg103) - on Fri. Sometime over the weekend, one of our NFS servers had some trouble. cfg103 detected the problem and attempted to do something which resulted in a kernel panic. Here is what I saw on the serial console: [root@localhost env]# nfs: server 10.10.20.14 not responding, still trying updfstab[19537]: Oops 8813272891392 Pid: 19537, comm: updfstab EIP is at strnlen [kernel] 0x20 (2.4.21-1.1931.2.399.ent) psr : 0000101008026038 ifs : 8000000000000007 ip : [<e0000000047f96e0>] Not tainted unat: 0000000000000000 pfs : 0000000000000592 rsc : 0000000000000003 rnat: 0000101008026038 bsps: e0000000047f8990 pr : 80000000f5aa9ad9 ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8a70033f b0 : e0000000047fb0c0 b6 : e0000000047fb520 b7 : e0000000047fb080 f6 : 0fffbccccccccc8c00000 f7 : 0ffd9a200000000000000 f8 : 0ffff8000000000000000 f9 : 10002a000000000000000 r1 : e000000004c9bd00 r2 : e00000002f46810c r3 : e00000012f46810c r8 : 0000000000636439 r9 : 0000000000000020 r10 : 0000000000000025 r11 : 0000000000000073 r12 : e00000001e25fe00 r13 : e00000001e258000 r14 : 0000000000000000 r15 : 000000000000004e r16 : 0000000000000073 r17 : ffffffffffffffff r18 : 0000000000000000 r19 : e00000002f4680b1 r20 : 0000000000000298 r21 : e0000000049405f8 r22 : e000000004b60548 r23 : 0000000000000073 r24 : 0000000000000073 r25 : e0000000047fb080 r26 : ffffffffffeba790 r27 : e0000000049408f0 r28 : 0000000000000270 r29 : e000000004940680 r30 : e000000004b60550 r31 : 6e6d6c6b6a696867 Call Trace: [<e0000000044155c0>] sp=0xe00000001e25fa10 bsp=0xe00000001e259608 sh ow_stack [kernel] 0x80 [<e000000004430410>] sp=0xe00000001e25fbd0 bsp=0xe00000001e2595d8 die [kernel] 0 x1b0 [<e000000004451d70>] sp=0xe00000001e25fbd0 bsp=0xe00000001e259580 ia64_do_page_f ault [kernel] 0x310 [<e00000000440e680>] sp=0xe00000001e25fc60 bsp=0xe00000001e259580 ia64_leave_ker nel [kernel] 0x0 [<e0000000047f96e0>] sp=0xe00000001e25fe00 bsp=0xe00000001e259548 strnlen [kerne l] 0x20 [<e0000000047fb0c0>] sp=0xe00000001e25fe00 bsp=0xe00000001e2594f0 vsnprintf [ker nel] 0x780 [<e0000000047fb590>] sp=0xe00000001e25fe10 bsp=0xe00000001e259498 sprintf [kerne l] 0x70 [<a0000000001c41b0>] sp=0xe00000001e25fe40 bsp=0xe00000001e259420 proc_print_scs idevice_Rsmp_7ba7cbc7 [scsi_mod] 0x390 [<a0000000001bb3b0>] sp=0xe00000001e25fe40 bsp=0xe00000001e2593b0 scsi_proc_info [scsi_mod] 0x170 [<e00000000457d240>] sp=0xe00000001e25fe50 bsp=0xe00000001e259330 proc_file_read [kernel] 0x3e0 [<e000000004511400>] sp=0xe00000001e25fe60 bsp=0xe00000001e2592b0 sys_read [kern el] 0x1c0 [<e00000000440e660>] sp=0xe00000001e25fe60 bsp=0xe00000001e2592b0 ia64_ret_from_ syscall [kernel] 0x0 sym53c1010-66-0: detaching ... sym53c1010-66-0: resetting chip scsi : 1 host left. [root@localhost env]# uname -a Linux cfg103 2.4.21-1.1931.2.399.ent #1 SMP Wed Aug 20 15:23:44 EDT 2003 ia64 ia 64 ia64 GNU/Linux [root@localhost env]# mount /dev/hda4 on / type ext2 (rw,errors=remount-ro) none on /proc type proc (rw) usbdevfs on /proc/bus/usb type usbdevfs (rw) /dev/hda1 on /boot/efi type vfat (rw) none on /dev/pts type devpts (rw,gid=5,mode=620) none on /dev/shm type tmpfs (rw) /dev/hda5 on /home type ext2 (rw,errors=remount-ro) 10.10.20.20:/iotests/linux on /iotests/linux type nfs (rw,addr=10.10.20.20) 10.10.20.14:/iotests/hpux on /iotests/hpux type nfs (rw,nosuid,nodev,addr=10.10. 20.14) 10.10.20.14:/iotests/xpeak on /iotests/xpeak type nfs (rw,nosuid,nodev,addr=10.1 0.20.14) 10.10.20.14:/iotests/bin on /iotests/bin type nfs (rw,nosuid,nodev,addr=10.10.20 .14) [root@localhost env]# cat /etc/mtab /dev/hda4 / ext2 rw,errors=remount-ro 0 0 none /proc proc rw 0 0 usbdevfs /proc/bus/usb usbdevfs rw 0 0 /dev/hda1 /boot/efi vfat rw 0 0 none /dev/pts devpts rw,gid=5,mode=620 0 0 none /dev/shm tmpfs rw 0 0 /dev/hda5 /home ext2 rw,errors=remount-ro 0 0 10.10.20.20:/iotests/linux /iotests/linux nfs rw,addr=10.10.20.20 0 0 10.10.20.14:/iotests/hpux /iotests/hpux nfs rw,nosuid,nodev,addr=10.10.20.14 0 0 10.10.20.14:/iotests/xpeak /iotests/xpeak nfs rw,nosuid,nodev,addr=10.10.20.14 0 0 10.10.20.14:/iotests/bin /iotests/bin nfs rw,nosuid,nodev,addr=10.10.20.14 0 0 [root@localhost env]# ------------------------------ Event posted 09-02-2003 04:58pm by mansing.li with duration of 0.00 Larry, Sue, I think the NFS code, or the 'mount portion' of the file system has some defets. This is very reproducible. May not be reported by other OEM, but it is there. I doubt this is for IPF only. For IA32, (my guess) is that the client could hang. Mansing
FROM ISSUE TRACKER Event posted 09-09-2003 06:17pm by ierickson with duration of 0.00 Also reproduced on IA32.... any file over 2GB gets ignored by mkisofs.
FROM ISSUE TRACKER Event posted 09-12-2003 04:59pm by Daryl with duration of 0.00 Knocking issue down to High, because we think this problem -- while real -- may only exist in the particular testing environment or setup and not be universal. Severity set to: High
Need to understand from HP-WS if this is a critical problem or characteristic of a specific test environment only.
Closing due to lack response.