Description of problem: Same as 432313 - Backgrounded NFS mounts dies soon after "service netfs start" command is issued. https://bugzilla.redhat.com/show_bug.cgi?id=432313 Bugfix update: http://rhn.redhat.com/errata/RHBA-2008-0751.html Version-Release number of selected component (if applicable): RHEL4U6, RHEL4U7 How reproducible: Always Steps to Reproduce: Steps to reproduce the problem with 2 RHEL4 Boxes. Test setup ======== A configuration where the 3 or more directories are mounted via NFS as background mounts through /etc/fstab is needed. 1. Take 2 Latest Rhel4 Boxes(say system1, system2). 2. One system(say system1) should export 3 or more nfs filesystems(by making appropriate entries in /etc/fstab and starting NFS server). /dev/sda1 /work1 ext3 defaults 1 2 /dev/sdb1 /work2 ext3 defaults 1 2 /dev/sdc1 /work3 ext3 defaults 1 2 Make an entry in /etc/exports as below /work1 <system1'sIP>/<NETMASK>(rw,no_root_squash,sync) /work2 <system1'sIP>/<NETMASK>(rw,no_root_squash,sync) /work3 <system1'sIP>/<NETMASK>(rw,no_root_squash,sync) 3. The second system (system2) should mount nfs filesystems exported by system1 by making appropriate entries in /etc/fstab. Here are the entries in /etc/fstab file <system1'sIP>:/work1 /work1 nfs rw,bg,intr,hard,wsize=32768,rsize=32768 0 0 <system1'sIP>:/work2 /work2 nfs rw,bg,intr,hard,wsize=32768,rsize=32768 0 0 <system1'sIP>:/work3 /work3 nfs rw,bg,intr,hard,wsize=32768,rsize=32768 0 0 4. When system1 is up and NFS service is running with 3 or more exported nfs file systems mounted state, system2 should successfully mount all nfs imported filesystems from system1. To reproduce the problem with the above setup (Try these steps WITH and WITHOUT REDHAT Fix/Patch). 1. Bring down both systems. 2. Bring up system2 first to make sure nfs mounts gets backgrounded till system1 comes up. Have a note of "ps -elf | grep mount" at this stage. 3. Now bring up the system1. 4. See status of nfs mounts on system2. How to Verify the correctness of fix : ========================== Without the REDHAT fix, system2 should see at least one (last entry) or more nfs filesystems in unmounted state. If the REDHAT's fix fixes the problem system2 should see all the imported nfs file systems in mounted state. ================================================================================== Actual results: The mount processes on nfs clients are not getting backgrounded until the nfs server exporting those file systems becomes available. All those commands are getting killed for some reason and need to issue "mount -a" to explicitly mount those file systems when the system exporting the file systems become available. Expected results: If the REDHAT's fix fixes the problem system2 should see all the imported nfs file systems in mounted state. Additional info: ============================================================================= -- System1 ( exp10fs1 ) NFS server -- [root@exp10fs1 ~]# uname -a Linux exp10fs1 2.6.9-67.EL #1 SMP Wed Nov 7 13:43:35 EST 2007 ia64 ia64 ia64 GNU/Linux [root@exp10fs1 ~]# [root@exp10fs1 ~]# cat /etc/fstab # This file is edited by fstab-sync - see 'man fstab-sync' for details LABEL=/1 / ext3 defaults 1 1 LABEL=/boot/efi /boot/efi vfat defaults 0 0 none /dev/pts devpts gid=5,mode=620 0 0 none /dev/shm tmpfs defaults 0 0 none /proc proc defaults 0 0 none /sys sysfs defaults 0 0 LABEL=/var1 /var ext3 defaults 1 2 LABEL=/work1 /work1 ext3 defaults 1 2 LABEL=/work2 /work2 ext3 defaults 1 2 LABEL=/work3 /work3 ext3 defaults 1 2 LABEL=/work4 /work4 ext3 defaults 1 2 LABEL=SW-cciss/c0d0p3 swap swap defaults 0 0 [root@exp10fs1 ~]# df Filesystem 1K-blocks Used Available Use% Mounted on /dev/cciss/c0d0p2 30237648 7036436 21665212 25% / /dev/cciss/c0d0p1 307032 6160 300872 3% /boot/efi none 2064256 0 2064256 0% /dev/shm /dev/cciss/c0d0p4 33968352 275780 31967060 1% /var /dev/sdb1 17496684 77800 16530092 1% /work1 /dev/sdc1 17496684 77800 16530092 1% /work2 /dev/sdd1 17496684 77800 16530092 1% /work3 /dev/sde1 17496684 77800 16530092 1% /work4 [root@exp10fs1 ~]# cat /etc/exports /work1 System1-IP/Subnet-Mask(rw,no_root_squash,sync) /work2 System1-IP/Subnet-Mask(rw,no_root_squash,sync) /work3 System1-IP/Subnet-Mask(rw,no_root_squash,sync) [root@exp10fs1 ~]# [root@exp10fs1 ~]# cat /etc/hosts.allow # # hosts.allow This file describes the names of the hosts which are # allowed to use the local INET services, as decided # by the '/usr/sbin/tcpd' server. # ALL:ALL [root@exp10fs1 ~]# cat /etc/hosts.deny # # hosts.deny This file describes the names of the hosts which are # *not* allowed to use the local INET services, as decided # by the '/usr/sbin/tcpd' server. # # The portmap line is redundant, but it is left to remind you that # the new secure portmap uses hosts.deny and hosts.allow. In particular # you should know that NFS uses portmap! ALL:ALL [root@exp10fs1 ~]# exportfs /work1 System1-IP/Subnet-Mask /work2 System1-IP/Subnet-Mask /work3 System1-IP/Subnet-Mask [root@exp10fs1 ~]# [root@exp10fs1 ~]# chkconfig --list |grep nfs nfslock 0:off 1:off 2:off 3:on 4:on 5:on 6:off nfs 0:off 1:off 2:on 3:on 4:on 5:on 6:off [root@exp10fs1 ~]# [root@exp10fs1 ~]# [root@exp10fs1 ~]# service nfs status rpc.mountd (pid 11034) is running... nfsd (pid 11028 11027 11026 11025 11024 11023 11022 11021) is running... rpc.rquotad (pid 11017) is running... [root@exp10fs1 ~]# [root@exp10fs1 ~]# showmount Hosts on exp10fs1 System2-IP [root@exp10fs1 ~]# [root@exp10fs1 ~]# cat /etc/hosts # Do not remove the following line, or various programs # that require network functionality will fail. 127.0.0.1 exp10fs1 exp10fs1 localhost.localdomain localhost # System1-IP exp10fs1 exp10fs1 System2-IP exp10fs2 exp10fs2 [root@exp10fs1 ~]# ========================================================================= -- System2 ( exp10fs2) NFS client -- [root@exp10fs2 ~]# uname -a Linux exp10fs2 2.6.9-67.EL #1 SMP Wed Nov 7 13:43:35 EST 2007 ia64 ia64 ia64 GNU/Linux [root@exp10fs2 ~]# [root@exp10fs2 ~]# cat /etc/fstab # This file is edited by fstab-sync - see 'man fstab-sync' for details LABEL=/1 / ext3 defaults 1 1 LABEL=/boot/efi /boot/efi vfat defaults 0 0 none /dev/pts devpts gid=5,mode=620 0 0 none /dev/shm tmpfs defaults 0 0 none /proc proc defaults 0 0 none /sys sysfs defaults 0 0 LABEL=/var1 /var ext3 defaults 1 2 System1-IP:/work1 /work1 nfs rw,bg,intr,hard,wsize=32768,rsize=32769 0 0 System1-IP:/work2 /work2 nfs rw,bg,intr,hard,wsize=32768,rsize=32769 0 0 System1-IP:/work3 /work3 nfs rw,bg,intr,hard,wsize=32768,rsize=32769 0 0 LABEL=SW-cciss/c0d0p4 swap swap defaults 0 0 [root@exp10fs2 ~]# df Filesystem 1K-blocks Used Available Use% Mounted on /dev/cciss/c0d0p2 38436528 7040852 29443208 20% / /dev/cciss/c0d0p1 307016 5872 301144 2% /boot/efi none 2583696 0 2583696 0% /dev/shm /dev/cciss/c0d0p3 25623668 298572 24023452 2% /var [root@exp10fs2 ~]# [root@exp10fs2 ~]# [root@exp10fs2 ~]# mount -a [root@exp10fs2 ~]# df Filesystem 1K-blocks Used Available Use% Mounted on /dev/cciss/c0d0p2 38436528 7040852 29443208 20% / /dev/cciss/c0d0p1 307016 5872 301144 2% /boot/efi none 2583696 0 2583696 0% /dev/shm /dev/cciss/c0d0p3 25623668 298572 24023452 2% /var System1-IP:/work1 17496704 77792 16530112 1% /work1 System1-IP:/work2 17496704 77792 16530112 1% /work2 System1-IP:/work3 17496704 77792 16530112 1% /work3 [root@exp10fs2 ~]# [root@exp10fs2 ~]# [root@exp10fs2 ~]# chkconfig --list portmap portmap 0:off 1:off 2:off 3:on 4:on 5:on 6:off [root@exp10fs2 ~]# [root@exp10fs2 ~]# service portmap status portmap (pid 10386) is running... [root@exp10fs2 ~]# [root@exp10fs2 ~]# reboot Broadcast message from root (pts/2) (Fri Aug 29 15:31:06 2008): The system is going down for reboot NOW! [root@exp10fs2 ~]# -- Reboot only System2 ( exp10fs2 ) while System1( exp10fs1) keeps online -- -- System2 reboot Done -- login as: root root@exp10fs2's password: [root@exp10fs2 ~]# [root@exp10fs2 ~]# [root@exp10fs2 ~]# [root@exp10fs2 ~]# [root@exp10fs2 ~]# [root@exp10fs2 ~]# [root@exp10fs2 ~]# [root@exp10fs2 ~]# [root@exp10fs2 ~]# df Filesystem 1K-blocks Used Available Use% Mounted on /dev/cciss/c0d0p2 38436528 7040812 29443248 20% / /dev/cciss/c0d0p1 307016 5872 301144 2% /boot/efi none 2583696 0 2583696 0% /dev/shm /dev/cciss/c0d0p3 25623668 298608 24023416 2% /var System1-IP:/work1 17496704 77792 16530112 1% /work1 System1-IP:/work2 17496704 77792 16530112 1% /work2 System1-IP:/work3 17496704 77792 16530112 1% /work3 [root@exp10fs2 ~]# --------------------------------------------------- After shutdown System1 ( exp10fs1 ), reboot system2. ------------------------------------------------------- [root@exp10fs2 ~]# reboot Broadcast message from root (pts/1) (Fri Aug 29 15:45:38 2008): The system is going down for reboot NOW! [root@exp10fs2 ~]# login as: root root@exp10fs2's password: [root@exp10fs2 ~]# [root@exp10fs2 ~]# [root@exp10fs2 ~]# [root@exp10fs2 ~]# df Filesystem 1K-blocks Used Available Use% Mounted on /dev/cciss/c0d0p2 38436528 7040812 29443248 20% / /dev/cciss/c0d0p1 307016 5872 301144 2% /boot/efi none 2583696 0 2583696 0% /dev/shm /dev/cciss/c0d0p3 25623668 298668 24023356 2% /var [root@exp10fs2 ~]# [root@exp10fs2 ~]# [root@exp10fs2 ~]# ps aux USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 1 1.2 0.0 3440 1488 ? S 15:47 0:02 init [5] root 2 0.0 0.0 0 0 ? S 15:47 0:00 [migration/0] root 3 0.0 0.0 0 0 ? SN 15:47 0:00 [ksoftirqd/0] root 4 0.0 0.0 0 0 ? S< 15:47 0:00 [events/0] root 5 0.0 0.0 0 0 ? S< 15:47 0:00 [khelper] root 6 0.0 0.0 0 0 ? S< 15:47 0:00 [kacpid] root 29 0.0 0.0 0 0 ? S< 15:47 0:00 [kblockd/0] root 30 0.0 0.0 0 0 ? S 15:47 0:00 [khubd] root 47 0.0 0.0 0 0 ? S 15:47 0:00 [pdflush] root 48 0.0 0.0 0 0 ? S 15:47 0:00 [pdflush] root 49 0.0 0.0 0 0 ? S 15:47 0:00 [kswapd0] root 50 0.0 0.0 0 0 ? S< 15:47 0:00 [aio/0] root 193 0.0 0.0 0 0 ? S 15:47 0:00 [kseriod] root 438 0.0 0.0 0 0 ? S 15:47 0:00 [scsi_eh_0] root 440 0.0 0.0 0 0 ? S 15:47 0:00 [scsi_eh_1] root 458 0.0 0.0 0 0 ? S 15:47 0:00 [kjournald] root 1075 0.0 0.0 0 0 ? S< 15:47 0:00 [kauditd] root 9619 0.0 0.0 3152 1264 ? S<s 15:47 0:00 udevd root 9861 0.0 0.0 0 0 ? S< 15:47 0:00 [kmpathd/0] root 9888 0.0 0.0 0 0 ? S 15:47 0:00 [kjournald] root 10368 0.0 0.0 4128 1376 ? Ss 15:48 0:00 syslogd -m 0 root 10372 0.0 0.0 3936 1008 ? Ss 15:48 0:00 klogd -x rpc 10384 0.0 0.0 4352 1392 ? Ss 15:48 0:00 portmap rpcuser 10404 0.0 0.0 4512 1824 ? Ss 15:48 0:00 rpc.statd root 10434 0.0 0.0 9904 1712 ? Ss 15:48 0:00 rpc.idmapd root 10518 0.0 0.0 2976 1264 ? Ss 15:48 0:00 /usr/sbin/acpid root 10549 0.1 0.1 65152 7104 ? Ss 15:48 0:00 cupsd root 10584 0.0 0.0 10464 3008 ? Ss 15:48 0:00 /usr/sbin/sshd root 10599 0.0 0.0 5808 2048 ? Ss 15:48 0:00 xinetd -stayalive -pidfile /var/run/xinetd.pid root 10618 0.0 0.0 21488 4784 ? Ss 15:48 0:00 sendmail: accepting connections smmsp 10628 0.0 0.0 15696 3776 ? Ss 15:48 0:00 sendmail: Queue runner@01:00:00 for /var/spool/clientmqueue root 10639 0.0 0.0 3904 1168 ? Ss 15:48 0:00 gpm -m /dev/input/mice -t imps2 htt 10670 0.0 0.0 3056 576 ? Ss 15:48 0:00 /usr/sbin/htt -retryonerror 0 htt 10671 0.0 0.1 19136 8160 ? S 15:48 0:00 htt_server -nodaemon canna 10683 1.2 0.5 29648 27360 ? Ss 15:48 0:01 /usr/sbin/cannaserver -syslog -u canna root 10695 0.0 0.0 55824 2224 ? Ss 15:48 0:00 crond xfs 10730 0.0 0.1 9696 5648 ? Ss 15:48 0:00 xfs -droppriv -daemon root 10749 0.0 0.0 4832 1312 ? Ss 15:48 0:00 /usr/sbin/atd root 10763 1.6 0.0 47184 1984 ? Ssl 15:48 0:01 /usr/sbin/salinfod -n dbus 10786 0.0 0.0 15744 2768 ? Ssl 15:48 0:00 dbus-daemon-1 --system root 10800 0.0 0.0 6128 2848 ? Ss 15:48 0:00 cups-config-daemon root 10811 0.2 0.1 9088 5248 ? Ss 15:48 0:00 hald root 10968 0.0 0.0 3024 1184 tty1 Ss+ 15:48 0:00 /sbin/mingetty tty1 root 10971 0.0 0.0 3024 1184 tty2 Ss+ 15:48 0:00 /sbin/mingetty tty2 root 10974 0.0 0.0 3024 1184 tty3 Ss+ 15:48 0:00 /sbin/mingetty tty3 root 10977 0.0 0.0 3024 1184 tty4 Ss+ 15:48 0:00 /sbin/mingetty tty4 root 10980 0.0 0.0 3024 1184 tty5 Ss+ 15:48 0:00 /sbin/mingetty tty5 root 10983 0.0 0.0 3024 1184 tty6 Ss+ 15:48 0:00 /sbin/mingetty tty6 root 10984 0.0 0.1 73104 7136 ? Ss 15:48 0:00 /usr/bin/gdm-binary -nodaemon root 11725 0.0 0.1 74944 6464 ? S 15:48 0:00 /usr/bin/gdm-binary -nodaemon root 11750 2.2 0.4 25392 21008 ? S 15:48 0:02 /usr/X11R6/bin/X :0 -audit 0 -auth /var/gdm/:0.Xauth -nolisten tcp vt7 gdm 11853 1.3 0.4 91856 24128 ? Ss 15:48 0:01 /usr/bin/gdmgreeter root 11859 0.1 0.1 14528 6016 ? Ss 15:50 0:00 sshd: root@pts/1 root 11861 0.1 0.0 55376 3392 pts/1 Ss 15:50 0:00 -bash root 11894 0.0 0.0 5872 1936 pts/1 R+ 15:50 0:00 ps aux [root@exp10fs2 ~]# ps aux|grep portmap rpc 10384 0.0 0.0 4352 1392 ? Ss 15:48 0:00 portmap root 11896 0.0 0.0 53584 1760 pts/1 S+ 15:50 0:00 grep portmap [root@exp10fs2 ~]# [root@exp10fs2 ~]# [root@exp10fs2 ~]# [root@exp10fs2 ~]# ps aux|grep mount root 11898 0.0 0.0 53584 1760 pts/1 S+ 15:50 0:00 grep mount [root@exp10fs2 ~]# [root@exp10fs2 ~]# No Mount's commands backgrounded, even though exported file systems from Systems1 are not yet available. --------------------------------------------------- Boot System1 ( exp10fs1 ) --------------------------------------------------- -- After boot up System1 -- [root@exp10fs2 ~]# ping exp10fs1 PING exp10fs1 (System1-IP) 56(84) bytes of data. 64 bytes from exp10fs1 (System1-IP): icmp_seq=0 ttl=64 time=0.608 ms 64 bytes from exp10fs1 (System1-IP): icmp_seq=1 ttl=64 time=0.151 ms 64 bytes from exp10fs1 (System1-IP): icmp_seq=2 ttl=64 time=0.150 ms --- exp10fs1 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 1999ms rtt min/avg/max/mdev = 0.150/0.303/0.608/0.215 ms, pipe 2 [root@exp10fs2 ~]# [root@exp10fs2 ~]# [root@exp10fs2 ~]# ssh exp10fs1 root@exp10fs1's password: [root@exp10fs1 ~]# [root@exp10fs1 ~]# [root@exp10fs1 ~]# [root@exp10fs1 ~]# df Filesystem 1K-blocks Used Available Use% Mounted on /dev/cciss/c0d0p2 30237648 7036436 21665212 25% / /dev/cciss/c0d0p1 307032 6160 300872 3% /boot/efi none 2064256 0 2064256 0% /dev/shm /dev/cciss/c0d0p4 33968352 275852 31966988 1% /var /dev/sdb1 17496684 77800 16530092 1% /work1 /dev/sdc1 17496684 77800 16530092 1% /work2 /dev/sdd1 17496684 77800 16530092 1% /work3 /dev/sde1 17496684 77800 16530092 1% /work4 [root@exp10fs1 ~]# [root@exp10fs1 ~]# exportfs /work1 System1-IP/Subnet-Mask /work2 System1-IP/Subnet-Mask /work3 System1-IP/Subnet-Mask [root@exp10fs1 ~]# [root@exp10fs1 ~]# [root@exp10fs1 ~]# service nfs status rpc.mountd (pid 11036) is running... nfsd (pid 11030 11029 11028 11027 11026 11025 11024 11023) is running... rpc.rquotad (pid 11019) is running... [root@exp10fs1 ~]#exit Connection to exp10fs1 closed. [root@exp10fs2 ~]# [root@exp10fs2 ~]# ps -efH UID PID PPID C STIME TTY TIME CMD root 1 0 0 15:47 ? 00:00:02 init [5] root 2 1 0 15:47 ? 00:00:00 [migration/0] root 3 1 0 15:47 ? 00:00:00 [ksoftirqd/0] root 4 1 0 15:47 ? 00:00:00 [events/0] root 5 4 0 15:47 ? 00:00:00 [khelper] root 6 4 0 15:47 ? 00:00:00 [kacpid] root 29 4 0 15:47 ? 00:00:00 [kblockd/0] root 47 4 0 15:47 ? 00:00:00 [pdflush] root 48 4 0 15:47 ? 00:00:00 [pdflush] root 50 4 0 15:47 ? 00:00:00 [aio/0] root 1075 4 0 15:47 ? 00:00:00 [kauditd] root 9861 4 0 15:47 ? 00:00:00 [kmpathd/0] root 30 1 0 15:47 ? 00:00:00 [khubd] root 49 1 0 15:47 ? 00:00:00 [kswapd0] root 193 1 0 15:47 ? 00:00:00 [kseriod] root 438 1 0 15:47 ? 00:00:00 [scsi_eh_0] root 440 1 0 15:47 ? 00:00:00 [scsi_eh_1] root 458 1 0 15:47 ? 00:00:00 [kjournald] root 9619 1 0 15:47 ? 00:00:00 udevd root 9888 1 0 15:47 ? 00:00:00 [kjournald] root 10368 1 0 15:48 ? 00:00:00 syslogd -m 0 root 10372 1 0 15:48 ? 00:00:00 klogd -x rpc 10384 1 0 15:48 ? 00:00:00 portmap rpcuser 10404 1 0 15:48 ? 00:00:00 rpc.statd root 10434 1 0 15:48 ? 00:00:00 rpc.idmapd root 10518 1 0 15:48 ? 00:00:00 /usr/sbin/acpid root 10549 1 0 15:48 ? 00:00:00 cupsd root 10584 1 0 15:48 ? 00:00:00 /usr/sbin/sshd root 11859 10584 0 15:50 ? 00:00:00 sshd: root@pts/1 root 11861 11859 0 15:50 pts/1 00:00:00 -bash root 11909 11861 0 16:00 pts/1 00:00:00 ps -efH root 10599 1 0 15:48 ? 00:00:00 xinetd -stayalive -pidfile /var/run/xinetd.pid root 10618 1 0 15:48 ? 00:00:00 sendmail: accepting connections smmsp 10628 1 0 15:48 ? 00:00:00 sendmail: Queue runner@01:00:00 for /var/spool/clientmqueue root 10639 1 0 15:48 ? 00:00:00 gpm -m /dev/input/mice -t imps2 htt 10670 1 0 15:48 ? 00:00:00 /usr/sbin/htt -retryonerror 0 htt 10671 10670 0 15:48 ? 00:00:00 htt_server -nodaemon canna 10683 1 0 15:48 ? 00:00:01 /usr/sbin/cannaserver -syslog -u canna root 10695 1 0 15:48 ? 00:00:00 crond xfs 10730 1 0 15:48 ? 00:00:00 xfs -droppriv -daemon root 10749 1 0 15:48 ? 00:00:00 /usr/sbin/atd root 10763 1 0 15:48 ? 00:00:01 /usr/sbin/salinfod -n dbus 10786 1 0 15:48 ? 00:00:00 dbus-daemon-1 --system root 10800 1 0 15:48 ? 00:00:00 cups-config-daemon root 10811 1 0 15:48 ? 00:00:00 hald root 10968 1 0 15:48 tty1 00:00:00 /sbin/mingetty tty1 root 10971 1 0 15:48 tty2 00:00:00 /sbin/mingetty tty2 root 10974 1 0 15:48 tty3 00:00:00 /sbin/mingetty tty3 root 10977 1 0 15:48 tty4 00:00:00 /sbin/mingetty tty4 root 10980 1 0 15:48 tty5 00:00:00 /sbin/mingetty tty5 root 10983 1 0 15:48 tty6 00:00:00 /sbin/mingetty tty6 root 10984 1 0 15:48 ? 00:00:00 /usr/bin/gdm-binary -nodaemon root 11725 10984 0 15:48 ? 00:00:00 /usr/bin/gdm-binary -nodaemon root 11750 11725 0 15:48 ? 00:00:02 /usr/X11R6/bin/X :0 -audit 0 -auth /var/gdm/:0.Xauth -nolisten tcp vt7 gdm 11853 11725 0 15:48 ? 00:00:01 /usr/bin/gdmgreeter [root@exp10fs2 ~]# [root@exp10fs2 ~]# date Fri Aug 29 16:010 JST 2008 [root@exp10fs2 ~]# [root@exp10fs2 ~]# ps -ef|grep mount root 11922 11861 0 16:01 pts/1 00:00:00 grep mount [root@exp10fs2 ~]# [root@exp10fs2 ~]# date Fri Aug 29 16:01:55 JST 2008 [root@exp10fs2 ~]# date;df Fri Aug 29 16:08:00 JST 2008 Filesystem 1K-blocks Used Available Use% Mounted on /dev/cciss/c0d0p2 38436528 7040812 29443248 20% / /dev/cciss/c0d0p1 307016 5872 301144 2% /boot/efi none 2583696 0 2583696 0% /dev/shm /dev/cciss/c0d0p3 25623668 298668 24023356 2% /var [root@exp10fs2 ~]# [root@exp10fs2 patch]# ls XCV3.X-IA64-1000763010.tar.gz [root@exp10fs2 patch]# [root@exp10fs2 patch]# [root@exp10fs2 patch]# tar zxvf XCV3.X-IA64-1000763010.tar.gz XCV3.X-IA64-1000763010/ XCV3.X-IA64-1000763010/manifest XCV3.X-IA64-1000763010/install_IA64 XCV3.X-IA64-1000763010/README XCV3.X-IA64-1000763010/RPMS/ XCV3.X-IA64-1000763010/RPMS/util-linux-2.12a-20.el4.ia64.rpm XCV3.X-IA64-1000763010/RPMS/util-linux-debuginfo-2.12a-20.el4.ia64.rpm XCV3.X-IA64-1000763010/PatchDetails XCV3.X-IA64-1000763010/src/ XCV3.X-IA64-1000763010/src/util-linux-2.12a-20.el4.src.rpm [root@exp10fs2 patch]# ls XCV3.X-IA64-1000763010 XCV3.X-IA64-1000763010.tar.gz [root@exp10fs2 patch]# [root@exp10fs2 patch]# [root@exp10fs2 patch]# cd XCV3.X-IA64-1000763010 [root@exp10fs2 XCV3.X-IA64-1000763010]# ls install_IA64 manifest PatchDetails README RPMS src [root@exp10fs2 XCV3.X-IA64-1000763010]# [root@exp10fs2 XCV3.X-IA64-1000763010]# cd RPMS [root@exp10fs2 RPMS]# ls util-linux-2.12a-20.el4.ia64.rpm util-linux-debuginfo-2.12a-20.el4.ia64.rpm [root@exp10fs2 RPMS]# [root@exp10fs2 RPMS]# rpm -Uvh util-linux-2.12a-20.el4.ia64.rpm warning: util-linux-2.12a-20.el4.ia64.rpm: V3 DSA signature: NOKEY, key ID d7265960 Preparing... ########################################### [100%] 1:util-linux ########################################### [100%] [root@exp10fs2 RPMS]# [root@exp10fs2 RPMS]# [root@exp10fs2 RPMS]# [root@exp10fs2 RPMS]# [root@exp10fs2 RPMS]# rpm -aq|grep util-linux util-linux-2.12a-20.el4 [root@exp10fs2 RPMS]# --------------------------------------------------- After shutdown System1 ( exp10fs1 ), reboot system2 --------------------------------------------------- [root@exp10fs2 RPMS]# ping exp10fs1 PING exp10fs1 (System1-IP) 56(84) bytes of data. From exp10fs2 (System2-IP) icmp_seq=1 Destination Host Unreachable From exp10fs2 (System2-IP) icmp_seq=2 Destination Host Unreachable From exp10fs2 (System2-IP) icmp_seq=3 Destination Host Unreachable From exp10fs2 (System2-IP) icmp_seq=5 Destination Host Unreachable From exp10fs2 (System2-IP) icmp_seq=6 Destination Host Unreachable From exp10fs2 (System2-IP) icmp_seq=7 Destination Host Unreachable --- exp10fs1 ping statistics --- 9 packets transmitted, 0 received, +6 errors, 100% packet loss, time 8001ms , pipe 4 [root@exp10fs2 RPMS]# [root@exp10fs2 RPMS]# df Filesystem 1K-blocks Used Available Use% Mounted on /dev/cciss/c0d0p2 38436528 7040816 29443244 20% / /dev/cciss/c0d0p1 307016 5872 301144 2% /boot/efi none 2583696 0 2583696 0% /dev/shm /dev/cciss/c0d0p3 25623668 310300 24011724 2% /var [root@exp10fs2 RPMS]# [root@exp10fs2 RPMS]# reboot Broadcast message from root (pts/1) (Fri Aug 29 16:22:31 2008): The system is going down for reboot NOW! [root@exp10fs2 RPMS]# login as: root root@exp10fs2's password: [root@exp10fs2 ~]# [root@exp10fs2 ~]# --------------------------------------------------- Reboot System1 ( exp10fs1 ) --------------------------------------------------- [root@exp10fs2 ~]# ping exp10fs1 PING exp10fs1 (System1-IP) 56(84) bytes of data. 64 bytes from exp10fs1 (System1-IP): icmp_seq=0 ttl=64 time=1.18 ms 64 bytes from exp10fs1 (System1-IP): icmp_seq=1 ttl=64 time=0.151 ms 64 bytes from exp10fs1 (System1-IP): icmp_seq=2 ttl=64 time=0.150 ms 64 bytes from exp10fs1 (System1-IP): icmp_seq=3 ttl=64 time=0.150 ms 64 bytes from exp10fs1 (System1-IP): icmp_seq=4 ttl=64 time=0.150 ms 64 bytes from exp10fs1 (System1-IP): icmp_seq=5 ttl=64 time=0.149 ms --- exp10fs1. ping statistics --- 6 packets transmitted, 6 received, 0% packet loss, time 5000ms rtt min/avg/max/mdev = 0.149/0.322/1.185/0.386 ms, pipe 2 [root@exp10fs2 ~]# [root@exp10fs2 ~]# [root@exp10fs2 ~]# [root@exp10fs2 ~]# mount /dev/cciss/c0d0p2 on / type ext3 (rw) none on /proc type proc (rw) none on /sys type sysfs (rw) none on /dev/pts type devpts (rw,gid=5,mode=620) usbfs on /proc/bus/usb type usbfs (rw) /dev/cciss/c0d0p1 on /boot/efi type vfat (rw) none on /dev/shm type tmpfs (rw) /dev/cciss/c0d0p3 on /var type ext3 (rw) none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) [root@exp10fs2 ~]# df Filesystem 1K-blocks Used Available Use% Mounted on /dev/cciss/c0d0p2 38436528 7040816 29443244 20% / /dev/cciss/c0d0p1 307016 5872 301144 2% /boot/efi none 2583696 0 2583696 0% /dev/shm /dev/cciss/c0d0p3 25623668 308556 24013468 2% /var [root@exp10fs2 ~]# [root@exp10fs2 ~]# [root@exp10fs2 ~]# ps -ef|grep portmap rpc 10384 1 0 16:25 ? 00:00:00 portmap root 11900 11847 0 16:42 pts/1 00:00:00 grep portmap [root@exp10fs2 ~]# ps -ef|grep mount root 11902 11847 0 16:42 pts/1 00:00:00 grep mount [root@exp10fs2 ~]# [root@exp10fs2 ~]# ssh exp10fs1 root@exp10fs1's password: [root@exp10fs1 ~]# [root@exp10fs1 ~]# [root@exp10fs1 ~]# df Filesystem 1K-blocks Used Available Use% Mounted on /dev/cciss/c0d0p2 30237648 7036436 21665212 25% / /dev/cciss/c0d0p1 307032 6160 300872 3% /boot/efi none 2064256 0 2064256 0% /dev/shm /dev/cciss/c0d0p4 33968352 275920 31966920 1% /var /dev/sdb1 17496684 77800 16530092 1% /work1 /dev/sdc1 17496684 77800 16530092 1% /work2 /dev/sdd1 17496684 77800 16530092 1% /work3 /dev/sde1 17496684 77800 16530092 1% /work4 [root@exp10fs1 ~]# exportfs /work1 System1-IP/Subnet-Mask /work2 System1-IP/Subnet-Mask /work3 System1-IP/Subnet-Mask [root@exp10fs1 ~]# [root@exp10fs1 ~]# showmount Hosts on exp10fs1: System2-IP [root@exp10fs1 ~]# [root@exp10fs1 ~]# exit logout Connection to exp10fs1 closed. [root@exp10fs2 ~]# [root@exp10fs2 ~]# mount -a [root@exp10fs2 ~]# df Filesystem 1K-blocks Used Available Use% Mounted on /dev/cciss/c0d0p2 38436528 7040816 29443244 20% / /dev/cciss/c0d0p1 307016 5872 301144 2% /boot/efi none 2583696 0 2583696 0% /dev/shm /dev/cciss/c0d0p3 25623668 308556 24013468 2% /var System1-IP:/work1 17496704 77792 16530112 1% /work1 System1-IP:/work2 17496704 77792 16530112 1% /work2 System1-IP:/work3 17496704 77792 16530112 1% /work3 [root@exp10fs2 ~]#
There's a lot of info in this bug, but I'm afraid I'm not clear on what the actual problem is. Are you saying that the patch from bug 432313 did not fix this problem for you?
Yes, the patch didnot fix the reported problem.
Created attachment 316751 [details] patch -- fix dup2 args in detach_terminal, close original fd Thanks. I think I see the problem... Could you test the attached patch and let me know whether it fixes it? It should apply cleanly to the latest util-linux SRPM.
H, I looked in to initlog and mount code and added debugging statemetns assuming the problem is something different. Finally I found the same fix and tested here. It is working fine. INITLOG : PID = 22826 :: PPID : 22813 :::: initlog -q -c mount -a -t nfs,nfs4 INITLOG :forkCommand() quiet 1 : ourpid : 22826 : PID = 22827 :: PPID : 22826 pid : 0 MOUNT :: PID = 22827 :: PPID : 22826 ::::: mount -a -t nfs,nfs4 INITLOG :forkCommand() quiet 1 : ourpid : 22826 : PID = 22826 :: PPID : 22813 pid : 22827 MOUNT : mount Exiting : PID = 22827 :: PPID : 22826 INITLOG :runCommand() IF PART PID : 22826 :: PPID : 22813 : x 0 reexec 0 quiet 1 debug 0 pid : 22827 INITLOG : initlog :: Exiting : PID = 22826 :: PPID : 22813 INITLOG : PID = 22846 :: PPID : 22813 :::: initlog -q -n /etc/rc.d/init.d/netfs -s Mounting NFS filesystems: -e 1 INITLOG : initlog :: Exiting : PID = 22846 :: PPID : 22813 INITLOG : PID = 22848 :: PPID : 22813 :::: initlog -q -c mount -a -t nonfs,nfs4,smbfs,cifs,ncpfs,gfs INITLOG :forkCommand() quiet 1 : ourpid : 22848 : PID = 22849 :: PPID : 22848 pid : 0 MOUNT :: PID = 22849 :: PPID : 22848 ::::: mount -a -t nonfs,nfs4,smbfs,cifs,ncpfs,gfs INITLOG :forkCommand() quiet 1 : ourpid : 22848 : PID = 22848 :: PPID : 22813 pid : 22849 MOUNT : mount Exiting : PID = 22849 :: PPID : 22848 INITLOG :runCommand() IF PART PID : 22848 :: PPID : 22813 : x 0 reexec 0 quiet 1 debug 0 pid : 22849 INITLOG : initlog :: Exiting : PID = 22848 :: PPID : 22813 INITLOG : PID = 22856 :: PPID : 22813 :::: initlog -q -n /etc/rc.d/init.d/netfs -s Mounting other filesystems: -e 1 INITLOG : initlog :: Exiting : PID = 22856 :: PPID : 22813 MOUNT : mount Exiting : PID = 22840 :: PPID : 1 MOUNT : mount Exiting : PID = 22841 :: PPID : 1 MOUNT : mount Exiting : PID = 22839 :: PPID : 1
Sounds good. I'll try to make sure we get this into 4.8.
Hi Jeff, By when and in which version of the util-linux SRPM this fix will be available. Please Update us on this. --Raghu
Currently, this is on the 4.8 proposed list and I expect that we'll be able to get a fixed package into RHEL4.8 with this patch. If you need something sooner, then you'll need to escalate a support case. If you do open a support case, then be sure to reference this BZ so that they're aware that this is a known bug.
Committed in util-linux-2.12a-21.el4
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2009-0981.html