Bug 1121451
Summary: | [start-statd] start-statd start fail, when "rpc.statd dead but pid file exists" | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | JianHong Yin <jiyin> |
Component: | nfs-utils | Assignee: | Steve Dickson <steved> |
Status: | CLOSED WONTFIX | QA Contact: | Yongcheng Yang <yoyang> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 6.6 | CC: | eguan, fs-qe, jiyin, steved, swhiteho, yoyang |
Target Milestone: | rc | Keywords: | Reopened, TestBlocker |
Target Release: | --- | Flags: | jiyin:
needinfo-
|
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2015-04-21 20:54:27 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1164899 |
Description
JianHong Yin
2014-07-21 03:09:32 UTC
Got the reason: Jul 22 03:06:21 ibm-z10-67 rpc.statd[42160]: Opening /var/run/rpc.statd.pid failed: Permission denied but do not known why Permission denied, and how to reproduce step by step,, :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: [ LOG ] :: /CoreOS/nfs-utils/Regression/bz712438-nfsstat-m-does-not-display-nfs4-mounts/Test :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: [03:06:21 root@ ~~]# service nfslock status rpc.statd dead but pid file exists -------------------------------------------------------------------------------- [03:06:21 root@ ~~]# rpcinfo -s program version(s) netid(s) service owner 100000 2,3,4 local,udp,tcp,udp6,tcp6 portmapper superuser 100011 2,1 tcp,udp rquotad superuser 100005 3,2,1 tcp6,udp6,tcp,udp mountd superuser 100003 4,3,2 udp6,tcp6,udp,tcp nfs superuser 100227 3,2 udp6,tcp6,udp,tcp nfs_acl superuser 100021 4,3,1 tcp6,udp6,tcp,udp nlockmgr superuser -------------------------------------------------------------------------------- [03:06:21 root@ ~~]# mount -t nfs -o vers=3 127.0.0.1:/tmp /bz712438 mount.nfs: rpc.statd is not running but is required for remote locking. mount.nfs: Either use '-o nolock' to keep locks local, or start statd. mount.nfs: an incorrect mount option was specified :: [ FAIL ] :: Mounting with nfsv3 (Expected 0, got 32) -------------------------------------------------------------------------------- [03:06:21 root@ ~~]# tail /var/log/messages Jul 22 03:06:01 ibm-z10-67 logger: /usr/bin/rhts-test-runner.sh report_finish Jul 22 03:06:17 ibm-z10-67 logger: /usr/bin/rhts-test-runner.sh rhts-test-update 127.0.0.1:7092 22968827 start nfs-utils-CoreOS-nfs-utils-Regression-bz712438-nfsstat-m-does-not-display-nfs4-mounts Jul 22 03:06:17 ibm-z10-67 python: 07/22/14 03:06:17 testID:22968827 start: Jul 22 03:06:18 ibm-z10-67 rpc.mountd[41997]: Version 1.2.3 starting Jul 22 03:06:18 ibm-z10-67 kernel: NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory Jul 22 03:06:18 ibm-z10-67 kernel: NFSD: starting 90-second grace period Jul 22 03:06:21 ibm-z10-67 rpc.statd[42160]: Version 1.2.3 starting Jul 22 03:06:21 ibm-z10-67 rpc.statd[42160]: Flags: TI-RPC Jul 22 03:06:21 ibm-z10-67 rpc.statd[42160]: Opening /var/run/rpc.statd.pid failed: Permission denied How do I get access to these tests? /CoreOS/nfs-utils/Regression/... (In reply to Steve Dickson from comment #5) > How do I get access to these tests? > > /CoreOS/nfs-utils/Regression/... I am investigating the real reproduce condition.. Now the reproducer /CoreOS/nfs-utils/Regression/bz1121451-start-statd-test can work fine. After follow step, will reproduce the issue [07:15:06 root@ ~~]# rpc.statd; sleep 0.5; rpc.mountd -------------------------------------------------------------------------------- [07:15:07 root@ ~~]# rpcinfos program version(s) netid(s) service owner 100000 2,3,4 local,udp,tcp,udp6,tcp6 portmapper superuser 100024 1 tcp6,udp6,tcp,udp status 29 100005 3,2,1 tcp6,udp6,tcp,udp mountd superuser :: [ PASS ] :: Running 'rpcinfos' (Expected 0, got 0) -------------------------------------------------------------------------------- [07:15:07 root@ ~~]# killall rpc.statd -------------------------------------------------------------------------------- [07:15:07 root@ ~~]# killall rpc.mountd mount.nfs will fail with follow msg: mount.nfs: rpc.statd is not running but is required for remote locking. mount.nfs: Either use '-o nolock' to keep locks local, or start statd. mount.nfs: an incorrect mount option was specified (In reply to Yin.JianHong from comment #0) > Description of problem: > [10:41:23 root@ ~~]# service nfslock status > rpc.statd dead but pid file exists > ----------------------------------------------------------------------------- > --- > [10:41:23 root@ ~~]# mount -t nfs -o vers=3 127.0.0.1:/tmp /bz712438 > mount.nfs: rpc.statd is not running but is required for remote locking. > mount.nfs: Either use '-o nolock' to keep locks local, or start statd. > mount.nfs: an incorrect mount option was specified > :: [ FAIL ] :: Mounting with nfsv3 (Expected 0, got 32) > > Version-Release number of selected component (if applicable): > RHEL-6.x > > How reproducible: > always > > Steps to Reproduce: > 1. kill rpc.statd, and the status is "rpc.statd dead but pid file exists" > 2. mount -t nfs -o vers=3 $serv:$expdir $nfsmp > 3. > > Actual results: > [10:41:23 root@ ~~]# mount -t nfs -o vers=3 127.0.0.1:/tmp /bz712438 > mount.nfs: rpc.statd is not running but is required for remote locking. > mount.nfs: Either use '-o nolock' to keep locks local, or start statd. > mount.nfs: an incorrect mount option was specified > :: [ FAIL ] :: Mounting with nfsv3 (Expected 0, got 32) This seems to work with the latest nfs-utils rhel66# service nfslock start Starting NFS statd: [ OK ] rhel66# ps ax | grep statd 3844 ? Ss 0:00 rpc.statd 3849 pts/1 S+ 0:00 grep statd rhel66# kill 3844 rhel66# service nfslock status rpc.statd dead but pid file exists rhel66# mount redhat:/home /mnt/home -o v3 rhel66# ps ax | grep statd 3868 ? Ss 0:00 rpc.statd --no-notify 3870 pts/1 S+ 0:00 grep statd Please check your script... This has not been fixed in the latest RHEL-6 (In reply to Steve Dickson from comment #14) > (In reply to Yin.JianHong from comment #0) > > Description of problem: > > [10:41:23 root@ ~~]# service nfslock status > > rpc.statd dead but pid file exists > > ----------------------------------------------------------------------------- > > --- > > [10:41:23 root@ ~~]# mount -t nfs -o vers=3 127.0.0.1:/tmp /bz712438 > > mount.nfs: rpc.statd is not running but is required for remote locking. > > mount.nfs: Either use '-o nolock' to keep locks local, or start statd. > > mount.nfs: an incorrect mount option was specified > > :: [ FAIL ] :: Mounting with nfsv3 (Expected 0, got 32) > > > > Version-Release number of selected component (if applicable): > > RHEL-6.x > > > > How reproducible: > > always > > > > Steps to Reproduce: > > 1. kill rpc.statd, and the status is "rpc.statd dead but pid file exists" > > 2. mount -t nfs -o vers=3 $serv:$expdir $nfsmp > > 3. > > > > Actual results: > > [10:41:23 root@ ~~]# mount -t nfs -o vers=3 127.0.0.1:/tmp /bz712438 > > mount.nfs: rpc.statd is not running but is required for remote locking. > > mount.nfs: Either use '-o nolock' to keep locks local, or start statd. > > mount.nfs: an incorrect mount option was specified > > :: [ FAIL ] :: Mounting with nfsv3 (Expected 0, got 32) > > This seems to work with the latest nfs-utils > > rhel66# service nfslock start > Starting NFS statd: [ OK ] > rhel66# ps ax | grep statd > 3844 ? Ss 0:00 rpc.statd > 3849 pts/1 S+ 0:00 grep statd > rhel66# kill 3844 > rhel66# service nfslock status > rpc.statd dead but pid file exists > rhel66# mount redhat:/home /mnt/home -o v3 > rhel66# ps ax | grep statd > 3868 ? Ss 0:00 rpc.statd --no-notify > 3870 pts/1 S+ 0:00 grep statd > > Please check your script... Need to kill both rpc.statd and rpc.mountd [root@hp-dl385pg8-10 ~]# service nfslock restart Stopping NFS locking: [ OK ] Stopping NFS statd: [ OK ] Starting NFS statd: [ OK ] [root@hp-dl385pg8-10 ~]# ps ax | grep statd 24738 ? Ss 0:00 rpc.statd 24742 pts/2 S+ 0:00 grep statd [root@hp-dl385pg8-10 ~]# kill 24738 [root@hp-dl385pg8-10 ~]# ps ax | grep mountd 24634 ? Ss 0:00 rpc.mountd 24744 pts/2 S+ 0:00 grep mountd [root@hp-dl385pg8-10 ~]# kill 24634 [root@hp-dl385pg8-10 ~]# service nfslock status rpc.statd dead but pid file exists [root@hp-dl385pg8-10 ~]# mount -t nfs -o vers=3 127.0.0.1:/export/ /mnt/mnt_test/ mount.nfs: requested NFS version or transport protocol is not supported [root@hp-dl385pg8-10 ~]# echo $? 32 [root@hp-dl385pg8-10 ~]# ps ax | grep statd 24768 ? Ss 0:00 rpc.statd --no-notify 24770 pts/2 S+ 0:00 grep statd [root@hp-dl385pg8-10 ~]# ps ax | grep mountd 24772 pts/2 S+ 0:00 grep mountd [root@hp-dl385pg8-10 ~]# mount -vvv -t nfs -o vers=3 127.0.0.1:/export/ /mnt/mnt_test/ mount: fstab path: "/etc/fstab" mount: mtab path: "/etc/mtab" mount: lock path: "/etc/mtab~" mount: temp path: "/etc/mtab.tmp" mount: UID: 0 mount: eUID: 0 mount: spec: "127.0.0.1:/export/" mount: node: "/mnt/mnt_test/" mount: types: "nfs" mount: opts: "vers=3" final mount options: 'vers=3' mount: external mount: argv[0] = "/sbin/mount.nfs" mount: external mount: argv[1] = "127.0.0.1:/export/" mount: external mount: argv[2] = "/mnt/mnt_test/" mount: external mount: argv[3] = "-v" mount: external mount: argv[4] = "-o" mount: external mount: argv[5] = "rw,vers=3" mount.nfs: timeout set for Mon Mar 23 17:21:59 2015 mount.nfs: trying text-based options 'vers=3,addr=127.0.0.1' mount.nfs: prog 100003, trying vers=3, prot=6 mount.nfs: trying 127.0.0.1 prog 100003 vers 3 prot TCP port 2049 mount.nfs: prog 100005, trying vers=3, prot=17 mount.nfs: portmap query retrying: RPC: Program not registered mount.nfs: prog 100005, trying vers=3, prot=6 mount.nfs: portmap query failed: RPC: Program not registered mount.nfs: requested NFS version or transport protocol is not supported [root@hp-dl385pg8-10 ~]# (In reply to Yongcheng Yang from comment #16)> > > Description of problem: > > Please check your script... > Need to kill both rpc.statd and rpc.mountd > [root@hp-dl385pg8-10 ~]# mount -t nfs -o vers=3 127.0.0.1:/export/ > /mnt/mnt_test/ > mount.nfs: requested NFS version or transport protocol is not supported Its is expected behaviour. Killing rpc.statd and rpc.mountd is going to cause a loopback mount to fail. I'm not sure what the expectations are but there is no problem here... (In reply to Steve Dickson from comment #17) > (In reply to Yongcheng Yang from comment #16)> > > Description of problem: > > > Please check your script... > > Need to kill both rpc.statd and rpc.mountd > > [root@hp-dl385pg8-10 ~]# mount -t nfs -o vers=3 127.0.0.1:/export/ > > /mnt/mnt_test/ > > mount.nfs: requested NFS version or transport protocol is not supported > Its is expected behaviour. Killing rpc.statd and rpc.mountd is > going to cause a loopback mount to fail. I'm not sure what the > expectations are but there is no problem here... Sorry I mistake the reproducer, need to restart the nfs service after killing both rpc.statd and rpc.mountd, then try mount, it should success but failed. I edit a simple reproducer, and it can 100% reproduced in rhe6, and can't reproduced at rhel7: [root@hp-dl385pg8-10 ~]# cat 11214512.sh service nfs stop > /dev/null service nfslock stop > /dev/null service rpcbind restart > /dev/null rpc.statd kill `ps ax | grep [r]pc.statd | awk '{print $1}'` 2>/dev/null kill `ps ax | grep [r]pc.mountd | awk '{print $1}'` 2>/dev/null service nfs restart > /dev/null service nfslock status mount -t nfs -o vers=3 127.0.0.1:/export/ /mnt/mnt_test/ ps ax | grep statd umount /mnt/mnt_test/ 2>/dev/null tail /var/log/messages [root@hp-dl385pg8-10 ~]# Following is the reproduce logs by hand: [root@hp-dl385pg8-10 ~]# service rpcbind restart Stopping rpcbind: [ OK ] Starting rpcbind: [ OK ] [root@hp-dl385pg8-10 ~]# rpc.statd [root@hp-dl385pg8-10 ~]# rpc.mountd [root@hp-dl385pg8-10 ~]# ps ax | grep rpc.statd 5720 ? Ss 0:00 rpc.statd 5725 pts/2 S+ 0:00 grep rpc.statd [root@hp-dl385pg8-10 ~]# kill 5720 [root@hp-dl385pg8-10 ~]# ps ax | grep rpc.mountd 5608 ? Ss 0:00 rpc.mountd 5723 ? Ss 0:00 rpc.mountd 5727 pts/2 S+ 0:00 grep rpc.mountd [root@hp-dl385pg8-10 ~]# kill 5608 5723 [root@hp-dl385pg8-10 ~]# service nfs restart Shutting down NFS daemon: [ OK ] Shutting down NFS mountd: [FAILED] Shutting down NFS quotas: [ OK ] Shutting down NFS services: [ OK ] Shutting down RPC idmapd: [ OK ] Starting NFS services: [ OK ] Starting NFS quotas: [ OK ] Starting NFS mountd: [ OK ] Starting NFS daemon: [ OK ] Starting RPC idmapd: [ OK ] [root@hp-dl385pg8-10 ~]# service nfslock status rpc.statd dead but pid file exists [root@hp-dl385pg8-10 ~]# mount -t nfs -o vers=3 127.0.0.1:/export/ /mnt/mnt_test/ mount.nfs: rpc.statd is not running but is required for remote locking. mount.nfs: Either use '-o nolock' to keep locks local, or start statd. mount.nfs: an incorrect mount option was specified [root@hp-dl385pg8-10 ~]# echo $? 32 [root@hp-dl385pg8-10 ~]# mount -vvv -t nfs -o vers=3 127.0.0.1:/export/ /mnt/mnt_test/ mount: fstab path: "/etc/fstab" mount: mtab path: "/etc/mtab" mount: lock path: "/etc/mtab~" mount: temp path: "/etc/mtab.tmp" mount: UID: 0 mount: eUID: 0 mount: spec: "127.0.0.1:/export/" mount: node: "/mnt/mnt_test/" mount: types: "nfs" mount: opts: "vers=3" final mount options: 'vers=3' mount: external mount: argv[0] = "/sbin/mount.nfs" mount: external mount: argv[1] = "127.0.0.1:/export/" mount: external mount: argv[2] = "/mnt/mnt_test/" mount: external mount: argv[3] = "-v" mount: external mount: argv[4] = "-o" mount: external mount: argv[5] = "rw,vers=3" mount.nfs: timeout set for Tue Mar 24 10:22:22 2015 mount.nfs: rpc.statd is not running but is required for remote locking. mount.nfs: Either use '-o nolock' to keep locks local, or start statd. mount.nfs: an incorrect mount option was specified [root@hp-dl385pg8-10 ~]# echo $? 32 [root@hp-dl385pg8-10 ~]# tail /var/log/messages Mar 24 10:34:38 hp-dl385pg8-10 rpcbind: rpcbind terminating on signal. Restart with "rpcbind -w" Mar 24 10:34:38 hp-dl385pg8-10 rpc.statd[10471]: Version 1.2.3 starting Mar 24 10:34:38 hp-dl385pg8-10 sm-notify[10472]: Version 1.2.3 starting Mar 24 10:34:38 hp-dl385pg8-10 rpc.statd[10471]: Caught signal 15, un-registering and exiting Mar 24 10:34:39 hp-dl385pg8-10 rpc.mountd[10588]: Version 1.2.3 starting Mar 24 10:34:39 hp-dl385pg8-10 kernel: NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory Mar 24 10:34:39 hp-dl385pg8-10 kernel: NFSD: starting 90-second grace period Mar 24 10:34:39 hp-dl385pg8-10 rpc.statd[10661]: Version 1.2.3 starting Mar 24 10:34:39 hp-dl385pg8-10 rpc.statd[10661]: Flags: TI-RPC Mar 24 10:34:39 hp-dl385pg8-10 rpc.statd[10661]: Opening /var/run/rpc.statd.pid failed: Permission denied |