1121451 – [start-statd] start-statd start fail, when "rpc.statd dead but pid file exists"

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1121451 - [start-statd] start-statd start fail, when "rpc.statd dead but pid file exists"

Summary: [start-statd] start-statd start fail, when "rpc.statd dead but pid file exists"

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	nfs-utils
Sub Component:
Version:	6.6
Hardware:	All
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Steve Dickson
QA Contact:	Yongcheng Yang
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1164899
TreeView+	depends on / blocked

Reported:	2014-07-21 03:09 UTC by JianHong Yin
Modified:	2015-04-21 20:54 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2015-04-21 20:54:27 UTC
Target Upstream Version:
Embargoed:
Flags:	jiyin: needinfo-

Attachments	(Terms of Use)

Description JianHong Yin 2014-07-21 03:09:32 UTC

Description of problem:
[10:41:23 root@ ~~]# service nfslock status
rpc.statd dead but pid file exists
--------------------------------------------------------------------------------
[10:41:23 root@ ~~]# mount -t nfs -o vers=3 127.0.0.1:/tmp /bz712438
mount.nfs: rpc.statd is not running but is required for remote locking.
mount.nfs: Either use '-o nolock' to keep locks local, or start statd.
mount.nfs: an incorrect mount option was specified
:: [   FAIL   ] :: Mounting with nfsv3 (Expected 0, got 32)

Version-Release number of selected component (if applicable):
RHEL-6.x

How reproducible:
always

Steps to Reproduce:
1. kill rpc.statd, and the status is "rpc.statd dead but pid file exists"
2. mount -t nfs -o vers=3  $serv:$expdir  $nfsmp
3.

Actual results:
[10:41:23 root@ ~~]# mount -t nfs -o vers=3 127.0.0.1:/tmp /bz712438
mount.nfs: rpc.statd is not running but is required for remote locking.
mount.nfs: Either use '-o nolock' to keep locks local, or start statd.
mount.nfs: an incorrect mount option was specified
:: [   FAIL   ] :: Mounting with nfsv3 (Expected 0, got 32)

Expected results:
mount success

Additional info:

Comment 4 JianHong Yin 2014-07-22 07:31:38 UTC

Got the reason: Jul 22 03:06:21 ibm-z10-67 rpc.statd[42160]: Opening /var/run/rpc.statd.pid failed: Permission denied

but do not known why Permission denied, and how to reproduce step by step,,

::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: [   LOG    ] :: /CoreOS/nfs-utils/Regression/bz712438-nfsstat-m-does-not-display-nfs4-mounts/Test
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

[03:06:21 root@ ~~]# service nfslock status
rpc.statd dead but pid file exists
--------------------------------------------------------------------------------
[03:06:21 root@ ~~]# rpcinfo -s
   program version(s) netid(s)                         service     owner
    100000  2,3,4     local,udp,tcp,udp6,tcp6          portmapper  superuser
    100011  2,1       tcp,udp                          rquotad     superuser
    100005  3,2,1     tcp6,udp6,tcp,udp                mountd      superuser
    100003  4,3,2     udp6,tcp6,udp,tcp                nfs         superuser
    100227  3,2       udp6,tcp6,udp,tcp                nfs_acl     superuser
    100021  4,3,1     tcp6,udp6,tcp,udp                nlockmgr    superuser
--------------------------------------------------------------------------------
[03:06:21 root@ ~~]# mount -t nfs -o vers=3 127.0.0.1:/tmp /bz712438
mount.nfs: rpc.statd is not running but is required for remote locking.
mount.nfs: Either use '-o nolock' to keep locks local, or start statd.
mount.nfs: an incorrect mount option was specified
:: [   FAIL   ] :: Mounting with nfsv3 (Expected 0, got 32)
--------------------------------------------------------------------------------
[03:06:21 root@ ~~]# tail /var/log/messages
Jul 22 03:06:01 ibm-z10-67 logger: /usr/bin/rhts-test-runner.sh report_finish 
Jul 22 03:06:17 ibm-z10-67 logger: /usr/bin/rhts-test-runner.sh rhts-test-update 127.0.0.1:7092 22968827 start  nfs-utils-CoreOS-nfs-utils-Regression-bz712438-nfsstat-m-does-not-display-nfs4-mounts
Jul 22 03:06:17 ibm-z10-67 python: 07/22/14 03:06:17  testID:22968827 start:
Jul 22 03:06:18 ibm-z10-67 rpc.mountd[41997]: Version 1.2.3 starting
Jul 22 03:06:18 ibm-z10-67 kernel: NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
Jul 22 03:06:18 ibm-z10-67 kernel: NFSD: starting 90-second grace period
Jul 22 03:06:21 ibm-z10-67 rpc.statd[42160]: Version 1.2.3 starting
Jul 22 03:06:21 ibm-z10-67 rpc.statd[42160]: Flags: TI-RPC 
Jul 22 03:06:21 ibm-z10-67 rpc.statd[42160]: Opening /var/run/rpc.statd.pid failed: Permission denied

Comment 5 Steve Dickson 2014-08-07 14:27:16 UTC

How do I get access to these tests? 

/CoreOS/nfs-utils/Regression/...

Comment 6 JianHong Yin 2014-08-08 01:52:55 UTC

(In reply to Steve Dickson from comment #5)
> How do I get access to these tests? 
> 
> /CoreOS/nfs-utils/Regression/...

I am investigating the real reproduce condition..

Comment 8 JianHong Yin 2014-08-10 11:21:55 UTC

Now the reproducer /CoreOS/nfs-utils/Regression/bz1121451-start-statd-test can work fine.


After follow step, will reproduce the issue

[07:15:06 root@ ~~]# rpc.statd; sleep 0.5; rpc.mountd
--------------------------------------------------------------------------------
[07:15:07 root@ ~~]# rpcinfos
   program version(s) netid(s)                         service     owner
    100000  2,3,4     local,udp,tcp,udp6,tcp6          portmapper  superuser
    100024  1         tcp6,udp6,tcp,udp                status      29
    100005  3,2,1     tcp6,udp6,tcp,udp                mountd      superuser
:: [   PASS   ] :: Running 'rpcinfos' (Expected 0, got 0)
--------------------------------------------------------------------------------
[07:15:07 root@ ~~]# killall rpc.statd
--------------------------------------------------------------------------------
[07:15:07 root@ ~~]# killall rpc.mountd


mount.nfs  will  fail  with  follow  msg:
mount.nfs: rpc.statd is not running but is required for remote locking.
mount.nfs: Either use '-o nolock' to keep locks local, or start statd.
mount.nfs: an incorrect mount option was specified

Comment 14 Steve Dickson 2015-01-02 21:57:44 UTC

(In reply to Yin.JianHong from comment #0)
> Description of problem:
> [10:41:23 root@ ~~]# service nfslock status
> rpc.statd dead but pid file exists
> -----------------------------------------------------------------------------
> ---
> [10:41:23 root@ ~~]# mount -t nfs -o vers=3 127.0.0.1:/tmp /bz712438
> mount.nfs: rpc.statd is not running but is required for remote locking.
> mount.nfs: Either use '-o nolock' to keep locks local, or start statd.
> mount.nfs: an incorrect mount option was specified
> :: [   FAIL   ] :: Mounting with nfsv3 (Expected 0, got 32)
> 
> Version-Release number of selected component (if applicable):
> RHEL-6.x
> 
> How reproducible:
> always
> 
> Steps to Reproduce:
> 1. kill rpc.statd, and the status is "rpc.statd dead but pid file exists"
> 2. mount -t nfs -o vers=3  $serv:$expdir  $nfsmp
> 3.
> 
> Actual results:
> [10:41:23 root@ ~~]# mount -t nfs -o vers=3 127.0.0.1:/tmp /bz712438
> mount.nfs: rpc.statd is not running but is required for remote locking.
> mount.nfs: Either use '-o nolock' to keep locks local, or start statd.
> mount.nfs: an incorrect mount option was specified
> :: [   FAIL   ] :: Mounting with nfsv3 (Expected 0, got 32)

This seems to work with the latest nfs-utils

 rhel66# service nfslock start
Starting NFS statd:                                        [  OK  ]
rhel66# ps ax | grep statd
 3844 ?        Ss     0:00 rpc.statd
 3849 pts/1    S+     0:00 grep statd
rhel66# kill 3844
rhel66# service nfslock status
rpc.statd dead but pid file exists
rhel66# mount redhat:/home /mnt/home -o v3
rhel66# ps ax | grep statd
 3868 ?        Ss     0:00 rpc.statd --no-notify
 3870 pts/1    S+     0:00 grep statd

Please check your script...

Comment 16 Yongcheng Yang 2015-03-23 09:23:18 UTC

This has not been fixed in the latest RHEL-6
(In reply to Steve Dickson from comment #14)
> (In reply to Yin.JianHong from comment #0)
> > Description of problem:
> > [10:41:23 root@ ~~]# service nfslock status
> > rpc.statd dead but pid file exists
> > -----------------------------------------------------------------------------
> > ---
> > [10:41:23 root@ ~~]# mount -t nfs -o vers=3 127.0.0.1:/tmp /bz712438
> > mount.nfs: rpc.statd is not running but is required for remote locking.
> > mount.nfs: Either use '-o nolock' to keep locks local, or start statd.
> > mount.nfs: an incorrect mount option was specified
> > :: [   FAIL   ] :: Mounting with nfsv3 (Expected 0, got 32)
> > 
> > Version-Release number of selected component (if applicable):
> > RHEL-6.x
> > 
> > How reproducible:
> > always
> > 
> > Steps to Reproduce:
> > 1. kill rpc.statd, and the status is "rpc.statd dead but pid file exists"
> > 2. mount -t nfs -o vers=3  $serv:$expdir  $nfsmp
> > 3.
> > 
> > Actual results:
> > [10:41:23 root@ ~~]# mount -t nfs -o vers=3 127.0.0.1:/tmp /bz712438
> > mount.nfs: rpc.statd is not running but is required for remote locking.
> > mount.nfs: Either use '-o nolock' to keep locks local, or start statd.
> > mount.nfs: an incorrect mount option was specified
> > :: [   FAIL   ] :: Mounting with nfsv3 (Expected 0, got 32)
> 
> This seems to work with the latest nfs-utils
> 
>  rhel66# service nfslock start
> Starting NFS statd:                                        [  OK  ]
> rhel66# ps ax | grep statd
>  3844 ?        Ss     0:00 rpc.statd
>  3849 pts/1    S+     0:00 grep statd
> rhel66# kill 3844
> rhel66# service nfslock status
> rpc.statd dead but pid file exists
> rhel66# mount redhat:/home /mnt/home -o v3
> rhel66# ps ax | grep statd
>  3868 ?        Ss     0:00 rpc.statd --no-notify
>  3870 pts/1    S+     0:00 grep statd
> 
> Please check your script...
Need to kill both rpc.statd and rpc.mountd

[root@hp-dl385pg8-10 ~]# service nfslock restart
Stopping NFS locking: [  OK  ]
Stopping NFS statd: [  OK  ]
Starting NFS statd: [  OK  ]
[root@hp-dl385pg8-10 ~]# ps ax | grep statd
24738 ?        Ss     0:00 rpc.statd
24742 pts/2    S+     0:00 grep statd
[root@hp-dl385pg8-10 ~]# kill 24738
[root@hp-dl385pg8-10 ~]# ps ax | grep mountd
24634 ?        Ss     0:00 rpc.mountd
24744 pts/2    S+     0:00 grep mountd
[root@hp-dl385pg8-10 ~]# kill 24634
[root@hp-dl385pg8-10 ~]# service nfslock status
rpc.statd dead but pid file exists
[root@hp-dl385pg8-10 ~]# mount -t nfs -o vers=3 127.0.0.1:/export/ /mnt/mnt_test/
mount.nfs: requested NFS version or transport protocol is not supported
[root@hp-dl385pg8-10 ~]# echo $?
32
[root@hp-dl385pg8-10 ~]# ps ax | grep statd
24768 ?        Ss     0:00 rpc.statd --no-notify
24770 pts/2    S+     0:00 grep statd
[root@hp-dl385pg8-10 ~]# ps ax | grep mountd
24772 pts/2    S+     0:00 grep mountd
[root@hp-dl385pg8-10 ~]# mount -vvv -t nfs -o vers=3 127.0.0.1:/export/ /mnt/mnt_test/
mount: fstab path: "/etc/fstab"
mount: mtab path:  "/etc/mtab"
mount: lock path:  "/etc/mtab~"
mount: temp path:  "/etc/mtab.tmp"
mount: UID:        0
mount: eUID:       0
mount: spec:  "127.0.0.1:/export/"
mount: node:  "/mnt/mnt_test/"
mount: types: "nfs"
mount: opts:  "vers=3"
final mount options: 'vers=3'
mount: external mount: argv[0] = "/sbin/mount.nfs"
mount: external mount: argv[1] = "127.0.0.1:/export/"
mount: external mount: argv[2] = "/mnt/mnt_test/"
mount: external mount: argv[3] = "-v"
mount: external mount: argv[4] = "-o"
mount: external mount: argv[5] = "rw,vers=3"
mount.nfs: timeout set for Mon Mar 23 17:21:59 2015
mount.nfs: trying text-based options 'vers=3,addr=127.0.0.1'
mount.nfs: prog 100003, trying vers=3, prot=6
mount.nfs: trying 127.0.0.1 prog 100003 vers 3 prot TCP port 2049
mount.nfs: prog 100005, trying vers=3, prot=17
mount.nfs: portmap query retrying: RPC: Program not registered
mount.nfs: prog 100005, trying vers=3, prot=6
mount.nfs: portmap query failed: RPC: Program not registered
mount.nfs: requested NFS version or transport protocol is not supported
[root@hp-dl385pg8-10 ~]#

Comment 17 Steve Dickson 2015-03-23 12:21:16 UTC

(In reply to Yongcheng Yang from comment #16)> > > Description of problem:
> > Please check your script...
> Need to kill both rpc.statd and rpc.mountd
> [root@hp-dl385pg8-10 ~]# mount -t nfs -o vers=3 127.0.0.1:/export/
> /mnt/mnt_test/
> mount.nfs: requested NFS version or transport protocol is not supported
Its is expected behaviour. Killing rpc.statd and rpc.mountd is
going to cause a loopback mount to fail. I'm not sure what the 
expectations are but there is no problem here...

Comment 18 Yongcheng Yang 2015-03-24 02:44:24 UTC

(In reply to Steve Dickson from comment #17)
> (In reply to Yongcheng Yang from comment #16)> > > Description of problem:
> > > Please check your script...
> > Need to kill both rpc.statd and rpc.mountd
> > [root@hp-dl385pg8-10 ~]# mount -t nfs -o vers=3 127.0.0.1:/export/
> > /mnt/mnt_test/
> > mount.nfs: requested NFS version or transport protocol is not supported
> Its is expected behaviour. Killing rpc.statd and rpc.mountd is
> going to cause a loopback mount to fail. I'm not sure what the 
> expectations are but there is no problem here...
Sorry I mistake the reproducer, need to restart the nfs service after killing both rpc.statd and rpc.mountd, then try mount, it should success but failed.

I edit a simple reproducer, and it can 100% reproduced in rhe6, and can't reproduced at rhel7:
[root@hp-dl385pg8-10 ~]# cat 11214512.sh 
service nfs stop > /dev/null
service nfslock stop > /dev/null
service rpcbind restart > /dev/null
rpc.statd
kill `ps ax | grep [r]pc.statd | awk '{print $1}'` 2>/dev/null
kill `ps ax | grep [r]pc.mountd | awk '{print $1}'` 2>/dev/null
service nfs restart > /dev/null
service nfslock status
mount -t nfs -o vers=3 127.0.0.1:/export/ /mnt/mnt_test/
ps ax | grep statd
umount /mnt/mnt_test/ 2>/dev/null
tail /var/log/messages
[root@hp-dl385pg8-10 ~]# 

Following is the reproduce logs by hand:
[root@hp-dl385pg8-10 ~]# service rpcbind restart
Stopping rpcbind: [  OK  ]
Starting rpcbind: [  OK  ]
[root@hp-dl385pg8-10 ~]# rpc.statd
[root@hp-dl385pg8-10 ~]# rpc.mountd 
[root@hp-dl385pg8-10 ~]# ps ax | grep rpc.statd
 5720 ?        Ss     0:00 rpc.statd
 5725 pts/2    S+     0:00 grep rpc.statd
[root@hp-dl385pg8-10 ~]# kill 5720
[root@hp-dl385pg8-10 ~]# ps ax | grep rpc.mountd
 5608 ?        Ss     0:00 rpc.mountd
 5723 ?        Ss     0:00 rpc.mountd
 5727 pts/2    S+     0:00 grep rpc.mountd
[root@hp-dl385pg8-10 ~]# kill 5608 5723
[root@hp-dl385pg8-10 ~]# service nfs restart
Shutting down NFS daemon: [  OK  ]
Shutting down NFS mountd: [FAILED]
Shutting down NFS quotas: [  OK  ]
Shutting down NFS services:  [  OK  ]
Shutting down RPC idmapd: [  OK  ]
Starting NFS services:  [  OK  ]
Starting NFS quotas: [  OK  ]
Starting NFS mountd: [  OK  ]
Starting NFS daemon: [  OK  ]
Starting RPC idmapd: [  OK  ]
[root@hp-dl385pg8-10 ~]# service nfslock status
rpc.statd dead but pid file exists
[root@hp-dl385pg8-10 ~]# mount -t nfs -o vers=3 127.0.0.1:/export/ /mnt/mnt_test/
mount.nfs: rpc.statd is not running but is required for remote locking.
mount.nfs: Either use '-o nolock' to keep locks local, or start statd.
mount.nfs: an incorrect mount option was specified
[root@hp-dl385pg8-10 ~]# echo $?
32
[root@hp-dl385pg8-10 ~]# mount -vvv -t nfs -o vers=3 127.0.0.1:/export/ /mnt/mnt_test/
mount: fstab path: "/etc/fstab"
mount: mtab path:  "/etc/mtab"
mount: lock path:  "/etc/mtab~"
mount: temp path:  "/etc/mtab.tmp"
mount: UID:        0
mount: eUID:       0
mount: spec:  "127.0.0.1:/export/"
mount: node:  "/mnt/mnt_test/"
mount: types: "nfs"
mount: opts:  "vers=3"
final mount options: 'vers=3'
mount: external mount: argv[0] = "/sbin/mount.nfs"
mount: external mount: argv[1] = "127.0.0.1:/export/"
mount: external mount: argv[2] = "/mnt/mnt_test/"
mount: external mount: argv[3] = "-v"
mount: external mount: argv[4] = "-o"
mount: external mount: argv[5] = "rw,vers=3"
mount.nfs: timeout set for Tue Mar 24 10:22:22 2015
mount.nfs: rpc.statd is not running but is required for remote locking.
mount.nfs: Either use '-o nolock' to keep locks local, or start statd.
mount.nfs: an incorrect mount option was specified
[root@hp-dl385pg8-10 ~]# echo $?
32
[root@hp-dl385pg8-10 ~]# tail /var/log/messages
Mar 24 10:34:38 hp-dl385pg8-10 rpcbind: rpcbind terminating on signal. Restart with "rpcbind -w"
Mar 24 10:34:38 hp-dl385pg8-10 rpc.statd[10471]: Version 1.2.3 starting
Mar 24 10:34:38 hp-dl385pg8-10 sm-notify[10472]: Version 1.2.3 starting
Mar 24 10:34:38 hp-dl385pg8-10 rpc.statd[10471]: Caught signal 15, un-registering and exiting
Mar 24 10:34:39 hp-dl385pg8-10 rpc.mountd[10588]: Version 1.2.3 starting
Mar 24 10:34:39 hp-dl385pg8-10 kernel: NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
Mar 24 10:34:39 hp-dl385pg8-10 kernel: NFSD: starting 90-second grace period
Mar 24 10:34:39 hp-dl385pg8-10 rpc.statd[10661]: Version 1.2.3 starting
Mar 24 10:34:39 hp-dl385pg8-10 rpc.statd[10661]: Flags: TI-RPC 
Mar 24 10:34:39 hp-dl385pg8-10 rpc.statd[10661]: Opening /var/run/rpc.statd.pid failed: Permission denied

Note You need to log in before you can comment on or make changes to this bug.