Bug 1129425

Summary:	/usr/sbin/start-statd is unreliable
Product:	[Fedora] Fedora	Reporter:	Andy Lutomirski <luto>
Component:	nfs-utils	Assignee:	Steve Dickson <steved>
Status:	CLOSED CURRENTRELEASE	QA Contact:	Fedora Extras Quality Assurance <extras-qa>
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	rawhide	CC:	bfields, jlayton, steved, tmraz, zbyszek
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
URL:	https://fedorahosted.org/fesco/ticket/1310
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2014-12-13 19:09:18 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1099595

Description Andy Lutomirski 2014-08-12 17:56:28 UTC

When rpcbind is no longer autostarted per FESCo ticket #1310 is fixed, apparently mounting NFS v3 shares will be tricky.  I think this is a bug in start-statd, possibly like this:

$ sudo /usr/sbin/start-statd; sudo rpcinfo
Job for nfs-lock.service failed. See 'systemctl status nfs-lock.service' and 'journalctl -xn' for details.
Statd service already running!
   program version netid     address                service    owner
    100000    4    tcp6      ::.0.111               portmapper superuser
    100000    3    tcp6      ::.0.111               portmapper superuser
    100000    4    udp6      ::.0.111               portmapper superuser
    100000    3    udp6      ::.0.111               portmapper superuser
    100000    4    tcp       0.0.0.0.0.111          portmapper superuser
    100000    3    tcp       0.0.0.0.0.111          portmapper superuser
    100000    2    tcp       0.0.0.0.0.111          portmapper superuser
    100000    4    udp       0.0.0.0.0.111          portmapper superuser
    100000    3    udp       0.0.0.0.0.111          portmapper superuser
    100000    2    udp       0.0.0.0.0.111          portmapper superuser
    100000    4    local     /var/run/rpcbind.sock  portmapper superuser
    100000    3    local     /var/run/rpcbind.sock  portmapper superuser
    100024    1    udp       0.0.0.0.237.22         status     29
    100024    1    tcp       0.0.0.0.228.97         status     29
    100024    1    udp6      ::.169.67              status     29
    100024    1    tcp6      ::.195.173             status     29

I think there are multiple problems here.  The script's fallback to manually starting rpc.statd is bad -- it starts it in the wrong context, preventing systemd from starting it correctly in the future.  But that fallback shouldn't happen at all, and I'm not entirely sure what the issue is.

If I manually kill rpc.statd and stop all the related systemd units, then this does seem to work:

sudo systemctl start nfs-lock.service; sudo rpcinfo

It shows 'status' in the output.  The lockd service doesn't show up, but I assume that's because nothing has asked the kernel to start it yet.

Comment 1 Steve Dickson 2014-08-24 17:41:51 UTC

(In reply to Andy Lutomirski from comment #0)
> I think there are multiple problems here.  The script's fallback to manually
> starting rpc.statd is bad -- it starts it in the wrong context, preventing
> systemd from starting it correctly in the future.  But that fallback
> shouldn't happen at all, and I'm not entirely sure what the issue is.
The failback is needed for when systemd is not installed

> 
> If I manually kill rpc.statd and stop all the related systemd units, then
> this does seem to work:
> 
> sudo systemctl start nfs-lock.service; sudo rpcinfo
> 
> It shows 'status' in the output.  The lockd service doesn't show up, but I
> assume that's because nothing has asked the kernel to start it yet.
lockd is a kernel modules that will get loaded when the NFS server is started.

Comment 2 Andy Lutomirski 2014-08-24 17:45:49 UTC

(In reply to Steve Dickson from comment #1)
> (In reply to Andy Lutomirski from comment #0)
> > I think there are multiple problems here.  The script's fallback to manually
> > starting rpc.statd is bad -- it starts it in the wrong context, preventing
> > systemd from starting it correctly in the future.  But that fallback
> > shouldn't happen at all, and I'm not entirely sure what the issue is.
> The failback is needed for when systemd is not installed

Is that even possible on Fedora?  In any case, wouldn't checking for the existence of systemctl be better than checking whether systemctl start succeeds?

> 
> > 
> > If I manually kill rpc.statd and stop all the related systemd units, then
> > this does seem to work:
> > 
> > sudo systemctl start nfs-lock.service; sudo rpcinfo
> > 
> > It shows 'status' in the output.  The lockd service doesn't show up, but I
> > assume that's because nothing has asked the kernel to start it yet.
> lockd is a kernel modules that will get loaded when the NFS server is
> started.

I must be missing something here.  start-statd is for the NFS *client*, I think.