Bug 1120852

Summary: 'systemctl start nfs-lock' fails to detect active rpc.statd instances
Product: Red Hat Enterprise Linux 7 Reporter: David Vossel <dvossel>
Component: nfs-utilsAssignee: Steve Dickson <steved>
Status: CLOSED DUPLICATE QA Contact: Filesystem QE <fs-qe>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.1CC: fdinitto
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-10-21 17:24:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description David Vossel 2014-07-17 21:15:04 UTC
Description of problem:

If you have nfs-server and nfs-lock disabled at boot (required for a cluster environment) and you mount an NFS client, rpc.statd magically gets started for us.  This is great because otherwise the NFSv3 client couldn't perform locking.  The problem is that this conflicts with the nfs-lock systemd unit file

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. make sure both nfs-server and nfs-lock stopped

systemctl stop nfs-server
systemctl stop nfs-lock

2. mount a nfs client
mount -v -o "vers=3" rhel7-alt1:/root/testnfs /root/testmount

3. rpc.statd magically appears, hooray!

ps aux | grep [r]pc.statd
rpcuser   2075  0.0  0.1  44544  1952 ?        Ss   16:36   0:00 rpc.statd --no-notify

4. Now run 'systemctl status rpc.statd"

systemctl status nfs-lock
nfs-lock.service - NFS file locking service.
   Loaded: loaded (/usr/lib/systemd/system/nfs-lock.service; disabled)
   Active: inactive (dead)

Status says nfs-lock is down, but we know rpc.statd is actually up.

Now try and start nfs-lock. it fails.

systemctl start nfs-lock
Job for nfs-lock.service failed. See 'systemctl status nfs-lock.service' and 'journalctl -xn' for details.

looking at the status we see that statd detects there's already a statd instance up... so it fails. 

systemctl status nfs-lock
nfs-lock.service - NFS file locking service.
   Loaded: loaded (/usr/lib/systemd/system/nfs-lock.service; disabled)
   Active: failed (Result: exit-code) since Thu 2014-07-17 17:02:01 EDT; 5min ago
  Process: 2147 ExecStart=/sbin/rpc.statd $STATDARG (code=exited, status=1/FAILURE)
  Process: 2145 ExecStartPre=/usr/libexec/nfs-utils/scripts/nfs-lock.preconfig (code=exited, status=0/SUCCESS)

Jul 17 17:02:01 rhel7-alt2 rpc.statd[2147]: Statd service already running!
Jul 17 17:02:01 rhel7-alt2 systemd[1]: nfs-lock.service: control process exited, code=exited status=1
Jul 17 17:02:01 rhel7-alt2 systemd[1]: Failed to start NFS file locking service..
Jul 17 17:02:01 rhel7-alt2 systemd[1]: Unit nfs-lock.service entered failed state.

Actual results:

The nfs-lock unit file can not reliably manage the rpc.statd daemon because the unit file is unable to detect rpc.statd is already running (and gracefully handle this situation)

Expected results:

nfs-lock should be able to manage rpc.statd regardless if the daemon was started outside of the systemd unit file.

Additional info:

This is a big deal for us in managing HA NFS with pacemaker. The nfs-lock unit file needs to reliably work.  As simple as this failure is, it results in a unrecoverable situation where the HA NFS server can not start.  The HA NFS server depends on this unit file to start the locking daemons.

Comment 2 Steve Dickson 2014-10-21 17:24:34 UTC
This now works due to bz 1144440

*** This bug has been marked as a duplicate of bug 1144440 ***