Bug 1570567 - [rpc.statd] fails to start if used to mask the service rpc-statd and failed to mount nfsv3
Summary: [rpc.statd] fails to start if used to mask the service rpc-statd and failed t...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: nfs-utils
Version: 7.5
Hardware: All
OS: Linux
low
low
Target Milestone: rc
: ---
Assignee: Steve Dickson
QA Contact: Yongcheng Yang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-23 09:09 UTC by Yongcheng Yang
Modified: 2021-02-15 07:38 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-02-15 07:38:34 UTC
Target Upstream Version:


Attachments (Terms of Use)
reproducer1 start/stop the rpcbind/rpcbind.socket (873 bytes, application/x-shellscript)
2018-04-23 09:09 UTC, Yongcheng Yang
no flags Details
reproducer2 only stop rpc-statd (890 bytes, application/x-shellscript)
2018-04-23 09:10 UTC, Yongcheng Yang
no flags Details

Description Yongcheng Yang 2018-04-23 09:09:35 UTC
Created attachment 1425594 [details]
reproducer1 start/stop the rpcbind/rpcbind.socket

Description of problem:
During tests of the NFSv4-only server configure procedure (Bug 1387694), just found that rpc-statd.service sometimes gets failed when trying start. The key to reproduce this scenario is just trying to mount NFS version 3 (or v2) after rpc-statd is masked, certainly it gets failed as expected. But after recovering it, always fail to start the service rpc-statd then.

I just simplify the reproducer as following.


Version-Release number of selected component (if applicable):
# latest rhel7 nfs-utils
nfs-utils-1.3.0-0.54.el7
# upstream also has the same issue
nfs-utils-2.2.1-4.rc2.fc27


How reproducible:
100% easy


Steps to Reproduce:
1. systemctl mask --now rpc-statd
2. mount localhost:/tmp /mnt -o vers=3 <<< should fail
3. systemctl unmask rpc-statd.service
4. systemctl start rpc-statd           <<< get failed


The failure workaround:
a. If we stop/start the rpcbind/rpcbind.socket at the same time, we can work around it by just starting the rpc-statd one more time.
b. If removing other operations (rpcbind/rpcbind.socket), looks like a daemon "rpc.statd --no-notify" starts immediately when unmasking the service. We can kill that daemon and try to start it then.


Actual results:
[root@hp-dl360g9-01 ~]# rpm -q nfs-utils
nfs-utils-1.3.0-0.54.el7.x86_64
[root@hp-dl360g9-01 ~]# ./repro1.sh 
# systemctl stop rpc-statd
# systemctl stop rpcbind
Warning: Stopping rpcbind.service, but it can still be activated by:
  rpcbind.socket
# systemctl stop rpcbind.socket
# systemctl mask --now rpc-statd
Created symlink from /etc/systemd/system/rpc-statd.service to /dev/null.
# mount localhost:/tmp /mnt -o vers=3  # should fail
Failed to start rpc-statd.service: Unit is masked.
mount.nfs: rpc.statd is not running but is required for remote locking.
mount.nfs: Either use '-o nolock' to keep locks local, or start statd.
mount.nfs: an incorrect mount option was specified
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# systemctl start rpcbind
# systemctl unmask rpc-statd.service
Removed symlink /etc/systemd/system/rpc-statd.service.
# systemctl start rpc-statd  # should not fail
Job for rpc-statd.service failed because the control process exited with error code. See "systemctl status rpc-statd.service" and "journalctl -xe" for details.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ *reproduced*
# systemctl start rpc-statd  # workaround
rpcuser   2885  0.0  0.0  42420  1740 ?        Ss   04:47   0:00 /usr/sbin/rpc.statd
[root@hp-dl360g9-01 ~]

[root@hp-dl360g9-01 ~]# ./repro2.sh 
# systemctl stop rpc-statd
# systemctl mask --now rpc-statd
Created symlink from /etc/systemd/system/rpc-statd.service to /dev/null.
# mount localhost:/tmp /mnt -o vers=3  # should fail
Failed to start rpc-statd.service: Unit is masked.
mount.nfs: access denied by server while mounting localhost:/tmp
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# systemctl unmask rpc-statd.service
Removed symlink /etc/systemd/system/rpc-statd.service.
rpcuser   2933  0.0  0.0  42420  1740 ?        Ss   04:49   0:00 rpc.statd --no-notify
# systemctl start rpc-statd  # should not fail
Job for rpc-statd.service failed because the control process exited with error code. See "systemctl status rpc-statd.service" and "journalctl -xe" for details.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ *reproduced*
rpcuser   2933  0.0  0.0  42420  1740 ?        Ss   04:49   0:00 rpc.statd --no-notify
# pkill rpc.statd  # workaround
# systemctl start rpc-statd  # workaround
rpcuser   2982  0.0  0.0  42420  1744 ?        Ss   04:49   0:00 /usr/sbin/rpc.statd
[root@hp-dl360g9-01 ~]# 


Expected results:
Service rpc.statd can be started successfully after unmasking it.


Additional info:
Just setting priority as "low" for now as there is workaround for this problem.
However, if the NFSv4-only server configuration is getting more and more popular, customer may sometimes encounter this problem too.

Comment 1 Yongcheng Yang 2018-04-23 09:10:48 UTC
Created attachment 1425595 [details]
reproducer2 only stop rpc-statd

Comment 4 RHEL Program Management 2021-02-15 07:38:34 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.


Note You need to log in before you can comment on or make changes to this bug.