Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1821903

Summary: NFSv4.1 mount fails when VM name is exactly same
Product: Red Hat Enterprise Linux 7 Reporter: Anbu Govindasamy <anbugovi>
Component: nfs-utilsAssignee: Steve Dickson <steved>
Status: CLOSED WONTFIX QA Contact: Yongcheng Yang <yoyang>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 7.5CC: bcodding, dwysocha, kcollins, parisi, radeltch, xzhou
Target Milestone: rcFlags: pm-rhel: mirror+
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-08-27 14:21:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Anbu Govindasamy 2020-04-07 19:31:25 UTC
Description of problem: We have NetApp NFSv4.1 storage that we are trying to mount to Redhat VMs. We clone VM for DR test purpose and during that time, we will end-up with two VMs with exactly same hostname & NFS storage is also cloned so when we attempt to mount cloned storage (NFS v4.1) on cloned VM, we are getting "mount.nfs: Operation not permitted". Same issue does not happen if NFS v3.0 or hostname are different. In our test, we found out that having same hostname for NFSv4.1 is not working. We try to mount the NFS volume using clientaddr option but that seems to be ignored as well. 


Version-Release number of selected component (if applicable):

[root@rheldb2eus2 ~]# cat /etc/os-release
NAME="Red Hat Enterprise Linux Server"
VERSION="7.5 (Maipo)"
ID="rhel"
ID_LIKE="fedora"
VARIANT="Server"
VARIANT_ID="server"
VERSION_ID="7.5"
PRETTY_NAME="Red Hat Enterprise Linux Server 7.5 (Maipo)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:7.5:GA:server"
HOME_URL="https://www.redhat.com/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 7"
REDHAT_BUGZILLA_PRODUCT_VERSION=7.5
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="7.5"




How reproducible:

Output from VM1:

[root@rheldb2eus2 ~]# mount -t nfs -o rw,hard,rsize=65536,wsize=65536,sec=sys,vers=4.1,tcp,clientaddr=10.6.1.5 10.6.3.5:/vol1 /vol1
[root@rheldb2eus2 ~]# ping rheldb2eus2
PING rheldb2eus2 (10.6.1.5) 56(84) bytes of data.
64 bytes from rheldb2eus2 (10.6.1.5): icmp_seq=1 ttl=64 time=0.022 ms
^C
--- rheldb2eus2 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.022/0.022/0.022/0.000 ms
[root@rheldb2eus2 ~]#
[root@rheldb2eus2 ~]#
[root@rheldb2eus2 ~]# umount /vol1
[root@rheldb2eus2 ~]# echo "remount /vol1, after it has been mounted on other VM"
remount /vol1, after it has been mounted on other VM
[root@rheldb2eus2 ~]# mount -t nfs -o rw,hard,rsize=65536,wsize=65536,sec=sys,vers=4.1,tcp,clientaddr=10.6.1.5 10.6.3.5:/vol1 /vol1
mount.nfs: Operation not permitted


Output from VM2:
[root@rheldb2eus2 ~]# ping rheldb2eus2
PING rheldb2eus2 (10.6.2.4) 56(84) bytes of data.
64 bytes from rheldb2eus2 (10.6.2.4): icmp_seq=1 ttl=64 time=0.082 ms
64 bytes from rheldb2eus2 (10.6.2.4): icmp_seq=2 ttl=64 time=0.038 ms
^C
--- rheldb2eus2 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.038/0.060/0.082/0.022 ms
[root@rheldb2eus2 ~]#  mount -t nfs -o rw,hard,rsize=65536,wsize=65536,sec=sys,vers=4.1,tcp,clientaddr=10.6.2.4 10.6.3.5:/vol1 /vol1
mount.nfs: Operation not permitted
[root@rheldb2eus2 ~]#
[root@rheldb2eus2 ~]#
[root@rheldb2eus2 ~]# echo "Test 2, mount /vol1 after umounting from other VM"
Test 2, mount /vol1 after umounting from other VM
[root@rheldb2eus2 ~]# mount -t nfs -o rw,hard,rsize=65536,wsize=65536,sec=sys,vers=4.1,tcp,clientaddr=10.6.2.4 10.6.3.5:/vol1 /vol1
[root@rheldb2eus2 ~]# df -kh /vol1
Filesystem      Size  Used Avail Use% Mounted on
10.6.3.5:/vol1  100T  576K  100T   1% /vol1



Steps to Reproduce:
1. create two NFS v4.1 volume - v1 & v2
2. create two VM with exactly same hostname
3. mount volume v1 on VM1 & try mounting v2 on VM2..it will fail & return "mount.nfs: Operation not permitted "

Actual results:


Expected results:
It should be able to mount both volumes


Additional info:

in SLES, mounting works if we continuously try mounting for 30 seconds..it seems, RPC time out after 30 seconds and allow mounting. It also worked on CentOS Linux release 7.3.1611 (Core) where nfs-utils-1.3.0-0.33.el7_3.src.rpm. 
 
 Same thing is not working on RHEL 7.5 where nfs-utils-1.3.0-0.65.el7.x86_64.

Comment 2 Justin Parisi 2020-04-07 19:49:40 UTC
To add some color to this...

What seems to be happening is that when both clients attempt the NFSv4.1 mount, the 2nd client will get an NFS4ERR_CLID_INUSE error. 

We can see this in a packet trace:

222         24.458904            10.193.67.174     10.193.67.219     NFS        334         V4 Call (Reply In 223) EXCHANGE_ID
223         24.459090            10.193.67.219     10.193.67.174     NFS        114         V4 Reply (Call In 222) EXCHANGE_ID Status: NFS4ERR_CLID_INUSE

This is expected behavior as per RFC-5661 in section 2.10.8.3: https://tools.ietf.org/html/rfc5661#section-2.10.8.3

However, my understanding is what is supposed to happen when NFS4ERR_CLID_INUSE is sent from the NFS server, the client is supposed to then retry the mount with a different client ID as per RFC-5661:

https://tools.ietf.org/html/rfc5661#section-15.1.13.2 
 
15.1.13.2.  NFS4ERR_CLID_INUSE (Error Code 10017)
While processing an EXCHANGE_ID operation, the server was presented with a co_ownerid field that matches an existing client with valid leased state, but the principal sending the EXCHANGE_ID operation differs from the principal that established the existing client. This indicates a collision (most likely due to chance) between clients.  The client should recover by changing the co_ownerid and re-sending EXCHANGE_ID (but not with the same slot ID and sequence ID; one or both MUST be different on the re-send).
 

We can see that happening in my SLES15 clients. I can also see it happening on my RHEL 7.x client when I specify the mount option "clientaddr=x.x.x.x" on each client that mounts the NFSv4.1 export.

We can see this in a packet trace.

The first attempt gives the error:

222         24.458904            10.193.67.174     10.193.67.219     NFS        334         V4 Call (Reply In 223) EXCHANGE_ID
223         24.459090            10.193.67.219     10.193.67.174     NFS        114         V4 Reply (Call In 222) EXCHANGE_ID Status: NFS4ERR_CLID_INUSE


The next attempt resolves the issue by incrementing the client ID from 0x00419319 to 0x0041931f:

293         30.251555            10.193.67.174     10.193.67.237     NFS        334         V4 Call (Reply In 294) EXCHANGE_ID
294         30.251753            10.193.67.237     10.193.67.174     NFS        238         V4 Reply (Call In 293) EXCHANGE_ID

Comment 3 Yongcheng Yang 2020-04-08 01:24:06 UTC
(In reply to Anbu Govindasamy from comment #0)
...
> Additional info:
> ...
> It also worked on CentOS Linux release 7.3.1611 (Core) where
> nfs-utils-1.3.0-0.33.el7_3.src.rpm. 
>  
> Same thing is not working on RHEL 7.5 where nfs-utils-1.3.0-0.65.el7.x86_64.

JFYI. we have changed the default NFS mount protocal from NFSv4.0 to NFSv4.1 since RHEL-7.4 (via Bug 1375259).
And per man page nfs(5), the option "clientaddr" is only for NFSv4.0 but not 4.1 mounts.

Is it possible to check with v4.0 mounting again?

(In reply to Justin Parisi from comment #2)
...
> What seems to be happening is that when both clients attempt the NFSv4.1
> mount, the 2nd client will get an NFS4ERR_CLID_INUSE error. 
> 
> We can see this in a packet trace:
> 
> 222         24.458904            10.193.67.174     10.193.67.219     NFS    334         V4 Call (Reply In 223) EXCHANGE_ID
> 223         24.459090            10.193.67.219     10.193.67.174     NFS    114         V4 Reply (Call In 222) EXCHANGE_ID Status: NFS4ERR_CLID_INUSE
> 
> This is expected behavior as per RFC-5661 in section 2.10.8.3:
> https://tools.ietf.org/html/rfc5661#section-2.10.8.3
> 
> However, my understanding is what is supposed to happen when
> NFS4ERR_CLID_INUSE is sent from the NFS server, the client is supposed to
> then retry the mount with a different client ID as per RFC-5661:
> 
> https://tools.ietf.org/html/rfc5661#section-15.1.13.2 
>  

Hello Benjamin, would you please help to check how shall nfs client respond the NFS4ERR_CLID_INUSE?

Comment 4 Justin Parisi 2020-04-08 13:42:21 UTC
Thanks for the reply, Yang.

There is some confusion in the docs then. Under "man nfs" it says this:

> Options for NFS version 4 only

But then it says for v4 and newer:
> Use these options, along with the options in the first subsection above, for NFS version 4 and newer.
 
So, which is it?

If the option was not meant for v4.1, wouldn't I have gotten an error that says "incorrect mount option"? 

Either way, -clientaddr fixed the issue I was seeing in RHEL 7.x using v4.1.

Comment 5 Kevin Collins 2020-04-14 23:17:59 UTC
Justin - I am working with Anbu on this (as his customer) and I can confirm the man page for nfs (man 5 nfs) on RHEL7 shows the following under the clientaddr parameter description:

NFS protocol versions 4.1 and 4.2 use the client-established TCP connection for callback requests, so do not require the server to connect to the client.  This option is therefore only affect NFS version 4.0 mounts.

The system I checked on is RHEL7.5 with nfs-utils-1.3.0-0.61.el7.x86_64.

Hopefully that adds some clarification.

Comment 7 Dave Wysochanski 2021-08-26 21:35:46 UTC
On Aug 6, 2020 Red Hat Enterprise Linux 7 entered Maintenance Support 2 Phase.

https://access.redhat.com/support/policy/updates/errata#Maintenance_Support_2_Phase

That means only "Critical and Important Security errata advisories (RHSAs) and Urgent Priority Bug Fix errata advisories (RHBAs) may be released". This BZ does not appear to meet Maintenance Support 2 Phase criteria so is being closed WONTFIX. If this is critical for your environment please open a case in the Red Hat Customer Portal, https://access.redhat.com ,provide a thorough business justification and ask that the BZ be re-opened for consideration in a future erratum.

Comment 9 Justin Parisi 2021-08-27 14:14:12 UTC
I did find an additional workaround for this issue.

By default, NFSv4.x uses the client’s host name for the client ID value when mounting to the NFS server. However, there are client-side NFS options you can leverage to change that default behavior and override the client ID used for NFSv4.x mounts.

To do this, set the NFS option nfs4-unique-id on the client to a static value for all clients that will use the same host names. If you add this value to the /etc/modprobe.d/nfsclient.conf file, it will retains across reboots. 

You can see the setting on the client as:
 # systool -v -m nfs | grep -i nfs4_unique
    nfs4_unique_id      = ""

To set it, run the following command: 
echo options nfs nfs4_unique_id=[string] > /etc/modprobe.d/nfsclient.conf
reboot

For example:
# echo options nfs nfs4_unique_id=uniquenfs4-1 > /etc/modprobe.d/nfsclient.conf

# systool -v -m nfs | grep -i nfs4_unique
    nfs4_unique_id      = "uniquenfs4-1"