Bug 1821903
| Summary: | NFSv4.1 mount fails when VM name is exactly same | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Anbu Govindasamy <anbugovi> |
| Component: | nfs-utils | Assignee: | Steve Dickson <steved> |
| Status: | CLOSED WONTFIX | QA Contact: | Yongcheng Yang <yoyang> |
| Severity: | medium | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 7.5 | CC: | bcodding, dwysocha, kcollins, parisi, radeltch, xzhou |
| Target Milestone: | rc | Flags: | pm-rhel:
mirror+
|
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-08-27 14:21:01 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Anbu Govindasamy
2020-04-07 19:31:25 UTC
To add some color to this... What seems to be happening is that when both clients attempt the NFSv4.1 mount, the 2nd client will get an NFS4ERR_CLID_INUSE error. We can see this in a packet trace: 222 24.458904 10.193.67.174 10.193.67.219 NFS 334 V4 Call (Reply In 223) EXCHANGE_ID 223 24.459090 10.193.67.219 10.193.67.174 NFS 114 V4 Reply (Call In 222) EXCHANGE_ID Status: NFS4ERR_CLID_INUSE This is expected behavior as per RFC-5661 in section 2.10.8.3: https://tools.ietf.org/html/rfc5661#section-2.10.8.3 However, my understanding is what is supposed to happen when NFS4ERR_CLID_INUSE is sent from the NFS server, the client is supposed to then retry the mount with a different client ID as per RFC-5661: https://tools.ietf.org/html/rfc5661#section-15.1.13.2 15.1.13.2. NFS4ERR_CLID_INUSE (Error Code 10017) While processing an EXCHANGE_ID operation, the server was presented with a co_ownerid field that matches an existing client with valid leased state, but the principal sending the EXCHANGE_ID operation differs from the principal that established the existing client. This indicates a collision (most likely due to chance) between clients. The client should recover by changing the co_ownerid and re-sending EXCHANGE_ID (but not with the same slot ID and sequence ID; one or both MUST be different on the re-send). We can see that happening in my SLES15 clients. I can also see it happening on my RHEL 7.x client when I specify the mount option "clientaddr=x.x.x.x" on each client that mounts the NFSv4.1 export. We can see this in a packet trace. The first attempt gives the error: 222 24.458904 10.193.67.174 10.193.67.219 NFS 334 V4 Call (Reply In 223) EXCHANGE_ID 223 24.459090 10.193.67.219 10.193.67.174 NFS 114 V4 Reply (Call In 222) EXCHANGE_ID Status: NFS4ERR_CLID_INUSE The next attempt resolves the issue by incrementing the client ID from 0x00419319 to 0x0041931f: 293 30.251555 10.193.67.174 10.193.67.237 NFS 334 V4 Call (Reply In 294) EXCHANGE_ID 294 30.251753 10.193.67.237 10.193.67.174 NFS 238 V4 Reply (Call In 293) EXCHANGE_ID (In reply to Anbu Govindasamy from comment #0) ... > Additional info: > ... > It also worked on CentOS Linux release 7.3.1611 (Core) where > nfs-utils-1.3.0-0.33.el7_3.src.rpm. > > Same thing is not working on RHEL 7.5 where nfs-utils-1.3.0-0.65.el7.x86_64. JFYI. we have changed the default NFS mount protocal from NFSv4.0 to NFSv4.1 since RHEL-7.4 (via Bug 1375259). And per man page nfs(5), the option "clientaddr" is only for NFSv4.0 but not 4.1 mounts. Is it possible to check with v4.0 mounting again? (In reply to Justin Parisi from comment #2) ... > What seems to be happening is that when both clients attempt the NFSv4.1 > mount, the 2nd client will get an NFS4ERR_CLID_INUSE error. > > We can see this in a packet trace: > > 222 24.458904 10.193.67.174 10.193.67.219 NFS 334 V4 Call (Reply In 223) EXCHANGE_ID > 223 24.459090 10.193.67.219 10.193.67.174 NFS 114 V4 Reply (Call In 222) EXCHANGE_ID Status: NFS4ERR_CLID_INUSE > > This is expected behavior as per RFC-5661 in section 2.10.8.3: > https://tools.ietf.org/html/rfc5661#section-2.10.8.3 > > However, my understanding is what is supposed to happen when > NFS4ERR_CLID_INUSE is sent from the NFS server, the client is supposed to > then retry the mount with a different client ID as per RFC-5661: > > https://tools.ietf.org/html/rfc5661#section-15.1.13.2 > Hello Benjamin, would you please help to check how shall nfs client respond the NFS4ERR_CLID_INUSE? Thanks for the reply, Yang. There is some confusion in the docs then. Under "man nfs" it says this: > Options for NFS version 4 only But then it says for v4 and newer: > Use these options, along with the options in the first subsection above, for NFS version 4 and newer. So, which is it? If the option was not meant for v4.1, wouldn't I have gotten an error that says "incorrect mount option"? Either way, -clientaddr fixed the issue I was seeing in RHEL 7.x using v4.1. Justin - I am working with Anbu on this (as his customer) and I can confirm the man page for nfs (man 5 nfs) on RHEL7 shows the following under the clientaddr parameter description: NFS protocol versions 4.1 and 4.2 use the client-established TCP connection for callback requests, so do not require the server to connect to the client. This option is therefore only affect NFS version 4.0 mounts. The system I checked on is RHEL7.5 with nfs-utils-1.3.0-0.61.el7.x86_64. Hopefully that adds some clarification. On Aug 6, 2020 Red Hat Enterprise Linux 7 entered Maintenance Support 2 Phase. https://access.redhat.com/support/policy/updates/errata#Maintenance_Support_2_Phase That means only "Critical and Important Security errata advisories (RHSAs) and Urgent Priority Bug Fix errata advisories (RHBAs) may be released". This BZ does not appear to meet Maintenance Support 2 Phase criteria so is being closed WONTFIX. If this is critical for your environment please open a case in the Red Hat Customer Portal, https://access.redhat.com ,provide a thorough business justification and ask that the BZ be re-opened for consideration in a future erratum. I did find an additional workaround for this issue.
By default, NFSv4.x uses the client’s host name for the client ID value when mounting to the NFS server. However, there are client-side NFS options you can leverage to change that default behavior and override the client ID used for NFSv4.x mounts.
To do this, set the NFS option nfs4-unique-id on the client to a static value for all clients that will use the same host names. If you add this value to the /etc/modprobe.d/nfsclient.conf file, it will retains across reboots.
You can see the setting on the client as:
# systool -v -m nfs | grep -i nfs4_unique
nfs4_unique_id = ""
To set it, run the following command:
echo options nfs nfs4_unique_id=[string] > /etc/modprobe.d/nfsclient.conf
reboot
For example:
# echo options nfs nfs4_unique_id=uniquenfs4-1 > /etc/modprobe.d/nfsclient.conf
# systool -v -m nfs | grep -i nfs4_unique
nfs4_unique_id = "uniquenfs4-1"
|