Bug 1352856
| Summary: | NFS fails to mount on boot if both client and server were rebooted at the same time | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Jason Woods <devel> | |
| Component: | nfs-utils | Assignee: | Steve Dickson <steved> | |
| Status: | CLOSED ERRATA | QA Contact: | Yongcheng Yang <yoyang> | |
| Severity: | medium | Docs Contact: | ||
| Priority: | high | |||
| Version: | 6.8 | CC: | eguan, steved, swhiteho | |
| Target Milestone: | rc | |||
| Target Release: | --- | |||
| Hardware: | x86_64 | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | nfs-utils-1.2.3-73.el6 | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1404121 (view as bug list) | Environment: | ||
| Last Closed: | 2017-03-21 11:24:18 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1404121 | |||
Have reproduced this issue in RHEL-6.8. But, IMHO this is only because NFS Server's nfs services has not been bring up yet. (maybe the warning "not supported" is confusing) After reboot both server and client, on the client side: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [root@ibm-x3650m4-08 ~]# tail -1 /etc/fstab ibm-x3250m4-07.rhts.eng.pek2.redhat.com:/export_test /mnt/mnt_test nfs _netdev,nfsvers=3,proto=tcp,retry=10000 0 0 [root@ibm-x3650m4-08 ~]# nfsstat -m [root@ibm-x3650m4-08 ~]# cat /var/log/boot.log | grep mount.nfs Mounting filesystems: mount.nfs: requested NFS version or transport protocol is not supported [root@ibm-x3650m4-08 ~]# There might be a simpler reproducer. Uust stop the server's nfs.service and try to mount it on the client side, the same warning emitted. On the nfs server: ^^^^^^^^^^^^^^^^^^ [root@ibm-x3250m4-07 ~]# service nfs stop Shutting down NFS daemon: [ OK ] Shutting down NFS mountd: [ OK ] Shutting down NFS quotas: [ OK ] Shutting down NFS services: [ OK ] Shutting down RPC idmapd: [ OK ] [root@ibm-x3250m4-07 ~]# Then on the client side: ^^^^^^^^^^^^^^^^^^^^^^^ [root@ibm-x3650m4-08 ~]# showmount -e ibm-x3250m4-07.rhts.eng.pek2.redhat.com clnt_create: RPC: Program not registered [root@ibm-x3650m4-08 ~]# mount -a mount.nfs: requested NFS version or transport protocol is not supported [root@ibm-x3650m4-08 ~]# mount -o vers=3 ibm-x3250m4-07.rhts.eng.pek2.redhat.com:/export_test /mnt/mnt_test/ mount.nfs: requested NFS version or transport protocol is not supported [root@ibm-x3650m4-08 ~]# After starting server's nfs service, on the client side: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [root@ibm-x3650m4-08 ~]# showmount -e ibm-x3250m4-07.rhts.eng.pek2.redhat.com Export list for ibm-x3250m4-07.rhts.eng.pek2.redhat.com: /export_test * [root@ibm-x3650m4-08 ~]# mount -a [root@ibm-x3650m4-08 ~]# nfsstat -m /mnt/mnt_test from ibm-x3250m4-07.rhts.eng.pek2.redhat.com:/export_test Flags: rw,relatime,vers=3,rsize=524288,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.73.4.153,mountvers=3,mountport=50268,mountproto=tcp,local_lock=none,addr=10.73.4.153 [root@ibm-x3650m4-08 ~]# umount /mnt/mnt_test/ My main issue is that due to the "requested NFS version or transport protocol is not supported" the specified retry option is ignored. It is perfectly fine from my perspective to throw that warning up but I would expect it to keep retrying. It seems when this error occurs, "requested NFS version or transport protocol is not supported", the retry of 10000 minutes is ignored I can see nfs-utils has this fixed for background mounts in 1.3.3 and the version 1.2.3 that RedHat ships doesn't have this fix (though I have not checked the SRPM to see if there's a patch for it.) However, for foreground mounts it seems there is no such fix and still might be a problem in 1.3.3. In nfsmount_fg loop the loop is broken on nfs_is_permanent_error(). EOPNOTSUPP, which seems to correspond to "requested NFS version or transport protocol is not supported", is treated as permanent, meaning the loop is exited. However, in 1.3.3, for background mounts, in nfsmount_parent (first attempt) and also nfsmount_child (subsequent background attempts), there is an explicit check for EOPNOTSUPP which allows the retries to continue even if EOPNOTSUPP occurs. For now I have changed to using background mount to see if it fixes my problem, and will examine the SRPM eventually to see if RedHat patched in the background mount fix (does anyone know if they have?) Would be great to see the foreground mount fixed upstream too though and also in RedHat 6 if at all possible. Thanks everyone! I thinking the answer here is to used background mounts. You really don't want foreground mounts to hang very since it would hang the entire boot process for an indefinite amount of time. (In reply to Jason Woods from comment #4) > My main issue is that due to the "requested NFS version or transport > protocol is not supported" the specified retry option is ignored. It is > perfectly fine from my perspective to throw that warning up but I would > expect it to keep retrying. Sorry for not getting this point before. Thanks for the clarifying. Assuming from comment 6, we'll fix it using background mounts to keep trying to mount it. > > It seems when this error occurs, "requested NFS version or transport > protocol is not supported", the retry of 10000 minutes is ignored (In reply to Steve Dickson from comment #6) > I thinking the answer here is to used background mounts. Does this mean the RedHat package does have the fix in for the background mount? As mentioned, looking at 1.2.3 upstream source the fix is not there. Does the RedHat package include it as a patch? (In reply to Steve Dickson from comment #6) > You really don't > want foreground mounts to hang very since it would hang the > entire boot process for an indefinite amount of time. Maybe, but it is still an option that a user can select and should therefore work. The issue I think is that foreground and background mount currently have differing opinions on what is actually a permanent failure. Background mount retries on EOPNOTSUPP. Foreground does not. I think this is a problem and confusing to say the least. Regarding indefinite hang - my example of 10000 on retry is a really bad example as it really should only be about 10-15 minutes - the reason I had it 10000 is because I was battling with this issue and playing with the numbers, even on 10-15 it does not retry on EOPNOTSUPP. I also have startup scripts in /etc/init.d/ starting services that depend on the network file system, so I really do need foreground option to retry correctly for the specified time given. Background mount is not a viable permanent fix for me and I'm only testing it to just to see if it indeed does behave differently to foreground Thanks It appears upstream agrees with you due to this recent commit:
commit df0b99980d74505299e9289c2ccddd03a48b664f
Author: NeilBrown <neilb>
Date: Sat Aug 20 10:39:52 2016 -0400
mount: RPC_PROGNOTREGISTERED should not be a permanent error
Commit: bf66c9facb8e ("mounts.nfs: v2 and v3 background mounts should
retry when server is down.")
changed the behaviour of "bg" mounts so that RPC_PROGNOTREGISTERED,
which maps to EOPNOTSUPP, is not a permanent error.
This useful because when an NFS server starts up there is a small window
between the moment that rpcbind (or portmap) starts responding to lookup
requests, and the moment when nfsd registers with rpcbind. During that window
rpcbind will reply with RPC_PROGNOTREGISTERED, but mount should not give
up.
This same reasoning applies to foreground mounts. They don't wait for
as long, but could still hit the window and fail prematurely.
So revert the above patch and instead add EOPNOTSUPP to the list of
temporary errors known to nfs_is_permanent_error.
Moving to VERIFIED according to test logs of Comment #13. Also keep running this automatic case as regression test in future. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2017-0741.html |
Description of problem: Rebooting both an NFS client and its server at the same time causes the following error to be reported during boot sequence when the NFS client is configured to mount automatically on boot inside /etc/fstab. No retries are performed and the mount point is left unmounted until manually mounted with "mount -a". /etc/fstab: 123.123.123.123:/data/location /data/location nfs _netdev,nfsvers=3,proto=tcp,retry=10000 0 0 Error in boot.log: NFS filesystems queued to be mounted Setting up Logical Volume Management: [ OK ] Checking network-attached filesystems [ OK ] Mounting filesystems: mount.nfs: requested NFS version or transport protocol is not supported [FAILED] Version-Release number of selected component (if applicable): nfs-utils-1.2.3-70.el6.x86_64 nfs-utils-lib-1.1.5-11.el6.x86_64 nfs4-acl-tools-0.3.3-8.el6.x86_64 How reproducible: Almost always (sometimes it works OK - depending on how quickly server boots) Steps to Reproduce: 1. Setup NFS mount point to mount on boot using _netdev 2. Reboot both NFS server and NFS client Actual results: Mounting filesystems: mount.nfs: requested NFS version or transport protocol is not supported Mount is not mounted and never retried Expected results: Mount is mounted, either immediately, or within the retry period if server is still booting Additional info: N/A