Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1352856

Summary: NFS fails to mount on boot if both client and server were rebooted at the same time
Product: Red Hat Enterprise Linux 6 Reporter: Jason Woods <devel>
Component: nfs-utilsAssignee: Steve Dickson <steved>
Status: CLOSED ERRATA QA Contact: Yongcheng Yang <yoyang>
Severity: medium Docs Contact:
Priority: high    
Version: 6.8CC: eguan, steved, swhiteho
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: nfs-utils-1.2.3-73.el6 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1404121 (view as bug list) Environment:
Last Closed: 2017-03-21 11:24:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1404121    

Description Jason Woods 2016-07-05 09:33:58 UTC
Description of problem:
Rebooting both an NFS client and its server at the same time causes the following error to be reported during boot sequence when the NFS client is configured to mount automatically on boot inside /etc/fstab. No retries are performed and the mount point is left unmounted until manually mounted with "mount -a".

/etc/fstab:
123.123.123.123:/data/location /data/location nfs _netdev,nfsvers=3,proto=tcp,retry=10000 0 0

Error in boot.log:
NFS filesystems queued to be mounted
Setting up Logical Volume Management:                      [  OK  ]
Checking network-attached filesystems
                                                           [  OK  ]
Mounting filesystems:  mount.nfs: requested NFS version or transport protocol is not supported
                                                           [FAILED]

Version-Release number of selected component (if applicable):
nfs-utils-1.2.3-70.el6.x86_64
nfs-utils-lib-1.1.5-11.el6.x86_64
nfs4-acl-tools-0.3.3-8.el6.x86_64

How reproducible:
Almost always (sometimes it works OK - depending on how quickly server boots)

Steps to Reproduce:
1. Setup NFS mount point to mount on boot using _netdev
2. Reboot both NFS server and NFS client

Actual results:
Mounting filesystems:  mount.nfs: requested NFS version or transport protocol is not supported
Mount is not mounted and never retried

Expected results:
Mount is mounted, either immediately, or within the retry period if server is still booting

Additional info:
N/A

Comment 3 Yongcheng Yang 2016-07-06 03:43:09 UTC
Have reproduced this issue in RHEL-6.8.
But, IMHO this is only because NFS Server's nfs services has not been bring up yet. (maybe the warning "not supported" is confusing)

After reboot both server and client, on the client side:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[root@ibm-x3650m4-08 ~]# tail -1 /etc/fstab 
ibm-x3250m4-07.rhts.eng.pek2.redhat.com:/export_test /mnt/mnt_test nfs _netdev,nfsvers=3,proto=tcp,retry=10000 0 0
[root@ibm-x3650m4-08 ~]# nfsstat -m
[root@ibm-x3650m4-08 ~]# cat /var/log/boot.log | grep mount.nfs
Mounting filesystems:  mount.nfs: requested NFS version or transport protocol is not supported
[root@ibm-x3650m4-08 ~]# 

There might be a simpler reproducer.
Uust stop the server's nfs.service and try to mount it on the client side, the same warning emitted.

On the nfs server:
^^^^^^^^^^^^^^^^^^
[root@ibm-x3250m4-07 ~]# service nfs stop
Shutting down NFS daemon: [  OK  ]
Shutting down NFS mountd: [  OK  ]
Shutting down NFS quotas: [  OK  ]
Shutting down NFS services:  [  OK  ]
Shutting down RPC idmapd: [  OK  ]
[root@ibm-x3250m4-07 ~]# 

Then on the client side:
^^^^^^^^^^^^^^^^^^^^^^^
[root@ibm-x3650m4-08 ~]# showmount -e ibm-x3250m4-07.rhts.eng.pek2.redhat.com
clnt_create: RPC: Program not registered
[root@ibm-x3650m4-08 ~]# mount -a
mount.nfs: requested NFS version or transport protocol is not supported
[root@ibm-x3650m4-08 ~]# mount -o vers=3 ibm-x3250m4-07.rhts.eng.pek2.redhat.com:/export_test /mnt/mnt_test/
mount.nfs: requested NFS version or transport protocol is not supported
[root@ibm-x3650m4-08 ~]# 

After starting server's nfs service, on the client side:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[root@ibm-x3650m4-08 ~]# showmount -e ibm-x3250m4-07.rhts.eng.pek2.redhat.com
Export list for ibm-x3250m4-07.rhts.eng.pek2.redhat.com:
/export_test *
[root@ibm-x3650m4-08 ~]# mount -a
[root@ibm-x3650m4-08 ~]# nfsstat -m
/mnt/mnt_test from ibm-x3250m4-07.rhts.eng.pek2.redhat.com:/export_test
 Flags:	rw,relatime,vers=3,rsize=524288,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.73.4.153,mountvers=3,mountport=50268,mountproto=tcp,local_lock=none,addr=10.73.4.153

[root@ibm-x3650m4-08 ~]# umount /mnt/mnt_test/

Comment 4 Jason Woods 2016-07-06 10:46:36 UTC
My main issue is that due to the "requested NFS version or transport protocol is not supported" the specified retry option is ignored. It is perfectly fine from my perspective to throw that warning up but I would expect it to keep retrying.

It seems when this error occurs, "requested NFS version or transport protocol is not supported", the retry of 10000 minutes is ignored

Comment 5 Jason Woods 2016-07-06 11:04:01 UTC
I can see nfs-utils has this fixed for background mounts in 1.3.3 and the version 1.2.3 that RedHat ships doesn't have this fix (though I have not checked the SRPM to see if there's a patch for it.)

However, for foreground mounts it seems there is no such fix and still might be a problem in 1.3.3.

In nfsmount_fg loop the loop is broken on nfs_is_permanent_error(). EOPNOTSUPP, which seems to correspond to "requested NFS version or transport protocol is not supported", is treated as permanent, meaning the loop is exited.

However, in 1.3.3, for background mounts, in nfsmount_parent (first attempt) and also nfsmount_child (subsequent background attempts), there is an explicit check for EOPNOTSUPP which allows the retries to continue even if EOPNOTSUPP occurs.

For now I have changed to using background mount to see if it fixes my problem, and will examine the SRPM eventually to see if RedHat patched in the background mount fix (does anyone know if they have?)

Would be great to see the foreground mount fixed upstream too though and also in RedHat 6 if at all possible.

Thanks everyone!

Comment 6 Steve Dickson 2016-07-06 11:11:14 UTC
I thinking the answer here is to used background mounts. You really don't
want foreground mounts to hang very since it would hang the
entire boot process for an indefinite amount of time.

Comment 7 Yongcheng Yang 2016-07-06 11:43:24 UTC
(In reply to Jason Woods from comment #4)
> My main issue is that due to the "requested NFS version or transport
> protocol is not supported" the specified retry option is ignored. It is
> perfectly fine from my perspective to throw that warning up but I would
> expect it to keep retrying.
Sorry for not getting this point before. Thanks for the clarifying.

Assuming from comment 6, we'll fix it using background mounts to keep trying to mount it.

> 
> It seems when this error occurs, "requested NFS version or transport
> protocol is not supported", the retry of 10000 minutes is ignored

Comment 8 Jason Woods 2016-07-06 12:16:12 UTC
(In reply to Steve Dickson from comment #6)
> I thinking the answer here is to used background mounts.

Does this mean the RedHat package does have the fix in for the background mount? As mentioned, looking at 1.2.3 upstream source the fix is not there. Does the RedHat package include it as a patch?

(In reply to Steve Dickson from comment #6)
> You really don't
> want foreground mounts to hang very since it would hang the
> entire boot process for an indefinite amount of time.

Maybe, but it is still an option that a user can select and should therefore work.

The issue I think is that foreground and background mount currently have differing opinions on what is actually a permanent failure. Background mount retries on EOPNOTSUPP. Foreground does not. I think this is a problem and confusing to say the least.

Regarding indefinite hang - my example of 10000 on retry is a really bad example as it really should only be about 10-15 minutes - the reason I had it 10000 is because I was battling with this issue and playing with the numbers, even on 10-15 it does not retry on EOPNOTSUPP. I also have startup scripts in /etc/init.d/ starting services that depend on the network file system, so I really do need foreground option to retry correctly for the specified time given.

Background mount is not a viable permanent fix for me and I'm only testing it to just to see if it indeed does behave differently to foreground

Thanks

Comment 10 Steve Dickson 2016-08-25 14:45:59 UTC
It appears upstream agrees with you due to this recent commit:

commit df0b99980d74505299e9289c2ccddd03a48b664f
Author: NeilBrown <neilb>
Date:   Sat Aug 20 10:39:52 2016 -0400

    mount: RPC_PROGNOTREGISTERED should not be a permanent error
    
    Commit: bf66c9facb8e ("mounts.nfs: v2 and v3 background mounts should
    retry when server is down.")
    
    changed the behaviour of "bg" mounts so that RPC_PROGNOTREGISTERED,
    which maps to EOPNOTSUPP, is not a permanent error.
    This useful because when an NFS server starts up there is a small window
    between the moment that rpcbind (or portmap) starts responding to lookup
    requests, and the moment when nfsd registers with rpcbind.  During that window
    rpcbind will reply with RPC_PROGNOTREGISTERED, but mount should not give
    up.
    
    This same reasoning applies to foreground mounts.  They don't wait for
    as long, but could still hit the window and fail prematurely.
    
    So revert the above patch and instead add EOPNOTSUPP to the list of
    temporary errors known to nfs_is_permanent_error.

Comment 14 Yongcheng Yang 2016-11-18 11:01:34 UTC
Moving to VERIFIED according to test logs of Comment #13.

Also keep running this automatic case as regression test in future.

Comment 16 errata-xmlrpc 2017-03-21 11:24:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2017-0741.html