Bug 513496

Summary: Upgrading nfs-utils on NFSv4 server kills downstream NFSv4 clients
Product: [Fedora] Fedora Reporter: Jeff Garzik <jgarzik>
Component: nfs-utilsAssignee: Steve Dickson <steved>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: high    
Version: 11CC: amessina, bfields, braden, ckjohnson, jbastian, jussi.eloranta, peterm, roth, signo, sprabhu, steved, stijn
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: 1.2.0-4.fc11 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 512377 Environment:
Last Closed: 2009-07-31 14:00:28 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:

Description Jeff Garzik 2009-07-23 18:05:25 EDT
+++ This bug was initially created as a clone of Bug #512377 +++

Description of problem:
Last night's Fedora 11 nfs-utils upgrade broke NFSv4.

Problem machine:  Fedora 11/x86-64, NFSv4 server (kernel version below)

Attached was a lone NFSv4 client:  Fedora 10/x86-64, kernel 2.6.30 vanilla.

After yum upgraded nfs-utils, syscalls on the NFSv4 client started failing with "Protocol not supported".

On the NFSv4 server, where the nfs-utils package upgrade had just taken place, the log was filling with a steady stream of messages:
Jul 17 05:58:14 pretzel kernel: svc: 10.10.20.30, port=763: unknown version (4 for prog 100003, nfsd)
Jul 17 05:58:15 pretzel kernel: svc: 10.10.20.30, port=763: unknown version (4 for prog 100003, nfsd)
Jul 17 05:58:15 pretzel kernel: svc: 10.10.20.30, port=763: unknown version (4 for prog 100003, nfsd)
Jul 17 05:58:17 pretzel kernel: svc: 10.10.20.30, port=763: unknown version (4 for prog 100003, nfsd)
Jul 17 05:58:17 pretzel kernel: svc: 10.10.20.30, port=763: unknown version (4 for prog 100003, nfsd)

10.10.20.30 is the NFSv4 client mentioned above.


Version-Release number of selected component (if applicable):
nfs-utils-1.2.0-3.fc11.x86_64
kernel-2.6.29.5-191.fc11.x86_64


How reproducible:
unknown

Steps to Reproduce:
1. set up nfsv4 server, nfsv4 client.
2. mount server, on client
3. upgrade nfs-utils on server
  
Actual results:
see above

Expected results:
working client

Additional info:
NFSv4 client /etc/fstab line:
pretzel:/		/g			nfs4	defaults,noatime 0 0

(pretzel == the problematic NFSv4 server)

--- Additional comment from jussi.eloranta@csun.edu on 2009-07-17 12:30:45 EDT ---

I ran into the same problem, on 32bit system. This caused a major headache this morning :-(

--- Additional comment from bfields@fieldses.org on 2009-07-17 12:58:52 EDT ---

/proc/fs/nfsd/versions was extended to allow turning on/off minor versions by echoing "+4.1" or "-4.1" to /proc/fs/nsfd/versions.

Unfortunately, pre-2.6.30 kernels just stop parsing at first non-digit, so "-4.1" is interpreted as "-4".  If new nfs-utils (on old kernel) writes "+2", "+3", "+4", then "-4.1", result therefore is to turn off 4.1....

Turning off the minorversion first should work.  Or just not bothering, since the kernel leaves it off by default.

But the interface now seems more delicate than intended.  Better might be to violate the rule against changing upstream user<->kernel api's (nobody user wants 4.1 yet anyway), and add a new /proc/fs/nsfd/v4_minor_versions file instead....

--- Additional comment from amessina@messinet.com on 2009-07-17 19:27:46 EDT ---

What is the proposed workaround for F11 until the 2.6.30 kernel is released, besides reverting to NFSv3?  This has killed my entire network with NFSv4 mounted /home directories.

I'm confused as to why this nfs-utils would be pushed to stable without the necessary kernel to support the change for NFSv4.

--- Additional comment from jbastian@redhat.com on 2009-07-22 11:23:30 EDT ---

A colleague of mine, Sachin Prabhu, suggested removing the '-N 4.1' argument from the rpc.nfsd line in /etc/rc.d/init.d/nfs; see below.

This fixed it for me.  Of course, NOT disabling 4.1 may introduce new problems...


--- /etc/rc.d/init.d/nfs.ORIG   2009-06-11 13:13:23.000000000 -0500
+++ /etc/rc.d/init.d/nfs        2009-07-22 10:09:44.083889410 -0500
@@ -94,7 +94,7 @@

        echo -n $"Starting NFS daemon: "
        # For now, turn off the nfs41 support
-       daemon rpc.nfsd -N 4.1 $RPCNFSDARGS $RPCNFSDCOUNT
+       daemon rpc.nfsd $RPCNFSDARGS $RPCNFSDCOUNT
        RETVAL=$?
        echo
        [ $RETVAL -ne 0 ] && exit $RETVAL

--- Additional comment from jgarzik@redhat.com on 2009-07-23 17:44:43 EDT ---

This affects rawhide as well as Fedora 11.
Comment 1 Jeff Garzik 2009-07-23 18:06:38 EDT
Cloned this bug from rawhide Bug #512377, as it affects F11 too.

Most likely, there is a single solution for this, and both bugs can be closed at almost the same time, once fixed.
Comment 2 Steve Dickson 2009-07-27 12:29:05 EDT
I just did test the patch  and it does work (with a minor change) 
and I will be updating the F-11 nfs-utils shortly
Comment 3 Fedora Update System 2009-07-27 21:41:21 EDT
nfs-utils-1.2.0-4.fc11 has been submitted as an update for Fedora 11.
http://admin.fedoraproject.org/updates/nfs-utils-1.2.0-4.fc11
Comment 4 Fedora Update System 2009-07-28 14:25:35 EDT
nfs-utils-1.2.0-4.fc11 has been pushed to the Fedora 11 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update nfs-utils'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F11/FEDORA-2009-8065
Comment 5 Fedora Update System 2009-07-31 14:00:22 EDT
nfs-utils-1.2.0-4.fc11 has been pushed to the Fedora 11 stable repository.  If problems still persist, please make note of it in this bug report.
Comment 6 Giorgio Signorini 2009-09-10 07:14:56 EDT
I recently updated my F11 on an NFS server to

  nfs-utils-1.2.0-4.fc11.x86_64
  rpcbind-0.2.0-2.fc11.x86_64
  kernel-devel-2.6.30.5-43.fc11.x86_64

and the server stopped working.

Namely, a client listed in /etc/hosts.allow BY
NAME is no more be able to mount an exported fs (hosts.deny is "ALL: ALL"),
while it seemed to work before the update. 

The error that appears on the NFS client is

# mount: mount to NFS server 'xxx.yyy.it' failed: RPC Error: Authentication
error.  

Everything is fixed if rpcbind is restarted with -w after server reboot.

Otherwise the only way to get things working is to list the client BY IP in
hosts.allow OR in /etc/hosts, which is not very convenient ...