Bug 220649

Summary: NFS with Kerberos completely broken on 2.6.18 and 2.6.19
Product: Red Hat Enterprise Linux 5 Reporter: Paarvai Naai <opensource3141>
Component: kernelAssignee: Steve Dickson <steved>
Status: CLOSED CURRENTRELEASE QA Contact: Brian Brock <bbrock>
Severity: urgent Docs Contact:
Priority: medium    
Version: 5.0CC: coughlan, dzickus, francois.marabelle, jburke, jlaska, jturner, k.georgiou, rkenna, staubach, wtogami
Target Milestone: ---Keywords: Regression
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RC Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-02-08 02:01:01 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Purposed Patch
none
The complete patch
none
Updated patch that combines two other upstream patches none

Description Paarvai Naai 2006-12-22 18:58:15 UTC
+++ This bug was initially created as a clone of Bug #218330 +++

Description of problem:
---
NFS with Kerberos is completely broken since 2.6.18 due to a bug in the NFS
client code.  An NFS client can no longer mount more than one NFS+KRB5 volume
from a single server.  If you try to do so, you will get the error:

mount.nfs: File exists

from the mount program.


Version-Release number of selected component (if applicable):
---
Any kernel 2.6.18 and up.


How reproducible:
---
This is 100% reproducible.
---
Steps to Reproduce:
1. Export two share using either NFSv4 or NFSv3 with the sec=krb5 option.
2. Mount one on the client.
3. Try to mount the other on the client.


Actual results:
---
You will now see the "File exists" error.
  
Expected results:
---
The mount should complete without incident.


Additional info:
---
This is an urgent problem that needs attention.  The problem has been isolated.
 See the following mailing list post:

http://linux-nfs.org/pipermail/nfsv4/2006-November/005306.html

And a patch has been proposed:

http://linux-nfs.org/pipermail/nfsv4/2006-November/005315.html

The patch works on 2.6.19 vanilla kernel (it does not apply cleanly on 2.6.18
and a simple readjustment of the patch does not work) as verified by both the
maintainer and myself.

Here's what I suggest:  add the patch ASAP to the Fedora SRPM of kernel 2.6.19
and propagate this new kernel release to FC5 updates as well as FC6 updates.

-- Additional comment from k.georgiou.uk on 2006-12-05 12:50 EST --
This of course affects FC6. I suspect the beta of RHEL5 (although I can't check
at the moment) will have the same problem as well.

-- Additional comment from opensource3141 on 2006-12-05 14:09 EST --
Yes, that's why I listed that the kernel SRPM include the patch and be released
in FC5 and FC6.  That's also a good point about RHEL 5.  Can someone move this
forward soon?  Thanks!

-- Additional comment from opensource3141 on 2006-12-07 14:33 EST --
I haven't heard anything back from anyone about this bug.  I've been fighting
this problem for nearly two months and it's getting very frustrating, especially
since 2.6.17 has now been removed from the fc5 updates repository!!!  This has
caused considerable inconvenience for our company since we use NFSv3+Kerberos
and need to have machines installed with a properly working kernel.

I am unable to increase the urgency field past "Urgent", so can someone help me out.

Thanks.

-- Additional comment from k.georgiou.uk on 2006-12-07 17:11 EST --
Cloning the bug to FC6 and possibly RHEL5b2 might help there. It usually takes
some time before that bug is assigned to anyone but lack of activity in bugzilla
doesn't necessarily mean that someone isn't looking at it right now.

Comment 1 RHEL Program Management 2007-01-18 01:01:06 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release.  Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release.  This request is not yet committed for
inclusion.

Comment 2 RHEL Program Management 2007-01-18 12:27:07 UTC
This bugzilla has Keywords: Regression.  

Since no regressions are allowed between releases, 
it is also being proposed as a blocker for this release.  

Please resolve ASAP.

Comment 4 RHEL Program Management 2007-01-18 21:29:58 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 6 RHEL Program Management 2007-01-19 17:27:13 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 8 Steve Dickson 2007-01-19 19:29:32 UTC
Created attachment 146024 [details]
Purposed Patch

This is patch is a backport of the following upstream commit:

commit 3e32a5d99a467b9d4d416323c8c292479b4915e5
Author: Trond Myklebust <Trond.Myklebust>
Date:	Thu Nov 16 11:37:27 2006 -0500

    SUNRPC: Give cloned RPC clients their own rpc_pipefs directory

    Signed-off-by: Trond Myklebust <Trond.Myklebust>

There also so some changes needed to the cloning process that
were needed to stop memory corruption (that has existed since
the beginning)....

Comment 9 Steve Dickson 2007-01-19 20:13:41 UTC
Created attachment 146027 [details]
The complete patch

Comment 10 James Laska 2007-01-19 21:27:39 UTC
Do we have any test results using this patch that show normal nfs and nfs+krb5
test results?

Comment 11 Steve Dickson 2007-01-22 14:53:53 UTC
Created attachment 146187 [details]
Updated patch that combines two other upstream patches

It turns out the previous posted patch breaks UDP mounts.
So I decided to back port the following upstream patch that adds refs
counts to cloned connections:

commit 6b6ca86b77b62b798cf9ca2599036420abce7796
Author: Trond Myklebust <Trond.Myklebust>
Date:	Tue Sep 5 12:55:57 2006 -0400

    SUNRPC: Add refcounting to the struct rpc_xprt

    In a subsequent patch, this will allow the portmapper to take a reference
    to the rpc_xprt for which it is updating the port number, fixing an Oops.

Then it turned out a third upstream patch was also needed  to avoid
a BUG_ON() from popping in a "valid" error path...

Comment 12 Steve Dickson 2007-01-22 14:59:21 UTC
> Do we have any test results using this patch that show normal nfs and nfs+krb5
> test results?
Yes... Jeff Burke ran some RHTS tests over the weekend to ensure there 
were no regressions and will continue to run the tests... 



Comment 13 Jay Turner 2007-01-22 16:08:18 UTC
How much testing have we don't on the krb functionality?  Also, are we really
sure that the only way to rememdy this regression is bringing in new functionality?

Comment 14 Steve Dickson 2007-01-22 18:04:02 UTC
> How much testing have we don't on the krb functionality?
I'll assume s/don't/done/.... And the answer to that question
is none that I'm aware of... except for the unit testing I
do for each release... That's how we got into this position... 
We also don't test simultaneously NFS mounts...  

> Also, are we really sure that the only way to rememdy this
> regression is bringing in new functionality?
We are not bring in new functionality, secure mounts have been
around since RHEL4... and I know a growing number people
are using this feature due to the increasing number of bugs... 


Comment 15 Jay Turner 2007-01-22 18:18:38 UTC
Yep, was supposed to be "done" . . . as for the "new functionality" comment,
what I meant the fact the patch adds ref counting to cloned connections.

Comment 16 Jay Turner 2007-01-22 18:49:40 UTC
Looks like we've got no option here.  The change breaks kABI, so best to get it
into the mix before GA.

Comment 19 Paarvai Naai 2007-01-23 23:05:03 UTC
Thanks for all of the activity since my initial report of this bug.  I just
wanted to remind you folks not to forget about FC5/FC6 kernel updates for us
Fedora users out there.  I filed this bug in both under both of those distros as
well, but recently saw that FC6 released a version of 2.6.19 that doesn't seem
to have the patches to fix NFS+KRB5!

Comment 20 Don Zickus 2007-01-23 23:12:01 UTC
in 2.6.18-5.el5

Comment 21 RHEL Program Management 2007-02-08 02:01:01 UTC
A package has been built which should help the problem described in 
this bug report. This report is therefore being closed with a resolution 
of CURRENTRELEASE. You may reopen this bug report if the solution does 
not work for you.


Comment 22 Steve Dickson 2007-04-25 17:18:52 UTC
*** Bug 230536 has been marked as a duplicate of this bug. ***