RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 619792 - secure nfs mount sec=krb5 fails in RHEL6
Summary: secure nfs mount sec=krb5 fails in RHEL6
Keywords:
Status: CLOSED DUPLICATE of bug 613682
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libtirpc
Version: 6.0
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Steve Dickson
QA Contact: qe-baseos-daemons
URL:
Whiteboard:
Depends On: 562807
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-07-30 14:44 UTC by Steve Dickson
Modified: 2010-07-30 19:04 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 562807
Environment:
Last Closed: 2010-07-30 19:04:51 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Steve Dickson 2010-07-30 14:44:32 UTC
+++ This bug was initially created as a clone of Bug #562807 +++

If I try an nfs mount of a directory with krb5=sec in Fedora 12 then it fails. I have tried this with the current nfs-utils version (1.2.1-4.f12) and the original release version (1.2.0-18.f12). If however I downgrade it to the most recent Fedora 11 version (1.2.0-6.f11) then it works. I have packet traces and by comparing a working version with one that isn't the difference seems to be in the first packet after the keys have been negotiated. In the version which works this is an NFS v3 NULL call with a GSS token attached. In the one that doesn't there is a malformed  NFS v3 NULL call packet, which is comparable as far as the GSS Token length but it ends there where the GSS token should start.
Since the length is the same, I would guess that rpc.gssd is generating a token, but failing to pass it to the mount process.

--- Additional comment from m.a.young.uk on 2010-02-08 10:58:20 EST ---

Going back through the f12 versions (that are still available from koji) 1.2.0-1.f12 works, 1.2.0-5.f12 doesn't.

--- Additional comment from m.a.young.uk on 2010-02-24 11:06:59 EST ---

I have done a bit of experimenting and the problem seems to have been introduced when --enable-tirpc was made the default. If I add --disable-tirpc to the spec file of nfs-utils-1.2.1-5.fc12 and build and install an RPM then it works, however the nfs-utils-1.2.1-5.fc12 RPM from updates-testing doesn't.

--- Additional comment from jlayton on 2010-03-01 19:08:30 EST ---

I'll test this as soon as I'm able. One question -- are any messages logged to syslog during these mount attempts?

Even better might be to run gssd in the foreground in debug mode and see whether it prints out anything suspicious:

# service rpcgssd stop
# rpc.gssd -f -vvvvv

...attempt the mount in another shell, then kill gssd and copy the output to a file. That might help point out where the problem is.

--- Additional comment from jlayton on 2010-03-01 21:04:43 EST ---

So far, this works for me. Client and server are both f12, both using nfs-utils-1.2.1-4.fc12. You'll probably need debug output from gssd to understand what's happening here.

--- Additional comment from m.a.young.uk on 2010-03-02 05:16:01 EST ---

Created an attachment (id=397288)
rpc.gssd log from failed attempt

Here are the logs (slightly anonymized). I had already looked at this but I didn't think they were very informative.

--- Additional comment from jlayton on 2010-03-02 07:13:12 EST ---

This is failing:

        auth = authgss_create_default(rpc_clnt, clp->servicename, &sec);
        if (!auth) {
                /* Our caller should print appropriate message */
                printerr(2, "WARNING: Failed to create %s context for "
                            "user with uid %d for server %s\n",
                        (authtype == AUTHTYPE_KRB5 ? "krb5":"spkm3"),

...though it's not clear to me why it's failing for you and not me. I'll have to look and see what sort of logging we can get out of libtirpc to diagnose this.

--- Additional comment from jlayton on 2010-03-02 08:37:47 EST ---

Created an attachment (id=397327)
patch -- have gssd print rpc_createerr when auth_gss creation fails

Here's an initial patch that might help point us in the right direction. Tested for compilation only. You'll want to apply this patch to the nfs-utils sources and rebuild gssd (or maybe just build a new package with the patch).

Then, run gssd in foreground debug mode again and reattempt the mount. With luck, we'll get a bit more info when that error message prints. If that doesn't help then we may need to rebuild libtirpc with -DDEBUG and see whether that gives us more info.

--- Additional comment from m.a.young.uk on 2010-03-02 09:35:08 EST ---

It returns RPC: Success which still isn't very helpful. Actually that doesn't surprise me as we know why the remote end rejects the call, it receives a malformed packet. The question is why libtirpc is malforming the packet by not attaching the GSS token.

--- Additional comment from m.a.young.uk on 2010-03-02 12:52:26 EST ---

Created an attachment (id=397384)
rpc.gssd log with libtirpc debugging turned on

This is the log with libtirpc debugging turned on. I have not had a chance to analyze it much yet.

--- Additional comment from jlayton on 2010-03-02 16:02:54 EST ---

rpcsec_gss: in authgss_marshal()
rpcsec_gss: xdr_rpc_gss_cred: encode success (v 1, proc 1, seq 0, svc 1, ctx (nil):0)
rpcsec_gss: xdr_rpc_gss_init_args: encode failure (token 0x1992e30:1221)

I have a hunch that I know what this is...

From your logs it looks like you're using AD as a KDC. This is fine, but one thing about AD is that it puts extra authorization info into krb5 tickets (the PAC -- privilege access certificate). They can grow to be quite large (on the order of 64k).

xdr_rpc_gss_init_args does this:

        xdr_stat = xdr_bytes(xdrs, (char **)&p->value,
                              (u_int *)&p->length, MAX_NETOBJ_SZ);

...and...

#define MAX_NETOBJ_SZ 1024

I suspect that the tickets from your AD server are larger than 1k and that's causing this to fail. What might be interesting is to increase this value and then rebuild tirpc and see if that works around the problem. A real fix will probably mean inlining the bytes, but we'll need to go over this carefully to be sure it out to be sure.

Here's what I'd do:

Try a mount, let it fail
stat /tmp/krb5cc_machine_MDS.AD.DUR.AC.UK

...then increase MAX_NETOBJ_SZ to something bigger than the size of the credcache.

I haven't surveyed this code fully, so I don't know whether a really big MAX_NETOBJ_SZ is ok, but it's worth a shot.

--- Additional comment from chuck.lever on 2010-03-02 16:10:21 EST ---

(In reply to comment #10)
> Here's what I'd do:
> 
> Try a mount, let it fail
> stat /tmp/krb5cc_machine_MDS.AD.DUR.AC.UK
> 
> ...then increase MAX_NETOBJ_SZ to something bigger than the size of the
> credcache.
> 
> I haven't surveyed this code fully, so I don't know whether a really big
> MAX_NETOBJ_SZ is ok, but it's worth a shot.    

I haven't looked at this code, but do note that a netobj is a well-known XDR type which is never larger than 1024, so I don't think the value of that constant should be changed.  If the argument being marshalled can be larger than 1024, the use of MAX_NETOBJ_SZ for the maximum size of that particular argument is not appropriate.

--- Additional comment from m.a.young.uk on 2010-03-03 05:11:29 EST ---

I tried increasing MAX_NETOBJ_SZ in two steps. Firstly we know from the logs how big the packet that fails was (1221 bytes) so I increased MAX_NETOBJ_SZ to 1280. That allowed me to mount the filesystem but not to access it. This is because the user tickets seem to be a bit bigger. Thus I increased it further to 1536 and I was then able to access the files. For reference /tmp/krb5cc_machine_MDS.AD.DUR.AC.UK is 2325 bytes and the user krb5cc file 2599 bytes, somewhat larger than the packets actually sent because they contain a krbtgt ticket as well as the ticket for the file server.

--- Additional comment from jlayton on 2010-03-03 07:01:32 EST ---

Ok, that's good news. Yep, I knew that we'd have more than one ticket there, but figured you wouldn't need larger than that.

Regarding Chuck's comment -- I'm not planning to propose that as a fix. It was simply a way to check to see whether the problem is what I think it is.

From what I can tell, librpcsecgss inlines the service ticket rather than copying in the bytes, but I need to look over this code more closely and see what the proper fix should be.

--- Additional comment from jlayton on 2010-03-03 14:00:09 EST ---

Changing this to a libtirpc bug since that's where the problem seems to be.

--- Additional comment from jlayton on 2010-03-03 15:53:52 EST ---

Created an attachment (id=397659)
patch -- allow larger ticket sizes with auth_gss

Here's an initial (untested) patch that I think will fix this issue the correct way. It also "backports" a number of other fixes that went into librpcsecgss. Please test this patch if you're able and let me know if it fixes the problem.

Chuck, any comments?

--- Additional comment from chuck.lever on 2010-03-03 16:15:51 EST ---

(In reply to comment #15)
> Chuck, any comments?    

I don't have any immediate objections, but you should have Kevin Coffman review this fix.

--- Additional comment from jlayton on 2010-03-03 16:28:36 EST ---

Good idea. If it tests out ok, I'll cc him when I send it out to the list.

--- Additional comment from m.a.young.uk on 2010-03-04 05:47:06 EST ---

(In reply to comment #15)
> Created an attachment (id=397659) [details]
> patch -- allow larger ticket sizes with auth_gss
> 
> Here's an initial (untested) patch that I think will fix this issue the correct
> way. It also "backports" a number of other fixes that went into librpcsecgss.
> Please test this patch if you're able and let me know if it fixes the problem.

Yes, with the patch it builds and works for me. I can mount the filesystem and view and write to files and directories within it.

--- Additional comment from jlayton on 2010-03-08 14:28:02 EST ---

The patch has been pushed to mainline libtirpc. Reassigning to steved so he can work out how to release the fix.

Comment 1 Michael Young 2010-07-30 14:53:39 UTC
I see my bug for Fedora 12 got cloned to RHEL6. I do have a RHEL6 beta2 test virtual machine if you need any testing done.

Comment 3 Steve Dickson 2010-07-30 19:04:51 UTC

*** This bug has been marked as a duplicate of bug 613682 ***


Note You need to log in before you can comment on or make changes to this bug.