Bug 826507

Summary: Problem mounting DFS Shares with Kerberos
Product: [Fedora] Fedora Reporter: Marcus Moeller <marcus.moeller>
Component: cifs-utilsAssignee: Jeff Layton <jlayton>
Status: CLOSED NOTABUG QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 17CC: dpal, jlayton, ssorce, steved
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-06-13 11:10:16 EDT Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Attachments:
Description Flags
dmesg output while trying to mount a dfs share (cifsFYI set)
none
relevant part of daemon.debug log
none
patch -- more debug logging for krb5 upcalls
none
daemon debug with cifs-verbose login patch applied
none
dmesg output with cifs-verbose login patch applied and cifsFYI set
none
kern and daemon debug in one file
none
debug log with updated pam_krb5 none

Description Marcus Moeller 2012-05-30 07:37:38 EDT
Description of problem:
While trying to mount a DFS Cifs share using kerberos, I got an mount error(126): Required key not available.

Version-Release number of selected component (if applicable):
cifs-utils-5.4-1
Comment 1 Jeff Layton 2012-05-30 14:59:07 EDT
Well, this bug report is pretty much content-free...

Let's start with exactly what you were doing (command line you ran, IOW), and some details about the server and share you're trying to mount...
Comment 2 Dmitri Pal 2012-05-30 17:25:20 EDT
Also check that you KDC allows (allow_weak_crypto) the ciphers the client supports. Latest MIT KDC turn older ciphers off but older clients are usually not capable of supporting latest ciphers.
Comment 3 Marcus Moeller 2012-05-31 02:44:56 EDT
I am trying to mount a DFS Cifs share (as described in the initial comment) using:

mount.cifs  -o nobrl,mfsymlinks,noserverino,file_mode=0700,dir_mode=0700,user=,uid=$MYUID,cruid=$MYUID,gid=$MYGID,sec=krb5 //$my.dfs.root/dfs/users/all/ /mnt

Here is my request-key.conf:

create  user    debug:*         negate          /bin/keyctl negate %k 30 %S
create  user    debug:loop:*    *               |/bin/cat
create  user    debug:*         *               /usr/share/keyutils/request-key-debug.sh %k %d %c %S
negate  *       *               *               /bin/keyctl negate %k 30 %S
create  cifs.spnego  *          *               /usr/sbin/cifs.upcall -t -c %k
create  dns_resolver *          *               /usr/sbin/cifs.upcall %k

The cc_cache file has been created correctly and is listed by klist.

This procedure worked in Fedora 16.
Comment 4 Jeff Layton 2012-05-31 07:09:59 EDT
In F17, we started shipping /etc/request-key.d/cifs.spnego.conf. You'll probably need to alter that file and add the '-t' argument in order for it to take effect. The -c argument to cifs.upcall has been deprecated for a long time and can be dropped.

Can you let me know whether that helps? If it doesn't then you'll probably want to set up syslog to log at daemon.debug level, and collect the log messages from cifs.upcall during one of these tests.
Comment 5 Marcus Moeller 2012-06-01 04:26:54 EDT
Created attachment 588354 [details]
dmesg output while trying to mount a dfs share (cifsFYI set)

Sadly the suggested option changes did not solve the problem. I have attached a dmesg output with cifsFYI set.
Comment 6 Marcus Moeller 2012-06-01 04:34:51 EDT
Created attachment 588356 [details]
relevant part of daemon.debug log
Comment 7 Jeff Layton 2012-06-01 06:52:05 EDT
Well, that is odd...

The cFYI message indicates that the SPNEGO key request returned -ENOKEY, but the daemon.debug log looks like the request succeeded (after doing a DNS lookup to get the hostname). I don't see any obvious bug that would make it downcall with a negative key in that case.

What we may need to do is add some extra debugging to cifs.upcall and ensure that it's downcalling with the right key.
Comment 8 Marcus Moeller 2012-06-01 10:00:56 EDT
Ok, does this have to be done in the sourcecode?
Comment 9 Jeff Layton 2012-06-01 14:08:40 EDT
Created attachment 588554 [details]
patch -- more debug logging for krb5 upcalls

Yes...

This patch is against cifs.upcall. It adds more debug logging in the exit codepaths of cifs.upcall and may help us determine whether we're getting an error back on the upcall or not.

It should apply to the 5.5 sources (which is in updates-testing for f17).
Comment 10 Marcus Moeller 2012-06-04 04:17:08 EDT
After changing

cifs.upcall.c:1045:2: error: format '%d' expects argument of type 'int', but argument 3 has type 'long int' [-Werror=format]

to:

cifs.upcall.c:1045:2: error: format '%ld' expects argument of type 'int', but argument 3 has type 'long int' [-Werror=format]

the patch applies correctly.

Attached you will find the logs with debug set.
Comment 11 Marcus Moeller 2012-06-04 04:18:00 EDT
Created attachment 589065 [details]
daemon debug with cifs-verbose login patch applied
Comment 12 Marcus Moeller 2012-06-04 04:18:51 EDT
Created attachment 589066 [details]
dmesg output with cifs-verbose login patch applied and cifsFYI set
Comment 13 Marcus Moeller 2012-06-11 07:01:45 EDT
Any update on that?
Comment 14 Jeff Layton 2012-06-11 07:34:27 EDT
It looks like your kernel log has two mount attempts (and two request_key calls), but the daemon.debug log only shows that one upcall occurred. That upcall was successful.

This request_key call failed:

[  338.066972] fs/cifs/cifs_spnego.c: key description = ver=0x2;host=d.ethz.ch;ip4=172.31.65.60;sec=krb5;uid=0xaf05;creduid=0xaf05;user=mam4tst;pid=0x5a2
[  338.126423] fs/cifs/sess.c: ssetup freeing small buf ffff88012da78000
[  338.126427] CIFS VFS: Send error in SessSetup = -126

...but this one worked:

[  344.834585] fs/cifs/cifs_spnego.c: key description = ver=0x2;host=d.ethz.ch;ip4=172.31.52.67;sec=krb5;uid=0xaf05;creduid=0xaf05;user=mam4tst;pid=0x5b4
[  344.870882] fs/cifs/transport.c: For smb_command 115
[  344.870889] fs/cifs/transport.c: Sending smb:  total_len 1676
[  344.874121] fs/cifs/connect.c: RFC1002 header 0xf9
[  344.874163] fs/cifs/transport.c: cifs_sync_mid_result: cmd=115 mid=2 state=4

...note that the ip4= addresses are different. Here's the key description in the daemon.debug log:

Jun  4 10:13:40 slabstb250 cifs.upcall: key description: cifs.spnego;0;0;3f000000;ver=0x2;host=d.ethz.ch;ip4=172.31.65.62;sec=krb5;uid=0xaf05;creduid=0xaf05;user=mam4tst;pid=0x4c0

...that upcall attempt looks like it succeeded, but note that this one has an entirely different ip4= address from the ones in dmesg. Are you certain that these are from the same attempt?

You may want to set up syslog to log both kern.debug and daemon.debug to the same file. That might make it easier to correlate what's happening in userspace with the kernel.
Comment 15 Marcus Moeller 2012-06-11 09:10:59 EDT
Created attachment 590928 [details]
kern and daemon debug in one file
Comment 16 Jeff Layton 2012-06-12 06:47:03 EDT
That's a little more helpful. The first one failed with this:

Jun 11 15:06:25 localhost cifs.upcall: handle_krb5_mech: failed to obtain service ticket (-1765328245)

...that's KRB5_CC_BADNAME -- basically the error you get when the upcall program can't find a suitable credcache to use. It looks like there's one that should be owned by uid=44805, but isn't:

Jun 11 15:06:25 localhost cifs.upcall: find_krb5_cc: considering /tmp/krb5cc_44805_hS9u9K
Jun 11 15:06:25 localhost cifs.upcall: find_krb5_cc: /tmp/krb5cc_44805_hS9u9K is owned by 0, not 44805
Jun 11 15:06:25 localhost cifs.upcall: find_krb5_cc: considering /tmp/krb5cc_0
Jun 11 15:06:25 localhost cifs.upcall: find_krb5_cc: /tmp/krb5cc_0 is owned by 0, not 44805
Jun 11 15:06:25 localhost cifs.upcall: krb5_get_init_creds_keytab: -1765328174

...the second attempt (5s later) finds a credcache, so perhaps you have a race of some sort there where your credcache isn't being created before you attempt the mount:

Jun 11 15:06:30 localhost cifs.upcall: find_krb5_cc: considering /tmp/krb5cc_44805_KAannO
Jun 11 15:06:30 localhost cifs.upcall: find_krb5_cc: FILE:/tmp/krb5cc_44805_KAannO is valid ccache

...but that attempt fails with KRB5KDC_ERR_S_PRINCIPAL_UNKNOWN. The kdc doesn't seem to server principal for "cifs/d.ethz.ch". In neither case do I see the program retrying after reverse-resolving the hostname to get the canonical name, so I can only assume that you haven't added '-t' to the flags. Leaving that flag off is safer, but you do need to make sure you have server principals for all possible hostnames.
Comment 17 Marcus Moeller 2012-06-12 09:27:03 EDT
Sorry, the updated pam_krb5 version that 'should' fix the permission problem has not been applied on that test system (as described in BZ #822493).

Attached you will find an updated log file with the updated pam_krb5 version installed.

Concerning the server-side: nothing has changed here and everything still works in F16.
Comment 18 Marcus Moeller 2012-06-12 09:27:49 EDT
Created attachment 591188 [details]
debug log with updated pam_krb5
Comment 19 Jeff Layton 2012-06-12 10:04:46 EDT
Ok, looks like krb5 is working fine in this trace, the problem is in the referral chasing. You're getting back -EINVAL on the mount. The specific reason for that is a bit hard to tell since it's buried down in the DFS code that I don't know too well. I suspect however that the problem may ultimately be related to this:

Jun 12 15:06:54 localhost kernel: [  615.054656] fs/cifs/dns_resolve.c: dns_resolve_server_name_to_ip: unable to resolve: HASH
Jun 12 15:06:54 localhost kernel: [  615.054664] fs/cifs/cifs_dfs_ref.c: cifs_compose_mount_options: Failed to resolve server part of \\HASH\dfs to IP: -126

...however, I may be wrong there. What kernel are you using here? If it's a 3.4 kernel, does this work better with something 3.3-ish?

You may have better luck reporting this upstream at linux-cifs@vger.kernel.org.
Comment 20 Marcus Moeller 2012-06-13 08:03:10 EDT
Ok, I got it now.

I just copied both lines:

create  cifs.spnego    * * /usr/sbin/cifs.upcall -t %k
create  dns_resolver    * * /usr/sbin/cifs.upcall %k

in one file within /etc/request-key.d/ but the patch that enabled the .d feature only seems to support one line per file. This is imho a false behaviour (and the man page is not even clear about it.
Comment 21 Jeff Layton 2012-06-13 10:10:47 EDT
The request-key.conf(5) manpage seems pretty clear on this point to me.

The request-key.d feature just makes /sbin/request-key pick a different config file. The file in that directory does need to be named properly, or /sbin/request-key won't find it however (and will fall back to using /etc/request-key.conf.

At this point, do you have it working?
Comment 22 Marcus Moeller 2012-06-13 10:16:46 EDT
I have named the file .conf.

It just does not seem to work if it contains more than one line.

After migrating the two settings to different .conf files within /etc/request-key.d/ it works fine.
Comment 23 Marcus Moeller 2012-06-13 11:10:16 EDT
You are right, after re-reding it twice it gets clearer. Anyhow, thanks for taking care.