Bug 645127 - Input/output error with sec=krb5
Input/output error with sec=krb5
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: cifs-utils (Show other bugs)
6.0
All Linux
urgent Severity medium
: rc
: ---
Assigned To: Jeff Layton
yanfu,wang
: OtherQA, ZStream
: 667219 (view as bug list)
Depends On: 622790 667644 667647
Blocks: 667675 668366
  Show dependency treegraph
 
Reported: 2010-10-20 16:47 EDT by Jack Neely
Modified: 2014-06-18 03:40 EDT (History)
16 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 622790
: 667675 (view as bug list)
Environment:
Last Closed: 2011-05-19 09:06:51 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Jack Neely 2010-10-20 16:47:00 EDT
I have been able to duplicate this bug on the RHEL 6 Beta 2 with krb5-libs-1.8.2-2.el6.x86_64.

+++ This bug was initially created as a clone of Bug #622790 +++

Description of problem:

While trying to mount a share (DFS or 'classic') with Kerberos, I got an error message like:

mount error(5): Input/output error

username/password works fine.

--- Additional comment from jlayton@redhat.com on 2010-08-10 08:45:02 EDT ---

Does anything pop up in dmesg when you mount that way?

--- Additional comment from mail@marcus-moeller.ch on 2010-08-10 08:53:11 EDT ---

Aug 10 14:52:39 host kernel: CIFS VFS: Send error in SessSetup = -5
Aug 10 14:52:39 host kernel: CIFS VFS: cifs_mount failed w/return code = -5

--- Additional comment from jlayton@redhat.com on 2010-08-10 09:08:33 EDT ---

EIO usually means either that you got some malformed packets back from the server, or we got some other CIFS error that the kernel doesn't know how to translate.

It might be good to have some cFYI info from one of these attempts. Could you follow the instructions here, capture the messages to a file and attach it?

http://wiki.samba.org/index.php/LinuxCIFS_troubleshooting#Enabling_Debugging

--- Additional comment from jlayton@redhat.com on 2010-08-10 09:53:29 EDT ---

I believe you may have attached the info to the wrong bug...

https://bugzilla.redhat.com/show_bug.cgi?id=622802#c2

Looks like the client is sending a session setup, and the server is sending back an error that gets translated to ERRgeneral which is sort of a catch-all error like EIO is. Would it be possible to get a binary capture of this? The instructions for how to do this are a little ways down on the same samba wiki page.

If not, you could do the capture yourself and tell me what wireshark says the error response to the Session Setup packet is.

--- Additional comment from mail@marcus-moeller.ch on 2010-08-10 10:08:48 EDT ---

Created attachment 437892 [details]
trace file

--- Additional comment from jlayton@redhat.com on 2010-08-10 10:23:16 EDT ---

That doesn't show it. The problem here is that this error is cropping up when chasing a DFS referral, so limiting the capture to the particular mount host means that you aren't getting anything once the referral is chased.

You may want to redo the capture and leave out the 'host cifs_server.example.com' part of the capture filter. If you capture everything on port 445 it should get it.

--- Additional comment from mail@marcus-moeller.ch on 2010-08-10 10:36:19 EDT ---

Created attachment 437901 [details]
tcp dump only limited to port 445

--- Additional comment from jlayton@redhat.com on 2010-08-10 12:40:09 EDT ---

Wireshark says the error was this:

    NT Status: RPC_NT_INVALID_BINDING (0xc0020003)

...which is one I've never seen before. What kind of server is at 192.168.50.100?

--- Additional comment from mail@marcus-moeller.ch on 2010-08-10 12:55:49 EDT ---

This is an EMC Storage.

--- Additional comment from jlayton@redhat.com on 2010-08-10 13:11:04 EDT ---

Thanks. I can find no helpful mention of that error in any of the MS docs. The closest I found is some googling that turned up some bugs in RDP protocol handling. Nothing in CIFS protocol however.

I suspect this is a bug in the EMC SPNEGO/krb5 implementation. You may want to file a bug report with them. You should be able to reference this bug if they need captures and such. If they think it's a bug in the Linux cifs implementation, then please have them explain what they think we're doing wrong...

I'll leave this bug open for now with the needinfo flag set in case you or they have questions or we need to discuss this further.

--- Additional comment from mail@marcus-moeller.ch on 2010-08-10 13:17:03 EDT ---

One assumption could be that the enctype is not supported by the EMC, as the TGT is fetched from an 2008R2 Server which does not support DES anymore (by default).

--- Additional comment from jlayton@redhat.com on 2010-08-10 13:33:56 EDT ---

Possibly -- only EMC could tell us that. I'll note that the krb5 ticket in the session setup request has this:

Encryption type: rc4-hmac (23)

...which is a very common enctype for windows. It would be a stretch for them to claim cifs+krb5 compatibility without supporting it...

Also from the capture, the negTokenTarg reply is "reject", but we can't tell much beyond that.

The SMB error code is also very odd. It's one I've never seen before, and I've seen servers throw quite a few different errors in response to krb5 auth problems.

--- Additional comment from mail@marcus-moeller.ch on 2010-08-10 14:02:32 EDT ---

Setting:

        default_tkt_enctypes = arcfour-hmac-md5
        default_tgs_enctypes = arcfour-hmac-md5

solved the problem, so I guess we can close this one. Thanks for keeping an eye on it, Jeff.

--- Additional comment from jlayton@redhat.com on 2010-08-10 14:09:10 EDT ---

Sorry...where was this set? Someplace on the server?

--- Additional comment from mail@marcus-moeller.ch on 2010-08-10 14:17:42 EDT ---

No, this has been set on the client.

From my observation, if the TGT is AES256, you won't be able to fetch service tickets from machines that does not support it. Setting the above value forced the TGT to arcfour-hmac-md5 which is compatible with every component (yes, even EMC) in our environment.

--- Additional comment from jlayton@redhat.com on 2010-08-10 14:31:46 EDT ---

Huh....ok. Here's what I don't quite get -- the enctype of the service ticket in the session setup request was rc4-hmac. Why would the server care that the TGT used to get this ticket from the KDC was AES?

It doesn't seem like that should matter at all...

--- Additional comment from mail@marcus-moeller.ch on 2010-08-10 15:04:35 EDT ---

Ah, sorry, it's not about the TGT, but about the Client/Server Session Key enc type negotiation which does not seem to work if not forced.

--- Additional comment from mail@marcus-moeller.ch on 2010-08-11 09:43:50 EDT ---

hmm, after some tries the problem occurs again so it may just have been luck that it worked once.

--- Additional comment from jlayton@redhat.com on 2010-08-16 14:38:35 EDT ---

Ok, I think this is probably a server issue. That error is pretty strange and not something I've ever seen Windows return. I suggest taking opening a bug with EMC -- I don't think we'll be able to solve this without their involvement.


For now, I'll set this to NEEDINFO. If you get some more info from the EMC support people, I'll be happy to look it over.

--- Additional comment from metze@samba.org on 2010-08-17 09:13:35 EDT ---

This seems to be a problem with the MIT krb5 libraries.

I've tested mount.cifs and smbclient from samba4 (commit b0b73ca041ba3d90b3924b380abed4975e5354d9) with the
"HACK bin/smbclient4... and don't use MS KRB5 oid." patch.

This worked:

/tmp/smbclient4-nomskrb5 '//nas-nethz-users.d.ethz.ch/share-m-$/metze' -k yes --option="gensec_gssapi:delegation=no" --option="gensec_gssapi:mutual=no"

worked while this failed:

mount.cifs -o user=metze,uid=21866,sec=krb5 '//nas-nethz-users.d.ethz.ch/share-m-$/metze' /home/metze


See metze-both-arcfour-06.pcap frame 10 is cifs mount and frame 26 is smbclient4-nomdkrb5.

Then I've tested with the installed smbclient from samba3
(Version 3.5.4-62.fc13) and also got the RPC_NT_INVALID_BINDING (0xc0020003)
error.

smbclient '//nas-nethz-users.d.ethz.ch/share-m-$/metze' -k yescli_session_setup_blob: receive failed (NT code 0xc0020003)
session setup failed: NT code 0xc0020003

As cifs.upcall and smbclient both use the MIT krb5 libraries
(krb5-libs-1.7.1-10.fc13.i686), but smbclient4-nomskrb5 uses
heimdal, I assume the problem is within the krb5 libraries.

The only difference between frames 10 and 26 is the different length
of the authenticator.

--- Additional comment from metze@samba.org on 2010-08-17 09:14:48 EDT ---

Created attachment 439107 [details]
Patch to build a source3/bin/smbclient4 staticly

--- Additional comment from metze@samba.org on 2010-08-17 09:16:06 EDT ---

Created attachment 439108 [details]
Capture of mount.cifs and smbclient4-nomskrb5 with arcfour

--- Additional comment from jlayton@redhat.com on 2010-08-17 09:40:04 EDT ---

Nice work, Metze. Ok, moving this bug to be against the krb5 libs.

--- Additional comment from jlayton@redhat.com on 2010-08-17 09:46:40 EDT ---

Pity that wireshark isn't better able to stitch together SMB packets...

There is a difference in the length of the session setup requests. Both session setup requests have a trailing frame that wireshark ids as an "NBSS Continuation Message". The first one (from the failed session setup request) is 222 bytes long. The second is 226 bytes long.

It seems likely that the problem is related to that difference.

--- Additional comment from jlayton@redhat.com on 2010-08-17 09:50:25 EDT ---

Then again, session setup requests have a couple of UCS2 strings at the end. Those strings can be variable length, so it's not a given that the two different programs will send the exact same length session setup request.

It would be helpful to have more info from the EMC side of things. Like, what specifically does it not like about the tickets that MIT krb5 is creating?

--- Additional comment from metze@samba.org on 2010-08-17 10:04:50 EDT ---

I've also tested

/tmp/smbclient4-nomskrb5 '//nas-nethz-users.d.ethz.ch/share-m-$/metze' -k yes --option="gensec_gssapi:delegation=no" --option="gensec_gssapi:mutual=no" "--option=gensec:fake_gssapi_krb5=yes" "--option=gensec:gssapi_krb5=no"

and it also works...

I first thought that the problem might be that smbclient (samba3)
and cifs.upcall use the kerberos library directly (with selfmade gss wrapping) instead of using the gssapi wrapper, but the fake_gssapi_krb5 module should simulate that behavior...

--- Additional comment from metze@samba.org on 2010-08-17 10:07:04 EDT ---

from IRC:

15:48 < jlayton> metze: the length difference between those two session setup packets could have more to do with 
                 the strings at the end of the session setup packet too
15:48 < jlayton> unless I'm missing a length field in there
15:53 < metze> jlayton: no, they're not part of the authenticator
15:57 < jlayton> I didn't see an authenticator field length in there
16:00 < metze> click in ticket and compare the offsets
16:01 < metze> they start at the same offset
16:01 < metze> and then click on the authenticator 
16:01 < metze> they also start at the same offset
16:01 < metze> but have different starting bytes
16:01 < metze> the length is asn1 encoded
16:02 < metze> but the smb level security blob length is also different
16:06 < metze> 1396 of mount.cifs and 1470 of smbclient4

--- Additional comment from metze@samba.org on 2010-08-17 10:41:14 EDT ---

If you disable tcp checksum validation wireshark reassambles the session setup requests fine...

--- Additional comment from metze@samba.org on 2010-08-17 11:01:58 EDT ---

I think I found the problem, but I need to verify my guess.

Inside the authenticator we have a checksum field.

For real gssapi this checksum has type 8003 and contains things like
the channel binding and delegated credentials.

For selfmade gssapi using krb5_mk_req_extended() they checksum should
be of CKSUMTYPE_RSA_MD5(7), this is what heimdal is using and the reason why
/tmp/smbclient4-nomskrb5 '//nas-nethz-users.d.ethz.ch/share-m-$/metze' -k yes --option="gensec_gssapi:delegation=no" --option="gensec_gssapi:mutual=no" "--option=gensec:fake_gssapi_krb5=yes" "--option=gensec:gssapi_krb5=no"
works.

MIT uses CKSUMTYPE_HMAC_MD5(-138) and that's the reason why
smbclient '//nas-nethz-users.d.ethz.ch/share-m-$/metze' -k yes
gets rejected.

I need to modify heimdal to also use CKSUMTYPE_HMAC_MD5(-138)
and see if it also gets rejected.

--- Additional comment from metze@samba.org on 2010-08-26 10:38:16 EDT ---

Created attachment 441233 [details]
Patch for krb5-1.7.1 to use checksum rsa md5 (7) as heimdal and windows clients
Comment 23 Stefan Metzmacher 2010-12-23 12:57:23 EST
Ok, the best fix seems to be using the GSSAPI checksum (0x8003),
that's what windows and all samba versions >= 3.6.0 (and maybe 3.5.7)
will use.

See https://bugzilla.samba.org/show_bug.cgi?id=7883

I'll provide a patch for cifs-utils soon.
Comment 24 Stefan Metzmacher 2010-12-27 15:34:47 EST
The patches for cifs-utils are here:
https://bugzilla.samba.org/show_bug.cgi?id=7890
Comment 29 Jeff Layton 2011-01-04 17:02:30 EST
*** Bug 667219 has been marked as a duplicate of this bug. ***
Comment 30 Jack Neely 2011-01-04 17:15:19 EST
I received cifs-utils-4.7-2.el6.i686.rpm from my Red Hat support rep containing
the above patch.  Unfortunately, I have to report that this did not work.

I installed the package, rebooted for good measure, but still get the same
errors.

   fs/cifs/netmisc.c: Mapping smb error code 31 to POSIX err -5

The mount command I'm using is:

   mount -t cifs //homedirtest1.oit.ncsu.edu/home-test/jjneely
/cifs/home/jjneely/ --verbose -o sec=krb5,user=jjneely,uid=18536,gid=108
Comment 31 Jeff Layton 2011-01-04 20:10:30 EST
Moving this back to ASSIGNED for now.

Could you provide a wire capture of the mount attempt (preferably filtered on port 445) ? Please see this page for details of how to do them:

http://wiki.samba.org/index.php/LinuxCIFS_troubleshooting#Wire_Captures

...with that I'll be able to tell a little more about the actual error returned by the server.
Comment 32 Jeff Layton 2011-01-04 20:19:44 EST
Looks sort of like the error is ERRgeneral, which is unhelpfully described as "General error". But, let's get a capture and make sure I'm interpreting the message correctly.
Comment 33 Stefan Metzmacher 2011-01-05 08:09:50 EST
I think for this kind of problems, capturing everything but port 22 is better.
As the problem isn't strictly related to SMB on port 445.
Comment 34 Stefan Walter 2011-01-05 08:39:42 EST
I have built a modified cifs-utils-4.4-5 RPM using the source RPM and the patches
from https://bugzilla.samba.org/show_bug.cgi?id=7883 and with that installed
I can successfully mount a share like this:

mount.cifs -o user=walteste,uid=walteste,sec=krb5 '//nas-nethz-users.d.ethz.ch/share-w-$' /mnt

All the tickets only use 'ArcFour with HMAC/md5' encryption types, so these
patches seem to work with some EMC Celerra systems at least.
Comment 35 Stefan Metzmacher 2011-01-05 09:04:05 EST
cifs-utils-4.7-2.el6.i686.rpm doesn't activate the patch.

As it doesn't regenerate configure.

Jeff, time to add ./autogen.sh and call it in the spec file :-)

readelf -a cifs.upcall |grep krb5_auth_con_set_req_cksumtype
returns nothing.
Comment 36 Jeff Layton 2011-01-05 09:15:35 EST
Thanks Metze. Will fix and rebuild. Should have a new package later today or tomorrow.
Comment 38 Jeff Layton 2011-01-05 10:07:13 EST
New package built, setting needinfo flag pending testing.
Comment 39 Jack Neely 2011-01-05 13:54:44 EST
The new cifs-utils-4.7-3.el6.i686.rpm package does indeed work.
Comment 42 Jeff Layton 2011-01-09 21:21:11 EST
Thanks for testing it, it should make 6.1.
Comment 43 Jeff Layton 2011-01-09 21:23:47 EST
Committed in cifs-utils-4.7-3.el6.
Comment 46 yanfu,wang 2011-03-23 02:00:50 EDT
no reproducer against https://bugzilla.redhat.com/show_bug.cgi?id=668366#c7 and customer had verified against comment #39.
Do code review and build src package cifs-utils-4.8.1-1.el6 is sane.
Comment 47 errata-xmlrpc 2011-05-19 09:06:51 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0569.html

Note You need to log in before you can comment on or make changes to this bug.