Bug 667647 - Input/output error with sec=krb5
Summary: Input/output error with sec=krb5
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: samba
Version: 14
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Guenther Deschner
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On: 622790
Blocks: 645127 667644 667675
TreeView+ depends on / blocked
 
Reported: 2011-01-06 10:52 UTC by Guenther Deschner
Modified: 2011-04-01 15:21 UTC (History)
7 users (show)

Fixed In Version: samba-3.5.8-75.fc14
Doc Type: Bug Fix
Doc Text:
Clone Of: 622790
Environment:
Last Closed: 2011-04-01 15:21:01 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Guenther Deschner 2011-01-06 10:52:44 UTC
+++ This bug was initially created as a clone of Bug #622790 +++

Description of problem:

While trying to mount a share (DFS or 'classic') with Kerberos, I got an error message like:

mount error(5): Input/output error

username/password works fine.

--- Additional comment from jlayton on 2010-08-10 08:45:02 EDT ---

Does anything pop up in dmesg when you mount that way?

--- Additional comment from mail on 2010-08-10 08:53:11 EDT ---

Aug 10 14:52:39 host kernel: CIFS VFS: Send error in SessSetup = -5
Aug 10 14:52:39 host kernel: CIFS VFS: cifs_mount failed w/return code = -5

--- Additional comment from jlayton on 2010-08-10 09:08:33 EDT ---

EIO usually means either that you got some malformed packets back from the server, or we got some other CIFS error that the kernel doesn't know how to translate.

It might be good to have some cFYI info from one of these attempts. Could you follow the instructions here, capture the messages to a file and attach it?

http://wiki.samba.org/index.php/LinuxCIFS_troubleshooting#Enabling_Debugging

--- Additional comment from jlayton on 2010-08-10 09:53:29 EDT ---

I believe you may have attached the info to the wrong bug...

https://bugzilla.redhat.com/show_bug.cgi?id=622802#c2

Looks like the client is sending a session setup, and the server is sending back an error that gets translated to ERRgeneral which is sort of a catch-all error like EIO is. Would it be possible to get a binary capture of this? The instructions for how to do this are a little ways down on the same samba wiki page.

If not, you could do the capture yourself and tell me what wireshark says the error response to the Session Setup packet is.

--- Additional comment from mail on 2010-08-10 10:08:48 EDT ---

Created attachment 437892 [details]
trace file

--- Additional comment from jlayton on 2010-08-10 10:23:16 EDT ---

That doesn't show it. The problem here is that this error is cropping up when chasing a DFS referral, so limiting the capture to the particular mount host means that you aren't getting anything once the referral is chased.

You may want to redo the capture and leave out the 'host cifs_server.example.com' part of the capture filter. If you capture everything on port 445 it should get it.

--- Additional comment from mail on 2010-08-10 10:36:19 EDT ---

Created attachment 437901 [details]
tcp dump only limited to port 445

--- Additional comment from jlayton on 2010-08-10 12:40:09 EDT ---

Wireshark says the error was this:

    NT Status: RPC_NT_INVALID_BINDING (0xc0020003)

...which is one I've never seen before. What kind of server is at 192.168.50.100?

--- Additional comment from mail on 2010-08-10 12:55:49 EDT ---

This is an EMC Storage.

--- Additional comment from jlayton on 2010-08-10 13:11:04 EDT ---

Thanks. I can find no helpful mention of that error in any of the MS docs. The closest I found is some googling that turned up some bugs in RDP protocol handling. Nothing in CIFS protocol however.

I suspect this is a bug in the EMC SPNEGO/krb5 implementation. You may want to file a bug report with them. You should be able to reference this bug if they need captures and such. If they think it's a bug in the Linux cifs implementation, then please have them explain what they think we're doing wrong...

I'll leave this bug open for now with the needinfo flag set in case you or they have questions or we need to discuss this further.

--- Additional comment from mail on 2010-08-10 13:17:03 EDT ---

One assumption could be that the enctype is not supported by the EMC, as the TGT is fetched from an 2008R2 Server which does not support DES anymore (by default).

--- Additional comment from jlayton on 2010-08-10 13:33:56 EDT ---

Possibly -- only EMC could tell us that. I'll note that the krb5 ticket in the session setup request has this:

Encryption type: rc4-hmac (23)

...which is a very common enctype for windows. It would be a stretch for them to claim cifs+krb5 compatibility without supporting it...

Also from the capture, the negTokenTarg reply is "reject", but we can't tell much beyond that.

The SMB error code is also very odd. It's one I've never seen before, and I've seen servers throw quite a few different errors in response to krb5 auth problems.

--- Additional comment from mail on 2010-08-10 14:02:32 EDT ---

Setting:

        default_tkt_enctypes = arcfour-hmac-md5
        default_tgs_enctypes = arcfour-hmac-md5

solved the problem, so I guess we can close this one. Thanks for keeping an eye on it, Jeff.

--- Additional comment from jlayton on 2010-08-10 14:09:10 EDT ---

Sorry...where was this set? Someplace on the server?

--- Additional comment from mail on 2010-08-10 14:17:42 EDT ---

No, this has been set on the client.

From my observation, if the TGT is AES256, you won't be able to fetch service tickets from machines that does not support it. Setting the above value forced the TGT to arcfour-hmac-md5 which is compatible with every component (yes, even EMC) in our environment.

--- Additional comment from jlayton on 2010-08-10 14:31:46 EDT ---

Huh....ok. Here's what I don't quite get -- the enctype of the service ticket in the session setup request was rc4-hmac. Why would the server care that the TGT used to get this ticket from the KDC was AES?

It doesn't seem like that should matter at all...

--- Additional comment from mail on 2010-08-10 15:04:35 EDT ---

Ah, sorry, it's not about the TGT, but about the Client/Server Session Key enc type negotiation which does not seem to work if not forced.

--- Additional comment from mail on 2010-08-11 09:43:50 EDT ---

hmm, after some tries the problem occurs again so it may just have been luck that it worked once.

--- Additional comment from jlayton on 2010-08-16 14:38:35 EDT ---

Ok, I think this is probably a server issue. That error is pretty strange and not something I've ever seen Windows return. I suggest taking opening a bug with EMC -- I don't think we'll be able to solve this without their involvement.


For now, I'll set this to NEEDINFO. If you get some more info from the EMC support people, I'll be happy to look it over.

--- Additional comment from metze on 2010-08-17 09:13:35 EDT ---

This seems to be a problem with the MIT krb5 libraries.

I've tested mount.cifs and smbclient from samba4 (commit b0b73ca041ba3d90b3924b380abed4975e5354d9) with the
"HACK bin/smbclient4... and don't use MS KRB5 oid." patch.

This worked:

/tmp/smbclient4-nomskrb5 '//nas-nethz-users.d.ethz.ch/share-m-$/metze' -k yes --option="gensec_gssapi:delegation=no" --option="gensec_gssapi:mutual=no"

worked while this failed:

mount.cifs -o user=metze,uid=21866,sec=krb5 '//nas-nethz-users.d.ethz.ch/share-m-$/metze' /home/metze


See metze-both-arcfour-06.pcap frame 10 is cifs mount and frame 26 is smbclient4-nomdkrb5.

Then I've tested with the installed smbclient from samba3
(Version 3.5.4-62.fc13) and also got the RPC_NT_INVALID_BINDING (0xc0020003)
error.

smbclient '//nas-nethz-users.d.ethz.ch/share-m-$/metze' -k yescli_session_setup_blob: receive failed (NT code 0xc0020003)
session setup failed: NT code 0xc0020003

As cifs.upcall and smbclient both use the MIT krb5 libraries
(krb5-libs-1.7.1-10.fc13.i686), but smbclient4-nomskrb5 uses
heimdal, I assume the problem is within the krb5 libraries.

The only difference between frames 10 and 26 is the different length
of the authenticator.

--- Additional comment from metze on 2010-08-17 09:14:48 EDT ---

Created attachment 439107 [details]
Patch to build a source3/bin/smbclient4 staticly

--- Additional comment from metze on 2010-08-17 09:16:06 EDT ---

Created attachment 439108 [details]
Capture of mount.cifs and smbclient4-nomskrb5 with arcfour

--- Additional comment from jlayton on 2010-08-17 09:40:04 EDT ---

Nice work, Metze. Ok, moving this bug to be against the krb5 libs.

--- Additional comment from jlayton on 2010-08-17 09:46:40 EDT ---

Pity that wireshark isn't better able to stitch together SMB packets...

There is a difference in the length of the session setup requests. Both session setup requests have a trailing frame that wireshark ids as an "NBSS Continuation Message". The first one (from the failed session setup request) is 222 bytes long. The second is 226 bytes long.

It seems likely that the problem is related to that difference.

--- Additional comment from jlayton on 2010-08-17 09:50:25 EDT ---

Then again, session setup requests have a couple of UCS2 strings at the end. Those strings can be variable length, so it's not a given that the two different programs will send the exact same length session setup request.

It would be helpful to have more info from the EMC side of things. Like, what specifically does it not like about the tickets that MIT krb5 is creating?

--- Additional comment from metze on 2010-08-17 10:04:50 EDT ---

I've also tested

/tmp/smbclient4-nomskrb5 '//nas-nethz-users.d.ethz.ch/share-m-$/metze' -k yes --option="gensec_gssapi:delegation=no" --option="gensec_gssapi:mutual=no" "--option=gensec:fake_gssapi_krb5=yes" "--option=gensec:gssapi_krb5=no"

and it also works...

I first thought that the problem might be that smbclient (samba3)
and cifs.upcall use the kerberos library directly (with selfmade gss wrapping) instead of using the gssapi wrapper, but the fake_gssapi_krb5 module should simulate that behavior...

--- Additional comment from metze on 2010-08-17 10:07:04 EDT ---

from IRC:

15:48 < jlayton> metze: the length difference between those two session setup packets could have more to do with 
                 the strings at the end of the session setup packet too
15:48 < jlayton> unless I'm missing a length field in there
15:53 < metze> jlayton: no, they're not part of the authenticator
15:57 < jlayton> I didn't see an authenticator field length in there
16:00 < metze> click in ticket and compare the offsets
16:01 < metze> they start at the same offset
16:01 < metze> and then click on the authenticator 
16:01 < metze> they also start at the same offset
16:01 < metze> but have different starting bytes
16:01 < metze> the length is asn1 encoded
16:02 < metze> but the smb level security blob length is also different
16:06 < metze> 1396 of mount.cifs and 1470 of smbclient4

--- Additional comment from metze on 2010-08-17 10:41:14 EDT ---

If you disable tcp checksum validation wireshark reassambles the session setup requests fine...

--- Additional comment from metze on 2010-08-17 11:01:58 EDT ---

I think I found the problem, but I need to verify my guess.

Inside the authenticator we have a checksum field.

For real gssapi this checksum has type 8003 and contains things like
the channel binding and delegated credentials.

For selfmade gssapi using krb5_mk_req_extended() they checksum should
be of CKSUMTYPE_RSA_MD5(7), this is what heimdal is using and the reason why
/tmp/smbclient4-nomskrb5 '//nas-nethz-users.d.ethz.ch/share-m-$/metze' -k yes --option="gensec_gssapi:delegation=no" --option="gensec_gssapi:mutual=no" "--option=gensec:fake_gssapi_krb5=yes" "--option=gensec:gssapi_krb5=no"
works.

MIT uses CKSUMTYPE_HMAC_MD5(-138) and that's the reason why
smbclient '//nas-nethz-users.d.ethz.ch/share-m-$/metze' -k yes
gets rejected.

I need to modify heimdal to also use CKSUMTYPE_HMAC_MD5(-138)
and see if it also gets rejected.

--- Additional comment from metze on 2010-08-26 10:38:16 EDT ---

Created attachment 441233 [details]
Patch for krb5-1.7.1 to use checksum rsa md5 (7) as heimdal and windows clients

--- Additional comment from nalin on 2010-11-01 13:57:50 EDT ---

Do we know which server implementations are doing this?  It's a bit hard to tell here if we're talking about a bug in the client or bug-compatibility with a server (and if so, which one(s)?).

--- Additional comment from jjneely on 2010-11-01 15:36:32 EDT ---

I have an EMC Celerra running revision 5.6.47.11.  (My EMC folks tell me they are planning an upgrade to 6.0.)

I'm using RHEL 6 Beta 2 with kerberos: krb5-libs-1.8.2-2.el6.x86_64

This combination results in the above errors when using kerberos authentication.

--- Additional comment from jjneely on 2010-11-01 16:20:02 EDT ---

Created attachment 456988 [details]
patch for kerberos 1.8.2

Porting the patch to 1.8.2 took some tweaking, but I can confirm that kerberos 1.8.2 in RHEL 5 Beta 2 built with this patch works with our EMC Celerra as well as normal Windows workstations.  (Where before only Windows workstations as the CIFS server would work.)

--- Additional comment from metze on 2010-11-02 04:39:41 EDT ---

It's bug-compatibility with EMC Servers.

As the client doesn't use the GSSAPI-Checksum 0x8003,
there might be some other possible fixes using
krb5_auth_con_set_req_cksumtype().

See the discussion here:
http://mailman.mit.edu/pipermail/krbdev/2010-September/thread.html#9478

But as nobody tried a krb5_auth_con_set_req_cksumtype() based workaround,
it's hard to tell if it would really fix the problem.

--- Additional comment from metze on 2010-11-02 04:46:45 EDT ---

Looking at the source of smbclient, it seems that krb5_auth_con_set_req_cksumtype() is already used to trigger the GSSAPI-Checksum,
but the problem still exists without using the patched krb5 library.
(But I haven't analyzed this in detail).

--- Additional comment from metze on 2010-12-23 12:59:05 EST ---

I've fixed the problem with smbclient,
see https://bugzilla.samba.org/show_bug.cgi?id=7883

I'll provide a patch for cifs-utils soon.

--- Additional comment from metze on 2010-12-27 15:35:00 EST ---

The patches for cifs-utils are here:
https://bugzilla.samba.org/show_bug.cgi?id=7890

Comment 1 Guenther Deschner 2011-04-01 15:21:01 UTC
This got fixed by a change in sambas client krb5 code in the meantime.


Note You need to log in before you can comment on or make changes to this bug.