Bug 1595927 - Kerberos-enabled NFSv4 doesn't work on armv7hl
Summary: Kerberos-enabled NFSv4 doesn't work on armv7hl
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: nfs-utils
Version: 28
Hardware: armv7hl
OS: Unspecified
high
high
Target Milestone: ---
Assignee: Steve Dickson
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: 1602836
TreeView+ depends on / blocked
 
Reported: 2018-06-27 19:49 UTC by James
Modified: 2018-07-20 17:44 UTC (History)
8 users (show)

Fixed In Version: nfs-utils-2.3.2-1.rc3.fc28
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1602836 (view as bug list)
Environment:
Last Closed: 2018-07-20 17:44:14 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Proposed upstream patch (1.94 KB, patch)
2018-07-17 19:29 UTC, Steve Dickson
no flags Details | Diff

Description James 2018-06-27 19:49:09 UTC
Description of problem:
There's something going on in the ARM build of the Kerberos bits of NFSv4 that stop it working on that platform (at least for me). Whenever I try to access an NFSv4 share I get either 'stale file handle' or 'permission denied'. 

I've tried it with F28 on an RPi 3, and seen similar results on other small ARM boxes. The only other thing they have in common is no hardware RTC.

The machines are configured as FreeIPA clients, enrolled using ipa-client-install and ipa-client-automount. All the other Kerberos bits are working fine: I get tickets, can use GSSAPI ssh to log into other machines, etc.

I'm using exactly the same steps and configuration as I do on my x86-64 boxes, and they all work fine. I also have Krb5+NFSv4 working on my realm for SPARC, PA-RISC and POWER boxes (albeit in another distro) so I really can't understand this one!

I currently suspect gssproxy because when I try to access a share it produces the following:


[2018/06/27 19:16:44]: Debug Enabled (level: 3)
[2018/06/27 19:16:44]: Service: nfs-server, Keytab: /etc/krb5.keytab, Enctype: 18
[2018/06/27 19:16:44]: Service: nfs-client, Keytab: /etc/krb5.keytab, Enctype: 18
[2018/06/27 19:16:44]: Client [2018/06/27 19:16:44]: (/usr/sbin/gssproxy) [2018/06/27 19:16:44]:  connected (fd = 12)[2018/06/27 19:16:44]:  (pid = 1810) (uid = 0) (gid = 0)[2018/06/27 19:16:44]:  (context = system_u:system_r:kernel_t:s0)[2018/06/27 19:16:44]: 
[2018/06/27 19:16:50]: Client [2018/06/27 19:16:50]: (/usr/sbin/rpc.gssd) [2018/06/27 19:16:50]:  connected (fd = 13)[2018/06/27 19:16:50]:  (pid = 617) (uid = 60673) (gid = 60673)[2018/06/27 19:16:50]:  (context = system_u:system_r:gssd_t:s0)[2018/06/27 19:16:50]: 
[CID 13][2018/06/27 19:16:50]: [status] Handling query input: 0xdf43a8 (116)
[CID 13][2018/06/27 19:16:50]: Connection matched service nfs-client
[CID 13][2018/06/27 19:16:50]: [status] Processing request [0xdf43a8 (116)]
[CID 13][2018/06/27 19:16:50]: [status] Executing request 6 (GSSX_ACQUIRE_CRED) from [0xdf43a8 (116)]
[CID 13][2018/06/27 19:16:50]: gp_rpc_execute: executing 6 (GSSX_ACQUIRE_CRED) for service "nfs-client", euid: 60673,socket: (null)
    GSSX_ARG_ACQUIRE_CRED( call_ctx: { "" [  ] } input_cred_handle: <Null> add_cred: 0 desired_name: <Null> time_req: 4294967295 desired_mechs: { { 1 2 840 113554 1 2 2 } } cred_usage: INITIATE initiator_time_req: 0 acceptor_time_req: 0 )
gssproxy[1810]: (OID: { 1 2 840 113554 1 2 2 }) Unspecified GSS failure.  Minor code may provide more information, No credentials cache found
    GSSX_RES_ACQUIRE_CRED( status: { 851968 { 1 2 840 113554 1 2 2 } 2529639107 "Unspecified GSS failure.  Minor code may provide more information" "No credentials cache found" [  ] } output_cred_handle: <Null> )
[CID 13][2018/06/27 19:16:50]: [status] Returned buffer 6 (GSSX_ACQUIRE_CRED) from [0xdf43a8 (116)]: [0xb3943870 (176)]
[CID 13][2018/06/27 19:16:50]: [status] Handling query output: 0xb3943870 (176)
[2018/06/27 19:16:50]: [status] Handling query reply: 0xb3943870 (176)
[2018/06/27 19:16:50]: [status] Sending data: 0xb3943870 (176)
[2018/06/27 19:16:50]: [status] Sending data [0xb3943870 (176)]: successful write of 176


I don't know why it's not finding the credential cache. The stuff in /etc/gssproxy is the same as on the working x86-64 boxes. The machine keytab seems to be in order. I've tried both kernel keyring and KCM, no difference.


Version-Release number of selected component (if applicable):
gssproxy-0.8.0-4.fc28.armv7hl
kernel-4.17.2-200.fc28.armv7hl
nfs-utils-2.3.2-0.fc28.armv7hl
sssd-1.16.2-1.fc28.armv7hl
sssd-kcm-1.16.2-1.fc28.armv7hl

How reproducible:
Consistently.

Steps to Reproduce:
1. Install F28 on an RPi 3.
2. Enroll it to a FreeIPA domain with Kerberised NFSv4, add the relevant automounts.
3. Attempt to access the share.

Actual results:
No share.

Expected results:
Shared stuff.

Comment 1 Robbie Harwood 2018-06-28 21:23:32 UTC
Hi, unfortunately, I don't have hardware for this architecture to test this on.  Could you capture an ltrace of gssproxy during the failure?  Also, `klist -ekt` on the keytab, and `klist -e` as the user?

Comment 2 James 2018-06-28 21:59:26 UTC
$ klist -e
Ticket cache: KCM:1402400001:43805
Default principal: james

Valid starting     Expires            Service principal
28/06/18 22:54:06  29/06/18 22:54:06  krbtgt/CB.ETTLE
        Etype (skey, tkt): aes256-cts-hmac-sha1-96, aes256-cts-hmac-sha1-96


# klist -ekt
Keytab name: FILE:/etc/krb5.keytab
KVNO Timestamp         Principal
---- ----------------- --------------------------------------------------------
   1 27/06/18 00:25:39 host/buck.cb.ettle (aes256-cts-hmac-sha1-96)
   1 27/06/18 00:25:39 host/buck.cb.ettle (aes128-cts-hmac-sha1-96)


I see pretty much the same on one of my x86-64 boxes. (ltrace to follow shortly.)

Comment 3 James 2018-06-28 22:56:36 UTC
From ltrace gssproxy -i, when doing an ls /<nfsv4 path>:


calloc(1, 148)                                                                      = 0xf2ed10
verto_get_private(0xf2db70, 0, -1, 0xf2edac)                                        = 0xf2f850
verto_get_fd(0xf2db70, 0, -1, -1)                                                   = 6
accept(6, 0xf2ed18, 0xf2ed88, 1)                                                    = 11
fcntl(11, 3, 0, 0)                                                                  = 2
fcntl(11, 4, 2050, 0xc75fe700)                                                      = 0
fcntl(11, 1, 0, 0xc75fe700)                                                         = 0
fcntl(11, 2, 1, 0xc75fe700)                                                         = 0
getsockopt(11, 1, 17, 0xf2ed90)                                                     = 0
getpeercon(11, 0xbe8e571c, 17, 1)                                                   = 0
context_new(0xf2cc00, 4, 0xc75fe700, 3)                                             = 0xf2cc38
freecon(0xf2cc00, 0xf2cc1b, 0, 3)                                                   = 0xf29010
__snprintf_chk(0xbe8e578c, 20, 1, 21)                                               = 14
realpath(0xbe8e578c, 0, 0xbe8e57a8, 0)                                              = 0xf2fa28
calloc(1, 16)                                                                       = 0xf2f908
verto_add_io(0xf2dc40, 16, 0x445de0, 11)                                            = 0xf35448
verto_set_private(0xf35448, 0xf2f908, 0, 0x465c08)                                  = 0xf35448
verto_get_fd(0xf35448, 0xf35448, 0x465c08, 0xc75fe700)                              = 11
verto_get_private(0xf35448, 0xf35448, 0x465c08, 1)                                  = 0xf2f908
read(11, "\200", 4)                                                                 = 4
malloc(116)                                                                         = 0xf36148
__errno_location()                                                                  = 0xb6efc9ac
read(11, "\022l\202\221", 116)                                                      = 116
calloc(1, 20)                                                                       = 0xf2f920
pthread_mutex_lock(0xf2a558, 0xf2f920, 20, 0)                                       = 0
pthread_mutex_unlock(0xf2a558, 0, 0, 0xf2d398)                                      = 0
pthread_mutex_lock(0xf2ecbc, 1, 0, 0)                                               = 0
pthread_cond_signal(0xf2ecd8, 0, 2351, 2)                                           = 0
pthread_mutex_unlock(0xf2ecbc, 129, 1, 0)                                           = 0
free(0xf2f908)                                                                      = <void>
gssproxy[2351]: (OID: { 1 2 840 113554 1 2 2 }) Unspecified GSS failure.  Minor code may provide more information, No credentials cache found
verto_get_private(0xf2f3c0, 0xf2f3c0, 0x446824, 0xc75fe700)                         = 0xf2a558
read(8, "", 1)                                                                      = 1
pthread_mutex_lock(0xf2a558, 2, 0, 0xf2f920)                                        = 0
pthread_mutex_unlock(0xf2a558, 0, 2351, 0)                                          = 0
calloc(1, 16)                                                                       = 0xf35480
verto_add_io(0xf2dc40, 32, 0x445bb8, 11)                                            = 0xf35448
verto_set_private(0xf35448, 0xf35480, 0, 0x465c08)                                  = 0xf35448
free(0xf2f920)                                                                      = <void>
verto_get_fd(0xf35448, 0xf35448, 0x445bb8, 0xc75fe700)                              = 11
verto_get_private(0xf35448, 0xf35448, 0x445bb8, 1)                                  = 0xf35480
__errno_location()                                                                  = 0xb6efc9ac
writev(11, 0xbe8e579c, 2, 0xbe8e5798)                                               = 180
calloc(1, 16)                                                                       = 0xf35498
verto_add_io(0xf2dc40, 16, 0x445de0, 11)                                            = 0xf2d400
verto_set_private(0xf2d400, 0xf35498, 0, 0x465c08)                                  = 0xf2d400
free(0xb3908bf0)                                                                    = <void>
free(0xf35480)                                                                      = <void>

Comment 4 Robbie Harwood 2018-07-03 20:38:09 UTC
Thanks.  I don't immediately see anything amiss in the keytab or the ltrace.  Could I trouble you for ptrace output as well?

Comment 5 James 2018-07-03 20:43:11 UTC
Do you mean strace? For anything else I'll need some guidance...

Comment 6 James 2018-07-03 21:18:29 UTC
Executing speculatively... I attached strace to the gssproxy process. When I did my first ls /nas/scratch, I got 'stale file handle' and from strace:


epoll_wait(5,   <-- where it paused before ls

[{EPOLLIN, {u32=11, u64=77309411339}}], 64, 59743) = 1
clock_gettime(CLOCK_MONOTONIC, {tv_sec=425, tv_nsec=858583211}) = 0
gettimeofday({tv_sec=1530652550, tv_usec=465087}, NULL) = 0
read(11, "\200\0\0t", 4)                = 4
read(11, "3\253\366\275\0\0\0\0\0\0\0\2\0\6\32\360\0\0\0\1\0\0\0\6\0\0\0\0\0\0\0\0"..., 116) = 116
futex(0x10ebd00, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x10ebcbc, FUTEX_WAKE_PRIVATE, 1) = 1
clock_gettime(CLOCK_MONOTONIC, {tv_sec=425, tv_nsec=876667276}) = 0
epoll_wait(5, [{EPOLLIN, {u32=4, u64=4294967300}}], 64, 59743) = 1
clock_gettime(CLOCK_MONOTONIC, {tv_sec=425, tv_nsec=887067009}) = 0
read(4, "\0", 1)                        = 1
epoll_ctl(5, EPOLL_CTL_ADD, 11, {EPOLLOUT, {u32=11, u64=81604378635}}) = -1 EEXIST (File exists)
epoll_ctl(5, EPOLL_CTL_MOD, 11, {EPOLLOUT, {u32=11, u64=81604378635}}) = 0
clock_gettime(CLOCK_MONOTONIC, {tv_sec=425, tv_nsec=889832722}) = 0
epoll_wait(5, [{EPOLLOUT, {u32=11, u64=81604378635}}], 64, 59743) = 1
clock_gettime(CLOCK_MONOTONIC, {tv_sec=425, tv_nsec=890922924}) = 0
writev(11, [{iov_base="\200\0\0\260", iov_len=4}, {iov_base="3\253\366\275\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\r\0\0"..., iov_len=176}], 2) = 180
epoll_ctl(5, EPOLL_CTL_MOD, 11, {EPOLLIN, {u32=11, u64=85899345931}}) = 0
clock_gettime(CLOCK_MONOTONIC, {tv_sec=425, tv_nsec=892383541}) = 0
epoll_wait(5, [{EPOLLIN, {u32=11, u64=85899345931}}], 64, 59743) = 1
clock_gettime(CLOCK_MONOTONIC, {tv_sec=426, tv_nsec=412722422}) = 0
gettimeofday({tv_sec=1530652551, tv_usec=19199}, NULL) = 0
read(11, "\200\0\0t", 4)                = 4
read(11, "3\253\366\276\0\0\0\0\0\0\0\2\0\6\32\360\0\0\0\1\0\0\0\6\0\0\0\0\0\0\0\0"..., 116) = 116
futex(0x10ebd04, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x10ebcbc, FUTEX_WAKE_PRIVATE, 1) = 1
clock_gettime(CLOCK_MONOTONIC, {tv_sec=426, tv_nsec=416787347}) = 0
epoll_wait(5, [{EPOLLIN, {u32=4, u64=4294967300}}], 64, 59743) = 1
clock_gettime(CLOCK_MONOTONIC, {tv_sec=426, tv_nsec=420828261}) = 0
read(4, "\0", 1)                        = 1
epoll_ctl(5, EPOLL_CTL_ADD, 11, {EPOLLOUT, {u32=11, u64=90194313227}}) = -1 EEXIST (File exists)
epoll_ctl(5, EPOLL_CTL_MOD, 11, {EPOLLOUT, {u32=11, u64=90194313227}}) = 0
clock_gettime(CLOCK_MONOTONIC, {tv_sec=426, tv_nsec=421796954}) = 0
epoll_wait(5, [{EPOLLOUT, {u32=11, u64=90194313227}}], 64, 59743) = 1
clock_gettime(CLOCK_MONOTONIC, {tv_sec=426, tv_nsec=422194660}) = 0
writev(11, [{iov_base="\200\0\0\260", iov_len=4}, {iov_base="3\253\366\276\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\r\0\0"..., iov_len=176}], 2) = 180
epoll_ctl(5, EPOLL_CTL_MOD, 11, {EPOLLIN, {u32=11, u64=94489280523}}) = 0
clock_gettime(CLOCK_MONOTONIC, {tv_sec=426, tv_nsec=423228456}) = 0
epoll_wait(5, 


I tried the ls again and got 'permission denied' this time. From strace:


[{EPOLLIN, {u32=11, u64=94489280523}}], 64, 59743) = 1
clock_gettime(CLOCK_MONOTONIC, {tv_sec=457, tv_nsec=317881963}) = 0
gettimeofday({tv_sec=1530652581, tv_usec=924402}, NULL) = 0
read(11, "\200\0\0t", 4)                = 4
read(11, "3\253\366\277\0\0\0\0\0\0\0\2\0\6\32\360\0\0\0\1\0\0\0\6\0\0\0\0\0\0\0\0"..., 116) = 116
futex(0x10ebd00, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x10ebcbc, FUTEX_WAKE_PRIVATE, 1) = 1
clock_gettime(CLOCK_MONOTONIC, {tv_sec=457, tv_nsec=323206360}) = 0
epoll_wait(5, [{EPOLLIN, {u32=4, u64=4294967300}}], 64, 59743) = 1
clock_gettime(CLOCK_MONOTONIC, {tv_sec=457, tv_nsec=325111037}) = 0
read(4, "\0", 1)                        = 1
epoll_ctl(5, EPOLL_CTL_ADD, 11, {EPOLLOUT, {u32=11, u64=98784247819}}) = -1 EEXIST (File exists)
epoll_ctl(5, EPOLL_CTL_MOD, 11, {EPOLLOUT, {u32=11, u64=98784247819}}) = 0
clock_gettime(CLOCK_MONOTONIC, {tv_sec=457, tv_nsec=328941432}) = 0
epoll_wait(5, [{EPOLLOUT, {u32=11, u64=98784247819}}], 64, 59743) = 1
clock_gettime(CLOCK_MONOTONIC, {tv_sec=457, tv_nsec=329499919}) = 0
writev(11, [{iov_base="\200\0\0\260", iov_len=4}, {iov_base="3\253\366\277\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\r\0\0"..., iov_len=176}], 2) = 180
epoll_ctl(5, EPOLL_CTL_MOD, 11, {EPOLLIN, {u32=11, u64=103079215115}}) = 0
clock_gettime(CLOCK_MONOTONIC, {tv_sec=457, tv_nsec=331903708}) = 0
epoll_wait(5, [], 64, 59743)            = 0
clock_gettime(CLOCK_MONOTONIC, {tv_sec=517, tv_nsec=90213467}) = 0
gettimeofday({tv_sec=1530652641, tv_usec=697179}, NULL) = 0
clock_gettime(CLOCK_MONOTONIC, {tv_sec=517, tv_nsec=92249229}) = 0
epoll_wait(5,

Comment 7 Robbie Harwood 2018-07-05 17:50:31 UTC
Yes, thanks, I meant strace.

What I don't understand from the trace is why I don't see any GSSAPI calls at all.  Nothing seems to *fail*, either.

If you're running `ls` as root: can you show `klist -ekt` of the keytab?  If not: can you show `klist -e` before (and after) running the command?

Can you also show your gssproxy configuration?

Comment 8 James 2018-07-05 21:51:58 UTC
Before ls, klist -e gives:


Ticket cache: KCM:1402400001:43805
Default principal: james

Valid starting     Expires            Service principal
05/07/18 22:45:53  06/07/18 22:45:53  krbtgt/CB.ETTLE
	Etype (skey, tkt): aes256-cts-hmac-sha1-96, aes256-cts-hmac-sha1-96


and after (with 'stale file handle')


Ticket cache: KCM:1402400001:43805
Default principal: james

Valid starting     Expires            Service principal
05/07/18 22:45:53  06/07/18 22:45:53  krbtgt/CB.ETTLE
	Etype (skey, tkt): aes256-cts-hmac-sha1-96, aes256-cts-hmac-sha1-96


which is interesting as I'd expect an nfs/skipper.cb.ettle service ticket.

The gssproxy config is the one created by ipa-client-install (same as on my x86-64 boxes), plus sssd-kcm. (Same results on kernel keyring.) I've got:

gssproxy.conf:
[gssproxy]

99-nfs-client.conf:
[service/nfs-client]
  mechs = krb5
  cred_store = keytab:/etc/krb5.keytab
  cred_store = ccache:FILE:/var/lib/gssproxy/clients/krb5cc_%U
  cred_store = client_keytab:/var/lib/gssproxy/clients/%U.keytab
  cred_usage = initiate
  allow_any_uid = yes
  trusted = yes
  euid = 0

24-nfs-server.conf:
[service/nfs-server]
  mechs = krb5
  socket = /run/gssproxy.sock
  cred_store = keytab:/etc/krb5.keytab
  trusted = yes
  kernel_nfsd = yes
  euid = 0

(This machine is not configured as a server.)

Comment 9 Robbie Harwood 2018-07-13 15:19:31 UTC
Okay, another idea: what libverto backend are you using?  (Check `rpm -qa | grep libverto`).  Could you try a different one - glib, libev, and libevent are the options in Fedora these days.

Comment 10 James 2018-07-13 18:59:01 UTC
Initially it was libverto-libev.

I then removed the -libev package and installed the -libevent one -- no difference after reboot.

I installed the -glib backed. Wouldn't let me remove the -libevent package so I just moved the .so and symlink out the way -- still stale file handles.

I assume I've switched the backends correctly -- is there a config option I've missed?

[I'm open to this not being a gssproxy bug, but it just seemed the likeliest place to start debugging this...]

Comment 11 Robbie Harwood 2018-07-13 20:35:03 UTC
Thanks for checking.  Config file switching is probably something I should look into adding to libverto, but there isn't a better way right now than what you did.

I agree that gssproxy makes sense to start with (or krb5, or libverto, but those are also me) - just trying to eliminate causes as much as I can.  Next guess is to stare at the epoll calls, I think.

Comment 12 James 2018-07-14 08:32:20 UTC
Hmm... curiously I can access the 'public' NFS4 shares when I log in as root. There are some weird credentials in the cache.


[root@buck ~]# klist 
klist: Credentials cache keyring 'persistent:0:0' not found 
[root@buck ~]# ls /nas/public 
<content correctly listed>
[root@buck ~]# klist 
Ticket cache: KEYRING:persistent:0:krb_ccache_2Jumg4T 
Default principal: host/buck.cb.ettle 
 
Valid starting     Expires            Service principal 
01/01/70 01:00:00  01/01/70 01:00:00  Encrypted/Credentials/v1@X-GSSPROXY: 
[root@buck ~]# exit 


Note the times - I really can't shake the suspicion that this is somehow related to the Pi having a zero clock at power-on until it gets a network time.

(Obviously I still get permission denied on home directories.)


THEN AGAIN (heh...) below I logged in remotely as me just after reboot. With the usual tickets I can see the public share as root, but not the actual user associated with the pricipal.


[james@buck ~]$ su
Password:
[root@buck james]# klist
Ticket cache: KEYRING:persistent:1402400001:krb_ccache_RdKlQey
Default principal: james

Valid starting     Expires            Service principal
14/07/18 09:19:49  15/07/18 09:19:49  krbtgt/CB.ETTLE
[root@buck james]# ls /nas/public
<content correctly listed>
[root@buck james]# klist
Ticket cache: KEYRING:persistent:1402400001:krb_ccache_RdKlQey
Default principal: james

Valid starting     Expires            Service principal
14/07/18 09:19:49  15/07/18 09:19:49  krbtgt/CB.ETTLE
[root@buck james]# exit
[james@buck ~]$ klist
Ticket cache: KEYRING:persistent:1402400001:krb_ccache_RdKlQey
Default principal: james

Valid starting     Expires            Service principal
14/07/18 09:19:49  15/07/18 09:19:49  krbtgt/CB.ETTLE
[james@buck ~]$ ls /nas/public
ls: cannot access '/nas/public': Permission denied

Comment 13 Simo Sorce 2018-07-16 13:27:48 UTC
(In reply to James Ettle from comment #12)
> Hmm... curiously I can access the 'public' NFS4 shares when I log in as
> root. There are some weird credentials in the cache.

This is because you are allowed to use the machine creds as root, I would guess.

> Valid starting     Expires            Service principal 
> 01/01/70 01:00:00  01/01/70 01:00:00  Encrypted/Credentials/v1@X-GSSPROXY: 
> [root@buck ~]# exit 
> 
> 
> Note the times - I really can't shake the suspicion that this is somehow
> related to the Pi having a zero clock at power-on until it gets a network
> time.

It is not, that's a special ticket we used in gssproxy and it's times aer always set to the epoch 0 time.

> (Obviously I still get permission denied on home directories.)
> 
> 
> THEN AGAIN (heh...) below I logged in remotely as me just after reboot. With
> the usual tickets I can see the public share as root, but not the actual
> user associated with the pricipal.
> 
> 
> [james@buck ~]$ su
> Password:
> [root@buck james]# klist
> Ticket cache: KEYRING:persistent:1402400001:krb_ccache_RdKlQey
> Default principal: james
> 
> Valid starting     Expires            Service principal
> 14/07/18 09:19:49  15/07/18 09:19:49  krbtgt/CB.ETTLE
> [root@buck james]# ls /nas/public
> <content correctly listed>
> [root@buck james]# klist
> Ticket cache: KEYRING:persistent:1402400001:krb_ccache_RdKlQey
> Default principal: james

Use klist -A in this case, you'll see more ...

> Valid starting     Expires            Service principal
> 14/07/18 09:19:49  15/07/18 09:19:49  krbtgt/CB.ETTLE
> [root@buck james]# exit
> [james@buck ~]$ klist
> Ticket cache: KEYRING:persistent:1402400001:krb_ccache_RdKlQey
> Default principal: james
> 
> Valid starting     Expires            Service principal
> 14/07/18 09:19:49  15/07/18 09:19:49  krbtgt/CB.ETTLE
> [james@buck ~]$ ls /nas/public
> ls: cannot access '/nas/public': Permission denied

Something is failing to get you a ticket, rpc.gssd should fallback and try to use your ticket if gssproxy fails, could you try to set GSSPROXY_BEHAVIOR=LOCAL_ONLY as an env var for rpc.gssd and see if that still fails ?
Also try to make rpc.gssd stop using GSS-Proxy completely by removing the GSS_USE_PROXY env var.
These experiments will help us narrow down where the bug should be.

Comment 14 James 2018-07-16 21:32:14 UTC
I tried klist -A, couldn't see any additional info.

Also had a go at disabling gssproxy, then the rpc.gssd env var. Same as before in either case.

One difference I did notice is that *without* GSSPROXY_BEHAVIOR=LOCAL_ONLY I see 'Unspecified GSS failure'/'No credentials cache found' errors logged by gssproxy whenever I try to access a share.

I just tried using the FILE: credential cache instead. rpc.gssd -f -vvvv complains that the cache file 'is expired or corrupt' but the klist results look OK. The cache also works fine for ssh.

Comment 15 James 2018-07-16 22:14:37 UTC
Found a solution. See here:

https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/1779962

Seems rpc.gssd is using the wrong syscall and mangling large uids in the process. I applied the changes as described in the link above to nfs-utils-2.3.2-1.rc2.fc28 and it works.

Why this only shows up for me on ARM must be some quirk of the ABI...

Comment 16 Simo Sorce 2018-07-17 09:48:08 UTC
Wow, great catch James.

Steve, this is a bad bad bug, we want to fix this ASAP on all platforms.

Comment 17 James 2018-07-17 10:05:11 UTC
(In reply to Simo Sorce from comment #16)
> Wow, great catch James.
> 
> Steve, this is a bad bad bug, we want to fix this ASAP on all platforms.

With all due credit to 'sree314' on Launchpad. Hopefully this can get upstream and backported to the 1.3 branch (if still maintained) since I've found this is a cross-distro issue.

Just demonstrates the value of cross-platform testing...

Comment 18 Steve Dickson 2018-07-17 19:29:28 UTC
Created attachment 1459549 [details]
Proposed upstream patch

Here is the patch I'm about to proposed to upstream 

I've also added to the original bug report. 
https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/1779962

James, if possible, could you please test this 
patch before I post it.

Comment 19 James 2018-07-17 21:48:42 UTC
Fix confirmed with the patch in attachment 1459549 [details].

Comment 20 Steve Dickson 2018-07-18 14:19:30 UTC
(In reply to James Ettle from comment #19)
> Fix confirmed with the patch in attachment 1459549 [details].

Thank you!

Comment 21 Steve Dickson 2018-07-18 15:42:24 UTC
The upstream patch

commit 2a6b8307fa4243a7921270aedf8ce6506e31569a (HEAD -> master, origin/master, origin/HEAD)
Author: Steve Dickson <steved>
Date:   Tue Jul 17 15:09:37 2018 -0400

    rpc.gssd: truncates 32-bit UIDs/GIDs to 16 bits architectures.
    
    utils/gssd_proc.c uses SYS_setresuid and SYS_setresgid in
    change_identity when it should use SYS_setresuid32 and
    SYS_setresgid32 instead. This causes it to truncate
    UIDs/GIDs > 65536.
    
    Fixes: https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/1779962
    Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1595927
    
    Tested-by: James Ettle <theholyettlz>
    Tested-by: Sree <Sree>
    Signed-off-by: Steve Dickson <steved>

Comment 22 Fedora Update System 2018-07-18 17:13:51 UTC
nfs-utils-2.3.2-1.rc3.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2018-7c7c2cd4c2

Comment 23 Fedora Update System 2018-07-19 20:20:55 UTC
nfs-utils-2.3.2-1.rc3.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-7c7c2cd4c2

Comment 24 Fedora Update System 2018-07-20 17:44:14 UTC
nfs-utils-2.3.2-1.rc3.fc28 has been pushed to the Fedora 28 stable repository. If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.