Bug 1477231 - ls hang on NFS share
ls hang on NFS share
Status: CLOSED CURRENTRELEASE
Product: Fedora
Classification: Fedora
Component: nfs-utils (Show other bugs)
26
Unspecified Linux
unspecified Severity unspecified
: ---
: ---
Assigned To: Steve Dickson
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-08-01 10:06 EDT by Alex Scheel
Modified: 2017-10-05 14:09 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-10-05 14:09:22 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Wireshark captures (15.49 KB, application/x-xz)
2017-08-01 10:06 EDT, Alex Scheel
no flags Details

  None (edit)
Description Alex Scheel 2017-08-01 10:06:40 EDT
Created attachment 1307628 [details]
Wireshark captures

Description of problem:

`ls` hangs while listing share contents in rpc_wait_bit_killable on a kerberized NFS share (sec=krb5). Does not hang when run under `strace`, `ltrace`, or `gdb`. Fixed by restarting rpcbind on the server (i.e., initial boot only). Cannot reproduce by stopping nfs+gssproxy+rpcbind on the server and starting in any order. Happens with both GSS_USE_PROXY="yes" and GSS_USE_PROXY="no" on the client.


Version-Release number of selected component (if applicable):

==server==
nfs-utils-2.1.1-5.rc4.fc26.x86_64
gssproxy-0.7.0-9.fc26.x86_64
rpcbind-0.2.4-7.rc2.fc26.x86_64
krb5-libs-1.15.1-17.fc26.x86_64


==client==
nfs-utils-2.1.1-5.rc4.fc26.x86_64
gssproxy-0.7.0-9.fc26.x86_64
rpcbind-0.2.4-7.rc2.fc26.x86_64
krb5-libs-1.15.1-17.fc26.x86_64


How reproducible:

Happens every reboot of the VMs 


Steps to Reproduce:
1. Clone https://github.com/cipherboy/nfs-vagrant ; checkout commit 77b1e23a2d47379592cfb1c2dd08d54ced8de3ff
2. `vagrant up --no-provision ; vagrant provision`
3. `vagrant ssh nfs-client` and run `sudo reboot`
4. `vagrant ssh nfs-server` and run `sudo reboot`
5. in `vagrant ssh nfs-client`: 
5.1 `mount.nfs nfs-server.mivehind.net:/home /mnt/nfs/home -o sec=krb5`
5.2 `ls /mnt/nfs/home`  #  <--- HANG ; CTRL+C will kill
5.3 `ltrace ls /mnt/nfs/home`  # <-- Won't hang
5.4 `strace ls /mnt/nfs/home`  # <-- Won't hang
5.5 `ls /mnt/nfs/home`  #  <--- HANG ; CTRL+C will kill
5.6 `umount /mnt/nfs/home`
6. in `vagrant ssh nfs-server` # SERVER
6.1 `systemctl restart rpcbind`
7. in `vagrant ssh nfs-client`
7.1 `mount.nfs nfs-server.mivehind.net:/home /mnt/nfs/home -o sec=krb5`
7.2 `ls /mnt/nfs/home`  #  <--- Won't hang

Actual results:

ls hangs.


Expected results:

ls shouldn't hang and shouldn't need to restart rpcbind after reboot.

Additional info:

/proc/<ls pid>/stack during hang:
[<ffffffffc01319d4>] rpc_wait_bit_killable+0x24/0xa0 [sunrpc]
[<ffffffffc01334b4>] __rpc_execute+0xe4/0x430 [sunrpc]
[<ffffffffc013515d>] rpc_execute+0x5d/0xc0 [sunrpc]
[<ffffffffc0125e2e>] rpc_run_task+0x10e/0x160 [sunrpc]
[<ffffffffc032c633>] nfs4_call_sync_sequence+0x63/0xa0 [nfsv4]
[<ffffffffc032d171>] _nfs4_proc_lookup+0x131/0x190 [nfsv4]
[<ffffffffc033b840>] nfs4_proc_lookup_common+0x80/0x390 [nfsv4]
[<ffffffffc033bc30>] nfs4_proc_lookup+0x50/0xb0 [nfsv4]
[<ffffffffc02f5c3e>] nfs_lookup+0x10e/0x290 [nfs]
[<ffffffff9a279985>] lookup_slow+0xa5/0x170
[<ffffffff9a279f1d>] walk_component+0x1bd/0x340
[<ffffffff9a27a717>] path_lookupat+0x67/0xf0
[<ffffffff9a27da96>] filename_lookup+0xb6/0x190
[<ffffffff9a27dc46>] user_path_at_empty+0x36/0x40
[<ffffffff9a271e86>] vfs_statx+0x76/0xe0
[<ffffffff9a27244d>] SYSC_newlstat+0x3d/0x70
[<ffffffff9a2728ae>] SyS_newlstat+0xe/0x10
[<ffffffff9a875ff7>] entry_SYSCALL_64_fastpath+0x1a/0xa9
[<ffffffffffffffff>] 0xffffffffffffffff
 

(see attached wireshark captures)
Comment 1 Steve Dickson 2017-08-01 11:12:36 EDT
Looking at the hung wire trace 

4 Call (Reply In 2) GETATTR FH: 0x668d2a45
V4 Reply (Call In 1) GETATTR
V4 Call (Reply In 5) GETATTR FH: 0x668d2a45
V4 Reply (Call In 4) GETATTR
V4 Call (Reply In 8) LOOKUP DH: 0x668d2a45/vagrant
V4 Reply (Call In 6) LOOKUP | GETATTR Status: NFS4ERR_DELAY
V4 Call (Reply In 13) LOOKUP DH: 0x668d2a45/vagrant
V4 Reply (Call In 10) LOOKUP | GETATTR Status: NFS4ERR_DELAY
V4 Call (Reply In 17) LOOKUP DH: 0x668d2a45/vagrant
V4 Reply (Call In 15) LOOKUP | GETATTR Status: NFS4ERR_DELAY
V4 Call (Reply In 21) LOOKUP DH: 0x668d2a45/vagrant
V4 Reply (Call In 19) LOOKUP | GETATTR Status: NFS4ERR_DELAY
V4 Call (Reply In 25) LOOKUP DH: 0x668d2a45/vagrant
V4 Reply (Call In 23) LOOKUP | GETATTR Status: NFS4ERR_DELAY
V4 Call (Reply In 29) LOOKUP DH: 0x668d2a45/vagrant
V4 Reply (Call In 27) LOOKUP | GETATTR Status: NFS4ERR_DELAY
V4 Call LOOKUP DH: 0x668d2a45/vagrant

It looks like the problem is on the server side
Comment 3 Scott Mayhew 2017-08-01 19:55:16 EDT
Here's the debug logging showing GETATTR returning NFS4ERR_DELAY:

Aug 01 23:23:24 nfs-server.mivehind.net kernel: nfsv4 compound op #5/5: 9 (OP_GETATTR)
Aug 01 23:23:24 nfs-server.mivehind.net kernel: nfsd: fh_verify(36: 01070001 00200001 00000000 5bb7b08f 004567bc 9c941f85)
Aug 01 23:23:24 nfs-server.mivehind.net kernel: RPC:       Want update, refage=120, age=227
Aug 01 23:23:30 nfs-server.mivehind.net kernel: nfsv4 compound op ffff9d2039c55080 opcnt 5 #5: 9: status 10008
Aug 01 23:23:30 nfs-server.mivehind.net kernel: nfsv4 compound returned 10008

I noticed idmapd had failed and restarted it:

[root@nfs-server ~]# systemctl status nfs-idmapd
● nfs-idmapd.service - NFSv4 ID-name mapping service
   Loaded: loaded (/usr/lib/systemd/system/nfs-idmapd.service; static; vendor preset: disabled)
   Active: failed (Result: exit-code) since Tue 2017-08-01 23:19:21 UTC; 6min ago
  Process: 391 ExecStart=/usr/sbin/rpc.idmapd $RPCIDMAPDARGS (code=exited, status=0/SUCCESS)
 Main PID: 393 (code=exited, status=1/FAILURE)

Aug 01 23:14:20 nfs-server.mivehind.net systemd[1]: Starting NFSv4 ID-name mapping service...
Aug 01 23:14:20 nfs-server.mivehind.net rpc.idmapd[391]: rpc.idmapd: conf_reinit: open ("(null)", O_RDONLY) failed
Aug 01 23:14:20 nfs-server.mivehind.net systemd[1]: Started NFSv4 ID-name mapping service.
Aug 01 23:19:21 nfs-server.mivehind.net rpc.idmapd[393]: main: event_dispatch returns errno 9 (Bad file descriptor)
Aug 01 23:19:21 nfs-server.mivehind.net systemd[1]: nfs-idmapd.service: Main process exited, code=exited, status=1/FAI
Aug 01 23:19:21 nfs-server.mivehind.net systemd[1]: nfs-idmapd.service: Unit entered failed state.
Aug 01 23:19:21 nfs-server.mivehind.net systemd[1]: nfs-idmapd.service: Failed with result 'exit-code'.
[root@nfs-server ~]# systemctl start nfs-idmapd
[root@nfs-server ~]# systemctl status nfs-idmapd
● nfs-idmapd.service - NFSv4 ID-name mapping service
   Loaded: loaded (/usr/lib/systemd/system/nfs-idmapd.service; static; vendor preset: disabled)
   Active: active (running) since Tue 2017-08-01 23:26:03 UTC; 3s ago
  Process: 878 ExecStart=/usr/sbin/rpc.idmapd $RPCIDMAPDARGS (code=exited, status=0/SUCCESS)
 Main PID: 879 (rpc.idmapd)
    Tasks: 1 (limit: 4915)
   CGroup: /system.slice/nfs-idmapd.service
           └─879 /usr/sbin/rpc.idmapd

Aug 01 23:26:03 nfs-server.mivehind.net systemd[1]: Starting NFSv4 ID-name mapping service...
Aug 01 23:26:03 nfs-server.mivehind.net rpc.idmapd[878]: rpc.idmapd: conf_reinit: open ("(null)", O_RDONLY) failed
Aug 01 23:26:03 nfs-server.mivehind.net systemd[1]: Started NFSv4 ID-name mapping service.

and ls started working again:

[root@nfs-client ~]# ls -l /mnt/nfs/home
total 12
drwx------. 2 alex    alex    4096 Aug  1 22:13 alex
drwx------. 2 robbie  robbie  4096 Aug  1 22:13 robbie
drwx------. 4 vagrant vagrant 4096 Aug  1 23:14 vagrant


Here's a callgraph of nfsd4_encode_fattr while the hang was occurring.  It shows we're erroring out after the upcall to idmapd.  I think the timeout for the sunrpc cache upcalls is 5 seconds too, which you can see in the timing:

 0)               |  nfsd4_encode_fattr [nfsd]() {
 0)               |    fh_compose [nfsd]() {
 0)   0.953 us    |      _fh_update.isra.14.part.15 [nfsd]();
 0)   2.919 us    |    }
 0)   0.266 us    |    nfsd4_encode_bitmap [nfsd]();
 0)   0.204 us    |    fsid_source [nfsd]();
 0)               |    nfsd4_encode_user [nfsd]() {
 0)               |      encode_name_from_id [nfsd]() {
 0)               |        idmap_id_to_name [nfsd]() {
 0)               |          idmap_lookup [nfsd]() {
 0)               |            idtoname_lookup [nfsd]() {
 0)   0.256 us    |              idtoname_hash [nfsd]();
 0)   0.199 us    |              ent_alloc [nfsd]();
 0)   0.387 us    |              ent_init [nfsd]();
 0)   5.001 us    |            }
 0)               |            idtoname_lookup [nfsd]() {
 0)   0.411 us    |              idtoname_hash [nfsd]();
 0)   0.765 us    |              idtoname_match [nfsd]();
 0)   6.309 us    |            }
 0) $ 5246065 us  |          }
 0)   0.562 us    |          nfserrno [nfsd]();
 0) $ 5246069 us  |        }
 0) $ 5246070 us  |      }
 0) $ 5246072 us  |    }
 0)   2.190 us    |    fh_put [nfsd]();
 0) $ 5246232 us  |  }

Contrasting that with the callgraph of nfsd4_encode_fattr when everything's working:

 0)               |  nfsd4_encode_fattr [nfsd]() {
 0)               |    fh_compose [nfsd]() {
 0)   0.488 us    |      _fh_update.isra.14.part.15 [nfsd]();
 0)   2.596 us    |    }
 0)   0.229 us    |    nfsd4_encode_bitmap [nfsd]();
 0)   0.201 us    |    fsid_source [nfsd]();
 0)               |    nfsd4_encode_user [nfsd]() {
 0)               |      encode_name_from_id [nfsd]() {
 0)               |        idmap_id_to_name [nfsd]() {
 0)               |          idmap_lookup [nfsd]() {
 0)               |            idtoname_lookup [nfsd]() {
 0)   0.266 us    |              idtoname_hash [nfsd]();
 0)   0.265 us    |              idtoname_match [nfsd]();
 0)   3.307 us    |            }
 0)   4.700 us    |          }
 0)   6.558 us    |        }
 0)   7.741 us    |      }
 0)   9.173 us    |    }
 0)               |    nfsd4_encode_group [nfsd]() {
 0)               |      encode_name_from_id [nfsd]() {
 0)               |        idmap_id_to_name [nfsd]() {
 0)               |          idmap_lookup [nfsd]() {
 0)               |            idtoname_lookup [nfsd]() {
 0)   0.270 us    |              idtoname_hash [nfsd]();
 0)   0.250 us    |              idtoname_match [nfsd]();
 0)   3.154 us    |            }
 0)   4.562 us    |          }
 0)   6.559 us    |        }
 0)   8.541 us    |      }
 0) + 10.016 us   |    }
 0)   0.279 us    |    fh_put [nfsd]();
 0) + 32.935 us   |  }

So something's amiss with rpc.idmapd. 

It's supposed to read /etc/nfs.conf to get the pipefs mountpoint and then it reads /etc/idmapd.conf to get the settings for libnfsidmap, but it looks like it's not:

Aug 01 23:46:21 nfs-server.mivehind.net rpc.idmapd[845]: rpc.idmapd: conf_reinit: open ("(null)", O_RDONLY) failed
Comment 5 Alex Scheel 2017-08-02 11:09:06 EDT
Close: the issue is in libnfsidmap actually. 

Note that rpc.idmapd is from nfs-utils (with patches carried by Fedora [0]). Upstream [1] defines conf_init to be `void conf_init(void)` (in support/nfs/conffile.c:373). However, Fedora ships a patch (nfs-utils-2.1.2-rc4.patch?) which updates this to be `void conf_init(const char *conf_file)` (in supporf/nfs/conffile.c:529 after patches), replacing the global variable conf_path with an explicit passed-in argument. This is what rpc.idmapd is built against, and contains guards to replace conf_path with a default if conf_path is NULL, (and further, utils/idmapd/idmapd.c contains code to set it appropriately as well, so it should never be NULL).

==HOWEVER==

Note that rpc.idmapd links against libnfsidamp:

[root@nfs-server usr]# ldd /usr/sbin/rpc.idmapd
        linux-vdso.so.1 (0x00007ffffab5a000)
        libevent-2.0.so.5 => /lib64/libevent-2.0.so.5 (0x00007fc8a80ae000)
        libnfsidmap.so.0 => /lib64/libnfsidmap.so.0 (0x00007fc8a7ea6000)
        libc.so.6 => /lib64/libc.so.6 (0x00007fc8a7ad5000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fc8a78b6000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007fc8a76b2000)
        libresolv.so.2 => /lib64/libresolv.so.2 (0x00007fc8a7497000)
        /lib64/ld-linux-x86-64.so.2 (0x0000561818709000)


In libnfsidmap, if you pull the RPM [2], it defines, in "cfg.c", a function conf_init which closely mirrors upstream's conf_init (an older version?) in that it uses conf_path and much of the same structure. However, this code has not been updated to reflect the new conf_init that rpc.idmapd is using. 

So, conf_path (the global) is NULL, and conf_init from the libnfsidmap package is being used:

# LD_DEBUG="bindings" ldd -r /usr/sbin/rpc.idmapd 2>&1 | grep 'conf_init'
       959:     binding file /lib64/libnfsidmap.so.0 [0] to /lib64/libnfsidmap.so.0 [0]: normal symbol `conf_init'
       959:     binding file /usr/sbin/rpc.idmapd [0] to /lib64/libnfsidmap.so.0 [0]: normal symbol `conf_init'


Hence, we get the error above, and rpc.idmapd crashes after first mount.


[0]: https://dl.fedoraproject.org/pub/fedora/linux/releases/26/Everything/source/tree/Packages/n/nfs-utils-2.1.1-5.rc4.fc26.src.rpm
[1]: https://www.kernel.org/pub/linux/utils/nfs-utils/2.1.1/nfs-utils-2.1.1.tar.xz
[2]: https://dl.fedoraproject.org/pub/fedora/linux/releases/26/Everything/source/tree/Packages/l/libnfsidmap-0.27-1.fc26.src.rpm
Comment 6 Scott Mayhew 2017-08-03 12:00:05 EDT
I'm going to spin that off into a separate bug.  It's something that definitely needs to be fixed, but it doesn't look like that's whats causing rpc.idmapd to exit with failure.
Comment 7 Scott Mayhew 2017-08-03 12:17:36 EDT
(In reply to Scott Mayhew from comment #6)
> I'm going to spin that off into a separate bug.  It's something that
> definitely needs to be fixed, but it doesn't look like that's whats causing
> rpc.idmapd to exit with failure.

https://bugzilla.redhat.com/show_bug.cgi?id=1478129
Comment 8 Scott Mayhew 2017-08-03 12:22:57 EDT
This is the the fd that we're getting -EBADF on:

[root@fedora26 ~]# lsof -p `pidof rpc.idmapd`|grep eventpoll
rpc.idmap 650 root    4u  a_inode               0,12        0       8738 [eventpoll]

and here's where we're gettine -EBADF:

[root@fedora26 ~]# stap -v -d /usr/lib64/libevent-2.0.so.5.1.9 -d /usr/lib64/libnfsidmap.so.0.3.0 -d /usr/lib64/libnfsidmap/nsswitch.so -d /usr/lib64/libnss_sss.so.2 -d /usr/lib64/libc-2.25.so -d /usr/sbin/rpc.idmapd -e 'probe syscall.epoll_wait { if (execname() != "rpc.idmapd") next; printf("%s %s -> %s %s\n", tz_ctime(gettimeofday_s()), execname(), ppfunc(), $$parms); } probe syscall.epoll_wait.return { if (execname() != "rpc.idmapd") next; printf("%s %s <- %s ret: %d\n", tz_ctime(gettimeofday_s()), execname(), ppfunc(), $return); if ($return < 0) print_ubacktrace(); }'
Pass 1: parsed user script and 467 library scripts using 245656virt/45248res/7412shr/38076data kb, in 100usr/20sys/147real ms.
Pass 2: analyzed script: 2 probes, 15 functions, 103 embeds, 0 globals using 427948virt/229204res/8968shr/220368data kb, in 1730usr/280sys/2187real ms.
Pass 3: translated to C into "/tmp/stapdl8inx/stap_f960fe0cd9220f684ec4d826c84cee67_70060_src.c" using 427948virt/229404res/9168shr/220368data kb, in 270usr/60sys/339real ms.
Pass 4: compiled C into "stap_f960fe0cd9220f684ec4d826c84cee67_70060.ko" in 7160usr/410sys/7868real ms.
Pass 5: starting run.
Thu Aug  3 08:22:59 2017 EDT rpc.idmapd -> SyS_epoll_wait epfd=0x4 events=0x562c1f26e080 maxevents=0x20 timeout=0xffffffff
Thu Aug  3 08:23:06 2017 EDT rpc.idmapd <- SyS_epoll_wait ret: 1
Thu Aug  3 08:23:06 2017 EDT rpc.idmapd -> SyS_epoll_wait epfd=0x4 events=0x562c1f26e080 maxevents=0x20 timeout=0xffffffff
Thu Aug  3 08:23:06 2017 EDT rpc.idmapd <- SyS_epoll_wait ret: 1
Thu Aug  3 08:23:06 2017 EDT rpc.idmapd -> SyS_epoll_wait epfd=0x4 events=0x562c1f26e080 maxevents=0x20 timeout=0xffffffff
Thu Aug  3 08:23:06 2017 EDT rpc.idmapd <- SyS_epoll_wait ret: -9
 0x7f0d72e560d3 : __epoll_wait_nocancel+0xa/0x57 [/usr/lib64/libc-2.25.so]
 0x7f0d733430fb : epoll_dispatch+0xab/0x1f0 [/usr/lib64/libevent-2.0.so.5.1.9]
 0x7f0d7332ddbe : event_base_loop+0x16e/0x1910 [/usr/lib64/libevent-2.0.so.5.1.9]
 0x562c1d6293ba : main+0x60a/0x890 [/usr/sbin/rpc.idmapd]
 0x7f0d72d654da : __libc_start_main+0xea/0x1c0 [/usr/lib64/libc-2.25.so]
 0x562c1d62966a : _start+0x2a/0x30 [/usr/sbin/rpc.idmapd]

So what is closing fd 4?

[root@fedora26 ~]# stap -v -d /usr/lib64/libevent-2.0.so.5.1.9 -d /usr/lib64/libnfsidmap.so.0.3.0 -d /usr/lib64/libnfsidmap/nsswitch.so -d /usr/lib64/libnss_sss.so.2 -d /usr/lib64/libc-2.25.so -d /usr/sbin/rpc.idmapd -e 'probe syscall.close { if (execname() != "rpc.idmapd") next; if ($fd == 4) print_ubacktrace(); }'
Pass 1: parsed user script and 467 library scripts using 245652virt/45352res/7520shr/38072data kb, in 100usr/20sys/155real ms.
Pass 2: analyzed script: 1 probe, 4 functions, 100 embeds, 0 globals using 427696virt/228944res/8960shr/220116data kb, in 1300usr/280sys/1770real ms.
Pass 3: translated to C into "/tmp/staptCHr8d/stap_843f266503501f0a0526b28b1144124b_55561_src.c" using 427696virt/229144res/9160shr/220116data kb, in 290usr/50sys/350real ms.
Pass 4: compiled C into "stap_843f266503501f0a0526b28b1144124b_55561.ko" in 6960usr/400sys/7659real ms.
Pass 5: starting run.
 0x7ff2fe81e930 : __close_nocancel+0x7/0x57 [/usr/lib64/libc-2.25.so]
 0x7ff2fdeddd1b : sss_nss_mc_destroy_ctx+0x2b/0x70 [/usr/lib64/libnss_sss.so.2]
 0x7ff2fdede0e0 : sss_nss_mc_get_ctx+0x230/0x260 [/usr/lib64/libnss_sss.so.2]
 0x7ff2fdedec5f : sss_nss_mc_getgrgid+0x4f/0x170 [/usr/lib64/libnss_sss.so.2]
 0x7ff2fdedca5d : _nss_sss_getgrgid_r+0x7d/0x270 [/usr/lib64/libnss_sss.so.2]
 0x7ff2fe7ef4e9 : getgrgid_r@@GLIBC_2.2.5+0x1e9/0x4d0 [/usr/lib64/libc-2.25.so]
 0x7ff2fbfdd914 : nss_gid_to_name+0x84/0x150 [/usr/lib64/libnfsidmap/nsswitch.so]
 0x7ff2feaf2093 : nfs4_gid_to_name+0x93/0x130 [/usr/lib64/libnfsidmap.so.0.3.0]
 0x55d329db08d2 : idtonameres+0x92/0x130 [/usr/sbin/rpc.idmapd]
 0x55d329db1428 : imconv+0x128/0x1b0 [/usr/sbin/rpc.idmapd]
 0x55d329db1752 : nfsdcb+0x162/0x4a0 [/usr/sbin/rpc.idmapd]
 0x7ff2fed079d9 : event_base_loop+0xd89/0x1910 [/usr/lib64/libevent-2.0.so.5.1.9]
 0x55d329db03ba : main+0x60a/0x890 [/usr/sbin/rpc.idmapd]
 0x7ff2fe73e4da : __libc_start_main+0xea/0x1c0 [/usr/lib64/libc-2.25.so]
 0x55d329db066a : _start+0x2a/0x30 [/usr/sbin/rpc.idmapd]
Comment 9 Scott Mayhew 2017-08-04 18:13:47 EDT
You can fix the hang for now by changing the "passwd" and "group" lines in NFS server's /etc/nsswitch.conf file so that "files" is listed before "sss".

The order was changed to "sss files systemd" in bug 1427646 so that SSSD would handle local users and groups, but that's causing an issue for rpc.idmapd.  Specifically:

1. During startup, rpc.idmapd calls getpwnam("nobody") and getgrnam("nobody")

2. If SSSD is handling the local users & groups it opens the caches for both.  I'll just describe what's going on w/ the group cache since that's what seems to be messing up (no idea why the same isn't happening w/ the passwd cache.  Info about the group cache is stored in a struct sss_cli_mc_ctx called "gr_mc_ctx".  If the state of the gr_mc_ctx is UNINITIALIZED, sss_nss_mc_init_ctx() is called, which opens the cache file, in this case /var/lib/sss/mc/group.  It mmaps the file, stored the address returned by mmap() along with the fd in gr_mc_ctx.  So ctx->fd for the group cache winds up being 4.

3. Further along in the startup, rpc.idmapd daemonizes.  nfs-utils uses a custom function called daemon_init() which uses a pipe to allow the forked child to communicate its startup status back to the parent.  daemon_init() sets the child's stdin, stdout, and stderr fd's to /dev/null and fd 3 to the write end of the pipe.  It then calls closeall (another custom function) which closes all fd's >= 4... so /var/lib/sss/mc/group is now closed.

Here's 2 & 3 in action:

[root@fedora26 ~]# stap -v -d /usr/lib64/libnss_sss.so.2 -d /usr/sbin/rpc.idmapd -d /usr/lib64/libc-2.25.so -e 'global mcfd=0; probe syscall.open.return { if (execname() != "rpc.idmapd") next; if (kernel_string(@entry($filename), "") == "/var/lib/sss/mc/group") { printf("%s %s(%d) %s filename: %s fd: %d\n", tz_ctime(gettimeofday_s()), execname(), pid(), ppfunc(), kernel_string(@entry($filename)), $return); mcfd = $return; print_ubacktrace(); } } probe syscall.close { if (execname()!="rpc.idmapd") next; if ($fd==mcfd) { printf("%s %s(%d) %s filename: %s fd: %d\n", tz_ctime(gettimeofday_s()), execname(), pid(), ppfunc(), fullpath_struct_file(task_current(), task_fd_lookup(task_current(), $fd)), $fd); print_ubacktrace(); } }'
Pass 1: parsed user script and 467 library scripts using 245656virt/45272res/7440shr/38076data kb, in 100usr/10sys/156real ms.
Pass 2: analyzed script: 5 probes, 44 functions, 104 embeds, 9 globals using 428176virt/229236res/8668shr/220596data kb, in 1510usr/280sys/1959real ms.
Pass 3: translated to C into "/tmp/stapqiQwdb/stap_1b48689990f1a4f10189f9e0a1294128_96477_src.c" using 428176virt/229452res/8884shr/220596data kb, in 290usr/60sys/354real ms.
Pass 4: compiled C into "stap_1b48689990f1a4f10189f9e0a1294128_96477.ko" in 7950usr/420sys/8621real ms.
Pass 5: starting run.
Fri Aug  4 15:45:01 2017 EDT rpc.idmapd(2204) SyS_open filename: /var/lib/sss/mc/group fd: 4
 0x7f6c147c4f30 : __open_nocancel+0x7/0x57 [/usr/lib64/libc-2.25.so]
 0x7f6c13e85350 : sss_open_cloexec+0x30/0x60 [/usr/lib64/libnss_sss.so.2]
 0x7f6c13e85061 : sss_nss_mc_get_ctx+0x1b1/0x260 [/usr/lib64/libnss_sss.so.2]
 0x7f6c13e85ab4 : sss_nss_mc_getgrnam+0x54/0x1b0 [/usr/lib64/libnss_sss.so.2]
 0x7f6c13e83796 : _nss_sss_getgrnam_r+0x96/0x2e0 [/usr/lib64/libnss_sss.so.2]
 0x7f6c147969cb : getgrnam_r@@GLIBC_2.2.5+0x1fb/0x4e0 [/usr/lib64/libc-2.25.so]
 0x7f6c14795cb8 : getgrnam+0x98/0x1a0 [/usr/lib64/libc-2.25.so]
 0x561f41781181 : main+0x3d1/0x890 [/usr/sbin/rpc.idmapd]
 0x7f6c146e54da : __libc_start_main+0xea/0x1c0 [/usr/lib64/libc-2.25.so]
 0x561f4178166a : _start+0x2a/0x30 [/usr/sbin/rpc.idmapd]
Fri Aug  4 15:45:01 2017 EDT rpc.idmapd(2205) SyS_close filename: /var/lib/sss/mc/group fd: 4
 0x7f6c147c5930 : __close_nocancel+0x7/0x57 [/usr/lib64/libc-2.25.so]
 0x561f417835ee : closeall+0xae/0x120 [/usr/sbin/rpc.idmapd]
 0x561f417833bd : daemon_init+0x14d/0x230 [/usr/sbin/rpc.idmapd]
 0x561f417811c7 : main+0x417/0x890 [/usr/sbin/rpc.idmapd]
 0x7f6c146e54da : __libc_start_main+0xea/0x1c0 [/usr/lib64/libc-2.25.so]
 0x561f4178166a : _start+0x2a/0x30 [/usr/sbin/rpc.idmapd]

4. After returning from daemon_init() the child calls event_init(), which calls epoll_init() which calls epoll_create1() which returns 4 as the epoll fd.  It then sets up the callbacks and finally calls event_dispatch().

5. The mount request comes in from the client.  The PUTROOTFH|GETATTR compound causes nfsd to do an upcall to rpc.idmapd so it can get the name strings corresponding to the owner and group owner of the mountpoint.  This triggers the nfdscb callback.

6. rpc.idmapd calls getgrgid_r() which calls sss_nss_mc_getgrgid() which calls sss_nss_mc_get_ctx() which calls sss_nss_check_header().  The result of that check is that the cache needs to be reinitialized (I guess, I'm not all that familiar w/ SSSD) so sss_nss_mc_destroy_ctx() is called, part of which closes the ctx->fd, hence the 2nd backtrace in comment #8. But fd 4 no longer corresponds to /var/lib/sss/mc/group, it's the epoll fd.

7. event_dispatch() calls epoll_wait() with epfd=4.  That fd is gone, so epoll_wait() returns -EBADF, hence the 1st backtrace in comment #8.

8. rpc.idmapd then logs the error and exits with the nonzero status.

Note You need to log in before you can comment on or make changes to this bug.