Bug 1573895

Summary: adding nis to hosts in nsswitch.conf causes segfaults
Product: [Fedora] Fedora Reporter: Thomas Sailer <fedora>
Component: libnsl2Assignee: Matej Mužila <mmuzila>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 28CC: edgar.hoch, eig-kunimoto, fedora, leif, mmuzila
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: libnsl2-1.2.0-2.20180605git4a062cf.fc28 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-06-16 20:15:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Thomas Sailer 2018-05-02 13:07:39 UTC
Description of problem:

Adding nis to hosts: in nsswitch.conf causes the following segfaults.

/etc/nsswitch.conf
hosts:      files mdns4_minimal [NOTFOUND=return] nis dns myhostname

Version-Release number of selected component (if applicable):
3.0-3.fc28

How reproducible:
always

Steps to Reproduce:
1.ping mit.edu

Actual results:
yp_bind_client_create_v3: RPC: Unknown host
yp_bind_client_create_v3: RPC: Unknown host
yp_bind_client_create_v3: RPC: Unknown host
yp_bind_client_create_v3: RPC: Unknown host

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff6c116eb in buffered_vfprintf () from /lib64/libc.so.6

Expected results:
PING mit.edu (2.17.228.35) 56(84) bytes of data.
64 bytes from 2.17.228.35 (2.17.228.35): icmp_seq=1 ttl=53 time=25.7 ms

Additional info:
The first 100 callframes of the backtrace look like this:

#0  0x00007ffff6c116eb in buffered_vfprintf () from /lib64/libc.so.6
#1  0x00007ffff6c0ec82 in vfprintf () from /lib64/libc.so.6
#2  0x00007ffff6cc84f6 in __fprintf_chk () from /lib64/libc.so.6
#3  0x00007fffeef0973c in yp_bind_client_create_v3 () from /lib64/libnsl.so.2
#4  0x00007fffeef09e16 in __yp_bind.part.1 () from /lib64/libnsl.so.2
#5  0x00007fffeef0a18b in do_ypcall () from /lib64/libnsl.so.2
#6  0x00007fffeef0a30d in do_ypcall_tr () from /lib64/libnsl.so.2
#7  0x00007fffeef0ac0c in yp_match () from /lib64/libnsl.so.2
#8  0x00007fffef120d24 in internal_gethostbyname2_r ()
   from /lib64/libnss_nis.so.2
#9  0x00007fffef12132c in _nss_nis_gethostbyname2_r ()
   from /lib64/libnss_nis.so.2
#10 0x00007ffff6ccb8c5 in gethostbyname2_r@@GLIBC_2.2.5 ()
   from /lib64/libc.so.6
#11 0x00007ffff6ca317b in gaih_inet.constprop () from /lib64/libc.so.6
#12 0x00007ffff6ca3da4 in getaddrinfo () from /lib64/libc.so.6
#13 0x00007fffeece574e in getclnthandle () from /lib64/libtirpc.so.3
#14 0x00007fffeece630f in __rpcb_findaddr_timed () from /lib64/libtirpc.so.3
#15 0x00007fffeecde954 in clnt_tp_create_timed () from /lib64/libtirpc.so.3
#16 0x00007fffeecdeb34 in clnt_create_timed () from /lib64/libtirpc.so.3
#17 0x00007fffeef096eb in yp_bind_client_create_v3 () from /lib64/libnsl.so.2
#18 0x00007fffeef098d5 in yp_bind_ypbindprog () from /lib64/libnsl.so.2
#19 0x00007fffeef09e51 in __yp_bind.part.1 () from /lib64/libnsl.so.2
#20 0x00007fffeef0a18b in do_ypcall () from /lib64/libnsl.so.2
#21 0x00007fffeef0a30d in do_ypcall_tr () from /lib64/libnsl.so.2
#22 0x00007fffeef0ac0c in yp_match () from /lib64/libnsl.so.2
#23 0x00007fffef120d24 in internal_gethostbyname2_r ()
   from /lib64/libnss_nis.so.2
#24 0x00007fffef12132c in _nss_nis_gethostbyname2_r ()
   from /lib64/libnss_nis.so.2
#25 0x00007ffff6ccb8c5 in gethostbyname2_r@@GLIBC_2.2.5 ()
   from /lib64/libc.so.6
#26 0x00007ffff6ca317b in gaih_inet.constprop () from /lib64/libc.so.6
#27 0x00007ffff6ca3da4 in getaddrinfo () from /lib64/libc.so.6
#28 0x00007fffeece574e in getclnthandle () from /lib64/libtirpc.so.3
#29 0x00007fffeece630f in __rpcb_findaddr_timed () from /lib64/libtirpc.so.3
#30 0x00007fffeecde954 in clnt_tp_create_timed () from /lib64/libtirpc.so.3
#31 0x00007fffeecdeb34 in clnt_create_timed () from /lib64/libtirpc.so.3
#32 0x00007fffeef096eb in yp_bind_client_create_v3 () from /lib64/libnsl.so.2
#33 0x00007fffeef098d5 in yp_bind_ypbindprog () from /lib64/libnsl.so.2
#34 0x00007fffeef09e51 in __yp_bind.part.1 () from /lib64/libnsl.so.2
#35 0x00007fffeef0a18b in do_ypcall () from /lib64/libnsl.so.2
#36 0x00007fffeef0a30d in do_ypcall_tr () from /lib64/libnsl.so.2
#37 0x00007fffeef0ac0c in yp_match () from /lib64/libnsl.so.2
#38 0x00007fffef120d24 in internal_gethostbyname2_r ()
   from /lib64/libnss_nis.so.2
#39 0x00007fffef12132c in _nss_nis_gethostbyname2_r ()
   from /lib64/libnss_nis.so.2
#40 0x00007ffff6ccb8c5 in gethostbyname2_r@@GLIBC_2.2.5 ()
   from /lib64/libc.so.6
#41 0x00007ffff6ca317b in gaih_inet.constprop () from /lib64/libc.so.6
#42 0x00007ffff6ca3da4 in getaddrinfo () from /lib64/libc.so.6
#43 0x00007fffeece574e in getclnthandle () from /lib64/libtirpc.so.3
#44 0x00007fffeece630f in __rpcb_findaddr_timed () from /lib64/libtirpc.so.3
#45 0x00007fffeecde954 in clnt_tp_create_timed () from /lib64/libtirpc.so.3
#46 0x00007fffeecdeb34 in clnt_create_timed () from /lib64/libtirpc.so.3
#47 0x00007fffeef096eb in yp_bind_client_create_v3 () from /lib64/libnsl.so.2
#48 0x00007fffeef098d5 in yp_bind_ypbindprog () from /lib64/libnsl.so.2
#49 0x00007fffeef09e51 in __yp_bind.part.1 () from /lib64/libnsl.so.2
#50 0x00007fffeef0a18b in do_ypcall () from /lib64/libnsl.so.2
#51 0x00007fffeef0a30d in do_ypcall_tr () from /lib64/libnsl.so.2
#52 0x00007fffeef0ac0c in yp_match () from /lib64/libnsl.so.2
#53 0x00007fffef120d24 in internal_gethostbyname2_r ()
   from /lib64/libnss_nis.so.2
#54 0x00007fffef12132c in _nss_nis_gethostbyname2_r ()
   from /lib64/libnss_nis.so.2
#55 0x00007ffff6ccb8c5 in gethostbyname2_r@@GLIBC_2.2.5 ()
   from /lib64/libc.so.6
#56 0x00007ffff6ca317b in gaih_inet.constprop () from /lib64/libc.so.6
#57 0x00007ffff6ca3da4 in getaddrinfo () from /lib64/libc.so.6
#58 0x00007fffeece574e in getclnthandle () from /lib64/libtirpc.so.3
#59 0x00007fffeece630f in __rpcb_findaddr_timed () from /lib64/libtirpc.so.3
#60 0x00007fffeecde954 in clnt_tp_create_timed () from /lib64/libtirpc.so.3
#61 0x00007fffeecdeb34 in clnt_create_timed () from /lib64/libtirpc.so.3
#62 0x00007fffeef096eb in yp_bind_client_create_v3 () from /lib64/libnsl.so.2
#63 0x00007fffeef098d5 in yp_bind_ypbindprog () from /lib64/libnsl.so.2
#64 0x00007fffeef09e51 in __yp_bind.part.1 () from /lib64/libnsl.so.2
#65 0x00007fffeef0a18b in do_ypcall () from /lib64/libnsl.so.2
#66 0x00007fffeef0a30d in do_ypcall_tr () from /lib64/libnsl.so.2
#67 0x00007fffeef0ac0c in yp_match () from /lib64/libnsl.so.2
#68 0x00007fffef120d24 in internal_gethostbyname2_r ()
   from /lib64/libnss_nis.so.2
#69 0x00007fffef12132c in _nss_nis_gethostbyname2_r ()
   from /lib64/libnss_nis.so.2
#70 0x00007ffff6ccb8c5 in gethostbyname2_r@@GLIBC_2.2.5 ()
   from /lib64/libc.so.6
#71 0x00007ffff6ca317b in gaih_inet.constprop () from /lib64/libc.so.6
#72 0x00007ffff6ca3da4 in getaddrinfo () from /lib64/libc.so.6
#73 0x00007fffeece574e in getclnthandle () from /lib64/libtirpc.so.3
#74 0x00007fffeece630f in __rpcb_findaddr_timed () from /lib64/libtirpc.so.3
#75 0x00007fffeecde954 in clnt_tp_create_timed () from /lib64/libtirpc.so.3
#76 0x00007fffeecdeb34 in clnt_create_timed () from /lib64/libtirpc.so.3
#77 0x00007fffeef096eb in yp_bind_client_create_v3 () from /lib64/libnsl.so.2
#78 0x00007fffeef098d5 in yp_bind_ypbindprog () from /lib64/libnsl.so.2
#79 0x00007fffeef09e51 in __yp_bind.part.1 () from /lib64/libnsl.so.2
#80 0x00007fffeef0a18b in do_ypcall () from /lib64/libnsl.so.2
#81 0x00007fffeef0a30d in do_ypcall_tr () from /lib64/libnsl.so.2
#82 0x00007fffeef0ac0c in yp_match () from /lib64/libnsl.so.2
#83 0x00007fffef120d24 in internal_gethostbyname2_r ()
   from /lib64/libnss_nis.so.2
#84 0x00007fffef12132c in _nss_nis_gethostbyname2_r ()
   from /lib64/libnss_nis.so.2
#85 0x00007ffff6ccb8c5 in gethostbyname2_r@@GLIBC_2.2.5 ()
   from /lib64/libc.so.6
#86 0x00007ffff6ca317b in gaih_inet.constprop () from /lib64/libc.so.6
#87 0x00007ffff6ca3da4 in getaddrinfo () from /lib64/libc.so.6
#88 0x00007fffeece574e in getclnthandle () from /lib64/libtirpc.so.3
#89 0x00007fffeece630f in __rpcb_findaddr_timed () from /lib64/libtirpc.so.3
#90 0x00007fffeecde954 in clnt_tp_create_timed () from /lib64/libtirpc.so.3
#91 0x00007fffeecdeb34 in clnt_create_timed () from /lib64/libtirpc.so.3
#92 0x00007fffeef096eb in yp_bind_client_create_v3 () from /lib64/libnsl.so.2
#93 0x00007fffeef098d5 in yp_bind_ypbindprog () from /lib64/libnsl.so.2
#94 0x00007fffeef09e51 in __yp_bind.part.1 () from /lib64/libnsl.so.2
#95 0x00007fffeef0a18b in do_ypcall () from /lib64/libnsl.so.2
#96 0x00007fffeef0a30d in do_ypcall_tr () from /lib64/libnsl.so.2
#97 0x00007fffeef0ac0c in yp_match () from /lib64/libnsl.so.2
#98 0x00007fffef120d24 in internal_gethostbyname2_r ()
   from /lib64/libnss_nis.so.2
#99 0x00007fffef12132c in _nss_nis_gethostbyname2_r ()
   from /lib64/libnss_nis.so.2
#100 0x00007ffff6ccb8c5 in gethostbyname2_r@@GLIBC_2.2.5 ()
   from /lib64/libc.so.6

In this state, the system does not boot properly as many network using daemons crash in a similar way

Comment 1 Leif Hedstrom 2018-05-15 03:12:04 UTC
I'm seeing something similar, and the net effect for me is that doing ssh to the host tend to be very slow. If I turn off PAM in sshd.config, logins are fast again.

I don't know if the crasher is the reason for this slowdown, but I do see it fairly often.

#0  0x00007fb5c356f9b6 in pthread_sigmask () from /lib64/libpthread.so.0
#1  0x00007fb5bef0bb7b in clnt_vc_create () from /lib64/libtirpc.so.3
#2  0x00007fb5bef0989a in clnt_tli_create () from /lib64/libtirpc.so.3
#3  0x00007fb5bef107c5 in getclnthandle () from /lib64/libtirpc.so.3
#4  0x00007fb5bef1130f in __rpcb_findaddr_timed () from /lib64/libtirpc.so.3
#5  0x00007fb5bef09954 in clnt_tp_create_timed () from /lib64/libtirpc.so.3
#6  0x00007fb5bef09b34 in clnt_create_timed () from /lib64/libtirpc.so.3
#7  0x00007fb5bf13480d in yp_bind_ypbindprog () from /lib64/libnsl.so.2
#8  0x00007fb5bf134e51 in __yp_bind.part.1 () from /lib64/libnsl.so.2
#9  0x00007fb5bf13518b in do_ypcall () from /lib64/libnsl.so.2
#10 0x00007fb5bf13530d in do_ypcall_tr () from /lib64/libnsl.so.2
#11 0x00007fb5bf135c0c in yp_match () from /lib64/libnsl.so.2
#12 0x00007fb5bf34fb9e in _nss_nis_getpwuid_r () from /lib64/libnss_nis.so.2
#13 0x00007fb5c3267f35 in getpwuid_r@@GLIBC_2.2.5 () from /lib64/libc.so.6
#14 0x00007fb5c32676b8 in getpwuid () from /lib64/libc.so.6
#15 0x0000555b48ab4a20 in ?? ()
#16 0x0000555b48ac2ba3 in ?? ()
#17 0x00007fb5c3cae6db in ?? () from /usr/lib/systemd/libsystemd-shared-238.so
#18 0x00007fb5c3cafcc6 in bus_process_object () from /usr/lib/systemd/libsystemd-shared-238.so
#19 0x00007fb5c3cbe3af in ?? () from /usr/lib/systemd/libsystemd-shared-238.so
#20 0x00007fb5c3cbfa5c in ?? () from /usr/lib/systemd/libsystemd-shared-238.so
#21 0x00007fb5c3ce4cb8 in ?? () from /usr/lib/systemd/libsystemd-shared-238.so
#22 0x00007fb5c3ce648c in sd_event_dispatch () from /usr/lib/systemd/libsystemd-shared-238.so
#23 0x00007fb5c3ce6630 in sd_event_run () from /usr/lib/systemd/libsystemd-shared-238.so
#24 0x0000555b48ab201d in ?? ()
#25 0x00007fb5c31c51bb in __libc_start_main () from /lib64/libc.so.6
#26 0x0000555b48ab36aa in ?? ()

Comment 2 Leif Hedstrom 2018-05-15 03:14:33 UTC
My issue (related to logins with accounts in NIS) started with the upgrade to F28. Things still work, it's just very slow to login to the host. And I do get that crash in systemd-login.

Comment 3 Matej Mužila 2018-05-22 08:56:28 UTC
Hi,

it seems, that this issue occurs when NIS server is not available.

Could you please try to run "ypcat hosts.byname" ?

Comment 4 Thomas Sailer 2018-05-22 09:13:29 UTC
It was and is available

Comment 5 Leif Hedstrom 2018-05-22 13:50:36 UTC
Yes, same here (first thing I checked :-).

Comment 6 Matej Mužila 2018-05-22 19:23:56 UTC
I was not able to reproduce this issue. I used:

  server (test 1, ypserv running on F28):
    ypserv.x86_64    4.0-6.20170331git5bfba76.fc28

  server (test 2, ypserv running on F26):
    ypserv.x86_64    2.32.1-8.fc26

  client (F28):
    yp-tools.x86_64    4.2.2-7.fc28
    ypbind.x86_64      3:2.4-8.fc28
    nss_nis.x86_64     3.0-3.fc28
    libnsl2.x86_64     1.2.0-1.fc28

Assuming "ypcat hosts" works on the client, everything works well for me.

Problem occurs when IP address of the NIS server is not found by any other service preceding NIS (specified in /etc/nsswitch.conf). Then the tirpc tries to find IP address of *working* NIS server using NIS.


Could you please specify what versions of ypserv, ypbind, yptools, nss_nis, libnsl2 you use on server and client?

It would be also helpful if you could provide your nsswitch.conf and yp.conf used on the client.

Comment 7 Leif Hedstrom 2018-05-22 20:28:26 UTC
On the YP client side, it's running the latest F28 packages. On the server side, I have

ypserv-2.32.1-7.fc24.armv7hl


On the F28 client, I have

domain ogre.com server 192.168.201.19


nsswitch is:

passwd:      files nis
shadow:      files nis
group:       files nis

#hosts:     db files nisplus nis dns
hosts:      files nis dns myhostname


However, I just did a "yum upgrade" on the F28 box, and I don't think I'm seeing this crasher any more. :-/ I'll keep poking around though, see if it reproduces again.

Comment 8 kunimo 2018-05-28 04:22:41 UTC
Hi.

I was on the same problem, I avoided it in my case.

In NIS Client, core dump if NIS Server's IP address is set to yp.conf with DNS / FQDN. The reason is that when NIS Client's application getaddrinfo () or gethostbyname (), getaddrinfo () it using the name of NIS Server written in yp.conf. Then in order to find NIS Server getaddrinfo () ... core dump it repeatedly indefinitely.

In F27, the same phenomenon did not occur and it occurred as F28.

Workaround 1. Write the server address in /etc/yp.conf in dot notation instead of name.
   domain <domain> server <NISSERVER.MY.COMPANY>
-> domain <domain> server 10.20.30.40

Workaround 2. In /etc/nsswitch.conf, set NIS Server's address, such as subtracting DNS before NIS, so that it will not be searched by NIS.
   hosts: files nis dns
-> hosts: files dns nis

Workaround 3. Write an entry for NIS Server in / etc / hosts and write files at the top of nsswitch.conf.
-> 10.20.30.40 <NISSERVER.MY.COMPANY>

In my environment I was able to avoid any of the three.

before workaroud:
# getent hosts NISHOSTNAME
Segmentation fault (core dumped)

after:
# getent hosts NISHOSTNAME
10.98.18.56     HOSHOSTNAME

Comment 9 Matej Mužila 2018-05-28 10:00:02 UTC
(In reply to kunimo from comment #8)
> Hi.
> 
> I was on the same problem, I avoided it in my case.
> 
> In NIS Client, core dump if NIS Server's IP address is set to yp.conf with
> DNS / FQDN. The reason is that when NIS Client's application getaddrinfo ()
> or gethostbyname (), getaddrinfo () it using the name of NIS Server written
> in yp.conf. Then in order to find NIS Server getaddrinfo () ... core dump it
> repeatedly indefinitely.


Hi, yes, you are right.

This version of NIS supports ypbind v3 protocol. The v3 binding (stored in /var/yp/<domain>.3) is in form:

    struct ypbind3_binding {
      struct netconfig *ypbind_nconf;
      struct netbuf *ypbind_svcaddr;
      char *ypbind_servername;
      /* that's the highest version number that the used
         ypserv supports, normally YPVERS */
      rpcvers_t ypbind_hi_vers;
      /* the lowest version number that the used
         ypserv supports, on Solaris 0 or YPVERS, too */
      rpcvers_t ypbind_lo_vers;
    };.

This allows to store address of the NIS server as its domain name and it causes problems.

Linux-NIS documentation[1] says: "Make sure that you don't use "nis" or "nis6" for the "hosts" entry!". I am working on a patch, that would make NIS to use ypbind protocol of version <3 when resolving hosts, however I have to consult it with upstream. 


[1] http://www.linux-nis.org/nis-ipv6/



PS: Don't get confused by the documentation, we do not have libnss_nis6.so plugin in Fedora. We have libnss_nis.so plugin so the /etc/nsswitch should not contain "nis6", but "nis".

Comment 10 Matej Mužila 2018-05-28 11:43:21 UTC
I think this is problem of the libnsl2 package (not nss_nis). I'm moving this bug to libnsl2.

Comment 11 Matej Mužila 2018-06-05 14:57:13 UTC
According to the upstream the workaround 1 in comment #8 is not a workaround, but the real fix and if hostnames are resolved by NIS, the /etc/yp.conf must not contain hostnames.


Thorsten Kukuk, author and the upstream of Linux-NIS, created a patch[1] that detects recursive lock between yp_all() and do_ypcall(). It seems to work properly (lookup ends with an error, not in an endless loop / segfault). I'm submitting it to the testing repo.



[1] https://github.com/thkukuk/libnsl/commit/549f7b986a448df802d63fc8ca2bbc85049c819b

Comment 12 Fedora Update System 2018-06-05 14:57:52 UTC
libnsl2-1.2.0-2.20180605git4a062cf.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2018-b1a37887b9

Comment 13 Fedora Update System 2018-06-06 15:01:49 UTC
libnsl2-1.2.0-2.20180605git4a062cf.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-b1a37887b9

Comment 14 Fedora Update System 2018-06-16 20:15:29 UTC
libnsl2-1.2.0-2.20180605git4a062cf.fc28 has been pushed to the Fedora 28 stable repository. If problems still persist, please make note of it in this bug report.