librt and libpthread both contain IFUNC resolvers which have relocation dependencies, for clock_gettime and other clock_* functions and vfork. The vfork forwarder was removed in this upstream commit: commit 41d6f74e6cb6a92ab428c11ee1e408b2a16aa1b0 Author: Florian Weimer <fweimer> Date: Tue Jul 2 15:12:20 2019 +0200 nptl: Remove vfork IFUNC-based forwarder from libpthread [BZ #20188] With commit f0b2132b35248c1f4a80f62a2c38cddcc802aa8c ("ld.so: Support moving versioned symbols between sonames [BZ #24741]"), the dynamic linker will find the definition of vfork in libc and binds a vfork reference to that symbol, even if the soname in the version reference says that the symbol should be located in libpthread. As a result, the forwarder (whether it's IFUNC-based or a duplicate of the libc implementation) is no longer necessary. On older architectures, a placeholder symbol is required, to make sure that the GLIBC_2.1.2 symbol version does not go away, or is turned in to a weak symbol definition by the link editor. (The symbol version needs to preserved so that the symbol coverage check in elf/dl-version.c does not fail for old binaries.) mips32 is an outlier: It defined __vfork@@GLIBC_2.2, but the baseline is GLIBC_2.0. Since there are other @@GLIBC_2.2 symbols, the placeholder symbol is not needed there. Removal of the librt forwarders is still pending upstream review.
librt patch review thread: https://sourceware.org/ml/libc-alpha/2019-08/msg00732.html https://sourceware.org/ml/libc-alpha/2019-09/msg00039.html
The upstream patch has been committed last week. It has been backported into Fedora 30; Fedora 29 and Fedora 31 updates are pending.
My working theory is this: nss_winbind fails to load and leaves an unrelocated mapped librt.so.1 behind because it is marked NODELETE. The subsequent loads for libmount.so.1 and libnss_systemd.so.2 see the unrelocated librt.so.1 and issues the “Relink `/lib64/libmount.so.1' with `/lib64/librt.so.1' for IFUNC symbol `clock_gettime'” error. I think this fits all the facts perfectly, including that libmount.so.1 and libnss_systemd.so.2 are in fact linked against librt.so.1. But it's a conjecture at this point. If this is true, the crashes will likely go away only after the Samba dependencies are changed so that samba-winbind-modules and samba-winbind gets updated sooner in the transaction, or we find a way to unload unrelocated NODELETE modules.
Reproducer without Samba: (1) Build a non-loadable NSS module which links against librt.so.1: gcc -shared -o linkmod.so -Wl,--soname=doesnotexist-bz1748197.so gcc -shared -o /lib64/libnss_faulty.so.2 -Wl,--no- as-needed -lrt ./linkmod.so (2) Edit /etc/nsswitch.conf to include it: passwd: faulty sss files systemd (3) Trigger the bug: # getent passwd does-not-exist getent: Relink `/lib64/libmount.so.1' with `/lib64/librt.so.1' for IFUNC symbol `clock_gettime' getent: Relink `/lib64/libnss_systemd.so.2' with `/lib64/librt.so.1' for IFUNC symbol `clock_gettime' Segmentation fault (core dumped) # This is obviously a synthetic test case, so it's not guaranteed it matches the Samba update scenario. If this is indeed the trigger, it may be possible to work around this in an in-place upgrade scenario by editing /etc/nsswitch.conf around the RPM update transaction. Carlos suggested the possibility that if we remove the IFUNC resolver, librt.so.1 gets re-relocated once needed by libmount.so.1 and libnss_systemd.so.2 are loaded and need it. In that case, the fix in this glibc bug here would be sufficient. I will try to verify that next.
I tested the synthetic reproducer on Fedora 30. The IFUNC error message is gone, as expected. However, we still crash in dlopen: #0 0x0000000000002950 in ?? () #1 0x00007f166a0cfe8a in call_init (l=<optimized out>, argc=argc@entry=3, argv=argv@entry=0x7ffe034c74c8, env=env@entry=0x7ffe034c74e8) at dl-init.c:72 #2 0x00007f166a0cff91 in call_init (env=0x7ffe034c74e8, argv=0x7ffe034c74c8, argc=3, l=<optimized out>) at dl-init.c:30 #3 _dl_init (main_map=main_map@entry=0x55ba10d8ea10, argc=3, argv=0x7ffe034c74c8, env=0x7ffe034c74e8) at dl-init.c:119 #4 0x00007f166a0d3eee in dl_open_worker (a=a@entry=0x7ffe034c6f10) at dl-open.c:506 #5 0x00007f166a01f1f9 in __GI__dl_catch_exception ( exception=exception@entry=0x7ffe034c6ef0, operate=operate@entry=0x7f166a0d3b00 <dl_open_worker>, args=args@entry=0x7ffe034c6f10) at dl-error-skeleton.c:196 #6 0x00007f166a0d376e in _dl_open (file=0x7ffe034c7190 "libnss_systemd.so.2", mode=-2147483646, caller_dlopen=0x7f166a005ab4 <nss_load_library+356>, nsid=-2, argc=3, argv=<optimized out>, env=0x7ffe034c74e8) at dl-open.c:588 Unfortunately, as can be seen below, librt.so.1 has not been re-relocated, and we crash once we try to execute its ELF constructors: (gdb) up #3 _dl_init (main_map=main_map@entry=0x55ba10d8ea10, argc=3, argv=0x7ffe034c74c8, env=0x7ffe034c74e8) at dl-init.c:119 119 call_init (main_map->l_initfini[i], argc, argv, env); (gdb) print main_map->l_initfini[0]->l_name $5 = 0x55ba10d8d210 "/lib64/libnss_systemd.so.2" (gdb) print main_map->l_initfini[1]->l_name $6 = 0x55ba10d8d240 "/lib64/librt.so.1" (gdb) print main_map->l_initfini[1]->l_relocated $7 = 0 This is another instance of bug 1500128. This means that this bug is not sufficient to fix the actual in-place upgrade failure.
nss_systemd is unconditionally added to /etc/nsswitch.conf by a systemd-libs scriptlet: function mod_nss() { if [ -f "$1" ] ; then # sed-fu to add myhostanme to hosts line grep -E -q '^hosts:.* myhostname' "$1" || sed -i.bak -e ' /^hosts:/ !b /\<myhostname\>/ b s/[[:blank:]]*$/ myhostname/ ' "$1" &>/dev/null || : # Add nss-systemd to passwd and group grep -E -q '^(passwd|group):.* systemd' "$1" || sed -i.bak -r -e ' s/^(passwd|group):(.*)/\1: \2 systemd/ ' "$1" &>/dev/null || : fi } FILE="$(readlink /etc/nsswitch.conf || echo /etc/nsswitch.conf)" mod_nss "$FILE" This is new in Red Hat Enterprise Linux 8; systemd-libs-219-67.el7.x86_64 does not do this.
We should also backport this upstream commit: commit b2b3b7598ae51c714b5fd0d0406d435e66f3624b Author: Adhemerval Zanella <adhemerval.zanella> Date: Wed Sep 25 22:10:00 2019 +0000 Set the expects flags to clock_nanosleep It moves the missing CFLAGS from rt/Makefile to time/Makefile missing from 7b5af2d8f2a2b (Finish move of clock_* functions to libc. [BZ #24959]). Checked on powerpc64le-linux-gnu. * rt/Makefile (CFLAGS-clock_nanosleep.c): Move to ... * time/Makefile (CFLAGS-clock_nanosleep.c): ... here.
We should include this followup in the backport: commit b2b3b7598ae51c714b5fd0d0406d435e66f3624b Author: Adhemerval Zanella <adhemerval.zanella> Date: Wed Sep 25 22:10:00 2019 +0000 Set the expects flags to clock_nanosleep It moves the missing CFLAGS from rt/Makefile to time/Makefile missing from 7b5af2d8f2a2b (Finish move of clock_* functions to libc. [BZ #24959]). Checked on powerpc64le-linux-gnu. * rt/Makefile (CFLAGS-clock_nanosleep.c): Move to ... * time/Makefile (CFLAGS-clock_nanosleep.c): ... here.
This is the commit which removes the librt IFUNC redirectors: commit 7b5af2d8f2a2b858319a792678b15a0db08764c7 Author: Zack Weinberg <zackw> Date: Wed Sep 4 08:18:57 2019 +0200 Finish move of clock_* functions to libc. [BZ #24959] In glibc 2.17, the functions clock_getcpuclockid, clock_getres, clock_gettime, clock_nanosleep, and clock_settime were moved from librt.so to libc.so, leaving compatibility stubs behind. Now that the dynamic linker no longer insists on finding versioned symbols in the same library that originally defined them, we do not need the stubs anymore, and this means we don't need GLIBC_PRIVATE __-prefix aliases for most of the functions anymore either. (clock_gettime still needs one.) For ports added before 2.17, libc.so needs to provide two symbol versions for each, the default at GLIBC_2.17 plus a compat version matching what librt had. While I'm at it, move the clock_*.c files and their tests from rt/ to time/. This commit removes some of the GLIBC_PRIVATE internal aliases (which are not part of the external run-time ABI, so programs that use them are invalid).
Verified, libpthread and librt don't contain indirect functions.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: glibc security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:4444