Description of problem: I reproduced [0] in Fedora Rawhide. I file it for documentation purpose and to track it for Fedora, too. [0] https://bugzilla.redhat.com/show_bug.cgi?id=2181478 Running the command `systemctl reload autofs.service` returns to the prompt without any output (no news are good news, right?). But checking the service status by running `systemctl status autofs.service` shows that the main process exited. Instead of reloading the config the status changed from active to failed. To recover from this issue run `systemctl start autofs.service`. Version-Release number of selected component (if applicable): libsss_autofs-2.8.2-4.fc38.x86_64 autofs-5.1.8-9.fc39.x86_64 kernel 6.3.0-0.rc3.20230322gita1effab7a3a3.31.fc39.x86_64 How reproducible: Steps to Reproduce: 1. dnf in autofs 2. systemctl enable --now autofs 3. systemctl reload autofs.service Actual results: --- # systemctl --no-pager status autofs.service × autofs.service - Automounts filesystems on demand Loaded: loaded (/usr/lib/systemd/system/autofs.service; enabled; preset: disabled) Drop-In: /usr/lib/systemd/system/service.d └─10-timeout-abort.conf Active: failed (Result: core-dump) since Fri 2023-03-24 13:36:04 CET; 51min ago Duration: 15.555s Process: 985 ExecStart=/usr/sbin/automount $OPTIONS --systemd-service --dont-check-daemon (code=dumped, signal=SEGV) Process: 999 ExecReload=/usr/bin/kill -HUP $MAINPID (code=exited, status=0/SUCCESS) Main PID: 985 (code=dumped, signal=SEGV) CPU: 24ms Mar 24 13:35:48 fedora-rawhide-2023-03-24 systemd[1]: Starting autofs.service - Automounts f…d... Mar 24 13:35:48 fedora-rawhide-2023-03-24 systemd[1]: Started autofs.service - Automounts fi…and. Mar 24 13:36:03 fedora-rawhide-2023-03-24 systemd[1]: Reloading autofs.service - Automounts …d... Mar 24 13:36:03 fedora-rawhide-2023-03-24 systemd[1]: Reloaded autofs.service - Automounts f…and. Mar 24 13:36:04 fedora-rawhide-2023-03-24 systemd[1]: autofs.service: Main process exited, c…SEGV Mar 24 13:36:04 fedora-rawhide-2023-03-24 systemd[1]: autofs.service: Failed with result 'co…mp'. Hint: Some lines were ellipsized, use -l to show in full. --- Expected results: Reload should work without taking the service down. Additional info: GDB back trace: --- [root@fedora-rawhide-2023-03-24 coredump]# coredumpctl list /usr/sbin/automount TIME PID UID GID SIG COREFILE EXE SIZE Fri 2023-03-24 13:36:03 CET 985 0 0 SIGSEGV present /usr/sbin/automount 217.0K [root@fedora-rawhide-2023-03-24 coredump]# coredumpctl debug PID: 985 (automount) UID: 0 (root) GID: 0 (root) Signal: 11 (SEGV) Timestamp: Fri 2023-03-24 13:36:03 CET (55min ago) Command Line: /usr/sbin/automount --systemd-service --dont-check-daemon Executable: /usr/sbin/automount Control Group: /system.slice/autofs.service Unit: autofs.service Slice: system.slice Boot ID: c1f22570133f4c7f9a150526bfc81332 Machine ID: cb980186c0a447eeb3df7516ce01bb15 Hostname: fedora-rawhide-2023-03-24 Storage: /var/lib/systemd/coredump/core.automount.0.c1f22570133f4c7f9a150526bfc81332.985.1679661363000000.zst (present) Size on Disk: 217.0K Message: Process 985 (automount) of user 0 dumped core. Module libpcre2-8.so.0 from rpm pcre2-10.42-1.fc38.1.x86_64 Module libselinux.so.1 from rpm libselinux-3.5-1.fc39.x86_64 Module libcrypto.so.3 from rpm openssl-3.0.8-2.fc39.x86_64 Module libkeyutils.so.1 from rpm keyutils-1.6.1-6.fc38.x86_64 Module libkrb5support.so.0 from rpm krb5-1.20.1-9.fc39.x86_64 Module libz.so.1 from rpm zlib-1.2.13-3.fc38.x86_64 Module liblz4.so.1 from rpm lz4-1.9.4-2.fc38.x86_64 Module libzstd.so.1 from rpm zstd-1.5.4-1.fc39.x86_64 Module liblzma.so.5 from rpm xz-5.4.1-1.fc38.x86_64 Module libcap.so.2 from rpm libcap-2.48-6.fc38.x86_64 Module libcom_err.so.2 from rpm e2fsprogs-1.46.5-4.fc38.x86_64 Module libk5crypto.so.3 from rpm krb5-1.20.1-9.fc39.x86_64 Module libkrb5.so.3 from rpm krb5-1.20.1-9.fc39.x86_64 Module libgssapi_krb5.so.2 from rpm krb5-1.20.1-9.fc39.x86_64 Module libxml2.so.2 from rpm libxml2-2.10.3-3.fc38.x86_64 Module libsystemd.so.0 from rpm systemd-253.1-4.fc39.x86_64 Module libtirpc.so.3 from rpm libtirpc-1.3.3-1.fc38.x86_64 Stack trace of thread 1001: #0 0x00007f1dd305a530 n/a (n/a + 0x0) #1 0x00007f1dd357807a start_thread (libc.so.6 + 0x8d07a) #2 0x00007f1dd35fe78c __clone3 (libc.so.6 + 0x11378c) Stack trace of thread 985: #0 0x00007f1dd35298a8 __sigtimedwait (libc.so.6 + 0x3e8a8) #1 0x00007f1dd3528f74 sigwait (libc.so.6 + 0x3df74) #2 0x00005614c13b7a83 main (automount + 0xba83) #3 0x00007f1dd3512b4a __libc_start_call_main (libc.so.6 + 0x27b4a) #4 0x00007f1dd3512c0b __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x27c0b) #5 0x00005614c13b82b5 _start (automount + 0xc2b5) Stack trace of thread 988: #0 0x00007f1dd3574ad9 __futex_abstimed_wait_common (libc.so.6 + 0x89ad9) #1 0x00007f1dd3577479 pthread_cond_wait@@GLIBC_2.3.2 (libc.so.6 + 0x8c479) #2 0x00005614c13ca643 st_queue_handler (automount + 0x1e643) #3 0x00007f1dd3578207 start_thread (libc.so.6 + 0x8d207) #4 0x00007f1dd35fe78c __clone3 (libc.so.6 + 0x11378c) Stack trace of thread 987: #0 0x00007f1dd3574ad9 __futex_abstimed_wait_common (libc.so.6 + 0x89ad9) #1 0x00007f1dd3577479 pthread_cond_wait@@GLIBC_2.3.2 (libc.so.6 + 0x8c479) #2 0x00007f1dd3702114 alarm_handler (libautofs.so + 0x13114) #3 0x00007f1dd3578207 start_thread (libc.so.6 + 0x8d207) #4 0x00007f1dd35fe78c __clone3 (libc.so.6 + 0x11378c) Stack trace of thread 991: #0 0x00007f1dd35f113d __poll (libc.so.6 + 0x10613d) #1 0x00005614c13c0950 handle_packet (automount + 0x14950) #2 0x00005614c13c1b35 handle_mounts (automount + 0x15b35) #3 0x00007f1dd3578207 start_thread (libc.so.6 + 0x8d207) #4 0x00007f1dd35fe78c __clone3 (libc.so.6 + 0x11378c) Stack trace of thread 1003: #0 0x00007f1dd356bac6 _IO_getc (libc.so.6 + 0x80ac6) #1 0x00007f1dd3088b48 read_one (lookup_file.so + 0x2b48) #2 0x00007f1dd3089d4e lookup_read_map (lookup_file.so + 0x3d4e) #3 0x00005614c13ccf97 do_read_map (automount + 0x20f97) #4 0x00005614c13cd153 read_file_source_instance (automount + 0x21153) #5 0x00005614c13cd738 lookup_nss_read_map (automount + 0x21738) #6 0x00005614c13cf5c9 do_readmap (automount + 0x235c9) #7 0x00007f1dd3578207 start_thread (libc.so.6 + 0x8d207) #8 0x00007f1dd35fe78c __clone3 (libc.so.6 + 0x11378c) Stack trace of thread 994: #0 0x00007f1dd35f113d __poll (libc.so.6 + 0x10613d) #1 0x00005614c13c0950 handle_packet (automount + 0x14950) #2 0x00005614c13c1b35 handle_mounts (automount + 0x15b35) #3 0x00007f1dd3578207 start_thread (libc.so.6 + 0x8d207) #4 0x00007f1dd35fe78c __clone3 (libc.so.6 + 0x11378c) ELF object binary architecture: AMD x86-64 GNU gdb (GDB) Fedora Linux 13.1-1.fc39 Copyright (C) 2023 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <https://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /usr/sbin/automount... This GDB supports auto-downloading debuginfo from the following URLs: <https://debuginfod.fedoraproject.org/> Enable debuginfod for this session? (y or [n]) y Debuginfod has been enabled. To make this setting permanent, add 'set debuginfod enabled on' to .gdbinit. Reading symbols from /root/.cache/debuginfod_client/2c6e4764416e6d039de28a5b2f73b5471a59285e/debuginfo... [New LWP 1001] [New LWP 985] [New LWP 988] [New LWP 987] [New LWP 991] [New LWP 1003] [New LWP 994] --Type <RET> for more, q to quit, c to continue without paging--c [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Core was generated by `/usr/sbin/automount --systemd-service --dont-check-daemon'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00007f1dd305a530 in ?? () [Current thread is 1 (Thread 0x7f1dd0bfb6c0 (LWP 1001))] (gdb) bt #0 0x00007f1dd305a530 in ?? () #1 0x00007f1dd3575100 in __GI___nptl_deallocate_tsd () at nptl_deallocate_tsd.c:73 #2 __GI___nptl_deallocate_tsd () at nptl_deallocate_tsd.c:22 #3 0x00007f1dd357807a in start_thread (arg=<optimized out>) at pthread_create.c:455 #4 0x00007f1dd35fe78c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78 ---
(In reply to Joerg from comment #0) snip ... > > --Type <RET> for more, q to quit, c to continue without paging--c > [Thread debugging using libthread_db enabled] > > Using host libthread_db library "/lib64/libthread_db.so.1". > Core was generated by `/usr/sbin/automount --systemd-service > --dont-check-daemon'. > Program terminated with signal SIGSEGV, Segmentation fault. > #0 0x00007f1dd305a530 in ?? () > [Current thread is 1 (Thread 0x7f1dd0bfb6c0 (LWP 1001))] > (gdb) bt > #0 0x00007f1dd305a530 in ?? () > #1 0x00007f1dd3575100 in __GI___nptl_deallocate_tsd () at > nptl_deallocate_tsd.c:73 > #2 __GI___nptl_deallocate_tsd () at nptl_deallocate_tsd.c:22 > #3 0x00007f1dd357807a in start_thread (arg=<optimized out>) at > pthread_create.c:455 > #4 0x00007f1dd35fe78c in clone3 () at > ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78 > --- Using the python script that Florian supplied in bug 2162939 I get: [0:1] 0x562023250470 <key_thread_stdenv_vars_destroy> (None) [1:1] 0x7fa0b2d59a20 <__GI___libc_free> (/lib64/libc.so.6) [2:1] 0x7fa0b2fa0ef0 <xmlFreeGlobalState> (/lib64/libxml2.so.2) [3:1] 0x7fa0b2844530 (None) [Inferior 1 (process 9674) detached] I matched the crash to the tsd key 0x7fa0b2844530. I suspected this was a sssd tsd key so I changed the "automount: files sss" to "automount: files" in /etc/nsswicth.conf and the crash went away. I then downloaded the rawhide sssd source rpm and compared to the RHEL rpm where this has been fixed. I found that the patch to fix the problem is not present in the rawhide rpm. Alexey could you update sssd in rawhide please?
Hi, I confirm that changing the "automount: files sss" to "automount: files" in /etc/nsswicth.conf works around the problem. All the best, Joerg
Hi, (In reply to Ian Kent from comment #1) > > Alexey could you update sssd in rawhide please? how urgent is this? We plan to rebase SSSD on sssd-2.9 (where this bug is fixed) in a couple of weeks... Could it wait until then or is it more pressing?
(In reply to Alexey Tikhonov from comment #3) > Hi, > > (In reply to Ian Kent from comment #1) > > > > Alexey could you update sssd in rawhide please? > > how urgent is this? > > We plan to rebase SSSD on sssd-2.9 (where this bug is fixed) in a couple of > weeks... > Could it wait until then or is it more pressing? I think that would be ok with Joerg but I believe he needs autofs to be functional in F37 so which sss NVR was this introduced in?
(In reply to Ian Kent from comment #4) > (In reply to Alexey Tikhonov from comment #3) > > Hi, > > > > (In reply to Ian Kent from comment #1) > > > > > > Alexey could you update sssd in rawhide please? > > > > how urgent is this? > > > > We plan to rebase SSSD on sssd-2.9 (where this bug is fixed) in a couple of > > weeks... > > Could it wait until then or is it more pressing? > > I think that would be ok with Joerg but I believe he needs autofs to be > functional in F37 so which sss NVR was this introduced in? F37 should be also affected. We will address all supported versions at the same time.
F37 will be fixed in sssd-2.9.0-1.fc37