Bug 2181545 - Main process exited after running `systemctl reload autofs.service`
Summary: Main process exited after running `systemctl reload autofs.service`
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: sssd
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: sssd-maintainers
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-03-24 13:35 UTC by Joerg
Modified: 2023-05-17 18:18 UTC (History)
11 users (show)

Fixed In Version: sssd-2.9.0-1.fc39
Clone Of:
Environment:
Last Closed: 2023-05-17 18:14:44 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 2181478 0 unspecified CLOSED Main process exited after running `systemctl reload autofs.service` 2023-09-23 11:38:19 UTC
Red Hat Issue Tracker SSSD-6100 0 None None None 2023-05-17 18:18:07 UTC

Internal Links: 2181478

Description Joerg 2023-03-24 13:35:14 UTC
Description of problem:
I reproduced [0] in Fedora Rawhide. I file it for documentation purpose and to track it for Fedora, too.

[0] https://bugzilla.redhat.com/show_bug.cgi?id=2181478

Running the command `systemctl reload autofs.service` returns to the prompt without any output (no news are good news, right?). But checking the service status by running `systemctl status autofs.service` shows that the main process exited.

Instead of reloading the config the status changed from active to failed. To recover from this issue run `systemctl start autofs.service`.

Version-Release number of selected component (if applicable):
libsss_autofs-2.8.2-4.fc38.x86_64
autofs-5.1.8-9.fc39.x86_64
kernel 6.3.0-0.rc3.20230322gita1effab7a3a3.31.fc39.x86_64

How reproducible:


Steps to Reproduce:
1. dnf in autofs
2. systemctl enable --now autofs
3. systemctl reload autofs.service

Actual results:

---
# systemctl --no-pager status autofs.service
× autofs.service - Automounts filesystems on demand
     Loaded: loaded (/usr/lib/systemd/system/autofs.service; enabled; preset: disabled)
    Drop-In: /usr/lib/systemd/system/service.d
             └─10-timeout-abort.conf
     Active: failed (Result: core-dump) since Fri 2023-03-24 13:36:04 CET; 51min ago
   Duration: 15.555s
    Process: 985 ExecStart=/usr/sbin/automount $OPTIONS --systemd-service --dont-check-daemon (code=dumped, signal=SEGV)
    Process: 999 ExecReload=/usr/bin/kill -HUP $MAINPID (code=exited, status=0/SUCCESS)
   Main PID: 985 (code=dumped, signal=SEGV)
        CPU: 24ms

Mar 24 13:35:48 fedora-rawhide-2023-03-24 systemd[1]: Starting autofs.service - Automounts f…d...
Mar 24 13:35:48 fedora-rawhide-2023-03-24 systemd[1]: Started autofs.service - Automounts fi…and.
Mar 24 13:36:03 fedora-rawhide-2023-03-24 systemd[1]: Reloading autofs.service - Automounts …d...
Mar 24 13:36:03 fedora-rawhide-2023-03-24 systemd[1]: Reloaded autofs.service - Automounts f…and.
Mar 24 13:36:04 fedora-rawhide-2023-03-24 systemd[1]: autofs.service: Main process exited, c…SEGV
Mar 24 13:36:04 fedora-rawhide-2023-03-24 systemd[1]: autofs.service: Failed with result 'co…mp'.
Hint: Some lines were ellipsized, use -l to show in full.
---

Expected results:
Reload should work without taking the service down.

Additional info:
GDB back trace:

---
[root@fedora-rawhide-2023-03-24 coredump]# coredumpctl list /usr/sbin/automount 
TIME                        PID UID GID SIG     COREFILE EXE                   SIZE
Fri 2023-03-24 13:36:03 CET 985   0   0 SIGSEGV present  /usr/sbin/automount 217.0K
[root@fedora-rawhide-2023-03-24 coredump]# coredumpctl debug
           PID: 985 (automount)
           UID: 0 (root)
           GID: 0 (root)
        Signal: 11 (SEGV)
     Timestamp: Fri 2023-03-24 13:36:03 CET (55min ago)
  Command Line: /usr/sbin/automount --systemd-service --dont-check-daemon
    Executable: /usr/sbin/automount
 Control Group: /system.slice/autofs.service
          Unit: autofs.service
         Slice: system.slice
       Boot ID: c1f22570133f4c7f9a150526bfc81332
    Machine ID: cb980186c0a447eeb3df7516ce01bb15
      Hostname: fedora-rawhide-2023-03-24
       Storage: /var/lib/systemd/coredump/core.automount.0.c1f22570133f4c7f9a150526bfc81332.985.1679661363000000.zst (present)
  Size on Disk: 217.0K
       Message: Process 985 (automount) of user 0 dumped core.
                
                Module libpcre2-8.so.0 from rpm pcre2-10.42-1.fc38.1.x86_64
                Module libselinux.so.1 from rpm libselinux-3.5-1.fc39.x86_64
                Module libcrypto.so.3 from rpm openssl-3.0.8-2.fc39.x86_64
                Module libkeyutils.so.1 from rpm keyutils-1.6.1-6.fc38.x86_64
                Module libkrb5support.so.0 from rpm krb5-1.20.1-9.fc39.x86_64
                Module libz.so.1 from rpm zlib-1.2.13-3.fc38.x86_64
                Module liblz4.so.1 from rpm lz4-1.9.4-2.fc38.x86_64
                Module libzstd.so.1 from rpm zstd-1.5.4-1.fc39.x86_64
                Module liblzma.so.5 from rpm xz-5.4.1-1.fc38.x86_64
                Module libcap.so.2 from rpm libcap-2.48-6.fc38.x86_64
                Module libcom_err.so.2 from rpm e2fsprogs-1.46.5-4.fc38.x86_64
                Module libk5crypto.so.3 from rpm krb5-1.20.1-9.fc39.x86_64
                Module libkrb5.so.3 from rpm krb5-1.20.1-9.fc39.x86_64
                Module libgssapi_krb5.so.2 from rpm krb5-1.20.1-9.fc39.x86_64
                Module libxml2.so.2 from rpm libxml2-2.10.3-3.fc38.x86_64
                Module libsystemd.so.0 from rpm systemd-253.1-4.fc39.x86_64
                Module libtirpc.so.3 from rpm libtirpc-1.3.3-1.fc38.x86_64
                Stack trace of thread 1001:
                #0  0x00007f1dd305a530 n/a (n/a + 0x0)
                #1  0x00007f1dd357807a start_thread (libc.so.6 + 0x8d07a)
                #2  0x00007f1dd35fe78c __clone3 (libc.so.6 + 0x11378c)
                
                Stack trace of thread 985:
                #0  0x00007f1dd35298a8 __sigtimedwait (libc.so.6 + 0x3e8a8)
                #1  0x00007f1dd3528f74 sigwait (libc.so.6 + 0x3df74)
                #2  0x00005614c13b7a83 main (automount + 0xba83)
                #3  0x00007f1dd3512b4a __libc_start_call_main (libc.so.6 + 0x27b4a)
                #4  0x00007f1dd3512c0b __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x27c0b)
                #5  0x00005614c13b82b5 _start (automount + 0xc2b5)
                
                Stack trace of thread 988:
                #0  0x00007f1dd3574ad9 __futex_abstimed_wait_common (libc.so.6 + 0x89ad9)
                #1  0x00007f1dd3577479 pthread_cond_wait@@GLIBC_2.3.2 (libc.so.6 + 0x8c479)
                #2  0x00005614c13ca643 st_queue_handler (automount + 0x1e643)
                #3  0x00007f1dd3578207 start_thread (libc.so.6 + 0x8d207)
                #4  0x00007f1dd35fe78c __clone3 (libc.so.6 + 0x11378c)
                
                Stack trace of thread 987:
                #0  0x00007f1dd3574ad9 __futex_abstimed_wait_common (libc.so.6 + 0x89ad9)
                #1  0x00007f1dd3577479 pthread_cond_wait@@GLIBC_2.3.2 (libc.so.6 + 0x8c479)
                #2  0x00007f1dd3702114 alarm_handler (libautofs.so + 0x13114)
                #3  0x00007f1dd3578207 start_thread (libc.so.6 + 0x8d207)
                #4  0x00007f1dd35fe78c __clone3 (libc.so.6 + 0x11378c)
                
                Stack trace of thread 991:
                #0  0x00007f1dd35f113d __poll (libc.so.6 + 0x10613d)
                #1  0x00005614c13c0950 handle_packet (automount + 0x14950)
                #2  0x00005614c13c1b35 handle_mounts (automount + 0x15b35)
                #3  0x00007f1dd3578207 start_thread (libc.so.6 + 0x8d207)
                #4  0x00007f1dd35fe78c __clone3 (libc.so.6 + 0x11378c)
                
                Stack trace of thread 1003:
                #0  0x00007f1dd356bac6 _IO_getc (libc.so.6 + 0x80ac6)
                #1  0x00007f1dd3088b48 read_one (lookup_file.so + 0x2b48)
                #2  0x00007f1dd3089d4e lookup_read_map (lookup_file.so + 0x3d4e)
                #3  0x00005614c13ccf97 do_read_map (automount + 0x20f97)
                #4  0x00005614c13cd153 read_file_source_instance (automount + 0x21153)
                #5  0x00005614c13cd738 lookup_nss_read_map (automount + 0x21738)
                #6  0x00005614c13cf5c9 do_readmap (automount + 0x235c9)
                #7  0x00007f1dd3578207 start_thread (libc.so.6 + 0x8d207)
                #8  0x00007f1dd35fe78c __clone3 (libc.so.6 + 0x11378c)
                
                Stack trace of thread 994:
                #0  0x00007f1dd35f113d __poll (libc.so.6 + 0x10613d)
                #1  0x00005614c13c0950 handle_packet (automount + 0x14950)
                #2  0x00005614c13c1b35 handle_mounts (automount + 0x15b35)
                #3  0x00007f1dd3578207 start_thread (libc.so.6 + 0x8d207)
                #4  0x00007f1dd35fe78c __clone3 (libc.so.6 + 0x11378c)
                ELF object binary architecture: AMD x86-64

GNU gdb (GDB) Fedora Linux 13.1-1.fc39
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/sbin/automount...

This GDB supports auto-downloading debuginfo from the following URLs:
  <https://debuginfod.fedoraproject.org/>
Enable debuginfod for this session? (y or [n]) y
Debuginfod has been enabled.
To make this setting permanent, add 'set debuginfod enabled on' to .gdbinit.
Reading symbols from /root/.cache/debuginfod_client/2c6e4764416e6d039de28a5b2f73b5471a59285e/debuginfo...
[New LWP 1001]                                                                                   
[New LWP 985]
[New LWP 988]
[New LWP 987]
[New LWP 991]
[New LWP 1003]
[New LWP 994]
                                                                                                 --Type <RET> for more, q to quit, c to continue without paging--c
[Thread debugging using libthread_db enabled]                                                    
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/sbin/automount --systemd-service --dont-check-daemon'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f1dd305a530 in ?? ()
[Current thread is 1 (Thread 0x7f1dd0bfb6c0 (LWP 1001))]
(gdb) bt
#0  0x00007f1dd305a530 in ?? ()
#1  0x00007f1dd3575100 in __GI___nptl_deallocate_tsd () at nptl_deallocate_tsd.c:73
#2  __GI___nptl_deallocate_tsd () at nptl_deallocate_tsd.c:22
#3  0x00007f1dd357807a in start_thread (arg=<optimized out>) at pthread_create.c:455
#4  0x00007f1dd35fe78c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78
---

Comment 1 Ian Kent 2023-03-25 06:07:14 UTC
(In reply to Joerg from comment #0)
snip ...
>                                                                             
> --Type <RET> for more, q to quit, c to continue without paging--c
> [Thread debugging using libthread_db enabled]                               
> 
> Using host libthread_db library "/lib64/libthread_db.so.1".
> Core was generated by `/usr/sbin/automount --systemd-service
> --dont-check-daemon'.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  0x00007f1dd305a530 in ?? ()
> [Current thread is 1 (Thread 0x7f1dd0bfb6c0 (LWP 1001))]
> (gdb) bt
> #0  0x00007f1dd305a530 in ?? ()
> #1  0x00007f1dd3575100 in __GI___nptl_deallocate_tsd () at
> nptl_deallocate_tsd.c:73
> #2  __GI___nptl_deallocate_tsd () at nptl_deallocate_tsd.c:22
> #3  0x00007f1dd357807a in start_thread (arg=<optimized out>) at
> pthread_create.c:455
> #4  0x00007f1dd35fe78c in clone3 () at
> ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78
> ---

Using the python script that Florian supplied in bug 2162939 I get:

[0:1] 0x562023250470 <key_thread_stdenv_vars_destroy> (None)
[1:1] 0x7fa0b2d59a20 <__GI___libc_free> (/lib64/libc.so.6)
[2:1] 0x7fa0b2fa0ef0 <xmlFreeGlobalState> (/lib64/libxml2.so.2)
[3:1] 0x7fa0b2844530 (None)
[Inferior 1 (process 9674) detached]

I matched the crash to the tsd key 0x7fa0b2844530.
I suspected this was a sssd tsd key so I changed the "automount: files sss" to
"automount: files" in /etc/nsswicth.conf and the crash went away.

I then downloaded the rawhide sssd source rpm and compared to the RHEL rpm where
this has been fixed. I found that the patch to fix the problem is not present in
the rawhide rpm.

Alexey could you update sssd in rawhide please?

Comment 2 Joerg 2023-03-27 06:43:17 UTC
Hi,  
I confirm that changing the "automount: files sss" to "automount: files" in /etc/nsswicth.conf works around the problem.

All the best,
Joerg

Comment 3 Alexey Tikhonov 2023-03-27 15:13:28 UTC
Hi,

(In reply to Ian Kent from comment #1)
> 
> Alexey could you update sssd in rawhide please?

how urgent is this?

We plan to rebase SSSD on sssd-2.9 (where this bug is fixed) in a couple of weeks...
Could it wait until then or is it more pressing?

Comment 4 Ian Kent 2023-03-29 01:14:32 UTC
(In reply to Alexey Tikhonov from comment #3)
> Hi,
> 
> (In reply to Ian Kent from comment #1)
> > 
> > Alexey could you update sssd in rawhide please?
> 
> how urgent is this?
> 
> We plan to rebase SSSD on sssd-2.9 (where this bug is fixed) in a couple of
> weeks...
> Could it wait until then or is it more pressing?

I think that would be ok with Joerg but I believe he needs autofs to be
functional in F37 so which sss NVR was this introduced in?

Comment 5 Alexey Tikhonov 2023-03-29 08:09:16 UTC
(In reply to Ian Kent from comment #4)
> (In reply to Alexey Tikhonov from comment #3)
> > Hi,
> > 
> > (In reply to Ian Kent from comment #1)
> > > 
> > > Alexey could you update sssd in rawhide please?
> > 
> > how urgent is this?
> > 
> > We plan to rebase SSSD on sssd-2.9 (where this bug is fixed) in a couple of
> > weeks...
> > Could it wait until then or is it more pressing?
> 
> I think that would be ok with Joerg but I believe he needs autofs to be
> functional in F37 so which sss NVR was this introduced in?

F37 should be also affected.
We will address all supported versions at the same time.

Comment 6 Alexey Tikhonov 2023-05-17 18:14:44 UTC
F37 will be fixed in sssd-2.9.0-1.fc37


Note You need to log in before you can comment on or make changes to this bug.