This bug has been migrated to another issue tracking site. It has been closed here and may no longer be being monitored.

If you would like to get updates for this issue, or to participate in it, you may do so at Red Hat Issue Tracker .
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2181478 - Main process exited after running `systemctl reload autofs.service`
Summary: Main process exited after running `systemctl reload autofs.service`
Keywords:
Status: CLOSED MIGRATED
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: autofs
Version: 9.1
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Ian Kent
QA Contact: Kun Wang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-03-24 09:40 UTC by Joerg
Modified: 2023-09-23 11:35 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-09-23 11:35:50 UTC
Type: Bug
Target Upstream Version:
Embargoed:
pm-rhel: mirror+


Attachments (Terms of Use)
Coredump of automounter process (240.18 KB, application/octet-stream)
2023-03-24 09:40 UTC, Joerg
no flags Details
Screenshot taken from console showing segfault (9.06 KB, image/png)
2023-03-24 09:42 UTC, Joerg
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 2181545 0 unspecified CLOSED Main process exited after running `systemctl reload autofs.service` 2023-05-17 18:18:50 UTC
Red Hat Issue Tracker   RHEL-7940 0 None Migrated None 2023-09-23 11:35:47 UTC
Red Hat Issue Tracker RHELPLAN-153004 0 None None None 2023-03-24 09:40:57 UTC

Internal Links: 2181545

Description Joerg 2023-03-24 09:40:31 UTC
Created attachment 1953345 [details]
Coredump of automounter process

Description of problem:
Running the command `systemctl reload autofs.service` returns to the prompt without any output (no news are good news, right?). But checking the service status by running `systemctl status autofs.service` shows that the main process exited.

Instead of reloading the config the status changed from active to failed. To recover from this issue run `systemctl start autofs.service`.


Version-Release number of selected component (if applicable):
autofs-5.1.7-32.el9_1.1.x86_64
libsss_autofs-2.7.3-4.el9.x86_64
kernel 5.14.0-162.6.1.el9_1.x86_64

How reproducible:
This issue is reproducible on any fresh install of RHEL 9.1.

Steps to Reproduce:
1. Take a fresh install of RHEL 9.1 (e.g. with minimal environment).
2. Run `# dnf in autofs`.
3. Run `# systemctl enable --now autofs`.
4. Check that `# systemctl status autofs` returns active and running.
5. Run `# systemctl reload autofs`.
6. Run `# systemctl status autofs` again.

Actual results:

~~~
[root@rhel91-2023-03-24 ~]# systemctl reload autofs
[root@rhel91-2023-03-24 ~]# systemctl --no-pager status autofs
× autofs.service - Automounts filesystems on demand
     Loaded: loaded (/usr/lib/systemd/system/autofs.service; enabled; vendor preset: disabled)
     Active: failed (Result: signal) since Fri 2023-03-24 10:11:15 CET; 2s ago
   Duration: 33.915s
    Process: 14310 ExecStart=/usr/sbin/automount $OPTIONS --systemd-service --dont-check-daemon (code=killed, signal=SEGV)
    Process: 14324 ExecReload=/usr/bin/kill -HUP $MAINPID (code=exited, status=0/SUCCESS)
   Main PID: 14310 (code=killed, signal=SEGV)
        CPU: 23ms

Mar 24 10:10:41 rhel91-2023-03-24 systemd[1]: Starting Automounts filesystems on demand...
Mar 24 10:10:41 rhel91-2023-03-24 systemd[1]: Started Automounts filesystems on demand.
Mar 24 10:11:14 rhel91-2023-03-24 systemd[1]: Reloading Automounts filesystems on demand...
Mar 24 10:11:14 rhel91-2023-03-24 systemd[1]: Reloaded Automounts filesystems on demand.
Mar 24 10:11:15 rhel91-2023-03-24 systemd[1]: autofs.service: Main process exited, code=killed, status=11/SEGV
Mar 24 10:11:15 rhel91-2023-03-24 systemd[1]: autofs.service: Failed with result 'signal'.
~~~

Expected results:
Service should reload without segfault.

Additional info:
I have reproduced this issue on a new install using a test system. I attached the systemd-coredump, a screenshot from console and sos report to this bugzilla. They contain no secrets as it's a pure test system.

Comment 1 Joerg 2023-03-24 09:42:02 UTC
Created attachment 1953346 [details]
Screenshot taken from console showing segfault

Comment 3 Joerg 2023-03-24 10:12:24 UTC
I checked and found that RHEL 8.7 with the following component versions **is not** affected:
autofs-5.1.4-83.el8.x86_64
libsss_autofs-2.7.3-4.el8_7.3.x86_64
kernel 4.18.0-425.13.1.el8_7.x86_64

Comment 4 Joerg 2023-03-24 10:19:56 UTC
I checked the following package versions in RHEL 9 which are all affected:
autofs.x86_64         1:5.1.7-27.el9              rhel-9-for-x86_64-baseos-rpms 
autofs.x86_64         1:5.1.7-31.el9              rhel-9-for-x86_64-baseos-rpms 
autofs.x86_64         1:5.1.7-32.el9_1.1          rhel-9-for-x86_64-baseos-rpms

Comment 5 Ian Kent 2023-03-24 11:36:57 UTC
Can you get onto the system where the crash occurred and get a gdb
back trace please.

And post it here.

This would be a lot easier than me setting up a system to be the
same as the system of the sosreport when there are a couple of known
problems that have already been resolved.

Are you using sssd on the system?
There's a known problem with a particular version of sssd and the description
of how it happened was the same as what you describe here. It might not be the
same though.

I'm pretty sure that this will be resolved by the changes for bug 2179753 or
the above fix for sssd.

I'll get onto bug 2179753 soon as I can and see if I can identify the fixed
version of sssd.

Ian

Comment 6 Joerg 2023-03-24 12:38:25 UTC
Hi Ian,
As we had already discussed on Slack I'm not able to get you a useful gdb back trace. Here are some additional bits I could share.

The package sssd is not installed.

But I found the same error happening on Fedora release 39 (Rawhide) with the following component versions:
libsss_autofs-2.8.2-4.fc38.x86_64
autofs-5.1.8-9.fc39.x86_64
kernel 6.3.0-0.rc3.20230322gita1effab7a3a3.31.fc39.x86_64

Joerg

Comment 7 Joerg 2023-03-24 12:53:43 UTC
Hello again,
After tinkering with gdb for a while I got this:

---
[root@rhel91-2023-03-24 coredump]# coredumpctl list /usr/sbin/automount 
TIME                        PID UID GID SIG     COREFILE EXE                   SIZE
Fri 2023-03-24 13:49:49 CET 822   0   0 SIGSEGV present  /usr/sbin/automount 244.5K
[root@rhel91-2023-03-24 coredump]# coredumpctl debug
           PID: 822 (automount)
           UID: 0 (root)
           GID: 0 (root)
        Signal: 11 (SEGV)
     Timestamp: Fri 2023-03-24 13:49:49 CET (1min 54s ago)
  Command Line: /usr/sbin/automount --systemd-service --dont-check-daemon
    Executable: /usr/sbin/automount
 Control Group: /system.slice/autofs.service
          Unit: autofs.service
         Slice: system.slice
       Boot ID: 41a54fcbe5af408190332e4ecbf59691
    Machine ID: b28b5c50076340e996780f4a472cf3cd
      Hostname: rhel91-2023-03-24
       Storage: /var/lib/systemd/coredump/core.automount.0.41a54fcbe5af408190332e4ecbf59691.822.1679662189000000.zst (present)
     Disk Size: 244.5K
       Message: Process 822 (automount) of user 0 dumped core.
                
                Module linux-vdso.so.1 with build-id b2402caaf299e146b50e7caee420b3ac0677afa9
                Module lookup_hosts.so with build-id 6502c9ac809410c115658c39cf619a60cb18e4e2
                Module mount_bind.so with build-id 1db3141bf32b6ec1b2b7b1ea5948eadaf99d383a
                Module mount_nfs.so with build-id 6ac450cdece6836c5303d9422d7ff9a7fc939a3e
                Module parse_sun.so with build-id 4aedc2ea29fbcacf76ded03a868e9a915da1ccc4
                Module lookup_file.so with build-id fed02f38e452e345c18084568a384adc14e0200b
                Module libpcre2-8.so.0 with build-id dac773591ff85ee4d18b00795d8bca123f3d5d66
                Module libselinux.so.1 with build-id 321a1f9b5537883ee8ec04c65a9edbaefcc7b5aa
                Module libgpg-error.so.0 with build-id 9d27198f0ca61c66cd921675219dffc0bad16a1a
                Module libresolv.so.2 with build-id dd26798426928fb454335411ecfeb883030b1f6c
                Module libcrypto.so.3 with build-id 5a47668cb7ac23dbdfcce8a8a6923484fd67d8a5
                Module libkeyutils.so.1 with build-id 83c6539bd0d3140678ba836b8baa1b215efa2632
                Module libkrb5support.so.0 with build-id 22c23607e8875f3b081adbc7fe9fbd612b7a57a5
                Module libm.so.6 with build-id c0eb573a2171d96b1aa970edb07f3368573bf845
                Module libz.so.1 with build-id a39f7a92539115971debc39f2f9b66b74f8f7bb8
                Module ld-linux-x86-64.so.2 with build-id df9c6b298bf5e3c1d0eb6a0911f3f561908a704d
                Module libgcrypt.so.20 with build-id 7f21916b83ba6859ff1392a52958f355567ae339
                Module libcap.so.2 with build-id c7625c8524a3d7756043555a1e7b1c3cb56fabbe
                Module liblz4.so.1 with build-id 4d32cb5fa39c86b05cc10cc380f3a8a0d6d9d648
                Module libzstd.so.1 with build-id f0c68ad1b3f8941857af47c6887736d835317ccc
                Module liblzma.so.5 with build-id 330eb2fe0769e5466e2e0ac1b158e1e8452738c9
                Module libcom_err.so.2 with build-id ec70fb11e14fe7dadde8353e95592eb7b8bd4b3a
                Module libk5crypto.so.3 with build-id bd537be81f12497f2d5b8a590665ce28c303b85c
                Module libkrb5.so.3 with build-id 8c62715e7b422618177de85f20fbc3a89128f06c
                Module libgssapi_krb5.so.2 with build-id 4dae28e73361fa8c8b216353852acd992e669a06
                Module libc.so.6 with build-id 82f7ae28e16376aa97cc3bf50b40ab2d1043924a
                Module libgcc_s.so.1 with build-id 9526c65fed0e95fbb6b988476cc811ca19d5c9c9
                Module libautofs.so with build-id abcf609c82d95711cd1563aa7504f263b528660e
                Module libxml2.so.2 with build-id 3175d5777b54e42141250543b6acc4794da1b104
                Module libsystemd.so.0 with build-id 0cce699958c66324d0a1bb698c28da0911b749f4
                Module libtirpc.so.3 with build-id 6a25d54850681edbfacb53f44df71f4e1fa7b52b
                Module automount with build-id 6c2c4e55d530a205f518be13e851d994c76e00fb
                Stack trace of thread 1264:
                #0  0x00007f39bd5cc510 n/a (n/a + 0x0)
                #1  0x0000000000000000 n/a (n/a + 0x0)
                ELF object binary architecture: AMD x86-64

GNU gdb (GDB) Red Hat Enterprise Linux 10.2-10.el9
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/sbin/automount...
Reading symbols from /usr/lib/debug/usr/sbin/automount-5.1.7-32.el9_1.1.x86_64.debug...
[New LWP 1264]
[New LWP 822]
[New LWP 825]
[New LWP 826]
[New LWP 845]
[New LWP 1265]
[New LWP 1266]
[New LWP 838]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/sbin/automount --systemd-service --dont-check-daemon'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f39bd5cc510 in ?? ()
[Current thread is 1 (Thread 0x7f39bddd9640 (LWP 1264))]
(gdb) bt
#0  0x00007f39bd5cc510 in ?? ()
#1  0x00007f39c08b2931 in __GI___nptl_deallocate_tsd () at nptl_deallocate_tsd.c:74
#2  __GI___nptl_deallocate_tsd () at nptl_deallocate_tsd.c:23
#3  0x00007f39c08b56d6 in start_thread (arg=<optimized out>) at pthread_create.c:454
#4  0x00007f39c0855450 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
(gdb)
---

Hope that helps.

Regards,
Joerg

Comment 8 Ian Kent 2023-03-24 13:19:20 UTC
(In reply to Joerg from comment #7)
> Hello again,
> After tinkering with gdb for a while I got this:

Right,

snip ...

> Core was generated by `/usr/sbin/automount --systemd-service
> --dont-check-daemon'.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  0x00007f39bd5cc510 in ?? ()
> [Current thread is 1 (Thread 0x7f39bddd9640 (LWP 1264))]
> (gdb) bt
> #0  0x00007f39bd5cc510 in ?? ()
> #1  0x00007f39c08b2931 in __GI___nptl_deallocate_tsd () at
> nptl_deallocate_tsd.c:74
> #2  __GI___nptl_deallocate_tsd () at nptl_deallocate_tsd.c:23
> #3  0x00007f39c08b56d6 in start_thread (arg=<optimized out>) at
> pthread_create.c:454
> #4  0x00007f39c0855450 in clone3 () at
> ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
> (gdb)

This is what I expected, it's not conclusive because we don't know
who owns those Thread Specific Data (tsd) keys.

It's the same signature we saw for the sssd bug and another bug that
appeared to be a mistake exposed by a newer version of glibc. But
that later one was fixed in revision 32.

There were some other changes that went into revision 34 but I'm
pretty sure this is a result of a large change I had to do recently
for a customer that had over 32000 exports from an NFS server and
the existing code just wasn't written to handle that many exports.

Point being it was a significant change and a couple of bugs got
though testing.

We'll see Monday.
Ian

Comment 9 Ian Kent 2023-03-24 14:14:56 UTC
Ok, I was able to reproduce this on my f37 machine here at home
after building and installing from my rawhide package source.

It looks like this is the sssd bug.

I don't have sssd configured on my machine here either but it still
crashes.

I'm not sure what version of sssd this was introduced in but I have
libsss_autofs-2.8.2-1.fc37.x86_64 on f37 which is broken.

It was fixed in sssd revision 2.8.2-2.

I expect you have:
automount: sss files
in /etc/nsswitch.conf.

Changing this to either:
automount: files
or
automount: files sss

worked around the problem for me.

Give it a try and see if you get the same results as me, ;)

Ian

Comment 10 Joerg 2023-03-27 06:40:38 UTC
Hi Ian,  
My /etc/nsswitch.conf contains:

~~~
# grep automount /etc/nsswitch.conf
automount:  files sss
~~~

I have to change it to `automount: files` to work around the problem. In case sss stays on the line the service keeps crashing on relaod.

Joerg

Comment 11 Ian Kent 2023-03-27 07:07:26 UTC
(In reply to Joerg from comment #10)
> Hi Ian,  
> My /etc/nsswitch.conf contains:
> 
> ~~~
> # grep automount /etc/nsswitch.conf
> automount:  files sss
> ~~~
> 
> I have to change it to `automount: files` to work around the problem. In
> case sss stays on the line the service keeps crashing on relaod.

Either way it's probably sss is this case but there are a couple of other
things that need to be fixed.

I'm pretty sure that it's fixed in RHEL 2.8.2-2.el9, the package revision
numbers don't match in Fedora, it has 2.8.2-4, IIRC, and it doesn't have
the fix.

In any case 2.8.2-2 is needed on RHEL.

Applying the changes to RHEL is going to take a couple of days, there were
two other changes I have scheduled for RHEL-9.3.0 and I want to keep the
order the same between RHEL-8 and RHEL-9, and there's the CI testing that
needs to be done for each change.

Ian

Comment 12 Joerg 2023-03-27 08:36:43 UTC
(In reply to Ian Kent from comment #11)
> In any case 2.8.2-2 is needed on RHEL.
> 
> Applying the changes to RHEL is going to take a couple of days, there were
> two other changes I have scheduled for RHEL-9.3.0 and I want to keep the
> order the same between RHEL-8 and RHEL-9, and there's the CI testing that
> needs to be done for each change.
> 
> Ian

Thanks for letting me know. As I have two workarounds:

 1. Change 'automount:  files sss' to 'automount:  files' OR
 2. Run 'systemctl restart autofs' instead of 'systemctl reload autofs'

It's not urgent on my end. I'll monitor this Bugzilla and the linked one I filed against Rawhide to see when the fixed version is available and released.

Cheers,  
Joerg

Comment 13 Ian Kent 2023-03-27 09:51:46 UTC
(In reply to Joerg from comment #12)
> (In reply to Ian Kent from comment #11)
> > In any case 2.8.2-2 is needed on RHEL.
> > 
> > Applying the changes to RHEL is going to take a couple of days, there were
> > two other changes I have scheduled for RHEL-9.3.0 and I want to keep the
> > order the same between RHEL-8 and RHEL-9, and there's the CI testing that
> > needs to be done for each change.
> > 
> > Ian
> 
> Thanks for letting me know. As I have two workarounds:
> 
>  1. Change 'automount:  files sss' to 'automount:  files' OR
>  2. Run 'systemctl restart autofs' instead of 'systemctl reload autofs'
> 
> It's not urgent on my end. I'll monitor this Bugzilla and the linked one I
> filed against Rawhide to see when the fixed version is available and
> released.

What Fedora release do you need?

The change I applied should make it's way onto the mirrors fairly quickly, it's
revision autofs-5.1.8-20.fc39. So Fedora has (or will have) the same changes as
RHEL and CentOS.

We can go back 2 Fedora releases but I'm not sure which release the sssd
regression was introduced.

Ian

Comment 14 Joerg 2023-03-27 12:05:23 UTC
(In reply to Ian Kent from comment #13)
> What Fedora release do you need?
> 
> The change I applied should make it's way onto the mirrors fairly quickly,
> it's
> revision autofs-5.1.8-20.fc39. So Fedora has (or will have) the same changes
> as
> RHEL and CentOS.
> 
> We can go back 2 Fedora releases but I'm not sure which release the sssd
> regression was introduced.

I would need it in F37, but I could wait for F38 in case it causes too much trouble to bring it back to F37.
IMHO it's more important to get the fix for RHEL 9.

Joerg

Comment 15 Ian Kent 2023-03-27 13:05:13 UTC
(In reply to Joerg from comment #14)
> (In reply to Ian Kent from comment #13)
> > What Fedora release do you need?
> > 
> > The change I applied should make it's way onto the mirrors fairly quickly,
> > it's
> > revision autofs-5.1.8-20.fc39. So Fedora has (or will have) the same changes
> > as
> > RHEL and CentOS.
> > 
> > We can go back 2 Fedora releases but I'm not sure which release the sssd
> > regression was introduced.
> 
> I would need it in F37, but I could wait for F38 in case it causes too much
> trouble to bring it back to F37.
> IMHO it's more important to get the fix for RHEL 9.

The big question is when was the sssd regression introduced.
I can update autofs back to f37 but we need the sssd folks to fix any broken
releases in Fedora. I'll see what I can find wrt. the sssd version as I go.

RHEL is a very different proposition, we'll discus that in the RHEL-9 bug
once I get the update done for RHEL-9.3.0.

Ian

Comment 16 RHEL Program Management 2023-09-23 11:34:46 UTC
Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug.

Comment 17 RHEL Program Management 2023-09-23 11:35:50 UTC
This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there.

Due to differences in account names between systems, some fields were not replicated.  Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information.

To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "RHEL-" followed by an integer.  You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like:

"Bugzilla Bug" = 1234567

In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information.


Note You need to log in before you can comment on or make changes to this bug.