Bug 1312297

Summary: nslcd.service does not restart on failure
Product: Red Hat Enterprise Linux 7 Reporter: Karsten Weiss <knweiss>
Component: nss-pam-ldapdAssignee: Jakub Hrozek <jhrozek>
Status: CLOSED ERRATA QA Contact: Martin Zelený <mzeleny>
Severity: low Docs Contact:
Priority: unspecified    
Version: 7.2CC: franz.brauneder, jhrozek, knoha, mkosek, mzeleny, pkis, sreber
Target Milestone: rcFlags: pkis: needinfo+
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: nss-pam-ldapd-0.8.13-13.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-10 17:24:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1298243    

Description Karsten Weiss 2016-02-26 10:51:44 UTC
Description of problem:

The nslcd.service does not restart on failure.

There's not "Restart=" line in /usr/lib/systemd/system/nslcd.service.

Is this intentional?

(I saw the following crash on one machine during startup and noticed that the service wasn't restarted automatically even though this would succeed:

# systemctl status nslcd
● nslcd.service - Naming services LDAP client daemon.
   Loaded: loaded (/usr/lib/systemd/system/nslcd.service; enabled; vendor preset: disabled)
   Active: failed (Result: signal) since Thu 2016-02-25 16:59:14 CET; 17h ago
  Process: 2275 ExecStart=/usr/sbin/nslcd (code=exited, status=0/SUCCESS)
 Main PID: 2280 (code=killed, signal=SEGV)

Feb 25 16:59:13 nehalem201 systemd[1]: Starting Naming services LDAP client daemon....
Feb 25 16:59:13 nehalem201 nslcd[2280]: version 0.8.13 starting
Feb 25 16:59:13 nehalem201 nslcd[2280]: accepting connections
Feb 25 16:59:13 nehalem201 systemd[1]: Started Naming services LDAP client daemon..
Feb 25 16:59:14 nehalem201 systemd[1]: nslcd.service: main process exited, code=killed, status=11/SEGV
Feb 25 16:59:14 nehalem201 systemd[1]: Unit nslcd.service entered failed state.
Feb 25 16:59:14 nehalem201 systemd[1]: nslcd.service failed.

(gdb) bt
#0  0x00007f227d8f5bd0 in pthread_mutex_lock () from /lib64/libpthread.so.0
#1  0x00007f227b5e6cb9 in PR_Lock () from /lib64/libnspr4.so
#2  0x00007f227b5ecd45 in PR_GetCurrentThread () from /lib64/libnspr4.so
#3  0x00007f227b5ddc7f in PR_SetError () from /lib64/libnspr4.so
#4  0x00007f227be8eaaf in SECMOD_RestartModules () from /lib64/libnss3.so
#5  0x00007f227df62276 in tlsm_deferred_ctx_init () from /lib64/libldap_r-2.4.so.2
#6  0x00007f227b5deb15 in PR_CallOnceWithArg () from /lib64/libnspr4.so
#7  0x00007f227df60aa1 in tlsm_session_new () from /lib64/libldap_r-2.4.so.2
#8  0x00007f227df5d974 in alloc_handle () from /lib64/libldap_r-2.4.so.2
#9  0x00007f227df5dd2c in ldap_int_tls_connect.isra.2 () from /lib64/libldap_r-2.4.so.2
#10 0x00007f227df5e558 in ldap_int_tls_start () from /lib64/libldap_r-2.4.so.2
#11 0x00007f227df37491 in ldap_int_open_connection () from /lib64/libldap_r-2.4.so.2
#12 0x00007f227df4ab6d in ldap_new_connection () from /lib64/libldap_r-2.4.so.2
#13 0x00007f227df3686f in ldap_open_defconn () from /lib64/libldap_r-2.4.so.2
#14 0x00007f227df4c098 in ldap_send_initial_request () from /lib64/libldap_r-2.4.so.2
#15 0x00007f227df40b86 in ldap_sasl_bind () from /lib64/libldap_r-2.4.so.2
#16 0x00007f227df41109 in ldap_sasl_bind_s () from /lib64/libldap_r-2.4.so.2
#17 0x00007f227df419a5 in ldap_simple_bind_s () from /lib64/libldap_r-2.4.so.2
#18 0x00007f227e805286 in do_retry_search ()
#19 0x00007f227e805a90 in myldap_search ()
#20 0x00007f227e80dbec in nslcd_host_byaddr ()
#21 0x00007f227e80263f in worker ()
#22 0x00007f227d8f3dc5 in start_thread () from /lib64/libpthread.so.0
#23 0x00007f227d62128d in clone () from /lib64/libc.so.6
(gdb)

Feb 25 16:59:14 nehalem201 kernel: alg: No test for crc32 (crc32-table)
Feb 25 16:59:14 nehalem201 systemd[1]: Started IPMI Driver.
Feb 25 16:59:14 nehalem201 kernel: nslcd[2281]: segfault at 10 ip 00007f227d8f5bd0 sp 00007f227a4b12d8 error 4 in libpthread-2.17.so[7f227d8ec000+16000]
Feb 25 16:59:14 nehalem201 systemd[1]: nslcd.service: main process exited, code=killed, status=11/SEGV
Feb 25 16:59:14 nehalem201 systemd[1]: Unit nslcd.service entered failed state.
Feb 25 16:59:14 nehalem201 systemd[1]: nslcd.service failed.
)

Version-Release number of selected component (if applicable):

nss-pam-ldapd-0.8.13-8.el7.x86_64

How reproducible:

Start nslcd.service, kill the process. Notice that systemd won't restart
the service.

Steps to Reproduce:
1. Make sure nslcd.service is active: systemctl status nslcd.service
2. kill $(cat /var/run/nslcd/nslcd.pid)
3. Check the status again: systemctl status nslcd.service => failed

Actual results:

Notice that the nslcd.service does not restart automatically.

Expected results:

I expect that systemd will restart the nslcd.service automatically after
a process crash.

Additional info:

I've tested the following systemd drop-in for nslcd.service and seems
to work:

# cat /etc/systemd/system/nslcd.service.d/restart.conf 
# 2016-02-26, KAW
[Service]
RestartSec=10s
Restart=on-failure

You may want to add the two Restart lines to /usr/lib/systemd/system/nslcd.service in the official rpm.

Comment 1 Jakub Hrozek 2016-02-26 11:05:09 UTC
Thanks for the bug report, I think this would be a nice addition to the service file when we update nslcd in RHEL.

Comment 3 Karsten Weiss 2016-02-27 12:19:36 UTC
While we're at it: You may as well want to add the missing Documentation= line to nslcd.service:

[Unit]
Documentation=man:nslcd(8) man:nslcd.conf(5)

Comment 20 errata-xmlrpc 2018-04-10 17:24:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0935