| Summary: | SSSD subprocesses are no longer monitored by the main sssd process, but self-monitor | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Amith <apeetham> |
| Component: | sssd | Assignee: | SSSD Maintainers <sssd-maint> |
| Status: | CLOSED NOTABUG | QA Contact: | Steeve Goveas <sgoveas> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 7.3 | CC: | apeetham, grajaiya, jhrozek, lslebodn, mkosek, mzidek, pbrezina |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2016-09-20 19:37:25 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
I'm sorry, but this is expected with 7.3. The services are no longer monitored by the sssd process, but instead self-monitor. This is a first step towards making it possible to socket-activate services and remove the monitor if possible. I suspect SIGSTOP is a test case to restart SSSD. But since SIGSTOP cannot be caught or ignored, it really stops the process and by effect also the self-monitoring. I think it would be better to come up with a different test case than one involving SIGSTOP. In the meantime, I'm removing the Regression keyword, but I will leave the bug open until we come up with some testcase, then we can close this bug. Hmm, I wonder if just running SIGCONT after at least 30 seconds would make sssd_be restart (not tested, just an idea..)b Please reopen if the processes do not restart cleanly or the watchdog doesn't work. |
Description of problem: This issue was observed when SSSD was unresponsive to a SIGSTOP signal on sssd_be process. By-default sssd process should restart itself after approx. 91 seconds and also show relevant message on running service status. See the example from RHEL-7.2 test machine: # service sssd status Redirecting to /bin/systemctl status sssd.service ● sssd.service - System Security Services Daemon Loaded: loaded (/usr/lib/systemd/system/sssd.service; enabled; vendor preset: disabled) Drop-In: /etc/systemd/system/sssd.service.d └─journal.conf Active: active (running) since Mon 2016-09-19 00:46:54 IST; 2min 34s ago Process: 27641 ExecStart=/usr/sbin/sssd -D -f (code=exited, status=0/SUCCESS) Main PID: 27642 (sssd) CGroup: /system.slice/sssd.service ├─27642 /usr/sbin/sssd -D -f ├─27644 /usr/libexec/sssd/sssd_nss --uid 0 --gid 0 --debug-to-files ├─27645 /usr/libexec/sssd/sssd_pam --uid 0 --gid 0 --debug-to-files └─27682 /usr/libexec/sssd/sssd_be --domain LDAP --uid 0 --gid 0 --debug-to-files Sep 19 00:46:54 vm-idm-012.lab.eng.pnq.redhat.com systemd[1]: Starting System Security Services Daemon... Sep 19 00:46:54 vm-idm-012.lab.eng.pnq.redhat.com sssd[27642]: Starting up Sep 19 00:46:54 vm-idm-012.lab.eng.pnq.redhat.com sssd[be[LDAP]][27643]: Starting up Sep 19 00:46:54 vm-idm-012.lab.eng.pnq.redhat.com sssd[nss][27644]: Starting up Sep 19 00:46:54 vm-idm-012.lab.eng.pnq.redhat.com sssd[pam][27645]: Starting up Sep 19 00:46:54 vm-idm-012.lab.eng.pnq.redhat.com systemd[1]: Started System Security Services Daemon. Sep 19 00:48:14 vm-idm-012.lab.eng.pnq.redhat.com sssd[27642]: Killing service [LDAP], not responding to pings! Sep 19 00:49:14 vm-idm-012.lab.eng.pnq.redhat.com sssd[27642]: [LDAP][27643] is not responding to SIGTERM. Sending SIGKILL. Sep 19 00:49:14 vm-idm-012.lab.eng.pnq.redhat.com sssd[be[LDAP]][27682]: Starting up In the case of RHEL-7.3, sssd process never restarts and strangely logs no error message on /var/log/sssd/sssd.log. Service should be manually restarted, for sssd to function properly. Version-Release number of selected component (if applicable): sssd-1.14.0-42.el7.x86_64 How reproducible: Always Steps to Reproduce: 1. Setup sssd.conf with debug_level = 0x0270 in sssd section, as given below: [sssd] config_file_version = 2 services = nss, pam domains = LDAP debug_level = 0x0270 [domain/LDAP] debug_level = 0xFFF0 id_provider = ldap auth_provider = ldap ldap_uri = ldap://<LDAP_SERVER> ldap_tls_cacert = /etc/openldap/certs/cacert.asc 2. Send SIGSTOP signal to sssd_be process. # kill -s SIGSTOP `pidof sssd_be` 3. Wait for some time and monitor the log messages to see whether the process restarts itself. Also, monitor the status of sssd service for following messages: sssd[27642]: Killing service [LDAP], not responding to pings! sssd[27642]: [LDAP][27643] is not responding to SIGTERM. Sending SIGKILL. sssd[be[LDAP]][27682]: Starting up 4. By default sssd_be process should restart. Actual results: SSSD is unresponsive and sssd_be process never restarts. No error message logged. Expected results: Signalled process should restart and SSSD should function properly with relevant messages logged. Additional info: