Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1377127

Summary: SSSD subprocesses are no longer monitored by the main sssd process, but self-monitor
Product: Red Hat Enterprise Linux 7 Reporter: Amith <apeetham>
Component: sssdAssignee: SSSD Maintainers <sssd-maint>
Status: CLOSED NOTABUG QA Contact: Steeve Goveas <sgoveas>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.3CC: apeetham, grajaiya, jhrozek, lslebodn, mkosek, mzidek, pbrezina
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-09-20 19:37:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Amith 2016-09-18 20:20:07 UTC
Description of problem:
This issue was observed when SSSD was unresponsive to a SIGSTOP signal on sssd_be process. By-default sssd process should restart itself after approx. 91 seconds and also show relevant message on running service status.

See the example from RHEL-7.2 test machine:
 
# service sssd status
Redirecting to /bin/systemctl status  sssd.service
● sssd.service - System Security Services Daemon
   Loaded: loaded (/usr/lib/systemd/system/sssd.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/sssd.service.d
           └─journal.conf
   Active: active (running) since Mon 2016-09-19 00:46:54 IST; 2min 34s ago
  Process: 27641 ExecStart=/usr/sbin/sssd -D -f (code=exited, status=0/SUCCESS)
 Main PID: 27642 (sssd)
   CGroup: /system.slice/sssd.service
           ├─27642 /usr/sbin/sssd -D -f
           ├─27644 /usr/libexec/sssd/sssd_nss --uid 0 --gid 0 --debug-to-files
           ├─27645 /usr/libexec/sssd/sssd_pam --uid 0 --gid 0 --debug-to-files
           └─27682 /usr/libexec/sssd/sssd_be --domain LDAP --uid 0 --gid 0 --debug-to-files

Sep 19 00:46:54 vm-idm-012.lab.eng.pnq.redhat.com systemd[1]: Starting System Security Services Daemon...
Sep 19 00:46:54 vm-idm-012.lab.eng.pnq.redhat.com sssd[27642]: Starting up
Sep 19 00:46:54 vm-idm-012.lab.eng.pnq.redhat.com sssd[be[LDAP]][27643]: Starting up
Sep 19 00:46:54 vm-idm-012.lab.eng.pnq.redhat.com sssd[nss][27644]: Starting up
Sep 19 00:46:54 vm-idm-012.lab.eng.pnq.redhat.com sssd[pam][27645]: Starting up
Sep 19 00:46:54 vm-idm-012.lab.eng.pnq.redhat.com systemd[1]: Started System Security Services Daemon.
Sep 19 00:48:14 vm-idm-012.lab.eng.pnq.redhat.com sssd[27642]: Killing service [LDAP], not responding to pings!
Sep 19 00:49:14 vm-idm-012.lab.eng.pnq.redhat.com sssd[27642]: [LDAP][27643] is not responding to SIGTERM. Sending SIGKILL.
Sep 19 00:49:14 vm-idm-012.lab.eng.pnq.redhat.com sssd[be[LDAP]][27682]: Starting up


In the case of RHEL-7.3, sssd process never restarts and strangely logs no error message on /var/log/sssd/sssd.log. Service should be manually restarted, for sssd to function properly.

Version-Release number of selected component (if applicable):
sssd-1.14.0-42.el7.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Setup sssd.conf with debug_level = 0x0270 in sssd section, as given below:

[sssd]
config_file_version = 2
services = nss, pam
domains = LDAP
debug_level = 0x0270

[domain/LDAP]
debug_level = 0xFFF0
id_provider = ldap
auth_provider = ldap
ldap_uri = ldap://<LDAP_SERVER>
ldap_tls_cacert = /etc/openldap/certs/cacert.asc

2. Send SIGSTOP signal to sssd_be process.
# kill -s SIGSTOP `pidof sssd_be`

3. Wait for some time and monitor the log messages to see whether the process restarts itself. Also, monitor the status of sssd service for following messages:

sssd[27642]: Killing service [LDAP], not responding to pings!
sssd[27642]: [LDAP][27643] is not responding to SIGTERM. Sending SIGKILL.
sssd[be[LDAP]][27682]: Starting up

4. By default sssd_be process should restart.

Actual results:
SSSD is unresponsive and sssd_be process never restarts. No error message logged.

Expected results:
Signalled process should restart and SSSD should function properly with relevant messages logged.

Additional info:

Comment 1 Jakub Hrozek 2016-09-18 20:32:27 UTC
I'm sorry, but this is expected with 7.3. The services are no longer monitored by the sssd process, but instead self-monitor. This is a first step towards making it possible to socket-activate services and remove the monitor if possible.

I suspect SIGSTOP is a test case to restart SSSD. But since SIGSTOP cannot be caught or ignored, it really stops the process and by effect also the self-monitoring.

I think it would be better to come up with a different test case than one involving SIGSTOP. In the meantime, I'm removing the Regression keyword, but I will leave the bug open until we come up with some testcase, then we can close this bug.

Comment 2 Jakub Hrozek 2016-09-18 20:34:26 UTC
Hmm, I wonder if just running SIGCONT after at least 30 seconds would make sssd_be restart (not tested, just an idea..)b

Comment 10 Jakub Hrozek 2016-09-20 19:37:25 UTC
Please reopen if the processes do not restart cleanly or the watchdog doesn't work.