Bug 2072640

Summary: sssd_nss exiting (due to missing 'sssd' local user) making SSSD service to restart in a loop
Product: Red Hat Enterprise Linux 9 Reporter: Alexey Tikhonov <atikhono>
Component: sssdAssignee: Alexey Tikhonov <atikhono>
Status: CLOSED ERRATA QA Contact: shridhar <sgadekar>
Severity: high Docs Contact:
Priority: high    
Version: 9.0CC: aboscatt, grajaiya, jhrozek, lslebodn, mzidek, pbrezina, pvlasin, sgadekar, tscherf
Target Milestone: rcKeywords: Triaged, ZStream
Target Release: ---Flags: pm-rhel: mirror+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: sync-to-jira
Fixed In Version: sssd-2.7.0-1.el9 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2075539 (view as bug list) Environment:
Last Closed: 2022-11-15 11:17:22 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2075539    

Description Alexey Tikhonov 2022-04-06 16:51:10 UTC
This bug was initially created as a copy of Bug #2072050

I am copying this bug because: to track fix for RHEL9



As part of OpenShift 4.11, RHCOS recently switched to using the RHEL 8.6 Beta content and we started to see evidence of `sssd` crashing and restarting in a loop.

```
Apr 04 22:10:22 test1-4nhln-master-0 systemd[1]: sssd.service: Service RestartSec=100ms expired, scheduling restart.
Apr 04 22:10:22 test1-4nhln-master-0 systemd[1]: sssd.service: Scheduled restart job, restart counter is at 26.
Apr 04 22:10:22 test1-4nhln-master-0 systemd[1]: Stopped System Security Services Daemon.
Apr 04 22:10:22 test1-4nhln-master-0 systemd[1]: sssd.service: Consumed 532ms CPU time
Apr 04 22:10:22 test1-4nhln-master-0 systemd[1]: Starting System Security Services Daemon...
Apr 04 22:10:22 test1-4nhln-master-0 sssd[2615]: Starting up
Apr 04 22:10:22 test1-4nhln-master-0 sssd_be[2616]: Starting up
Apr 04 22:10:22 test1-4nhln-master-0 sssd_nss[2617]: Starting up
Apr 04 22:10:22 test1-4nhln-master-0 sssd_nss[2624]: Starting up
Apr 04 22:10:25 test1-4nhln-master-0 sssd_nss[2638]: Starting up
Apr 04 22:10:29 test1-4nhln-master-0 sssd_nss[2681]: Starting up
Apr 04 22:10:29 test1-4nhln-master-0 sssd[2615]: Exiting the SSSD. Could not restart critical service [nss].
Apr 04 22:10:29 test1-4nhln-master-0 systemd[1]: sssd.service: Main process exited, code=exited, status=1/FAILURE
Apr 04 22:10:29 test1-4nhln-master-0 systemd[1]: sssd.service: Failed with result 'exit-code'.
Apr 04 22:10:29 test1-4nhln-master-0 systemd[1]: Failed to start System Security Services Daemon.
Apr 04 22:10:29 test1-4nhln-master-0 systemd[1]: sssd.service: Consumed 510ms CPU time
Apr 04 22:10:29 test1-4nhln-master-0 systemd[1]: sssd.service: Service RestartSec=100ms expired, scheduling restart.
Apr 04 22:10:29 test1-4nhln-master-0 systemd[1]: sssd.service: Scheduled restart job, restart counter is at 27.
Apr 04 22:10:29 test1-4nhln-master-0 systemd[1]: Stopped System Security Services Daemon.
Apr 04 22:10:29 test1-4nhln-master-0 systemd[1]: sssd.service: Consumed 510ms CPU time
Apr 04 22:10:29 test1-4nhln-master-0 systemd[1]: Starting System Security Services Daemon...
Apr 04 22:10:29 test1-4nhln-master-0 sssd[2683]: Starting up
Apr 04 22:10:29 test1-4nhln-master-0 sssd_be[2684]: Starting up
Apr 04 22:10:29 test1-4nhln-master-0 sssd_nss[2685]: Starting up
Apr 04 22:10:29 test1-4nhln-master-0 sssd_nss[2686]: Starting up
```

The version of `sssd` used in RHCOS is `sssd-0-2.6.2-3.el8-x86_64`


We think this may be related to:

https://bugzilla.redhat.com/show_bug.cgi?id=1796466#c10
https://github.com/SSSD/sssd/issues/5753


The upstream PR:

https://github.com/SSSD/sssd/pull/6075

...may resolve this issue for us.


This is currently blocking the ability for OpenShift clusters to be installed/started successfully.

Comment 1 Alexey Tikhonov 2022-04-12 14:44:49 UTC
Upstream PR: https://github.com/SSSD/sssd/pull/6108

Comment 2 Alexey Tikhonov 2022-04-14 09:40:10 UTC
Pushed PR: https://github.com/SSSD/sssd/pull/6108

* `master`
    * 3c6218aa91026e066e793ee26333ea64fd6bc50e - Revert "man: sssd.conf and sssd-ifp clarify user option"
    * 37f90057792a0b4543f34684ed9a240fe8e869c1 - Revert "usertools: force local user for sssd process user"

Comment 10 errata-xmlrpc 2022-11-15 11:17:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (sssd bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:8325