DescriptionAlexey Tikhonov
2022-04-06 16:51:10 UTC
This bug was initially created as a copy of Bug #2072050
I am copying this bug because: to track fix for RHEL9
As part of OpenShift 4.11, RHCOS recently switched to using the RHEL 8.6 Beta content and we started to see evidence of `sssd` crashing and restarting in a loop.
```
Apr 04 22:10:22 test1-4nhln-master-0 systemd[1]: sssd.service: Service RestartSec=100ms expired, scheduling restart.
Apr 04 22:10:22 test1-4nhln-master-0 systemd[1]: sssd.service: Scheduled restart job, restart counter is at 26.
Apr 04 22:10:22 test1-4nhln-master-0 systemd[1]: Stopped System Security Services Daemon.
Apr 04 22:10:22 test1-4nhln-master-0 systemd[1]: sssd.service: Consumed 532ms CPU time
Apr 04 22:10:22 test1-4nhln-master-0 systemd[1]: Starting System Security Services Daemon...
Apr 04 22:10:22 test1-4nhln-master-0 sssd[2615]: Starting up
Apr 04 22:10:22 test1-4nhln-master-0 sssd_be[2616]: Starting up
Apr 04 22:10:22 test1-4nhln-master-0 sssd_nss[2617]: Starting up
Apr 04 22:10:22 test1-4nhln-master-0 sssd_nss[2624]: Starting up
Apr 04 22:10:25 test1-4nhln-master-0 sssd_nss[2638]: Starting up
Apr 04 22:10:29 test1-4nhln-master-0 sssd_nss[2681]: Starting up
Apr 04 22:10:29 test1-4nhln-master-0 sssd[2615]: Exiting the SSSD. Could not restart critical service [nss].
Apr 04 22:10:29 test1-4nhln-master-0 systemd[1]: sssd.service: Main process exited, code=exited, status=1/FAILURE
Apr 04 22:10:29 test1-4nhln-master-0 systemd[1]: sssd.service: Failed with result 'exit-code'.
Apr 04 22:10:29 test1-4nhln-master-0 systemd[1]: Failed to start System Security Services Daemon.
Apr 04 22:10:29 test1-4nhln-master-0 systemd[1]: sssd.service: Consumed 510ms CPU time
Apr 04 22:10:29 test1-4nhln-master-0 systemd[1]: sssd.service: Service RestartSec=100ms expired, scheduling restart.
Apr 04 22:10:29 test1-4nhln-master-0 systemd[1]: sssd.service: Scheduled restart job, restart counter is at 27.
Apr 04 22:10:29 test1-4nhln-master-0 systemd[1]: Stopped System Security Services Daemon.
Apr 04 22:10:29 test1-4nhln-master-0 systemd[1]: sssd.service: Consumed 510ms CPU time
Apr 04 22:10:29 test1-4nhln-master-0 systemd[1]: Starting System Security Services Daemon...
Apr 04 22:10:29 test1-4nhln-master-0 sssd[2683]: Starting up
Apr 04 22:10:29 test1-4nhln-master-0 sssd_be[2684]: Starting up
Apr 04 22:10:29 test1-4nhln-master-0 sssd_nss[2685]: Starting up
Apr 04 22:10:29 test1-4nhln-master-0 sssd_nss[2686]: Starting up
```
The version of `sssd` used in RHCOS is `sssd-0-2.6.2-3.el8-x86_64`
We think this may be related to:
https://bugzilla.redhat.com/show_bug.cgi?id=1796466#c10https://github.com/SSSD/sssd/issues/5753
The upstream PR:
https://github.com/SSSD/sssd/pull/6075
...may resolve this issue for us.
This is currently blocking the ability for OpenShift clusters to be installed/started successfully.
Pushed PR: https://github.com/SSSD/sssd/pull/6108
* `master`
* 3c6218aa91026e066e793ee26333ea64fd6bc50e - Revert "man: sssd.conf and sssd-ifp clarify user option"
* 37f90057792a0b4543f34684ed9a240fe8e869c1 - Revert "usertools: force local user for sssd process user"
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (sssd bug fix and enhancement update), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHBA-2022:8325