DescriptionAleksandr Sharov
2022-11-29 10:19:52 UTC
Created attachment 1928235[details]
debug sssd logs
Description of problem:
After severe load on the system, including oom state, sssd's dbus looses ability to talk with it's data providers:
* ... skipping repetitive backtrace ...
(2022-11-26 12:18:08): [be[default]] [dp_client_handshake_timeout] (0x0040): Client [sssd.pam] timed out before identification [0x55cc9111b620]!
* ... skipping repetitive backtrace ...
(2022-11-26 12:18:08): [be[default]] [dp_client_handshake_timeout] (0x0040): Client [sssd.nss] timed out before identification [0x55cc91118770]!
This state remains until the sssd service is restarted.
Version-Release number of selected component (if applicable):
RHEL 8.6, sssd-2.6.2-4.el8_6.x86_64
How reproducible:
Couldn't reproduce the state in the lab. In client's environment, it's happening every 24 hours when custom cron script is running.
Steps to Reproduce:
1.
2.
3.
Actual results:
SSSD state is not recovered automatically
Expected results:
SSSD continues to work as expected after the system is out of heavy load and/or OOM.
Additional info:
Debug logs attached (nss trimmed to last 2000 lines, full logs in case). Issue happened around 5:39-5:41 AM , difficult to distinguish better because system appears to be hang at the time.
Additional data and sosreport are in the attached case.
Created attachment 1928235 [details] debug sssd logs Description of problem: After severe load on the system, including oom state, sssd's dbus looses ability to talk with it's data providers: * ... skipping repetitive backtrace ... (2022-11-26 12:18:08): [be[default]] [dp_client_handshake_timeout] (0x0040): Client [sssd.pam] timed out before identification [0x55cc9111b620]! * ... skipping repetitive backtrace ... (2022-11-26 12:18:08): [be[default]] [dp_client_handshake_timeout] (0x0040): Client [sssd.nss] timed out before identification [0x55cc91118770]! This state remains until the sssd service is restarted. Version-Release number of selected component (if applicable): RHEL 8.6, sssd-2.6.2-4.el8_6.x86_64 How reproducible: Couldn't reproduce the state in the lab. In client's environment, it's happening every 24 hours when custom cron script is running. Steps to Reproduce: 1. 2. 3. Actual results: SSSD state is not recovered automatically Expected results: SSSD continues to work as expected after the system is out of heavy load and/or OOM. Additional info: Debug logs attached (nss trimmed to last 2000 lines, full logs in case). Issue happened around 5:39-5:41 AM , difficult to distinguish better because system appears to be hang at the time. Additional data and sosreport are in the attached case.