Bug 2116487

Summary: cache_req_data_set_hybrid_lookup: cache_req_data should never be NULL
Product: Red Hat Enterprise Linux 8 Reporter: Alexey Tikhonov <atikhono>
Component: sssdAssignee: jstephen
Status: CLOSED ERRATA QA Contact: Anuj Borah <aborah>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 8.7CC: aboscatt, pbrezina
Target Milestone: rcKeywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: sync-to-jira
Fixed In Version: sssd-2.7.3-2.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-11-08 10:51:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Alexey Tikhonov 2022-08-08 15:37:53 UTC
This bug was initially created as a copy of Bug #2098654

I am copying this bug because: to track fix for RHEL8



Description of problem:

The cache_req_data_set_hybrid_lookup() is flooding the logs with "cache_req_data should never be NULL". It appears to be called from hybrid_domain_retry_data() as fall-thru where the CR type is neither CACHE_REQ_GROUP_BY_ID/CACHE_REQ_GROUP_BY_NAME tho its not clear what it actually is in this case.

[nss] [cache_req_data_set_hybrid_lookup] (0x0020): cache_req_data should never be NULL
********************** PREVIOUS MESSAGE WAS TRIGGERED BY THE FOLLOWING BACKTRACE:
   *  (2022-06-03 16:10:44): [nss] [cache_req_validate_domain_type] (0x2000): Request type POSIX-only for domain example.net type POSIX is valid
   *  (2022-06-03 16:10:44): [nss] [cache_req_set_domain] (0x0400): CR #503: Using domain [example.net]
   *  (2022-06-03 16:10:44): [nss] [cache_req_search_send] (0x0400): CR #503: Looking up UID:0
   *  (2022-06-03 16:10:44): [nss] [cache_req_search_ncache] (0x0400): CR #503: Checking negative cache for [UID:0]
   *  (2022-06-03 16:10:44): [nss] [sss_ncache_check_str] (0x2000): Checking negative cache for [NCE/UID/example.net/0]
   *  (2022-06-03 16:10:44): [nss] [sss_ncache_check_str] (0x2000): Checking negative cache for [NCE/UID/0]
   *  (2022-06-03 16:10:44): [nss] [cache_req_search_ncache] (0x0400): CR #503: [UID:0] does not exist (negative cache)

[... whole lot of other CRs skipped but they all the same UID:0 CRs just like the one above]

********************** BACKTRACE DUMP ENDS HERE *********************************

[nss] [cache_req_data_set_hybrid_lookup] (0x0020): cache_req_data should never be NULL
   *  ... skipping repetitive backtrace ...

How reproducible:

We do not have an in-house reproducer for this however it is constantly happening on the customer's IPA installation so if you need additional debug logging this can be requested as long as the expected logging overhead would still allow the system to function.

Expected results:

Even if the message is harmless and does not affect the system function it should not be flooding the logs at the default log level. It would make sense for hybrid_domain_retry_data() to catch the fall-thru and do 2 things: 1) log the CR type and anything else helpful in this case; 2) apply additional logic to stop sssd from repeating the same CR requests ad infinitum.

Comment 1 Alexey Tikhonov 2022-08-08 16:18:54 UTC
Pushed PR: https://github.com/SSSD/sssd/pull/6291

* `master`
    * 96a1dce8096d45e986ab01aaac11d8c77c36d1d7 - CACHE_REQ: Fix hybrid lookup log spamming
* `sssd-2-7`
    * 49eb871847a94311bbd2190a315230e4bae1ea2c - CACHE_REQ: Fix hybrid lookup log spamming

Comment 6 errata-xmlrpc 2022-11-08 10:51:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (sssd bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:7739