Bug 2217921
| Summary: | nscd aborts with failed assert in prune_cache | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | yanf | ||||||||
| Component: | glibc | Assignee: | glibc team <glibc-bugzilla> | ||||||||
| Status: | CLOSED MIGRATED | QA Contact: | qe-baseos-tools-bugs | ||||||||
| Severity: | medium | Docs Contact: | |||||||||
| Priority: | medium | ||||||||||
| Version: | 8.8 | CC: | ashankar, casantos, codonell, cww, dj, fweimer, jwright, mijjapur, pfrankli, sipoyare | ||||||||
| Target Milestone: | rc | Keywords: | Bugfix, MigratedToJIRA, Triaged | ||||||||
| Target Release: | --- | Flags: | mijjapur:
needinfo?
(yanf) |
||||||||
| Hardware: | Unspecified | ||||||||||
| OS: | Unspecified | ||||||||||
| Whiteboard: | |||||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||
| Doc Text: | Story Points: | --- | |||||||||
| Clone Of: | Environment: | ||||||||||
| Last Closed: | 2023-08-11 14:43:46 UTC | Type: | Bug | ||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Embargoed: | |||||||||||
| Attachments: |
|
||||||||||
Obviously, if I clear the cache, problem goes away. I have the problematic passwd cache file, but can't post it here for obvious reaons. Might be able to send it direct under our mutual NDA. Created attachment 1972857 [details]
*actual* nscd backtrace
The previous bt was a related one from crond, but I was able to repro the issue under gdb, which gave this backtrace.
Created attachment 1972871 [details]
nscd backtrace with all symbols
After adding the missing debug symbols package
If you are a Red Hat customer with an active subscription, please visit the Red Hat Customer Portal [1] for assistance with your issue. [1] http://access.redhat.com/ I'm providing the required link to the support ticket in the customer portal. I looked at this for some time and I'm still not sure what might be causing this. We need some sort of reproducer, or at least the corrupted mapping that triggers this. This issue seems different from the known concurrency issues (which I think cannot happen on x86-64 due to its strong memory model). I wonder if it could be caused by inconsistent data coming back from LDAP and trigger expiration of cache entries that is not time-based, hence triggering an assert. @fweimer I uploaded the corrupt nscd passwd file to RH case number 03548682 aes-256-cbc encrypted. You will need a password to decrypt it. Please reach out of band. @yanf Hello Yan! I have updated the support ticket and updating the same information here for your reference - Please update the support ticket with the requested information, and we will take this further. Thanks! - Murali ==================================================== >>Hello Yan! >>Thank you for updating the support ticket. >>I see that you want to share the decryption password for the file that you have shared here on the support ticket as well as on the BugZilla ticket. >>I understand that you want to share the password out of band over email. However, this is not a recommended process. >>It is best to keep all the communication and information on the support portal for security reasons, and tracking purposes. >>I had a word with Florian, engineer working on the bug ticket to get a better understanding of the progress we have had so far on the issue. >>Let's work on this together for sharing the password on an alternate secure medium; for the engineer to access it and work on the issue. >>I tried calling you on the number that we have on file for your contact - "2124780000". But, it looks like a dummy placeholder number, and I wasn't able to reach you. >>Could you provide your contact number along with country code to reach you and discuss this further? >>Awaiting your response. >>Thank you! >>Regards, >>Murali Prudhvi. ==================================================== This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there. To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "RHEL-" followed by an integer. You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like: "Bugzilla Bug" = 1234567 In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. |
Created attachment 1972848 [details] bt full of ABRT event Description of problem: NSCD exits with ABRT when reading `passwd` cache. Version-Release number of selected component (if applicable): glibc-2.28-189.1.el8.x86_64 How reproducible: always Steps to Reproduce: 1. start nscd 2. ABRT almost immediately Actual results: strace shows : [pid 219045] write(2</dev/null>, "nscd: cache.c:426: prune_cache: Assertion `dh->usable' failed.\n", 63) = 63 Expected results: runs without error Additional info: Debug output (actual IDs redacted for safety / confidentiality) : Mon 26 Jun 2023 06:55:18 PM EDT - 452466: Reloading "<redacted>" in user database cache! Mon 26 Jun 2023 06:55:18 PM EDT - 452466: Reloading "<redacted>" in user database cache! Mon 26 Jun 2023 06:55:18 PM EDT - 452466: Reloading "<redacted>" in user database cache! Mon 26 Jun 2023 06:55:18 PM EDT - 452466: Reloading "<redacted>" in user database cache! nscd: cache.c:426: prune_cache: Assertion `dh->usable' failed. Back trace : #0 0x00007fd2d11336cc in __nscd_get_map_ref () from /lib64/libc.so.6 #1 0x00007fd2d112fa7a in nscd_getpw_r () from /lib64/libc.so.6 #2 0x00007fd2d112feac in __nscd_getpwuid_r () from /lib64/libc.so.6 #3 0x00007fd2d10c6dbf in getpwuid_r@@GLIBC_2.2.5 () from /lib64/libc.so.6 #4 0x00007fd2d17d0976 in pam_modutil_getpwuid () from /lib64/libpam.so.0 #5 0x00007fd2cdd8cb12 in pam_sm_authenticate () from /usr/lib64/security/pam_succeed_if.so #6 0x00007fd2d17ca7b4 in _pam_dispatch () from /lib64/libpam.so.0 #7 0x00005630c43259a3 in cron_close_pam () #8 0x00005630c43251cf in do_command () #9 0x00005630c4324170 in job_runqueue () #10 0x00005630c432193c in main () bt full attached.