Bug 1186548
| Summary: | ns-slapd crash in shutdown phase | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Scott Poore <spoore> | ||||||||||||||
| Component: | 389-ds-base | Assignee: | Noriko Hosoi <nhosoi> | ||||||||||||||
| Status: | CLOSED ERRATA | QA Contact: | Viktor Ashirov <vashirov> | ||||||||||||||
| Severity: | high | Docs Contact: | |||||||||||||||
| Priority: | high | ||||||||||||||||
| Version: | 7.1 | CC: | amsharma, drieden, jgalipea, jkurik, nhosoi, nkinder, rmeggins, spoore | ||||||||||||||
| Target Milestone: | rc | Keywords: | ZStream | ||||||||||||||
| Target Release: | --- | ||||||||||||||||
| Hardware: | x86_64 | ||||||||||||||||
| OS: | Linux | ||||||||||||||||
| Whiteboard: | |||||||||||||||||
| Fixed In Version: | 389-ds-base-1.3.4.0-1.el7 | Doc Type: | Bug Fix | ||||||||||||||
| Doc Text: |
Cause: When shutting down the directory server, if some task threads were running, there was a small window that a task handle was freed by the shutting down main thread, although the task thread still referred the handle.
Consequence: The server process crashed in the shutdown phase.
Fix: The shutting down main thread wait for the task threads' finishing completely. Also task threads check more frequently on the server's shutdown.
Result: There is no more chance for the server to crash in the shutdown phase.
|
Story Points: | --- | ||||||||||||||
| Clone Of: | |||||||||||||||||
| : | 1195293 (view as bug list) | Environment: | |||||||||||||||
| Last Closed: | 2015-11-19 11:43:27 UTC | Type: | Bug | ||||||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||||||
| Documentation: | --- | CRM: | |||||||||||||||
| Verified Versions: | Category: | --- | |||||||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||
| Embargoed: | |||||||||||||||||
| Bug Depends On: | |||||||||||||||||
| Bug Blocks: | 1195293 | ||||||||||||||||
| Attachments: |
|
||||||||||||||||
Created attachment 984921 [details]
abrt output email
Upstream ticket: https://fedorahosted.org/389/ticket/48005 Created attachment 987294 [details]
pthread_mutex_lock lock abrt report
Created attachment 987296 [details]
send_ldap_result_ext crash abrt report
Created attachment 987297 [details]
slapi_get_object_extension crash abrt report
Created attachment 987299 [details]
strlen crash abrt report
Created attachment 987301 [details]
strlen_sse2_pminub crash abrt report
Hi Noriko, your previous comments states that its not possible to reproduce the crash from standalone 389-ds-base. Can you please share any pointers to verify this bug with 389-ds-base alone setup? (In reply to Sankar Ramalingam from comment #23) > Hi Noriko, your previous comments states that its not possible to reproduce > the crash from standalone 389-ds-base. Can you please share any pointers to > verify this bug with 389-ds-base alone setup? If you are interested in, take a look at the CI test: dirsrvtests/tickets/ticket48005_test.py But I haven't seen the crash even with the server without the patch. Scott had to repeat his IPA upgrade test many times to reproduce it. And to verify the fix, he repeated even more... (Thanks, Scott!!) Based on comment https://bugzilla.redhat.com/show_bug.cgi?id=1186548#c24 ============================================= test session starts ============================================== platform linux2 -- Python 2.7.5 -- py-1.4.30 -- pytest-2.7.2 -- /bin/python rootdir: /export/ds/dirsrvtests/tickets, inifile: collected 7 items ticket48005_test.py::test_ticket48005_setup PASSED ticket48005_test.py::test_ticket48005_memberof PASSED ticket48005_test.py::test_ticket48005_automember PASSED ticket48005_test.py::test_ticket48005_syntaxvalidate PASSED ticket48005_test.py::test_ticket48005_usn PASSED ticket48005_test.py::test_ticket48005_schemareload PASSED ticket48005_test.py::test_ticket48005_final PASSED ========================================== 7 passed in 432.63 seconds ========================================== Hence marking as VERIFIED. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-2351.html |
Description of problem: abrt_version: 2.1.11 backtrace_rating: 4 cmdline: /usr/sbin/ns-slapd -D /etc/dirsrv/slapd-TESTRELM-TEST -i /var/run/dirsrv/slapd-TESTRELM-TEST.pid -w /var/run/dirsrv/slapd-TESTRELM-TEST.startpid crash_function: __strlen_sse2_pminub executable: /usr/sbin/ns-slapd hostname: cloud-qe-15.testrelm.test kernel: 3.10.0-123.el7.x86_64 last_occurrence: 1422390267 pid: 21902 pkg_arch: x86_64 pkg_epoch: 0 pkg_name: 389-ds-base pkg_release: 13.el7 pkg_version: 1.3.3.1 pwd: /var/log/dirsrv/slapd-TESTRELM-TEST runlevel: N 3 time: Tue 27 Jan 2015 03:24:27 PM EST uid: 996 username: dirsrv And Noriko from DS Dev team helped by troubleshooting a similar core file to find the issue: It looks the server crashed in the shutdown phase since memberof_fixup_task_thread was still running (about to finish, though). Obviously, there is a bug in the task destruction handling -- not waiting enough to call task destructor. (gdb) bt #0 0x00007fa809846f02 in plugin_call_plugins (pb=pb@entry=0x7fa80aaa84c0, whichfunction=whichfunction@entry=503) at ldap/servers/slapd/plugin.c:365 #1 0x00007fa80983b87a in op_shared_search (pb=pb@entry=0x7fa80aaa84c0, send_result=send_result@entry=1) at ldap/servers/slapd/opshared.c:991 #2 0x00007fa80984ad0e in search_internal_callback_pb (pb=pb@entry=0x7fa80aaa84c0, callback_data=callback_data@entry=0x7fff1db11640, prc=prc@entry=0x7fa809849e60 <internal_plugin_result_callback>, psec=psec@entry=0x7fa809849eb0 <internal_plugin_search_entry_callback>, prec=prec@entry=0x7fa809849e70 <internal_plugin_search_referral_callback>) at ldap/servers/slapd/plugin_internal_op.c:812 #3 0x00007fa80984afa8 in search_internal_pb (pb=pb@entry=0x7fa80aaa84c0) at ldap/servers/slapd/plugin_internal_op.c:665 #4 0x00007fa80984b203 in slapi_search_internal_pb (pb=pb@entry=0x7fa80aaa84c0) at ldap/servers/slapd/plugin_internal_op.c:574 #5 0x00007fa7fcd8b87b in update_integrity (origSDN=0x7fa80a9bac80, newrDN=newrDN@entry=0x0, newsuperior=newsuperior@entry=0x0, logChanges=<optimized out>) at ldap/servers/plugins/referint/referint.c:1198 #6 0x00007fa7fcd8d142 in referint_postop_del (pb=0x7fa80aaa44f0) at ldap/servers/plugins/referint/referint.c:688 #7 0x00007fa809846db0 in plugin_call_func (list=0x7fa80a46d920, operation=operation@entry=563, pb=pb@entry=0x7fa80aaa44f0, call_one=call_one@entry=0) at ldap/servers/slapd/plugin.c:1952 #8 0x00007fa809847008 in plugin_call_list (pb=0x7fa80aaa44f0, operation=563, list=<optimized out>) at ldap/servers/slapd/plugin.c:1886 #9 plugin_call_plugins (pb=pb@entry=0x7fa80aaa44f0, whichfunction=whichfunction@entry=563) at ldap/servers/slapd/plugin.c:459 #10 0x00007fa809804a3d in dse_delete (pb=0x7fa80aaa44f0) at ldap/servers/slapd/dse.c:2547 #11 0x00007fa8097fb000 in op_shared_delete (pb=pb@entry=0x7fa80aaa44f0) at ldap/servers/slapd/delete.c:364 #12 0x00007fa8097fb1b2 in delete_internal_pb (pb=pb@entry=0x7fa80aaa44f0) at ldap/servers/slapd/delete.c:242 #13 0x00007fa8097fb463 in slapi_delete_internal_pb (pb=pb@entry=0x7fa80aaa44f0) at ldap/servers/slapd/delete.c:185 #14 0x00007fa80986e910 in destroy_task (when=when@entry=0, arg=<optimized out>) at ldap/servers/slapd/task.c:678 #15 0x00007fa809873987 in task_shutdown () at ldap/servers/slapd/task.c:2536 <== in the shutdown procedure #16 0x00007fa809d18d99 in slapd_daemon (ports=ports@entry=0x7fff1db140b0) at ldap/servers/slapd/daemon.c:1334 #17 0x00007fa809d0b17c in main (argc=7, argv=0x7fff1db146d8) at ldap/servers/slapd/main.c:1279