Bug 1465383
Summary: | Segmentation fault in valueset_array_to_sorted_quick | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Viktor Ashirov <vashirov> | |
Component: | 389-ds-base | Assignee: | mreynolds | |
Status: | CLOSED ERRATA | QA Contact: | Viktor Ashirov <vashirov> | |
Severity: | urgent | Docs Contact: | ||
Priority: | urgent | |||
Version: | 7.4 | CC: | lkrispen, msauton, nkinder, rmeggins, tbordaz | |
Target Milestone: | rc | Keywords: | ZStream | |
Target Release: | --- | |||
Hardware: | All | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | 389-ds-base-1.3.7.5-10.el7 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1479757 (view as bug list) | Environment: | ||
Last Closed: | 2018-04-10 14:18:12 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1479757 |
Description
Viktor Ashirov
2017-06-27 10:50:14 UTC
- The server crashes because valueset_value_syntax_cmp is unable to compare values valueset_array_to_sorted_quick is looping into a valueset swapping values in order to sort the valueset. Although the number of elements in the valueset is fixed, one of the loop does not end until valueset_value_syntax_cmp>0 A problem is that valueset_value_syntax_cmp returns a negative value in case of failure. so there is now way to make a difference between v1 < v2 and valueset_value_syntax_cmp can not compare v1 and v2. (gdb) where #0 0x00007fda01637bca in valueset_array_to_sorted_quick (a=0x7fff176b2018, vs=vs@entry=0x7fff176b1fb8, low=low@entry=0, high=9) at ldap/servers/slapd/valueset.c:1057 #1 0x00007fda01637cb2 in valueset_array_to_sorted (a=a@entry=0x7fff176b2018, vs=vs@entry=0x7fff176b1fb8) at ldap/servers/slapd/valueset.c:1034 #2 0x00007fda0163822c in slapi_valueset_add_attr_valuearray_ext (a=a@entry=0x7fff176b2018, vs=vs@entry=0x7fff176b1fb8, addvals=addvals@entry=0x7fff176b1b90, naddvals=naddvals@entry=1, flags=flags@entry=1, dup_index=dup_index@entry=0x0) at ldap/servers/slapd/valueset.c:1181 #3 0x00007fda016383a4 in slapi_valueset_add_attr_value_ext (a=a@entry=0x7fff176b2018, vs=vs@entry=0x7fff176b1fb8, addval=<optimized out>, flags=flags@entry=1) at ldap/servers/slapd/valueset.c:1231 #4 0x00007fda015c1537 in str2entry_dupcheck (rawdn=rawdn@entry=0x0, s=<optimized out>, s@entry=0x55c2c4a08000 "# 00core.ldif - Required Schema", flags=flags@entry=192, read_stateinfo=read_stateinfo@entry=-1) at ldap/servers/slapd/entry.c:1119 #5 0x00007fda015c1d85 in slapi_str2entry (s=s@entry=0x55c2c4a08000 "# 00core.ldif - Required Schema", flags=flags@entry=192) at ldap/servers/slapd/entry.c:1344 #6 0x00007fda015b9645 in dse_read_one_file (pdse=pdse@entry=0x55c2c4928190, filename=0x55c2c4910a50 "/usr/share/dirsrv/schema/00core.ldif", pb=pb@entry=0x7fff176b8010, primary_file=primary_file@entry=0) at ldap/servers/slapd/dse.c:764 #7 0x00007fda015b99cd in dse_read_file (pdse=0x55c2c4928190, pb=pb@entry=0x7fff176b8010) at ldap/servers/slapd/dse.c:852 #8 0x00007fda0161dbc5 in init_schema_dse_ext (schemadir=<optimized out>, be=be@entry=0x0, local_pschemadse=local_pschemadse@entry=0x7fda01894750 <pschemadse>, schema_flags=schema_flags@entry=16) at ldap/servers/slapd/schema.c:5375 #9 0x00007fda0161e00c in init_schema_dse (configdir=configdir@entry=0x55c2c491de00 "/etc/dirsrv/slapd-example") at ldap/servers/slapd/schema.c:5447 #10 0x000055c2c1f6d0b1 in setup_internal_backends (configdir=0x55c2c491de00 "/etc/dirsrv/slapd-example") at ldap/servers/slapd/fedse.c:1789 #11 0x000055c2c1f5ac67 in main (argc=5, argv=0x7fff176b9758) at ldap/servers/slapd/main.c:772 # valueset has 10 values (gdb) print *vs $58 = {num = 10, max = 16, sorted = 0x55c2c49d4280, va = 0x55c2c49d4200} # we are in the first loop valueset_value_cmp<0 (j was not yet decrease) (gdb) print high $60 = 9 (gdb) print j $56 = 10 # looping while valueset_value_cmp(a, vs->va[vs->sorted[i]], vs->va[pivot]) < 0 # 'i' was getting above the vs->num number of values (gdb) print i $57 = 16 - When the crash occurs we can see those logs [27/May/2017:15:56:28.098829688 -0400] - ERR - valueset_value_syntax_cmp - slapi_attr_values2keys_sv failed for type attributetypes - The reason why valueset_value_cmp fails is unknown but is likely related to the fact DS fails to retrieve/open dse.ldif Note there are attempts to restore it from dse.ldif.tmp and dse.ldif.bak [27/May/2017:15:56:28.091842082 -0400] - ERR - dse_check_file - The configuration file /etc/dirsrv/slapd-example/dse.ldif was not restored from backup /etc/dirsrv/slapd-example/dse.ldif.tmp, error -1 [27/May/2017:15:56:28.091987636 -0400] - ERR - dse_check_file - The configuration file /etc/dirsrv/slapd-example/dse.ldif was not restored from backup /etc/dirsrv/slapd-example/dse.ldif.bak, error -1 [27/May/2017:15:56:28.091996309 -0400] - ERR - slapd_bootstrap_config - The given config file /etc/dirsrv/slapd-example/dse.ldif could not be accessed, Netscape Portable Runtime error -5950 (File not found.) - How to fix this - I guess that if restore dse.ldif succeeds, valueset_value_syntax_cmp will succeed and there is no crash - We can do hardening in valueset_array_to_sorted_quick so that if 'i' keeps <= high and 'j' keeps >=0 But in case the loops hit that hardening, we know that the valueset is not sorted but this failure is not reported to the caller - valueset_value_syntax_cmp should be able to report a failure I think if the dse.ldif does not exist the syntax plugins will not be initialized and syntax_cmp will not work. The core issue seems to be the missing dse.ldif, maybe this is a variant of #49131 or #49298. I wouldn't invest too much into the valuset sorting and try to get the dse.ldif stuff right. Anf there is no dse.ldif, we should just stop and not attempt to read schema or whatever Is this fixed now with the dse.ldif fsync fix? (In reply to wibrown from comment #3) > Is this fixed now with the dse.ldif fsync fix? This issue was found in the downstream bits, fsync fix has not landed there yet. Also I don't have a standalone reproducer to retest it with a scratch build. Okay. I think we'll just have to assume that the fsync fix will work, and they have to wait for 1.3.7. Does that sound acceptable? Do you think this is important enough for a backport? I think fsync fix is good to have, especially that by default RHEL7 and latest Fedoras use XFS as a default FS, and we saw this issue most often on XFS (this bug included). Though I'd really like to fix valueset_value_syntax_cmp so it doesn't assume a lot of things when dse.ldif is missing. I don't know how much effort/time it will take to fix and if it worth it. we sure can try to harden valueset_value_syntax_cmp(), but if dse.ldif is missing we should't even get to a place where we call it. So we should look into startup code and stop in that case instead of continuing and running into these issues. Build tested: 389-ds-base-1.3.7.5-10.el7.x86_64 With the missing dse* files server no longer crashes, but instead logs EMERGENCY message: Nov 21 10:07:34 rhel7.example.com ds_systemd_ask_password_acl[11773]: grep: /etc/dirsrv/slapd-rhel7/dse.ldif: No such file or directory Nov 21 10:07:34 rhel7.example.com ns-slapd[11778]: [21/Nov/2017:10:07:34.312060355 -0500] - INFO - dse_check_file - The config /etc/dirsrv/slapd-rhel7/dse.ldif can not be accessed. Attempting restore ... (reason: 0) Nov 21 10:07:34 rhel7.example.com ns-slapd[11778]: [21/Nov/2017:10:07:34.312199471 -0500] - INFO - dse_check_file - The backup /etc/dirsrv/slapd-rhel7/dse.ldif.bak can not be accessed. Check it exists and permissions. Nov 21 10:07:34 rhel7.example.com ns-slapd[11778]: [21/Nov/2017:10:07:34.312205141 -0500] - ERR - slapd_bootstrap_config - No valid configurations can be accessed! You must restore /etc/dirsrv/slapd-rhel7/dse.ldif from backup! Nov 21 10:07:34 rhel7.example.com ns-slapd[11778]: [21/Nov/2017:10:07:34.312209129 -0500] - EMERG - main - The configuration files in directory /etc/dirsrv/slapd-rhel7 could not be read or were not found. Please refer to the error log or output for more information. Nov 21 10:07:34 rhel7.example.com systemd[1]: dirsrv: main process exited, code=exited, status=1/FAILURE Nov 21 10:07:34 rhel7.example.com systemd[1]: Failed to start 389 Directory Server rhel7.. Automated tests also pass: ============================= test session starts ============================== platform linux -- Python 3.6.3, pytest-3.2.5, py-1.5.2, pluggy-0.4.0 -- /opt/rh/rh-python36/root/usr/bin/python3 cachedir: .cache metadata: {'Python': '3.6.3', 'Platform': 'Linux-3.10.0-768.el7.x86_64-x86_64-with-redhat-7.5-Maipo', 'Packages': {'pytest': '3.2.5', 'py': '1.5.2', 'pluggy': '0.4.0'}, 'Plugins': {'metadata': '1.5.0', 'html': '1.16.0'}} 389-ds-base: 1.3.7.5-10.el7 nss: 3.34.0-0.1.beta1.el7 nspr: 4.17.0-1.el7 openldap: 2.4.44-9.el7 svrcore: 4.1.3-2.el7 rootdir: /export/tests, inifile: plugins: metadata-1.5.0, html-1.16.0 collected 2 items suites/config/removed_config_49298_test.py::test_restore_config PASSED suites/config/removed_config_49298_test.py::test_removed_config PASSED ========================== 2 passed in 16.16 seconds =========================== Marking as VERIFIED. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0811 |