Bug 1518320
Upstream ticket: https://pagure.io/389-ds-base/issue/49477 Hey, I'm handling the fts on case 01964873. Is this patch for the cu or german? I know the cu already has patched their system with the other debug patches. Is there a process to remove the previous ones or will it require a new install to be setup? thanks. (In reply to joel from comment #76) > Hey, > > I'm handling the fts on case 01964873. > > Is this patch for the cu or german? > > I know the cu already has patched their system with the other debug patches. > Is there a process to remove the previous ones or will it require a new > install to be setup? > It's for the customer, but it should applied on the default version of 1.3.6 that the customer was using before they applied any debug patches. So a fresh hotfix needs to be made with this patch, and only this patch. Created attachment 1477288 [details]
logs
Created attachment 1477295 [details]
thread apply all full bt
Created attachment 1477309 [details]
Amendment to debug patch
Here is a small change to the current debug patch. This should be applied on top of the last debug patch
can you search the entry (or its tombstone) with nscpentrywsi, that should also show what was done to this entry but in the coredump we have: [12/Sep/2018:13:45:16.509020958 -0300] conn=2412189 op=199100 SRCH base="cn=27263152560,ou=Nuevos,dc=abierto,dc=anses,dc=gov,dc=ar" scope=0 filter="(objectClass=*)" attrs="objectClass" [12/Sep/2018:13:45:16.509134066 -0300] conn=2412189 op=199100 RESULT err=0 tag=101 nentries=1 etime=0 and the DEL caused the crash, so where is it gone ? was it deleted successfully after the restart ? could you check the access logs after restart ? upstream ticket pushed (https://pagure.io/389-ds-base/issue/49967) => POST *** Bug 1633111 has been marked as a duplicate of this bug. *** Created attachment 1511788 [details]
New debug patch to be applied on top of 389-ds-base-1.3.6.1-29bz1518320
Created attachment 1513098 [details]
New debug patch to be applied on top of 389-ds-base-1.3.6.1-29
This patch contains the modrdn entry cache fix (ticket 49967), and debug logging. It will apply on top of 389-ds-base-1.3.6.1-29
Created attachment 1513099 [details]
New debug patch to be applied on top of 389-ds-base-1.3.6.1-29
Revised patch: contains fix for modrdn entry cache, and debug logging
*** Bug 1633111 has been marked as a duplicate of this bug. *** =============================================================================================== test session starts ===============================================================================================
platform linux -- Python 3.6.3, pytest-4.4.1, py-1.8.0, pluggy-0.9.0 -- /opt/rh/rh-python36/root/usr/bin/python3
cachedir: .pytest_cache
metadata: {'Python': '3.6.3', 'Platform': 'Linux-3.10.0-1034.el7.x86_64-x86_64-with-redhat-7.7-Maipo', 'Packages': {'pytest': '4.4.1', 'py': '1.8.0', 'pluggy': '0.9.0'}, 'Plugins': {'metadata': '1.8.0', 'html': '1.20.0'}}
389-ds-base: 1.3.9.1-4.el7
nss: 3.43.0-6.el7
nspr: 4.21.0-1.el7
openldap: 2.4.44-21.el7_6
cyrus-sasl: 2.1.26-23.el7
FIPS: disabled
rootdir: /mnt/tests/rhds/tests/upstream/ds/dirsrvtests/tests/suites/memberof_plugin
plugins: metadata-1.8.0, html-1.20.0
collected 6 items / 5 deselected / 1 selected
regression_test.py::test_entrycache_on_modrdn_failure PASSED [100%]
============================================================================== 1 passed, 5 deselected in 24.58 seconds ============================================================================================
Created attachment 1580675 [details]
the two diffs I have used to deliver the test RPM, backported to 7.4
Created attachment 1582849 [details]
Cumulative patch of all entry cache fixes for 389-ds-base-1.3.6
Created attachment 1583290 [details]
Cumulative patch of all entry cache fixes for 389-ds-base-1.3.6 (v2)
Revised patch. This patch can be built on top of 1.3.6.1-29 and delivered to customer
Any news from customer on the status of the latest patch? Have they seen this text in the errors log since deploying the new rpms: "avoided crash" The customer has not experienced any new crashes/downtime since the patch was installed. Will enquire as per the error logs you comment. Just to verify, are these supposed to be found in /var/log/dirsrv/$instance_name/errors ? (In reply to Facundo Jaspe from comment #204) > The customer has not experienced any new crashes/downtime since the patch > was installed. > Great! Just curious, how long ago was the patch actually installed? > Will enquire as per the error logs you comment. Just to verify, are these > supposed to be found in /var/log/dirsrv/$instance_name/errors ? Yes. Thanks, Mark Thank you Mark, you made it happen :-) The patch was installed 9 days ago. Considering it was crashing several times a day in recent times, it's a good sign. The customer management wants to wait up to 15 days to confirm everything is okay (originally, these crashed happened only every 2 or 3 months, but that'd be too much waiting). Customer confirms there hasn't been any new crashes. However, asking about the "avoided crash" messages: [root@ansescvss01 ~]# grep "avoided crash" /var/log/dirsrv/slapd-ansescvss01/errors [08/Jul/2019:17:51:03.246142528 -0300] - ALERT - ldbm_back_delete - DEBUG [retry: 1] - tombstone NOT in cache (nsuniqueid=c952911d-9c5a11e6-99f9afaa-6e3226cf,cn=27045860294,ou=activos,dc=abierto,dc=anses,dc=gov,dc=ar) orig dn (cn=27045860294,ou=Activos,dc=abierto,dc=anses,dc=gov,dc=ar) state (3), but it should be in the cache (avoided crash, but cache was corrupted) [09/Jul/2019:22:50:02.203984180 -0300] - ALERT - ldbm_back_delete - DEBUG [retry: 1] - tombstone NOT in cache (nsuniqueid=25beef02-c5d711e7-983b8eea-47ccf4be,cn=27181937705,ou=activos,dc=abierto,dc=anses,dc=gov,dc=ar) orig dn (cn=27181937705,ou=Activos,dc=abierto,dc=anses,dc=gov,dc=ar) state (3), but it should be in the cache (avoided crash, but cache was corrupted) [10/Jul/2019:07:42:46.614715017 -0300] - ALERT - ldbm_back_delete - DEBUG [retry: 1] - tombstone NOT in cache (nsuniqueid=85d1b188-fc6811e7-a36e8eea-47ccf4be,cn=23127210624,ou=activos,dc=abierto,dc=anses,dc=gov,dc=ar) orig dn (cn=23127210624,ou=Activos,dc=abierto,dc=anses,dc=gov,dc=ar) state (3), but it should be in the cache (avoided crash, but cache was corrupted) [10/Jul/2019:10:07:45.804045134 -0300] - ALERT - ldbm_back_delete - DEBUG [retry: 1] - tombstone NOT in cache (nsuniqueid=e43e2119-159811e9-aacc8eea-47ccf4be,cn=27491861415,ou=activos,dc=abierto,dc=anses,dc=gov,dc=ar) orig dn (cn=27491861415,ou=Activos,dc=abierto,dc=anses,dc=gov,dc=ar) state (3), but it should be in the cache (avoided crash, but cache was corrupted) [10/Jul/2019:10:13:58.503394544 -0300] - ALERT - ldbm_back_delete - DEBUG [retry: 1] - tombstone NOT in cache (nsuniqueid=182a918a-2e3811e9-9bd78eea-47ccf4be,cn=27378554123,ou=activos,dc=abierto,dc=anses,dc=gov,dc=ar) orig dn (cn=27378554123,ou=Activos,dc=abierto,dc=anses,dc=gov,dc=ar) state (3), but it should be in the cache (avoided crash, but cache was corrupted) [10/Jul/2019:11:11:26.110331522 -0300] - ALERT - ldbm_back_delete - DEBUG [retry: 1] - tombstone NOT in cache (nsuniqueid=129e0706-d60f11e8-89f88eea-47ccf4be,cn=27219167704,ou=activos,dc=abierto,dc=anses,dc=gov,dc=ar) orig dn (cn=27219167704,ou=Activos,dc=abierto,dc=anses,dc=gov,dc=ar) state (3), but it should be in the cache (avoided crash, but cache was corrupted) [10/Jul/2019:11:21:56.508787808 -0300] - ALERT - ldbm_back_delete - DEBUG [retry: 1] - tombstone NOT in cache (nsuniqueid=d85a3185-338411e9-9bd78eea-47ccf4be,cn=27426509402,ou=activos,dc=abierto,dc=anses,dc=gov,dc=ar) orig dn (cn=27426509402,ou=Activos,dc=abierto,dc=anses,dc=gov,dc=ar) state (3), but it should be in the cache (avoided crash, but cache was corrupted) [10/Jul/2019:13:20:57.168263463 -0300] - ALERT - ldbm_back_delete - DEBUG [retry: 1] - tombstone NOT in cache (nsuniqueid=5a8e0f85-74f111e8-b6308eea-47ccf4be,cn=27944217393,ou=activos,dc=abierto,dc=anses,dc=gov,dc=ar) orig dn (cn=27944217393,ou=Activos,dc=abierto,dc=anses,dc=gov,dc=ar) state (3), but it should be in the cache (avoided crash, but cache was corrupted) [10/Jul/2019:13:34:38.114030528 -0300] - ALERT - ldbm_back_delete - DEBUG [retry: 1] - tombstone NOT in cache (nsuniqueid=a85ea1a3-2bc011e9-9bd78eea-47ccf4be,cn=27403824203,ou=activos,dc=abierto,dc=anses,dc=gov,dc=ar) orig dn (cn=27403824203,ou=Activos,dc=abierto,dc=anses,dc=gov,dc=ar) state (3), but it should be in the cache (avoided crash, but cache was corrupted) [10/Jul/2019:16:50:51.683559337 -0300] - ALERT - ldbm_back_delete - DEBUG [retry: 1] - tombstone NOT in cache (nsuniqueid=537fd20a-e3cf11e6-99f9afaa-6e3226cf,cn=20323835897,ou=activos,dc=abierto,dc=anses,dc=gov,dc=ar) orig dn (cn=20323835897,ou=Activos,dc=abierto,dc=anses,dc=gov,dc=ar) state (3), but it should be in the cache (avoided crash, but cache was corrupted) [10/Jul/2019:23:00:23.540723331 -0300] - ALERT - ldbm_back_delete - DEBUG [retry: 1] - tombstone NOT in cache (nsuniqueid=fe95cf03-d55411e8-89f88eea-47ccf4be,cn=27367604269,ou=activos,dc=abierto,dc=anses,dc=gov,dc=ar) orig dn (cn=27367604269,ou=Activos,dc=abierto,dc=anses,dc=gov,dc=ar) state (3), but it should be in the cache (avoided crash, but cache was corrupted) [10/Jul/2019:23:33:53.959808850 -0300] - ALERT - ldbm_back_delete - DEBUG [retry: 1] - tombstone NOT in cache (nsuniqueid=f025e92c-54de11e9-84dd8eea-47ccf4be,cn=27144082503,ou=activos,dc=abierto,dc=anses,dc=gov,dc=ar) orig dn (cn=27144082503,ou=Activos,dc=abierto,dc=anses,dc=gov,dc=ar) state (3), but it should be in the cache (avoided crash, but cache was corrupted) [11/Jul/2019:08:01:37.641922112 -0300] - ALERT - ldbm_back_delete - DEBUG [retry: 1] - tombstone NOT in cache (nsuniqueid=97ee96a0-a21811e7-a349afaa-6e3226cf,cn=20180411357,ou=activos,dc=abierto,dc=anses,dc=gov,dc=ar) orig dn (cn=20180411357,ou=Activos,dc=abierto,dc=anses,dc=gov,dc=ar) state (3), but it should be in the cache (avoided crash, but cache was corrupted) [11/Jul/2019:11:08:39.610091660 -0300] - ALERT - ldbm_back_delete - DEBUG [retry: 1] - tombstone NOT in cache (nsuniqueid=ab97c7b8-599c11e6-9ad19ecf-c04433d2,cn=23086267659,ou=activos,dc=abierto,dc=anses,dc=gov,dc=ar) orig dn (cn=23086267659,ou=Activos,dc=abierto,dc=anses,dc=gov,dc=ar) state (3), but it should be in the cache (avoided crash, but cache was corrupted) [root@ansescvss01 ~]# Created attachment 1589592 [details]
Cumulative patch of all entry cache fixes for 389-ds-base-1.3.6 (v3)
This patch addresses an issue with the debug message that was incorrectly logged. The actual fix is the same as the last patch.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2152 |
Description of problem: crash in ns-slapd when attempting to add a dn to hash. What is strange is that the value is attempted to be added during DEL operation (shouldn't the value be deleted from the hash ?) My findings in private comments. #0 __strcmp_sse42 () at ../sysdeps/x86_64/multiarch/strcmp-sse42.S:164 #1 0x00007fcb5cc9b1ec in entry_same_dn (e=<optimized out>, k=0x561251f093a0) at ldap/servers/slapd/back-ldbm/cache.c:142 #2 0x00007fcb5cc9b3ee in add_hash (ht=0x561235944000, key=key@entry=0x561251f093a0, keylen=<optimized out>, entry=entry@entry=0x56124e195440, alt=alt@entry=0x7fcb236c73d0) at ldap/servers/slapd/back-ldbm/cache.c:191 #3 0x00007fcb5cc9c8aa in entrycache_add_int (cache=cache@entry=0x5612358ee3c8, e=0x56124e195440, state=state@entry=2, alt=alt@entry=0x0) at ldap/servers/slapd/back-ldbm/cache.c:1297 #4 0x00007fcb5cc9cb3d in cache_add_tentative (cache=cache@entry=0x5612358ee3c8, e=<optimized out>, alt=alt@entry=0x0) at ldap/servers/slapd/back-ldbm/cache.c:1497 #5 0x00007fcb5ccd201d in ldbm_back_delete (pb=0x7fcb236c7a50) at ldap/servers/slapd/back-ldbm/ldbm_delete.c:717 #6 0x00007fcb6875efd0 in op_shared_delete (pb=pb@entry=0x7fcb236c7a50) at ldap/servers/slapd/delete.c:331 #7 0x00007fcb6875f352 in do_delete (pb=pb@entry=0x7fcb236c7a50) at ldap/servers/slapd/delete.c:97 #8 0x0000561233bd8642 in connection_dispatch_operation (pb=0x7fcb236c7a50, op=0x561237af44e0, conn=0x561237241e38) at ldap/servers/slapd/connection.c:623 #9 connection_threadmain () at ldap/servers/slapd/connection.c:1772 #10 0x00007fcb66b119bb in _pt_root (arg=0x5612371e1440) at ../../../nspr/pr/src/pthreads/ptthread.c:216 #11 0x00007fcb664b1e25 in start_thread (arg=0x7fcb236c8700) at pthread_create.c:308 #12 0x00007fcb65d9334d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: