Hide Forgot
Description of problem: this is a clone of ticket DS 49008 this is Ludwig's investigation and description of the issue: if a plugin operation succeeded, but the operation itself fails and is aborted the RUV is in an incorrect state (rolled up to the succesful plugin op). Here is test case: enable memberof and add two members to a group, the second has no objectclasss allowing memberof, so the complete operation fails with err=65 ===================================== [13/Oct/2016:14:48:21 +0200] NSMMReplicationPlugin - ruv_add_csn_inprogress: successfully inserted csn 57ff838e000000640000 into pending list <<<< main MOD [13/Oct/2016:14:48:21 +0200] NSMMReplicationPlugin - ruv_add_csn_inprogress: successfully inserted csn 57ff838e000100640000 into pending list <<<< succesful memberof [13/Oct/2016:14:48:21 +0200] NSMMReplicationPlugin - ruv_update_ruv: successfully committed csn 57ff838e000100640000 [13/Oct/2016:14:48:21 +0200] NSMMReplicationPlugin - ruv_add_csn_inprogress: successfully inserted csn 57ff838e000200640000 into pending list <<< failing memberof [13/Oct/2016:14:48:21 +0200] NSMMReplicationPlugin - csn=57ff838e000200640000 process postop: canceling operation csn [13/Oct/2016:14:48:22 +0200] NSMMReplicationPlugin - conn=866 op=1 csn=57ff838e000000640000 process postop: canceling operation csn [13/Oct/2016:14:48:22 +0200] NSMMReplicationPlugin - agmt="cn=to200" (localhost:4945): {replica 100 ldap://localhost:30522} 57f4cb95000000640000 57ff838e000100640000 57ff8295 nsds50ruv: {replicageneration} 57bf05e40000012c0000 nsds50ruv: {replica 100 ldap://localhost:30522} 57f4cb95000000640000 57ff838e000100640000 ====================================== And we can see these messages in the errors log: This can be seen when having locker not able to resolve deadlocks. The messages that could show we are in presence of this bug are: ============================== [17/Oct/2016:10:32:07 +0000] NSMMReplicationPlugin - changelog program - _cl5WriteOperationTxn: retry (49) the transaction (csn=5804e0db000800030000) failed (rc=-30993 (BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock)) [17/Oct/2016:10:32:07 +0000] NSMMReplicationPlugin - changelog program - _cl5WriteOperationTxn: failed to write entry with csn (5804e0db000800030000); db error - -30993 BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock [17/Oct/2016:10:32:07 +0000] NSMMReplicationPlugin - write_changelog_and_ruv: can't add a change for <entry dn> (uniqid: 40d752a2-4fbb11e5-93a0d259-b00ed7d3, optype: 32) to changelog csn 5804e0db000800030000 [17/Oct/2016:10:32:07 +0000] - SLAPI_PLUGIN_BE_TXN_POST_DELETE_FN plugin returned error code but did not set SLAPI_RESULT_CODE [17/Oct/2016:10:32:07 +0000] agmt="<agmt dn>" (<host>:389) - Can't locate CSN 5804e0dc000200060000 in the changelog (DB rc=-30988). If replication stops, the consumer may need to be reinitialized. ========================================= Version-Release number of selected component (if applicable): will be reproduceable in RHEL7.3 version of 389-ds-base For the moment, in release 33 + replication hotfix can be seen. How reproducible: not easy. A possible workaround would be to set: nsslapd-db-deadlock-policy to 6 or, probably, to increase the nsslapd-db-locks since it seems to be a performance issue. I will attach the three cases where we have seen this.
*** Bug 1339693 has been marked as a duplicate of this bug. ***
========================================================== test session starts ========================================================== platform linux2 -- Python 2.7.5, pytest-3.0.7, py-1.4.33, pluggy-0.4.0 -- /usr/bin/python cachedir: .cache metadata: {'Python': '2.7.5', 'Platform': 'Linux-3.10.0-663.el7.x86_64-x86_64-with-redhat-7.4-Maipo', 'Packages': {'py': '1.4.33', 'pytest': '3.0.7', 'pluggy': '0.4.0'}, 'Plugins': {'beakerlib': '0.7.1', 'html': '1.14.2', 'cov': '2.5.1', 'metadata': '1.5.0'}} DS build: 1.3.6.1 389-ds-base: 1.3.6.1-14.el7 nss: 3.28.4-6.el7 nspr: 4.13.1-1.0.el7_3 openldap: 2.4.44-4.el7 svrcore: 4.1.3-2.el7 rootdir: /export/tests, inifile: plugins: metadata-1.5.0, html-1.14.2, cov-2.5.1, beakerlib-0.7.1 collected 1 items tickets/ticket49008_test.py::test_ticket49008 PASSED ======================================================= 1 passed in 74.82 seconds ======================================================= Marking as VERIFIED.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2086