Bug 677483
Summary: | export task followed by import task causes cache assertion | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Rich Megginson <rmeggins> |
Component: | 389-ds-base | Assignee: | Rich Megginson <rmeggins> |
Status: | CLOSED ERRATA | QA Contact: | Chandrasekar Kannan <ckannan> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 6.1 | CC: | amsharma, benl, jgalipea, nhosoi, shaines |
Target Milestone: | rc | Keywords: | screened |
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | 389-ds-base-1.2.8-0.3.a3.el6 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | 676053 | Environment: | |
Last Closed: | 2011-05-19 12:41:51 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 676053, 678369 | ||
Bug Blocks: | 639035, 656390, 676871 |
Comment 1
Scott Haines
2011-02-21 21:35:54 UTC
Hi Noriko, I am testing Bug 677483 - export task followed by import task causes cache assertion. This is very interesting and big one, like an interesting novel :) I concluded there are in total 3 bugs under this one - 1. Description: Task version of export had a bug in handling the busy instance error case. When returning due to the busy error, the function ldbm_back_ldbm2ldif reset the busy bit set by other threads. This patch checks the special return value set in the busy error case and resets the busy bit only when it is set by the function. Verify steps : Set up 2-way mmr 1. On one window, be a root: # cd /usr/lib64/dirsrv/slapd-amsharma # while true; do ./db2ldif.pl -D "cn=directory Manager" -w Secret123 -r -n userRootNT -a /tmp/export.ldif; ./ldif2db.pl -D "cn=Directory Manager" -w Secret123 -n userRootNT " -i /tmp/export.ldif; done **********It gives me Operations error (1) ********************************** 2. On another window: $ while true; do ldapsearch -x -h localhost -p 1389 -D "cn=Directory Manager" -w Secret123 -b "dc=example,dc=com" "(cn=*)"; done **********It gives me Operations error (1)************************************** Run the commands for an hour or so. If you don't see the following error message in the error log and the server keeps running, the fix is verified. ************I am getting below *********************************************** [amsharma@amsharma /]$ tail -f /var/log/dirsrv/slapd-amsharma/errors [30/Mar/2011:19:08:00 +051800] - ldbm: 'example' is already in the middle of another task and cannot be disturbed. [30/Mar/2011:19:08:00 +051800] - import example: Processing file "/tmp/export.ldif" [30/Mar/2011:19:08:00 +051800] - import example: Finished scanning file "/tmp/export.ldif" (9 entries) [30/Mar/2011:19:08:01 +051800] - import example: Workers finished; cleaning up... [30/Mar/2011:19:08:01 +051800] - import example: Workers cleaned up. [30/Mar/2011:19:08:01 +051800] - import example: Cleaning up producer thread... [30/Mar/2011:19:08:01 +051800] - import example: Indexing complete. Post-processing... [30/Mar/2011:19:08:01 +051800] - import example: Flushing caches... [30/Mar/2011:19:08:01 +051800] - import example: Closing files... [30/Mar/2011:19:08:02 +051800] - import example: Import complete. Processed 9 entries in 1 seconds. (9.00 entries/sec) ********************Noriko, is it as expected, I am not getting this error "entrycache_clear_int"********************************************************************************* 2. Description: When Simple Paged Results is requested and a page is returned, one entry is read ahead to check whether more entries exist or not. The read-ahead retrieves an entry (if any) and adds it into the entry cache. Simple Paged Results code puts the read- ahead entry back, but there was missing to call cache_return for the entry (that decrementing refcnt). If ldif2db.pl is called with the cache state, it finds out the entry which is still referred. This patch calls cache_return when the Simple Paged Results puts the read-ahead entry back. Plus, adding a debug function dump_hash. Verify steps : Prepare a server with some entries (> 10 entries). Run Simple Paged Result search and stops before getting all entries. $ ldapsearch -x -h localhost -p 1389 -b "dc=example,dc=com" -E pr=2 "(cn=*)" ... # search result search: 8 result: 0 Success control: 1.2.840.113556.1.4.319 false MAcCAgPbBAEy pagedresults: estimate=987 cookie=Mg== Press [size] Enter for the next {2|size} entries. Run ./db2ldif.pl # ./db2ldif.pl -D 'cn=directory manager' -w Secret123 -n userRootNT -a /tmp/export.ldif Run ./ldif2db.pl # ./ldif2db.pl -n userRootNT -D 'cn=directory manager' -w Secret123 -i /tmp/export.ldif Check error log: # grep entrycache_clear_int /var/log/dirsrv/slapd-ID/errors # echo $? 1 If the keyword is not found in the error log, the bug was verified. *************This one is all set, no errors are there********************************************** 3. Description: When a search request with VLV and/or SORT control fails, it did not returning an entry to the entry cache. The entry has positive refcnt and won't be cleared even by cache_clear. This patch adds CACHE_RETURN call for the error cases. ********I think it is covered in the 4th one? or do I need to verify this with it some other steps? ******** 4. Description: There were 3 places where an entry was not released by CACHE_RETURN (not decrimenting refcnt). If an entry has positive refcnt in the entry cache, it won't be released even if the entry never be accessed. 1. When a search request with VLV and/or SORT control fails. 2. When comparing entries in compare_entries_sv, and the second entry is not found, the first entry is not released. 3. vlv_trim_candidates_byvalue retrieves entries for performing binary search over the candidate list and put them into the cache. They are not released. Verify steps. 1. setup a server with suffix "o=umc". ./AddSuffix "o=umc" userRootNT 1389 localhost shutdown the server put 99umcschema.ldif in /etc/dirsrv/slapd-ID/schema open dse.ldif and append the contents of index.ldif at the end of the file import db_Febr15_noRep.ldif. ldif2db -n userRootNT -i /tmp/db_Febr15_noRep.ldif start the server 2. run ldapsearch: $ /usr/lib[64]/mozldap/ldapsearch -p <port> -D 'cn=cli,ou=components,o=operators,o=UMC' -w cli -b "o=umc" -s sub -S '-createTimestamp' -x -G 0:2:3 "(objectClass=*)" "createTimestamp modifyTimestamp" cd /usr/lib64/mozldap /usr/lib64/mozldap/ldapsearch -p 1389 -D 'cn=cli,ou=components,o=operators,o=UMC' -w cli -b "o=umc" -s sub -S '-createTimestamp' -x -G 0:2:3 "(objectClass=*)" "createTimestamp modifyTimestamp" ******** Above ldapsearch is not giving me any output?? Is it as expected? **************** 3. run task import cd /usr/lib64/dirsrv/slapd-amsharma # ./ldif2db.pl -D 'cn=directory manager' -w Secret123 -n userRootNT -i /tmp/db_Febr15_noRep.ldif 4. check error log # grep entrycache_clear_int /var/log/dirsrv/slapd-amsharma/errors If the keyword is not found, the bug is verified. ****Note : I did not face any slapd crash and error "entrycache_clear_int" while executing above scenarios************************ An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHEA-2011-0533.html |