Bug 627738
Summary: | The cn=monitor statistics entries for the dnentry cache do not change or change very rarely | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Retired] 389 | Reporter: | Andrey Ivanov <andrey.ivanov> | ||||||
Component: | Database - Indexes/Searches | Assignee: | Noriko Hosoi <nhosoi> | ||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Viktor Ashirov <vashirov> | ||||||
Severity: | low | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 1.2.6 | CC: | amsharma, jgalipea | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2015-12-07 17:13:55 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 639035 | ||||||||
Attachments: |
|
Description
Andrey Ivanov
2010-08-26 19:29:30 UTC
Dn cache works like a backup of entry cache. For instance, when you run a search, if the search target is found in the entry cache, the entry is returned without searching the dn cache. So, if you have a large entry cache and the entry cache is filled with the entries, there could be few chances the dn cache is consulted. Dn cache is implemented to go to the entryrdn index as little as possible. I ran a test adding 10K entries with a very small entry cache: maxentrycachesize: 1048576 maxdncachesize: 10485760 <== not allowed to change the size Started with this set of values: entrycachehits: 6 entrycachetries: 8 entrycachehitratio: 75 currententrycachesize: 2418 maxentrycachesize: 1048576 currententrycachecount: 1 maxentrycachecount: -1 dncachehits: 0 dncachetries: 1 dncachehitratio: 0 currentdncachesize: 67 maxdncachesize: 10485760 currentdncachecount: 1 maxdncachecount: -1 and ended with this set: entrycachehits: 30001 entrycachetries: 40013 entrycachehitratio: 74 currententrycachesize: 1048129 maxentrycachesize: 1048576 currententrycachecount: 170 maxentrycachecount: -1 dncachehits: 10000 dncachetries: 10006 dncachehitratio: 99 currentdncachesize: 1292746 maxdncachesize: 10485760 currentdncachecount: 10006 maxdncachecount: -1 Since this is repeated add, entrycache was not very useful and dncache's hit ratio was pretty good (most likely it's used for getting the parents of the adding entries). Then, I ran repeated "(cn=*)" search, which overflows small entry cache. Again, the entry cache hit ratio was pretty poor, while the dn cache hit ratio was good. This is because, the dn cache size is usually large enough for the most case. entrycachehits: 30060 entrycachetries: 565454 entrycachehitratio: 5 currententrycachesize: 1044494 maxentrycachesize: 1048576 currententrycachecount: 166 maxentrycachecount: -1 dncachehits: 535267 dncachetries: 535283 dncachehitratio: 99 currentdncachesize: 1293986 maxdncachesize: 10485760 currentdncachecount: 10016 maxdncachecount: -1 Please evaluate your entry cache size. I imagine you allocated enough size for your server (which is ideal. Please bear in mind the dn cache is secondary cache to assist the primary entry cache. Created attachment 441634 [details]
git patch file (master)
These are not the fixes for the subject of this bug, but found while looking into the dn cache.
Description:
1) id2entry_add_ext replaces an dn instance in the dn cache even if
the DN value is identical. Replace it only when they don't match.
File: ldap/servers/slapd/back-ldbm/id2entry.c
2) Modifying the dn cache related labels:
dnEntrycachehits
dnEntrycachetries
dnEntrycachehitratio
dnCurrententrycachesize
dnMaxentrycachesize
dnCurrententrycachecount
dnMaxentrycachecount
===>
dncachehits
dncachetries
dncachehitratio
currentdncachesize
maxdncachesize
currentdncachecount
maxdncachecount
File: ldap/servers/slapd/back-ldbm/monitor.c
Thank you, Noriko, for your explanations. Indeed, i have allocated enough space in memory to keep all the entries in the entry cache. And the fact that the dn cache is used only if the entry cache does not help can be deduced from the description of dn cache in the design document (http://directory.fedoraproject.org/wiki/Subtree_Rename#dn_cache). The strange thing for me was that these values were still changing sometimes, very rarely. After some research i have found the reason for this behaviour. The hits and tries of dn cache change every time the server does the backup export (with db2ldif.pl). The delta of the change corresponds to the number of exported entries. And we export every 12 hours, so everything is clear now :) Thanks again! Comment on attachment 441634 [details]
git patch file (master)
Reviewed by Rich (Thanks!!!)
Pushed to master.
$ git push
Counting objects: 67, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (41/41), done.
Writing objects: 100% (41/41), 14.98 KiB, done.
Total 41 (delta 33), reused 0 (delta 0)
To ssh://git.fedorahosted.org/git/389/ds.git
a2bcd81..cc36301 master -> master
When i apply the id2entry.c part of this patch, namely this code part : struct backdn *bdn = backdn_init(sdn, e->ep_id, 0); options |= SLAPI_DUMP_RDN_ENTRY; - /* If the ID already exists in the DN cache, replace it. */ - if (CACHE_ADD( &inst->inst_dncache, bdn, &oldbdn ) == 1) { + /* If the ID already exists in the DN cache && the DNs do not match, + * replace it. */ + if ((CACHE_ADD( &inst->inst_dncache, bdn, &oldbdn ) == 1) && + (slapi_sdn_compare(sdn, oldbdn->dn_sdn))) { cache_replace( &inst->inst_dncache, oldbdn, bdn ); CACHE_RETURN(&inst->inst_dncache, &oldbdn); /* to free oldbdn */ } , there is an error during import with the dn cache : [02/Sep/2010:13:10:34 +0200] - import userRoot: Workers finished; cleaning up... [02/Sep/2010:13:10:34 +0200] - import userRoot: Workers cleaned up. [02/Sep/2010:13:10:34 +0200] - import userRoot: Cleaning up producer thread... [02/Sep/2010:13:10:34 +0200] - import userRoot: Indexing complete. Post-processing... [02/Sep/2010:13:10:34 +0200] - import userRoot: Flushing caches... [02/Sep/2010:13:10:34 +0200] - import userRoot: Closing files... ----> [02/Sep/2010:13:10:34 +0200] - dncache_clear_int: there are still 53 dn's in the dn cache. :/ <---- [02/Sep/2010:13:10:35 +0200] - All database threads now stopped [02/Sep/2010:13:10:35 +0200] - import userRoot: Import complete. Processed 9527 entries in 13 seconds. (732.85 entries/sec) Created attachment 442689 [details] Fix for the previous patch: id=441634 Description: Change made in this commit: cc36301a7cae6737d9f8a0e53bed653a52130a1d for the following bug introduced a missing-cache-return bug: id2entry_add_ext replaces an dn instance in the dn cache even if the DN value is identical. Replace it only when they don't match. Thanks to andrey.ivanov for finding it out: https://bugzilla.redhat.com/show_bug.cgi?id=627738#c5 Can you please add steps to reproduce and verify, what was actually fixed Thanks! Run some repeated operations on a server. While the operations are running, run the search on the monitor and grep dncache related attributes: # ldapsearch -x -h localhost -p <port> -D 'cn=directory manager' -w <pw> -b "cn=monitor,cn=<backend>,cn=ldbm database,cn=plugins,cn=config" "(cn=*)" | egrep dncache dncachehits: 5122 dncachetries: 5124 dncachehitratio: 99 currentdncachesize: 548818 maxdncachesize: 10485760 currentdncachecount: 5124 maxdncachecount: -1 Wait for a while and run it again. # ldapsearch -x -h localhost -p <port> -D 'cn=directory manager' -w <pw> -b "cn=monitor,cn=<backend>,cn=ldbm database,cn=plugins,cn=config" "(cn=*)" | egrepdncache dncachehits: 5143 dncachetries: 5145 dncachehitratio: 99 currentdncachesize: 550971 maxdncachesize: 10485760 currentdncachecount: 5145 maxdncachecount: -1 If the counters are updated, this bug could be marked verified. Hi Noriko, In my case,these entries did not get changed: [root@rhel61-ds90-amita pwpolicy]# ldapsearch -x -h localhost -p 20100 -D 'cn=directory manager' -w Secret123 -b "cn=monitor,cn=userRoot,cn=ldbm database,cn=plugins,cn=config" "(cn=*)" | egrep dncache dncachehits: 0 dncachetries: 6 dncachehitratio: 0 currentdncachesize: 696 maxdncachesize: 10485760 currentdncachecount: 6 maxdncachecount: -1 [root@rhel61-ds90-amita pwpolicy]# sleep 15 [root@rhel61-ds90-amita pwpolicy]# ldapsearch -x -h localhost -p 20100 -D 'cn=directory manager' -w Secret123 -b "cn=monitor,cn=userRoot,cn=ldbm database,cn=plugins,cn=config" "(cn=*)" | egrep dncache dncachehits: 0 dncachetries: 6 dncachehitratio: 0 currentdncachesize: 696 maxdncachesize: 10485760 currentdncachecount: 6 maxdncachecount: -1 [root@rhel61-ds90-amita pwpolicy]# ldapsearch -x -h localhost -p 20100 -D 'cn=directory manager' -w Secret123 -b "cn=monitor,cn=userRoot,cn=ldbm database,cn=plugins,cn=config" "(cn=*)" | egrep dncache dncachehits: 0 dncachetries: 6 dncachehitratio: 0 currentdncachesize: 696 maxdncachesize: 10485760 currentdncachecount: 6 maxdncachecount: -1 I am running this operation in parallel : while [ 0=0 ] ; do ldapsearch -x -p 20100 -h localhost -D "cn=Directory Manager" -w Secret123 -b "ou=people,dc=replsuffix,dc=com"; done Q: do I need to execute some specific operation in parallel? Is the suffix "dc=replsuffix,dc=com" belongs to the backend userRoot? I guess when you set NEEDINFO, you'd better give me the IP address of your test system. Is the suffix "dc=replsuffix,dc=com" belongs to the backend userRoot? ==> Does the suffix "dc=replsuffix,dc=com" belong to the backend userRoot? :p no, but this time I am using the same suffix to execute the operation which is having the backend userRoot : 1. Executing in parallel window: while [ 0=0 ] ; do ldapsearch -x -p 1389 -h localhost -D "cn=Directory Manager" -w Secret123 -b "dc=pnq,dc=redhat,dc=com"; done 2. On other tty: [root@rhel61 dirsrv]# ldapsearch -x -h localhost -p 1389 -D 'cn=directory manager' -w Secret123 -b "cn=monitor,cn=userRoot,cn=ldbm database,cn=plugins,cn=config" "(cn=*)" | egrep dncache dncachehits: 0 dncachetries: 9 dncachehitratio: 0 currentdncachesize: 1057 maxdncachesize: 10485760 currentdncachecount: 9 maxdncachecount: -1 [1]+ Done 389-console [root@rhel61 dirsrv]# sleep 15 [root@rhel61 dirsrv]# ldapsearch -x -h localhost -p 1389 -D 'cn=directory manager' -w Secret123 -b "cn=monitor,cn=userRoot,cn=ldbm database,cn=plugins,cn=config" "(cn=*)" | egrep dncache dncachehits: 0 dncachetries: 9 dncachehitratio: 0 currentdncachesize: 1057 maxdncachesize: 10485760 currentdncachecount: 9 maxdncachecount: -1 [root@rhel61 dirsrv]# 389-console & [1] 10436 [root@rhel61 dirsrv]# ldapsearch -x -h localhost -p 1389 -D 'cn=directory manager' -w Secret123 -b "cn=monitor,cn=userRoot,cn=ldbm database,cn=plugins,cn=config" "(cn=*)" | egrep dncache dncachehits: 0 dncachetries: 9 dncachehitratio: 0 currentdncachesize: 1057 maxdncachesize: 10485760 currentdncachecount: 9 maxdncachecount: -1 [root@rhel61 dirsrv]# sending info of test machine in mail. This is because you had just 9 entries and all of them were in the entry cache. Entry cache is more powerful and if entries are already in the entry cache, dn cache won't be accessed. I added one entry uid=tuser0, then it changed the dncache status: Before> # ldapsearch -x -h localhost -p 1389 -D 'cn=directory manager' -w Secret123 -b "cn=monitor,cn=userRoot,cn=ldbm database,cn=plugins,cn=config" "(cn=*)" | egrep dncache dncachehits: 0 dncachetries: 9 dncachehitratio: 0 currentdncachesize: 1057 maxdncachesize: 10485760 currentdncachecount: 9 maxdncachecount: -1 After> # ldapsearch -x -h localhost -p 1389 -D 'cn=directory manager' -w Secret123 -b "cn=monitor,cn=userRoot,cn=ldbm database,cn=plugins,cn=config" "(cn=*)" | egrep dncache dncachehits: 1 dncachetries: 10 dncachehitratio: 10 currentdncachesize: 1158 maxdncachesize: 10485760 currentdncachecount: 10 maxdncachecount: -1 That makes sense, hence marking verified |