Bug 707462
Summary: | Memory leak when using Export tool | ||||||
---|---|---|---|---|---|---|---|
Product: | [Retired] 389 | Reporter: | Aaron Roots <aaron.roots> | ||||
Component: | Directory Server | Assignee: | Rich Megginson <rmeggins> | ||||
Status: | CLOSED DUPLICATE | QA Contact: | Chandrasekar Kannan <ckannan> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 1.2.8 | CC: | aaron.roots, benl, daniel.appleby, nhosoi | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2011-06-29 16:42:43 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 434915, 708096 | ||||||
Attachments: |
|
Description
Aaron Roots
2011-05-25 07:23:35 UTC
What are your cache settings on that machine? 32-bit or 64-bit? How much RAM do you have? 64-bit with 2GB of RAM. Memory available for cache is 50 MB (52428800 bytes) for each database & LDBM Plug-in settings grep nsslapd-cachememsize /etc/dirsrv/slapd-INST/dse.ldif ls -al /var/lib/dirsrv/slapd-INST/db/*/id2entry.db4 egrep "^dn|nsslapd-cachememsize" /etc/dirsrv/slapd-INST/dse.ldif dn: cn=accessgroupsData,cn=ldbm database,cn=plugins,cn=config nsslapd-cachememsize: 52428800 dn: cn=automountData,cn=ldbm database,cn=plugins,cn=config nsslapd-cachememsize: 52428800 dn: cn=groupData,cn=ldbm database,cn=plugins,cn=config nsslapd-cachememsize: 52428800 dn: cn=netgroupData,cn=ldbm database,cn=plugins,cn=config nsslapd-cachememsize: 52428800 dn: cn=NetscapeRoot,cn=ldbm database,cn=plugins,cn=config nsslapd-cachememsize: 10485760 dn: cn=userRoot,cn=ldbm database,cn=plugins,cn=config nsslapd-cachememsize: 52428800 ls -al /var/lib/dirsrv/slapd-INST/db/*/id2entry.db4 -rw------- 1 nobody nobody 1179648 May 27 13:51 /var/lib/dirsrv/slapd-INST/db/netgroupData/id2entry.db4 -rw------- 1 nobody nobody 12386304 Mar 18 10:34 /var/lib/dirsrv/slapd-INST/db/groupData/id2entry.db4 -rw------- 1 nobody nobody 60645376 May 27 15:49 /var/lib/dirsrv/slapd-INST/db/automountData/id2entry.db4 -rw------- 1 nobody nobody 139264 May 27 11:56 /var/lib/dirsrv/slapd-INST/db/NetscapeRoot/id2entry.db4 -rw------- 1 nobody nobody 860864512 May 28 07:18 /var/lib/dirsrv/slapd-INST/db/userRoot/id2entry.db4 -rw------- 1 nobody nobody 41025536 May 28 03:12 /var/lib/dirsrv/slapd-INST/db/accessgroupsData/id2entry.db4 The nsslapd-cachememsize should be at least 2*the size of the id2entry.db4. You can monitor the cache usage here - http://docs.redhat.com/docs/en-US/Red_Hat_Directory_Server/8.2/html-single/Administration_Guide/index.html#Monitoring_Server_and_Database_Activity-Monitoring_Database_Activity Keep an eye on the sizes in bytes (ignore the (in entries) items) Once the cache is warmed up, your cache hit ratio should approach 100 (percent) I suspect this is related to https://bugzilla.redhat.com/show_bug.cgi?id=697701 I've increased the cache settings - however are still experiencing the memory leak and crashes - there is a longer delay of a few days - however as we had to add more memory to the box to be able to increase the cache I am not sure if this is related to these settings dn: cn=accessgroupsData,cn=ldbm database,cn=plugins,cn=config nsslapd-cachememsize: 125829120 dn: cn=automountData,cn=ldbm database,cn=plugins,cn=config nsslapd-cachememsize: 209715200 dn: cn=groupData,cn=ldbm database,cn=plugins,cn=config nsslapd-cachememsize: 52428800 dn: cn=netgroupData,cn=ldbm database,cn=plugins,cn=config nsslapd-cachememsize: 52428800 dn: cn=NetscapeRoot,cn=ldbm database,cn=plugins,cn=config nsslapd-cachememsize: 10485760 dn: cn=userRoot,cn=ldbm database,cn=plugins,cn=config nsslapd-cachememsize: 3221225472 -rw------- 1 nobody nobody 41025536 Jun 3 03:11 /var/lib/dirsrv/slapd-auth-master-dev/db/accessgroupsData/id2entry.db4 -rw------- 1 nobody nobody 60645376 Jun 3 09:49 /var/lib/dirsrv/slapd-auth-master-dev/db/automountData/id2entry.db4 -rw------- 1 nobody nobody 12386304 Jun 3 03:11 /var/lib/dirsrv/slapd-auth-master-dev/db/groupData/id2entry.db4 -rw------- 1 nobody nobody 1253376 Jun 3 09:41 /var/lib/dirsrv/slapd-auth-master-dev/db/netgroupData/id2entry.db4 -rw------- 1 nobody nobody 139264 Jun 2 11:37 /var/lib/dirsrv/slapd-auth-master-dev/db/NetscapeRoot/id2entry.db4 -rw------- 1 nobody nobody 860864512 Jun 3 09:52 /var/lib/dirsrv/slapd-auth-master-dev/db/userRoot/id2entry.db4 (In reply to comment #6) > I've increased the cache settings - however are still experiencing the memory > leak and crashes - there is a longer delay of a few days - however as we had to > add more memory to the box to be able to increase the cache I am not sure if > this is related to these settings > > dn: cn=accessgroupsData,cn=ldbm database,cn=plugins,cn=config > nsslapd-cachememsize: 125829120 > dn: cn=automountData,cn=ldbm database,cn=plugins,cn=config > nsslapd-cachememsize: 209715200 > dn: cn=groupData,cn=ldbm database,cn=plugins,cn=config > nsslapd-cachememsize: 52428800 > dn: cn=netgroupData,cn=ldbm database,cn=plugins,cn=config > nsslapd-cachememsize: 52428800 > dn: cn=NetscapeRoot,cn=ldbm database,cn=plugins,cn=config > nsslapd-cachememsize: 10485760 > dn: cn=userRoot,cn=ldbm database,cn=plugins,cn=config > nsslapd-cachememsize: 3221225472 Try increasing this some more - by multiples of the size of the id2entry database. Start with 3221225472+860864512, see if that makes the problem better, then 3221225472+2*860864512, etc. > > > -rw------- 1 nobody nobody 41025536 Jun 3 03:11 > /var/lib/dirsrv/slapd-auth-master-dev/db/accessgroupsData/id2entry.db4 > -rw------- 1 nobody nobody 60645376 Jun 3 09:49 > /var/lib/dirsrv/slapd-auth-master-dev/db/automountData/id2entry.db4 > -rw------- 1 nobody nobody 12386304 Jun 3 03:11 > /var/lib/dirsrv/slapd-auth-master-dev/db/groupData/id2entry.db4 > -rw------- 1 nobody nobody 1253376 Jun 3 09:41 > /var/lib/dirsrv/slapd-auth-master-dev/db/netgroupData/id2entry.db4 > -rw------- 1 nobody nobody 139264 Jun 2 11:37 > /var/lib/dirsrv/slapd-auth-master-dev/db/NetscapeRoot/id2entry.db4 > -rw------- 1 nobody nobody 860864512 Jun 3 09:52 > /var/lib/dirsrv/slapd-auth-master-dev/db/userRoot/id2entry.db4 I'm trying to reproduce the problem, but so far no luck. I created 2 backends: backend1: nsslapd-cachememsize: 150000000 id2entry.db4: 136855552 bytes (50K entries) backend2: nsslapd-cachememsize: 1600000000 id2entry.db4: 1371119616 bytes (500K entries) I repeatedly ran ./db2ldif.pl against the 2 backends + ran add/delete/search operations at the same time. The size of ns-slapd started with 162,322KB and gradually increased. I was monitoring the entry cache size. Once the cache reaches the max cache size: currententrycachesize: 1599997873 maxentrycachesize: 1600000000 the growth of the process size stopped. The size was 1,296,959KB. I ran this test with the standalone Directory Server. I'm wondering if there could be some other configurations/operations/data that triggers the leak(s). Could it be possible to share your configuration file (dse.ldif) and log files (errors and access) with us? Also, could there be anything unique to your system? Which plug-ins you enabled? Custom schema? Any special data, images, certs in entries? Your help would be greatly appreciate it. Hi Noriko, I am working with Aaron on this issue. We run a fairly stock system with a couple of custom schemas. We don't use any plugin's. We mainly see the issue with the userRoot database due to the large number of objects it contains. Our scripts export and immediately start doing updates (if any are required). I have setup a test case which just uses the export tool (no writing afterwards) to try and narrow down the problem. I am happy to send my custom schemas and dse.ldif but would like them to be kept private. Is their a way i can get these files to you privately? Created attachment 505722 [details]
Valgrind output
Hi, I have run dirsrv under valgrind and reproduced the issue. See the attached output. I triggered it by running the export utility on the same database over and over (every minute) Let me know if you need any more info. Regards, Daniel Thank you for the valgrind output. It looks the leak is already fixed in the master tree. https://bugzilla.redhat.com/show_bug.cgi?id=697027#c6 We are releasing 389-ds-base 1.2.9 alpha soon. When it's available, could you please run the test on the release? 389-ds-base 1.2.9 alpha 2 is ready. Could you go to this site and download a package that matches your platform? http://koji.fedoraproject.org/koji/packageinfo?packageID=8423 389-ds-base-1.2.9-0.2.a2.el5 389-ds-base-1.2.9-0.2.a2.fc14 389-ds-base-1.2.9-0.2.a2.fc15 389-ds-base-1.2.9-0.2.a2.fc16 We'd greatly appreciate your testing on the new alpha release! Hi Noriko, I have installed 389-ds-base-1.2.9-0.2.a2.el5 and the memory usage appears to be holding. I'll leave it for a few more hours and let you know but it's no longer growing at the rate it was before. Thanks, Daniel Hi Noriko, 389-ds-base-1.2.9-0.2.a2.el5 has fixed the issue. The cache size no longer exceeds the max cache size. Do you know when the 1.2.9 will move into testing? Thank you for your assistance with this bug. Regards, Daniel (In reply to comment #17) > Hi Noriko, > > 389-ds-base-1.2.9-0.2.a2.el5 has fixed the issue. The cache size no longer > exceeds the max cache size. > > Do you know when the 1.2.9 will move into testing? It's already in Testing. It should be in the updates-testing and epel-testing mirrors today or tomorrow. We don't have an estimate yet of when 1.2.9 will be Stable. We have some more bug fixes and testing yet to do. > > Thank you for your assistance with this bug. > > Regards, > Daniel Daniel, thank you so much for testing 389-ds-base-1.2.9-0.2.a2. We are glad that the memory leak is no longer observed on the new version. Let me mark this bug as a dup of 697027 for the future reference. *** This bug has been marked as a duplicate of bug 697027 *** |