Bug 1458536 - Performance issues with RHDS 10 - NDN cache investigation.
Summary: Performance issues with RHDS 10 - NDN cache investigation.
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: 389-ds-base   
(Show other bugs)
Version: 7.3
Hardware: All Linux
urgent
urgent
Target Milestone: pre-dev-freeze
: 7.5
Assignee: mreynolds
QA Contact: Viktor Ashirov
Marc Muehlfeld
URL:
Whiteboard:
Keywords: ZStream
Depends On:
Blocks: 1420851 1477926 1472344 1486128 1490412
TreeView+ depends on / blocked
 
Reported: 2017-06-04 03:26 UTC by Ash Westbrook
Modified: 2018-04-10 14:17 UTC (History)
6 users (show)

Fixed In Version: 389-ds-base-1.3.7.5-4.el7
Doc Type: Enhancement
Doc Text:
Directory Server now uses separate normalized DN caches for each worker thread Previously, multiple worker threads used a single normalized Distinguished Name (DN) cache. Consequently, if multiple clients performed operations on Directory Server, performance decreased. With this update, Directory Server now creates separate normalized DN caches for each worker thread. As a result, performance no longer decreases in the mentioned scenario.
Story Points: ---
Clone Of:
: 1486128 (view as bug list)
Environment:
Last Closed: 2018-04-10 14:16:50 UTC
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:0811 None None None 2018-04-10 14:17 UTC

Description Ash Westbrook 2017-06-04 03:26:45 UTC
Description of problem:

As we scale up on the number of JBOSS application servers connecting to the RHDS 10 directories we see performance begin to degrade on the uniquemember group look ups for what groups a user is a member of.  With only 1 or 2 JBOSS servers connected to the directory performance is good then when we add a 3rd or 4th directory performance quickly degrades on the lookups with a LDAP filter of (&(uniquemember=userDN)(objectclass=companygroup)).

The Directory is split up into two databases, one is the userRoot database which contains the root Suffix for the directory, the second is a subsuffix with the groupRoot database.  The group lookups with the filter listed above begin to slow down as the number of connections ramp up on the directory.  Etimes begin to climb into the 1 to 2 minute range and the CPU load rises.  The response times on the userRoot database continue to be good.

The cache hit ratio on both of the databases is 95% or above.  The file descriptors have been increased to 32K and the hard limit is 64K, also the limit on procs has been increased to 32K.

Logconv has been run and there are no unindexed queries that show up in the report and there are thousands of connections that are left and listed as available.  

The databases are on a seperate partition that is mounted onto SSD drive and the file system is ZFS.  We have been able to isolate this prob.em down to a query and connection concurrency  problem with the groupRoot db, we are looking for Red Hat support to provide additional recommendations for remedying this problem.

Comment 2 wibrown@redhat.com 2017-06-05 02:52:19 UTC
As far as what you can do to check, what's your nsslapd-threadnumber? Have you followed the performance tuning guide? Can you use HR etime to see what's going on there? What is mounted for /var/log? Can you disable COW on the userRoot/groupRoot dbs?

Comment 5 wibrown@redhat.com 2017-07-21 01:13:52 UTC
Upstream ticket:
https://pagure.io/389-ds-base/issue/49330

Comment 11 Viktor Ashirov 2018-02-19 15:15:03 UTC
Build tested:
389-ds-base-1.3.7.5-18.el7.x86_64

My testing server with 48Gb RAM was configured with the following settings:

(default settings)
nsslapd-idlistscanlimit: 4000
nsslapd-dbcachesize: 536870912
nsslapd-cachememsize: 4563402752

I increased ndn-cache-max-size:
nsslapd-ndn-cache-max-size: 2097152000

Directory contains 1 group with 10k members, unindexed component (description). 

I see 8-10x increase on average in search rate:
ldclt -D 'cn=Directory Manager' -w Secret123 -e esearch,random -r0 -R99999  -f "(&(description=*)(objectClass=groupOfUniqueNames)(uniqueMember=uid=uXXXXXX,ou=People,dc=example,dc=com))"

389-ds-base-1.3.6.1-19.el7_4.x86_64 (without the fix):
ldclt[40687]: Average rate:   20.40/thr  (  20.40/sec), total:    204

389-ds-base-1.3.7.5-18.el7.x86_64
ldclt[39467]: Average rate:  192.90/thr  ( 192.90/sec), total:   1929

Marking as VERIFIED.

Comment 15 errata-xmlrpc 2018-04-10 14:16:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0811


Note You need to log in before you can comment on or make changes to this bug.