1458536 – Performance issues with RHDS 10 - NDN cache investigation.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1458536 - Performance issues with RHDS 10 - NDN cache investigation.

Summary: Performance issues with RHDS 10 - NDN cache investigation.

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	389-ds-base
Sub Component:
Version:	7.3
Hardware:	All
OS:	Linux
Priority:	urgent
Severity:	urgent
Target Milestone:	pre-dev-freeze
Target Release:	7.5
Assignee:	mreynolds
QA Contact:	Viktor Ashirov
Docs Contact:	Marc Muehlfeld
URL:
Whiteboard:
Depends On:
Blocks:	1420851 1472344 1477926 1486128 1490412
TreeView+	depends on / blocked

Reported:	2017-06-04 03:26 UTC by Ash Westbrook
Modified:	2020-12-14 08:48 UTC (History)
CC List:	6 users (show)
Fixed In Version:	389-ds-base-1.3.7.5-4.el7
Doc Type:	Enhancement
Doc Text:	Directory Server now uses separate normalized DN caches for each worker thread Previously, multiple worker threads used a single normalized Distinguished Name (DN) cache. Consequently, if multiple clients performed operations on Directory Server, performance decreased. With this update, Directory Server now creates separate normalized DN caches for each worker thread. As a result, performance no longer decreases in the mentioned scenario.
Clone Of:
Clones:	1486128 (view as bug list)
Environment:
Last Closed:	2018-04-10 14:16:50 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	389ds 389-ds-base issues 2389	0	None	None	None	2020-09-13 22:01:42 UTC
Red Hat Product Errata	RHBA-2018:0811	0	None	None	None	2018-04-10 14:17:45 UTC

Description Ash Westbrook 2017-06-04 03:26:45 UTC

Description of problem:

As we scale up on the number of JBOSS application servers connecting to the RHDS 10 directories we see performance begin to degrade on the uniquemember group look ups for what groups a user is a member of. With only 1 or 2 JBOSS servers connected to the directory performance is good then when we add a 3rd or 4th directory performance quickly degrades on the lookups with a LDAP filter of (&(uniquemember=userDN)(objectclass=companygroup)).

The Directory is split up into two databases, one is the userRoot database which contains the root Suffix for the directory, the second is a subsuffix with the groupRoot database. The group lookups with the filter listed above begin to slow down as the number of connections ramp up on the directory. Etimes begin to climb into the 1 to 2 minute range and the CPU load rises. The response times on the userRoot database continue to be good.

The cache hit ratio on both of the databases is 95% or above. The file descriptors have been increased to 32K and the hard limit is 64K, also the limit on procs has been increased to 32K.

Logconv has been run and there are no unindexed queries that show up in the report and there are thousands of connections that are left and listed as available.

The databases are on a seperate partition that is mounted onto SSD drive and the file system is ZFS. We have been able to isolate this prob.em down to a query and connection concurrency problem with the groupRoot db, we are looking for Red Hat support to provide additional recommendations for remedying this problem.

Comment 2 wibrown@redhat.com 2017-06-05 02:52:19 UTC

As far as what you can do to check, what's your nsslapd-threadnumber? Have you followed the performance tuning guide? Can you use HR etime to see what's going on there? What is mounted for /var/log? Can you disable COW on the userRoot/groupRoot dbs?

Comment 5 wibrown@redhat.com 2017-07-21 01:13:52 UTC

Upstream ticket:
https://pagure.io/389-ds-base/issue/49330

Comment 11 Viktor Ashirov 2018-02-19 15:15:03 UTC

Build tested:
389-ds-base-1.3.7.5-18.el7.x86_64

My testing server with 48Gb RAM was configured with the following settings:

(default settings)
nsslapd-idlistscanlimit: 4000
nsslapd-dbcachesize: 536870912
nsslapd-cachememsize: 4563402752

I increased ndn-cache-max-size:
nsslapd-ndn-cache-max-size: 2097152000

Directory contains 1 group with 10k members, unindexed component (description). 

I see 8-10x increase on average in search rate:
ldclt -D 'cn=Directory Manager' -w Secret123 -e esearch,random -r0 -R99999  -f "(&(description=*)(objectClass=groupOfUniqueNames)(uniqueMember=uid=uXXXXXX,ou=People,dc=example,dc=com))"

389-ds-base-1.3.6.1-19.el7_4.x86_64 (without the fix):
ldclt[40687]: Average rate:   20.40/thr  (  20.40/sec), total:    204

389-ds-base-1.3.7.5-18.el7.x86_64
ldclt[39467]: Average rate:  192.90/thr  ( 192.90/sec), total:   1929

Marking as VERIFIED.

Comment 15 errata-xmlrpc 2018-04-10 14:16:50 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0811

Note You need to log in before you can comment on or make changes to this bug.