979045 – sssd_be goes to 99% CPU and causes significant login delays when client is under load

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 979045 - sssd_be goes to 99% CPU and causes significant login delays when client is under load

Summary: sssd_be goes to 99% CPU and causes significant login delays when client is un...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	sssd
Sub Component:
Version:	7.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	unspecified
Target Milestone:	rc
Target Release:	---
Assignee:	Jakub Hrozek
QA Contact:	Kaushik Banerjee
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	979046 979047
TreeView+	depends on / blocked

Reported:	2013-06-27 13:34 UTC by Dmitri Pal
Modified:	2020-05-02 17:16 UTC (History)
CC List:	5 users (show)
Fixed In Version:	sssd-1.10.0-18.el7
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Clones:	979046 (view as bug list)
Environment:
Last Closed:	2014-06-13 10:28:38 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	SSSD sssd issues 2848	0	None	closed	sssd_be goes to 99% CPU and causes significant login delays when client is under load	2020-11-10 09:55:30 UTC

Description Dmitri Pal 2013-06-27 13:34:13 UTC

This bug is created as a clone of upstream ticket:
https://fedorahosted.org/sssd/ticket/1806

I have a system with a reproducible problem with sssd when under load.

The sssd.log shows a reoccurring number of messages stating:  A service PING timed out on [domain.com]. Attempt [0]

Followed by: Killing service [expertcity.com], not responding to pings!

Following a restart of sssd, the sssd_be process spikes at 99% cpu, and a delay of 30-60secs can be experienced sshing to the device.  Subsequent logins seem fine until whichever cache is effected needs to be renewed again, which in turn reproduces the long delay.

The system is a VM with 2 cores assigned.  Load can be anywhere from 4-12 to reproduce the issue.

Comment 1 Namita Soman 2013-06-27 14:28:29 UTC

Please add steps to verify this

Comment 2 Jakub Hrozek 2013-06-27 14:51:09 UTC

(In reply to Namita Soman from comment #1)
> Please add steps to verify this

To reproduce, create a very large hostgroup. I used 2000 hosts for my testing, then the bug was visible. Then create an HBAC rule that references this hostgroup. Make sure to disable the allow_all rule for the test. Then log in to the client.

With the unpatched SSSD, the CPU will spike for a couple of seconds. With the patch, the login should be quite faster and there should be no CPU spike.

I would also advise to make sure the user is not a member of large groups to avoid the memberof plugin doing much work during login. That way the only CPU-intensive task is saving the large hostgroup.

Alternatively (or in addition) you could check that the "member" attribute is not downloaded from the server at all.

Comment 3 Jakub Hrozek 2013-06-27 14:52:05 UTC

Fixed upstream.

Comment 4 Jakub Hrozek 2013-10-04 13:22:52 UTC

Temporarily moving bugs to MODIFIED to work around errata tool bug

Comment 6 Namita Soman 2014-02-24 18:45:32 UTC

Tested using ipa-server-3.3.3-18.el7.x86_64, sssd-1.11.2-40.el7.x86_64, ipa-client-3.3.3-18.el7.x86_64

Added a host group - hostgroup1
Added 2000 hosts
Added these hosts to the hostgroup
Installed ipaclient, and added that host to same hostgroup
Added hbac rule, allowing user (user one) to access hosts in the hostgroup (hostgroup1), and allowing access to a service (sshd).
Disabled hbac rule allow_all 
Ran kdestroy
ssh'd as user (one) from master server to the host where the rhel 7 client is installed.
Was able to login fast.

There was no cpu spikes or messages in sssd_testrelm.test.log

Comment 7 Ludek Smid 2014-06-13 10:28:38 UTC

This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.

Note You need to log in before you can comment on or make changes to this bug.