Bug 1505454
Summary: | IPA API returns TimeLimitExceeded without a reason | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Lea Bradacova <lea.bradacova> | ||||||
Component: | ipa | Assignee: | IPA Maintainers <ipa-maint> | ||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | ipa-qe <ipa-qe> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 7.4 | CC: | abokovoy, lea.bradacova, mreynolds, pasik, pvoborni, rcritten, tbordaz, tscherf | ||||||
Target Milestone: | rc | ||||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2018-08-22 09:19:54 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Lea Bradacova
2017-10-23 15:33:18 UTC
What hardcoded timelimit? What exactly did you change? I seem to recall a 389-ds bug related to user-specific limits, cc'ing 389-ds dev. A method get_ipa_config try: use find_entries here lest we hit an infinite recursion when ldap2.get_entries tries to determine default time/size limits (entries, truncated) = self.find_entries( None, attrs_list, base_dn=dn, scope=self.SCOPE_BASE, time_limit=2, size_limit=10 ) self.handle_truncated_result(truncated) config_entry = entries[0] time_limit set to 20 instead of 2 (In reply to Rob Crittenden from comment #2) > What hardcoded timelimit? What exactly did you change? > > I seem to recall a 389-ds bug related to user-specific limits, cc'ing 389-ds > dev. There was a DS ticket created where user specific resource limits were not honoured, but I just tested the server and it seems to be working fine. I set anonymous resource time limit to 2 seconds, as well as the global time limit to 2 seconds. Then I set a user time limit to 300 seconds, and it works as expected. So I do not think there is a DS bug around the resource limits. Ok, get_ipa_config is a bit of a special case as it bootstraps the timeout for subsequent requests. This is different and completely separate from binding as a user and using 389-ds timeout values. Is your server under serious strain? How long is this taking? (the logs will tell you). Created attachment 1342690 [details]
CPU Utilization graph last week
Created attachment 1342691 [details]
Memory utilization graph last week
Here are in attachment graphs of memory and cpu for last week from our monitoring. There is almost no load on the server. During this time the error occured 2 times: [root@ipa-02 ~]# grep 'err=3 ' /var/log/dirsrv/slapd-INTERNAL-SERVICES/access* /var/log/dirsrv/slapd-INTERNAL-SERVICES/access:[24/Oct/2017:08:28:39.954984501 +0200] conn=716022 op=3 RESULT err=3 tag=101 nentries=0 etime=4 /var/log/dirsrv/slapd-INTERNAL-SERVICES/access.20171022-212616:[23/Oct/2017:08:16:44.491743118 +0200] conn=676848 op=3 RESULT err=3 tag=101 nentries=0 etime=7 [root@ipa-02 ~]# grep conn=676848 /var/log/dirsrv/slapd-INTERNAL-SERVICES/access.20171022-212616 [23/Oct/2017:08:16:36.322275330 +0200] conn=676848 fd=83 slot=83 connection from 10.14.145.22 to 10.14.145.22 [23/Oct/2017:08:16:36.323317507 +0200] conn=676848 op=0 BIND dn="" method=sasl version=3 mech=GSSAPI [23/Oct/2017:08:16:36.344171869 +0200] conn=676848 op=0 RESULT err=14 tag=97 nentries=0 etime=0, SASL bind in progress [23/Oct/2017:08:16:36.344571433 +0200] conn=676848 op=1 BIND dn="" method=sasl version=3 mech=GSSAPI [23/Oct/2017:08:16:36.346333801 +0200] conn=676848 op=1 RESULT err=14 tag=97 nentries=0 etime=0, SASL bind in progress [23/Oct/2017:08:16:36.346487604 +0200] conn=676848 op=2 BIND dn="" method=sasl version=3 mech=GSSAPI [23/Oct/2017:08:16:36.347887151 +0200] conn=676848 op=2 RESULT err=0 tag=97 nentries=0 etime=0 dn="uid=czechidm,cn=users,cn=accounts,dc=internal,dc=services" [23/Oct/2017:08:16:36.591178422 +0200] conn=676848 op=3 SRCH base="cn=ipaconfig,cn=etc,dc=internal,dc=services" scope=0 filter="(objectClass=*)" attrs=ALL [23/Oct/2017:08:16:44.491743118 +0200] conn=676848 op=3 RESULT err=3 tag=101 nentries=0 etime=7 [23/Oct/2017:08:16:44.493905102 +0200] conn=676848 op=4 UNBIND [23/Oct/2017:08:16:44.493932632 +0200] conn=676848 op=4 fd=83 closed - U1 Upstream ticket: https://pagure.io/freeipa/issue/7260 Regarding the timelimit, you can enable log level (4 or 5=4+1) to get more information where the limit is coming from http://www.port389.org/docs/389ds/FAQ/faq.html#Troubleshooting level 4: will say if the limit come from the request or not do_search - SRCH base="(null)" scope=0 deref=0 sizelimit=0 timelimit=3 attrsonly=0 filter="(objectClass=*)" attrs=ALL level 5 (4+1): is much more verbose but could give information if the limit was computed slapi_reslimit_get_integer_limit - <= returning SUCCESS, value=3 Regarding the performance, it is surprising to see a base search lasting so long. [23/Oct/2017:08:16:36.591178422 +0200] conn=676848 op=3 SRCH base="cn=ipaconfig,cn=etc,dc=internal,dc=services" scope=0 filter="(objectClass=*)" attrs=ALL A guess is that requesting all the attributes it can be slow down by ACI evaluation. You may retry the search being DM, or doing some pstack while the search happen Mark, do you think this is related to https://bugzilla.redhat.com/show_bug.cgi?id=1515190 ? (In reply to Rob Crittenden from comment #11) > Mark, do you think this is related to > https://bugzilla.redhat.com/show_bug.cgi?id=1515190 ? Yes it's definitely possible. But it depends on what is set in the anonymous resource limits entry. It's timing out after 7 seconds - that's pretty short. What is set in the anonymous limits entry? Lea, an underlying issue in 389-ds was fixed in RHEL 7.5.z with the release of a fix for bug 1554720. Since there were no responses on this bug for 6 months, I'm closing it. The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |