Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 744132

Summary:	[RFE] return error code when server closes idle connection
Product:	Red Hat Enterprise Linux 6	Reporter:	Ondrej Valousek <ondrejv>
Component:	openldap	Assignee:	Jan Synacek <jsynacek>
Status:	CLOSED WONTFIX	QA Contact:	BaseOS QE Security Team <qe-baseos-security>
Severity:	low	Docs Contact:
Priority:	unspecified
Version:	6.1	CC:	dspurek, jhrozek, jplans, jsynacek, omoris, ovasik, sgallagh, syeghiay, tsmetana
Target Milestone:	rc	Keywords:	FutureFeature
Target Release:	---
Hardware:	i386
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Enhancement
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2014-06-04 13:24:19 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Ondrej Valousek 2011-10-07 08:08:42 UTC

Right now, ldap libraries return a generic error when a ldap server closes a connection (which has been idle for too long for example). It would be nice, if the ldap library could return a specific error in this case so we know what has happened.

It is replicable using Win 2008 based ldap servers, for example.

Comment 2 Stephen Gallagher 2011-10-07 13:26:10 UTC

Specifically, it appears that we get back a response, but when we call ldap_result(), we get a return code of -1. There is no additional information in the diagnostic message to explain what has actually happened.

It would be better if we could get some information on what the failure actually was, so we could know whether it is safe to retry.

Comment 4 Jan Synacek 2012-03-20 10:01:42 UTC

Just a shot in the dark.. Struct ldap has a member 'ld_errno', which should contain more info about what happened. There are predefined values for that, check 'ldap.h'. There are LDAP_SERVER_DOWN or LDAP_TIMEOUT among others.

Could that be a possible solution?

Comment 5 Stephen Gallagher 2012-03-20 11:43:39 UTC

(In reply to comment #4)
> Just a shot in the dark.. Struct ldap has a member 'ld_errno', which should
> contain more info about what happened. There are predefined values for that,
> check 'ldap.h'. There are LDAP_SERVER_DOWN or LDAP_TIMEOUT among others.
> 
> Could that be a possible solution?

'struct ldap' is privately defined. The SSSD source can't look into its members without a helper routine.

Comment 6 Jan Synacek 2012-03-21 12:35:17 UTC

My point was that the functionality you propose is already there.

You only have to use ldap_get_option like so:
ldap_get_option(ldap, LDAP_OPT_RESULT_CODE, &res);

Then you can compare the value of res with values defined in ldap.h - section commented as 'API Error Codes'.

It's only an unlucky coincidence (or a bad design error) that the value of LDAP_SERVER_DOWN is -1, as is the value for unknown error returned in a different context.

Simple test results in:

(... server killed)
ldap bind failed: Can't contact LDAP server
rc: -1 (== LDAP_SERVER_DOWN)

(... server stopped to simulate a timeout)
ldap bind failed: Timed out
rc: -5 (== LDAP_TIMEOUT)

Comment 7 Jakub Hrozek 2012-04-17 06:48:17 UTC

(In reply to comment #6)
> It's only an unlucky coincidence (or a bad design error) that the value of
> LDAP_SERVER_DOWN is -1, as is the value for unknown error returned in a
> different context.
> 

Sorry, I still don't get it. When you call ldap_result() followed by a ldap_get_option(ld, LDAP_OPT_RESULT_CODE, &err) how do you distinguish between two meanings of -1? How does ldap_err2string() do that?

Comment 8 Jakub Hrozek 2012-04-25 09:41:41 UTC

I was playing with ldap_result and LDAP_OPT_RESULT_CODE and I'm still not sure this meets all our requirements.

To test, I set up an openldap server and set the olcIdleTimeout to 5 seconds. When a subsequent request comes in after 5 seconds, ldap_err2string only reported:
"Can't contact LDAP server".

The problem is that Can't contact LDAP server is not specific and we can't decide whether to retry the same server or more to the next configured server.

Our result hadling looks somewhat like this:

if (ldap_result() == -1) {
   ldap_get_option(ld,  LDAP_OPT_RESULT_CODE, &err);
   log_error("%s\n", ldap_err2string(ret));
}

I also added ldap_get_option(ld, LDAP_OPT_DIAGNOSTIC_MESSAGE, &msg) to get extra information, but that only returned NULL in this case.

Comment 9 Jan Synacek 2012-04-25 10:19:09 UTC

if (ldap_result() == -1) {
   ldap_get_option(ld,  LDAP_OPT_RESULT_CODE, &err);
   log_error("%s\n", ldap_err2string(err));
}

If err == -5, which is the value of LDAP_TIMEOUT, ldap_err2string(err) results in "Timed out".

Notice the usage of err in ldap_err2string.

Comment 10 Jan Synacek 2012-04-25 10:26:01 UTC

> Notice the usage of err in ldap_err2string.
Forgot to emphasize it's the same 'err' you get via ldap_get_option.

Comment 11 Jan Vcelak 2012-04-25 11:03:23 UTC

Jakub is right. When the server cuts off the connection, LDAP_OPT_RESULT_CODE will return LDAP_SERVER_DOWN (-1). Which is not useful. LDAP_TIMEOUT (-5) is returned only if the client do not receive the response in time.

Comment 12 Jan Vcelak 2012-04-25 14:21:35 UTC

I was trying to come with some solution. Jakub, is there some workaround you are using now?

Just an idea: SSSD establishes the connection itself, right? What about storing some flag after successful binding to the server that the server works. And when the connection is dropped try to reconnect if this flag is set.

Maybe the information about dropped connection can be obtained somehow from sockbuf associated with the handle. I haven't tried yet.

Comment 14 Jakub Hrozek 2012-05-15 14:31:41 UTC

(In reply to comment #12)
> I was trying to come with some solution. Jakub, is there some workaround you
> are using now?
> 
> Just an idea: SSSD establishes the connection itself, right? What about storing
> some flag after successful binding to the server that the server works. And
> when the connection is dropped try to reconnect if this flag is set.
> 
> Maybe the information about dropped connection can be obtained somehow from
> sockbuf associated with the handle. I haven't tried yet.

Yes, we have a list of recoverable errors after which we retry to connection attempt - we retry each server in the fail over list unless we receive a fatal error such as ENOMEM.

Comment 15 RHEL Program Management 2012-07-10 08:29:53 UTC

This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 16 RHEL Program Management 2012-07-11 01:44:48 UTC

This request was erroneously removed from consideration in Red Hat Enterprise Linux 6.4, which is currently under development.  This request will be evaluated for inclusion in Red Hat Enterprise Linux 6.4.

Comment 17 RHEL Program Management 2012-12-14 08:27:49 UTC

This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.