Bug 450528

Summary:

"binder-based resource limits - slapi_reslimit_get_integer_limit(): slapi_get_object_extension() returned NULL" error and server abnormal exit

Product:

[Retired] 389

Reporter:

Aleksander Adamowski <bugs-redhat>

Component:

Directory Server

Assignee:

Rich Megginson <rmeggins>

Status:

CLOSED NEXTRELEASE

QA Contact:

Orla Hegarty <ohegarty>

Severity:

high

Docs Contact:

Priority:

low

Version:

1.1.0

CC:

ckannan

Target Milestone:

---

Target Release:

---

Hardware:

x86_64

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2009-03-25 20:04:00 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
errors log	none

Description Aleksander Adamowski 2008-06-09 12:20:22 UTC

Description of problem:

I've had the server exit abnormally twice in a row with the following errors in
the logs:

[09/Jun/2008:13:30:53 +0200] NSMMReplicationPlugin - agmt="cn=with_server3"
(server3:636): Beginning linger on the connection
[09/Jun/2008:13:30:53 +0200] NSMMReplicationPlugin - agmt="cn=with_server3"
(server3:636): State: sending_updates -> wait_for_c
hanges
[09/Jun/2008:13:30:53 +0200] - _cl5PositionCursorForReplay
(agmt="cn=with_server1" (server1:636)): Consumer RUV:
[09/Jun/2008:13:30:53 +0200] NSMMReplicationPlugin - agmt="cn=with_server1"
(server1:636): {replicageneration} 47ffda1100000001
0000
[09/Jun/2008:13:30:53 +0200] NSMMReplicationPlugin - agmt="cn=with_server1"
(server1:636): {replica 3 ldap://server1.example.co
m:389} 48018448000000030000 484d30e9000000030000 00000000
[09/Jun/2008:13:30:53 +0200] NSMMReplicationPlugin - agmt="cn=with_server1"
(server1:636): {replica 1 ldap://server3.example.co
m:389} 47ffe6c4000000010000 484d1afe000000010000 00000000
[09/Jun/2008:13:30:53 +0200] NSMMReplicationPlugin - agmt="cn=with_server1"
(server1:636): {replica 2 ldap://server2.example.co
m:389} 4800030b000000020000 484d2f93000000020000 00000000
[09/Jun/2008:13:30:53 +0200] - _cl5PositionCursorForReplay
(agmt="cn=with_server1" (server1:636)): Supplier RUV:
[09/Jun/2008:13:30:53 +0200] NSMMReplicationPlugin - agmt="cn=with_server1"
(server1:636): {replicageneration} 47ffda1100000001
0000
[09/Jun/2008:13:30:53 +0200] NSMMReplicationPlugin - agmt="cn=with_server1"
(server1:636): {replica 2 ldap://server2.example.co
m:389} 4800030b000000020000 484d2f93000000020000 00000000
[09/Jun/2008:13:30:53 +0200] NSMMReplicationPlugin - agmt="cn=with_server1"
(server1:636): {replica 3 ldap://server1.example.co
m:389} 48018448000000030000 484d30e9000000030000 00000000
[09/Jun/2008:13:30:53 +0200] NSMMReplicationPlugin - agmt="cn=with_server1"
(server1:636): {replica 1 ldap://server3.example.co
m:389} 47ffe6c4000000010000 484d1afe000000010000 00000000
[09/Jun/2008:13:30:53 +0200] NSMMReplicationPlugin - agmt="cn=with_server1"
(server1:636): No changes to send
[09/Jun/2008:13:30:53 +0200] NSMMReplicationPlugin - agmt="cn=with_server1"
(server1:636): Successfully released consumer
[09/Jun/2008:13:30:53 +0200] NSMMReplicationPlugin - agmt="cn=with_server1"
(server1:636): Beginning linger on the connection
[09/Jun/2008:13:30:53 +0200] NSMMReplicationPlugin - agmt="cn=with_server1"
(server1:636): State: sending_updates -> wait_for_c
hanges
[09/Jun/2008:13:30:53 +0200] binder-based resource limits -
slapi_reslimit_get_integer_limit(): slapi_get_object_extension() re
turned NULL
[09/Jun/2008:13:30:53 +0200] binder-based resource limits -
slapi_reslimit_get_integer_limit(): slapi_get_object_extension() re
turned NULL



Version-Release number of selected component (if applicable):
fedora-ds-1.1.0-3.fc6 on RHEL5 on x86_64

How reproducible:
Hard to reproduce, exact cause unknown. Attaching error logs that include
replication debugging data.

Comment 1 Aleksander Adamowski 2008-06-09 12:20:23 UTC

Created attachment 308681 [details]
errors log

Comment 2 Aleksander Adamowski 2008-06-09 12:23:23 UTC

I can also supply access and audit logs, however these contain mildly sensitive
data s I would need to send them to a private Red Hat address.

Comment 3 Rich Megginson 2008-06-09 18:02:26 UTC

Is there anything in the access log which looks suspicious?  If so, perhaps you
could attach a small excerpt with any sensitive data obscured.

Are there any messages in the syslog which show the process was killed or
crashed?  Any core dumps?  Note that it does not appear that the reslimit errors
would cause the server to exit - so this message is probably a symptom of the
actual failure.

Are you using bind dn based resource limits?

Comment 4 Aleksander Adamowski 2008-06-10 10:16:31 UTC

I'm not using resource limits anywhere.

Searching with this filter on my directory yields 0 entries as a result:

'(|(nsLookThroughLimit=*)(nsSizeLimit=*)(nsTimeLimit=*)(nsIdleTimeout=*))'

I'll send the access and audit logs to your private mail address.

Comment 5 Rich Megginson 2008-06-23 18:54:58 UTC

This problem only occurs on a VM guest, correct?  Can this problem be reproduced
on a bare metal system?

Comment 6 Aleksander Adamowski 2008-06-23 23:00:33 UTC

I'll try, however the bare metal system is a test system and I doubt I can
simulate this specific load.

Reproducing problems on LDAP servers is usually quite tricky, since most of the
time the actual query contents matter, not the number of simultaneous
connections etc. Every detail might be important, like the scope, the attributes
requested, actual data returned;

On OpenLDAP I've once discovered a socket leaking problem that was caused by
binding multiple times with different DN's on the same connection for a long
period of time.

Comment 7 Rich Megginson 2008-06-23 23:20:33 UTC

Ok.  It will help a great deal if we can determine that this is a Xen only bug.

Comment 8 Rich Megginson 2009-03-25 20:04:00 UTC

This may be fixed with the next release.