Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 887801

Summary:

Chainingbackend longduration test(test03) failed with "Insufficient access" error=50

Product:

Red Hat Enterprise Linux 7

Reporter:

Sankar Ramalingam <sramling>

Component:

389-ds-base

Assignee:

Noriko Hosoi <nhosoi>

Status:

CLOSED WORKSFORME

QA Contact:

Viktor Ashirov <vashirov>

Severity:

high

Docs Contact:

Priority:

unspecified

Version:

7.0

CC:

jgalipea, mreynolds, nhosoi, nkinder

Target Milestone:

Keywords:

Reopened

Target Release:

---

Hardware:

All

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2016-11-10 01:25:30 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
svn diff chainingbackend_longduration.sh	none

Description Sankar Ramalingam 2012-12-17 11:44:58 UTC

Description of problem: 
         Chainingbackend longduration test(test03) failed with  "ldclt[32764]: T013: Cannot ldap_delete (cn=monsieur00000,o=farm1,o=cbrLD.com), error=50 (Insufficient access)" error. I also observed a comment for test01, which says, the ldclt fails at first attempt and the same passes at second.

Version-Release number of selected component (if applicable): 389-ds-base-1.2.11.15-6


How reproducible: Consistently


Steps to Reproduce:
1. Run Chainingbackend longduration tests with the latest 389-ds-base.
2. Test01 will have failures, but the same would be passing in the second attempt. Hence the test is marked PASS.
3. Test03 fails since only once its executed.

/bin/ldclt-bin -h hp-ml370g6-01.rhts.eng.bos.redhat.com -p 4980 -D uid=buser,o=cbrLD.com -w binduser -e delete,incr,noloop -r 0 -R 50000 -b o=farm1,o=cbrLD.com -f cn=monsieurXXXXX -I 68 -I 32 -q -n 20 -N 720 -W 1
ldclt[32764]: T013: Cannot ldap_delete (cn=monsieur00000,o=farm1,o=cbrLD.com), error=50 (Insufficient access)
ldclt[32764]: T000: Cannot ldap_delete (cn=monsieur00000,o=farm1,o=cbrLD.com), error=50 (Insufficient access)
ldclt[32764]: T004: Cannot ldap_delete (cn=monsieur00001,o=farm1,o=cbrLD.com), error=50 (Insufficient access) 

  
Actual results: Chainingbackend longduration tests reported failures. 

Expected results: All chainingbackend longduration tests should PASS.

Additional info:

Mail to Rich:
      I observed a test(test03) failure with chainingbackend longduration tests. Ldclt delete failed with "ldclt[32764]: T013: Cannot ldap_delete (cn=monsieur00000,o=farm1,o=cbrLD.com), error=50 (Insufficient access)" error.

When I run tests test01 and test03, both the tests PASSed.
http://hp-ml370g6-01.rhts.eng.bos.redhat.com/qa/archive/ds/rhel64/longduration/Linux/20121213-042854.html

When I run tests test01, test02 and test03, it failed for test03.
http://hp-ml370g6-01.rhts.eng.bos.redhat.com/qa/archive/ds/rhel64/longduration/Linux/20121213-081953.html

Ldclt delete failed with "ldclt[32764]: T013: Cannot ldap_delete (cn=monsieur00000,o=farm1,o=cbrLD.com), error=50 (Insufficient access)" error. 

Hence, I tried running an ldapsearch using the same user and I encountered "Administrative Limit" error at first time. Then, from the next attempt the same PASSed. LDCLT delete was also successful in the second run. Not sure why it was failing at first attempt.

Reply from Rich:
       Not sure.  This may be related to https://fedorahosted.org/389/ticket/192
That is, chaining gets back an error, tries to fail over, and the next operation after that fails?
At any rate, please file a bug.

Comment 2 Noriko Hosoi 2012-12-18 03:06:33 UTC

Created attachment 665249 [details]
svn diff chainingbackend_longduration.sh

I believe this is a test bug.

Farm1 and Farm2 has no to-be-chained entry, o=farm1,o=cbrLD.com and o=farm2,o=cbrLD.com, respectively.  This patch adds them in the startup, then test01 succeeds without repeating ldclt twice.  (I think ldclt automatically adds missing parent in the first run.)

Test Name                            PASS 	FAIL 	NORESULT
ChainingBackend longduration startup 100% (1/1) 	  	 
ChainingBackend longduration run     100% (1/1) 	  	 
(Note: I only ran test01.  Please apply the patch and run the rest.  Thanks!)

In addition, the ldif files (mux.ldif, farm1.ldif, and farm2.ldif) are missing from TET.  I guess the entries I mentioned above might have been in these files.  But the author of the test forgot to check in into subversion?
    $LDAPMODIFY -c -h $LDAPhost -p $CBR_MUX_PORT -D "$ROOTDN" -w $ROOTDNPW -f $TET_TMP_DIR/mux.ldif
    $LDAPMODIFY -c -h $LDAPhost -p $CBR_FARM1_PORT -D "$ROOTDN" -w $ROOTDNPW -f $TET_TMP_DIR/farm1.ldif
    $LDAPMODIFY -c -h $LDAPhost -p $CBR_FARM2_PORT -D "$ROOTDN" -w $ROOTDNPW -f $TET_TMP_DIR/farm2.ldif

Comment 3 Sankar Ramalingam 2012-12-19 11:41:47 UTC

    The above fix worked for test01. But, I am still getting the failure for test03. 
/usr/bin/ldclt-bin -h hp-ml370g6-01.rhts.eng.bos.redhat.com -p 4980 -D uid=buser,o=cbrLD.com -w binduser -e delete,incr,noloop -r 0 -R 50000 -b o=farm1,o=cbrLD.com -f cn=monsieurXXXXX -I 68 -I 32 -q -n 20 -N 720 -W 1

ldclt[23817]: T002: Cannot ldap_delete (cn=monsieur00000,o=farm1,o=cbrLD.com), error=50 (Insufficient access)
ldclt[23817]: T011: Cannot ldap_delete (cn=monsieur00000,o=farm1,o=cbrLD.com), error=50 (Insufficient access)
ldclt[23817]: T004: Cannot ldap_delete (cn=monsieur00001,o=farm1,o=cbrLD.com), error=50 (Insufficient access)

Comment 4 Noriko Hosoi 2012-12-19 17:12:49 UTC

Probably, you could do this to narrow down the problem:
Stop the test just before test03, and search "cn=monsieur00000,o=farm1,o=cbrLD.com" as directory manager as well as as "uid=buser,o=cbrLD.com".  Do you see the entry?

If yes, can you delete it as "uid=buser,o=cbrLD.com"?

Comment 5 Noriko Hosoi 2012-12-20 00:13:56 UTC

(In reply to comment #3)
>     The above fix worked for test01. But, I am still getting the failure for
> test03. 
> /usr/bin/ldclt-bin -h hp-ml370g6-01.rhts.eng.bos.redhat.com -p 4980 -D
> uid=buser,o=cbrLD.com -w binduser -e delete,incr,noloop -r 0 -R 50000 -b
> o=farm1,o=cbrLD.com -f cn=monsieurXXXXX -I 68 -I 32 -q -n 20 -N 720 -W 1
> 
> ldclt[23817]: T002: Cannot ldap_delete
> (cn=monsieur00000,o=farm1,o=cbrLD.com), error=50 (Insufficient access)
> ldclt[23817]: T011: Cannot ldap_delete
> (cn=monsieur00000,o=farm1,o=cbrLD.com), error=50 (Insufficient access)
> ldclt[23817]: T004: Cannot ldap_delete
> (cn=monsieur00001,o=farm1,o=cbrLD.com), error=50 (Insufficient access)

Odd.  I ran test01 through test03 and found the 3 tests passed 100%. 
TestCase [test03] result-> [PASS]
===== [Pass/Fail] break down =====
Test Name                               PASS 	FAIL 	NORESULT
ChainingBackend longduration startup    100% (1/1) 	  	 
ChainingBackend longduration run        100% (3/3)

Comment 6 RHEL Program Management 2012-12-23 06:47:09 UTC

This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 7 Noriko Hosoi 2013-01-07 16:39:52 UTC

Sankar, could you try the comment 4 item?
Thanks!

Comment 9 Sankar Ramalingam 2013-01-14 17:14:45 UTC

(In reply to comment #4)
> Probably, you could do this to narrow down the problem:
> Stop the test just before test03, and search
> "cn=monsieur00000,o=farm1,o=cbrLD.com" as directory manager as well as as
> "uid=buser,o=cbrLD.com".  Do you see the entry?
> 
> If yes, can you delete it as "uid=buser,o=cbrLD.com"?

[root@hp-ml370g6-01 ~]# /usr/bin/ldapsearch -h hp-ml370g6-01.rhts.eng.bos.redhat.com -p 4980 -D "cn=Directory Manager" -w Secret123 -b "cn=monsieur00000,o=farm1,o=cbrLD.com" objectClass=*
# search result
search: 2
result: 53 Server is unwilling to perform
text: database configuration error - please contact the system administrator
# numResponses: 1

[root@hp-ml370g6-01 ~]# /usr/bin/ldapsearch -h hp-ml370g6-01.rhts.eng.bos.redhat.com -p 4980 -D uid=buser,o=cbrLD.com -w binduser -b "cn=monsieur00000,o=farm1,o=cbrLD.com" objectClass=*
# monsieur00000, farm1, cbrLD.com
dn: cn=monsieur00000,o=farm1,o=cbrLD.com

[root@hp-ml370g6-01 ~]# /usr/bin/ldapsearch -h hp-ml370g6-01.rhts.eng.bos.redhat.com -p 4981 -D "cn=Directory Manager" -w Secret123 -b "cn=monsieur00000,o=farm1,o=cbrLD.com" objectClass=*
# monsieur00000, farm1, cbrLD.com
dn: cn=monsieur00000,o=farm1,o=cbrLD.com

[root@hp-ml370g6-01 ~]# /usr/bin/ldapsearch -h hp-ml370g6-01.rhts.eng.bos.redhat.com -p 4981 -D uid=buser,o=cbrLD.com -w binduser -b "cn=monsieur00000,o=farm1,o=cbrLD.com" objectClass=*

# monsieur00000, farm1, cbrLD.com
dn: cn=monsieur00000,o=farm1,o=cbrLD.com

Deleting the user "uid=buser,o=cbrLD.com" is successful.

[root@hp-ml370g6-01 ~]# /usr/bin/ldapdelete -h hp-ml370g6-01.rhts.eng.bos.redhat.com -p 4981 -D "cn=Directory Manager" -w Secret123 "uid=buser,o=cbrLD.com"

[root@hp-ml370g6-01 ~]# /usr/bin/ldapdelete -h hp-ml370g6-01.rhts.eng.bos.redhat.com -p 4980 -D "cn=Directory Manager" -w Secret123 "uid=buser,o=cbrLD.com"

[root@hp-ml370g6-01 ~]# /usr/bin/ldapsearch -h hp-ml370g6-01.rhts.eng.bos.redhat.com -p 4981 -D uid=buser,o=cbrLD.com -w binduser -b "cn=monsieur00000,o=farm1,o=cbrLD^Com" objectClass=*

[root@hp-ml370g6-01 ~]# /usr/bin/ldapsearch -h hp-ml370g6-01.rhts.eng.bos.redhat.com -p 4980 -D "cn=Directory Manager" -w Secret123 -b "uid=buser,o=cbrLD.com" objectClass=*
# extended LDIF
result: 32 No such object
matchedDN: o=cbrLD.com

[root@hp-ml370g6-01 ~]# /usr/bin/ldapsearch -h hp-ml370g6-01.rhts.eng.bos.redhat.com -p 4981 -D "cn=Directory Manager" -w Secret123 -b "uid=buser,o=cbrLD.com" objectClass=*
# extended LDIF
result: 32 No such object
matchedDN: o=cbrLD.com

Comment 10 Noriko Hosoi 2013-01-14 17:52:05 UTC

In the sentence,
> If yes, can you delete it as "uid=buser,o=cbrLD.com"?

I meant "it" was "cn=monsieur00000,o=farm1,o=cbrLD.com".  Sorry if it was not clear enough.

So, instead of these command lines:
> [root@hp-ml370g6-01 ~]# /usr/bin/ldapdelete -h
> hp-ml370g6-01.rhts.eng.bos.redhat.com -p 4981 -D "cn=Directory Manager" -w
> Secret123 "uid=buser,o=cbrLD.com"
> 
> [root@hp-ml370g6-01 ~]# /usr/bin/ldapdelete -h
> hp-ml370g6-01.rhts.eng.bos.redhat.com -p 4980 -D "cn=Directory Manager" -w
> Secret123 "uid=buser,o=cbrLD.com"

Please run just this command line:
/usr/bin/ldapdelete -h hp-ml370g6-01.rhts.eng.bos.redhat.com -p 4980 -D "uid=buser,o=cbrLD.com" -w binduser "cn=monsieur00000,o=farm1,o=cbrLD.com"

All I want to know was why "ldclt -h hp-ml370g6-01.rhts.eng.bos.redhat.com -p 4980 -D uid=buser,o=cbrLD.com -w binduser -e delete,..." failed with error 50.

(In reply to comment #9)
[...]
> 
> Deleting the user "uid=buser,o=cbrLD.com" is successful.
> 
> [root@hp-ml370g6-01 ~]# /usr/bin/ldapdelete -h
> hp-ml370g6-01.rhts.eng.bos.redhat.com -p 4981 -D "cn=Directory Manager" -w
> Secret123 "uid=buser,o=cbrLD.com"
> 
> [root@hp-ml370g6-01 ~]# /usr/bin/ldapdelete -h
> hp-ml370g6-01.rhts.eng.bos.redhat.com -p 4980 -D "cn=Directory Manager" -w
> Secret123 "uid=buser,o=cbrLD.com"
> 
> [root@hp-ml370g6-01 ~]# /usr/bin/ldapsearch -h
> hp-ml370g6-01.rhts.eng.bos.redhat.com -p 4981 -D uid=buser,o=cbrLD.com -w
> binduser -b "cn=monsieur00000,o=farm1,o=cbrLD^Com" objectClass=*
> 
> [root@hp-ml370g6-01 ~]# /usr/bin/ldapsearch -h
> hp-ml370g6-01.rhts.eng.bos.redhat.com -p 4980 -D "cn=Directory Manager" -w
> Secret123 -b "uid=buser,o=cbrLD.com" objectClass=*
> # extended LDIF
> result: 32 No such object
> matchedDN: o=cbrLD.com
> 
> [root@hp-ml370g6-01 ~]# /usr/bin/ldapsearch -h
> hp-ml370g6-01.rhts.eng.bos.redhat.com -p 4981 -D "cn=Directory Manager" -w
> Secret123 -b "uid=buser,o=cbrLD.com" objectClass=*
> # extended LDIF
> result: 32 No such object
> matchedDN: o=cbrLD.com

Comment 11 Noriko Hosoi 2013-01-15 00:52:21 UTC

Thank you for the test env, Sankar.
===========================================================================
ldapdelete works just fine as follows.  This is one of the same operations ldclt in test03 is issuing.  So, I don't think error=50 (Insufficient access) is occuring any more.

[sramling@hp-ml370g6-01 longduration]$ /usr/bin/ldapsearch -h hp-ml370g6-01.rhts.eng.bos.redhat.com -p 4980 -D uid=buser,o=cbrLD.com -w binduser -b "cn=monsieur00001,o=farm1,o=cbrLD.com" objectClass=*
# extended LDIF
#
# LDAPv3
# base <cn=monsieur00001,o=farm1,o=cbrLD.com> with scope subtree
# filter: objectClass=*
# requesting: ALL
#

# monsieur00001, farm1, cbrLD.com
dn: cn=monsieur00001,o=farm1,o=cbrLD.com
objectClass: person
objectClass: top
cn: monsieur00001
sn: toto sn

# search result
search: 2
result: 0 Success

# numResponses: 2
# numEntries: 1
[sramling@hp-ml370g6-01 longduration]$ /usr/bin/ldapdelete -h hp-ml370g6-01.rhts.eng.bos.redhat.com -p 4980 -D "uid=buser,o=cbrLD.com" -w binduser "cn=monsieur00001,o=farm1,o=cbrLD.com"
[sramling@hp-ml370g6-01 longduration]$ /usr/bin/ldapsearch -h hp-ml370g6-01.rhts.eng.bos.redhat.com -p 4980 -D uid=buser,o=cbrLD.com -w binduser -b "cn=monsieur00001,o=farm1,o=cbrLD.com" objectClass=*
# extended LDIF
#
# LDAPv3
# base <cn=monsieur00001,o=farm1,o=cbrLD.com> with scope subtree
# filter: objectClass=*
# requesting: ALL
#

# search result
search: 2
result: 32 No such object
text: database configuration error - please contact the system administrator

# numResponses: 1
===========================================================================
I also ran the small subset of ldclt and verified the utility works just fine.
ldclt -h hp-ml370g6-01.rhts.eng.bos.redhat.com -p 4980 -D uid=buser,o=cbrLD.com -w binduser -e delete,incr,noloop -r 100 -R 500 -b o=farm1,o=cbrLD.com -f cn=monsieurXXXXX -I 68 -I 32 -v -q -n 20 -N 720 -W 1/usr/bin/ldclt-bin -h hp-ml370g6-01.rhts.eng.bos.redhat.com -p 4980 -D uid=buser,o=cbrLD.com -w binduser -e delete,incr,noloop -r 100 -R 500 -b o=farm2,o=cbrLD.com -f cn=monsieurXXXXX -I 68 -I 32 -v -q -n 20 -N 720 -W 1

You can see the result hp-ml370g6-01.rhts.eng.bos.redhat.com:/tmp/test03_farm[12].out.0.
===========================================================================
I have no idea why it fails if ldclt is executed from TET.  Probably, you may want to run test03 from TET and back up error and access logs to tmp dir to examine what happened then?

Comment 12 Noriko Hosoi 2013-01-17 23:06:41 UTC

It looks chaining backend has an issue.  Some connection is established and used for operations without binding.  Search is allowed, but delete is rejected as expected with the acl settings.

This is the snippet of the access log.  conn=15 is established and used immediately for search without binding.  The searches are from ldclt.  Then, the same connection is used for ldclt delete, which fails since acl does not allow it to anonymous (no bind user).

    [16/Jan/2013:09:25:01 -0500] conn=15 fd=64 slot=64 connection from 10.16.66.124 to 10.16.66.124
    [16/Jan/2013:09:25:01 -0500] conn=15 op=0 SRCH base="o=farm1,o=cbrLD.com" scope=2 filter="(cn=monsieur14394)" attrs=ALL authzid="uid=buser,o=cbrld.com"
    [16/Jan/2013:09:25:01 -0500] conn=15 op=0 RESULT err=0 tag=101 nentries=0 etime=0
    [16/Jan/2013:09:25:01 -0500] conn=15 op=1 SRCH base="o=farm1,o=cbrLD.com" scope=2 filter="(cn=monsieur22398)" attrs=ALL authzid="uid=buser,o=cbrld.com"
    [16/Jan/2013:09:25:01 -0500] conn=15 op=1 RESULT err=0 tag=101 nentries=0 etime=0
    [16/Jan/2013:09:25:01 -0500] conn=15 op=2 SRCH base="o=farm1,o=cbrLD.com" scope=2 filter="(cn=monsieur12018)" attrs=ALL authzid="uid=buser,o=cbrld.com"
    ...
    [16/Jan/2013:10:07:33 -0500] conn=15 op=48176 DEL dn="cn=monsieur00000,o=farm1,o=cbrLD.com" authzid="uid=buser,o=cbrld.com"
    [16/Jan/2013:10:07:33 -0500] conn=15 op=48176 RESULT err=50 tag=107 nentries=0 etime=0
    ...
    [16/Jan/2013:10:07:47 -0500] conn=15 op=48188 DEL dn="cn=monsieur00005,o=farm1,o=cbrLD.com" authzid="uid=buser,o=cbrld.com"
    [16/Jan/2013:10:07:48 -0500] conn=15 op=48188 RESULT err=50 tag=107 nentries=0 etime=1

On farm1, all the err=50 are associated with conn=15.
#  egrep err=50 access | egrep -v "conn=15"
#

On farm2, all the err=50 are associated with conn=16.
# egrep err=50 access | egrep -v "conn=16"
#

This is the error log and access log of one delete op.  It shows the binddn is empty.

    # egrep "conn=15 op=48188" access errors
    access:[16/Jan/2013:10:07:47 -0500] conn=15 op=48188 DEL dn="cn=monsieur00005,o=farm1,o=cbrLD.com" authzid="uid=buser,o=cbrld.com"
    access:[16/Jan/2013:10:07:48 -0500] conn=15 op=48188 RESULT err=50 tag=107 nentries=0 etime=1
    errors:[16/Jan/2013:10:07:47 -0500] NSACLPlugin - #### conn=15 op=48188 binddn=""
    errors:[16/Jan/2013:10:07:47 -0500] NSACLPlugin - #### conn=15 op=48188 binddn=""
    errors:[16/Jan/2013:10:07:48 -0500] NSACLPlugin - conn=15 op=48188 (main): Deny proxy on entry(cn=monsieur00005,o=farm1,o=cbrld.com).attr(NULL) to anonymous: no aci matched the subject by aci(19): aciname= "troot", acidn="o=cbrld.com"
    errors:[16/Jan/2013:10:07:48 -0500] NSACLPlugin - conn=15 op=48188 (main): Deny delete on entry(cn=monsieur00005,o=farm1,o=cbrld.com).attr(NULL) to proxy (uid=buser,o=cbrld.com): error occurred

I grepped BIND in the access log on Farm1.  conn=13 and 15 are missing, which were running w/o bind.

    # egrep BIND access | egrep -v UNBIND
    [16/Jan/2013:07:32:05 -0500] conn=2 op=0 BIND dn="cn=directory manager" method=128 version=3
    [16/Jan/2013:07:32:05 -0500] conn=3 op=0 BIND dn="cn=directory manager" method=128 version=3
    [16/Jan/2013:07:32:08 -0500] conn=4 op=0 BIND dn="uid=troot,o=cbrld.com" method=128 version=3
    [16/Jan/2013:08:07:32 -0500] conn=5 op=0 BIND dn="uid=troot,o=cbrld.com" method=128 version=3
    [16/Jan/2013:08:07:32 -0500] conn=6 op=0 BIND dn="uid=troot,o=cbrld.com" method=128 version=3
    [16/Jan/2013:08:07:32 -0500] conn=7 op=0 BIND dn="uid=troot,o=cbrld.com" method=128 version=3
    [16/Jan/2013:08:07:32 -0500] conn=8 op=0 BIND dn="uid=troot,o=cbrld.com" method=128 version=3
    [16/Jan/2013:08:07:32 -0500] conn=9 op=0 BIND dn="uid=troot,o=cbrld.com" method=128 version=3
    [16/Jan/2013:08:07:32 -0500] conn=10 op=0 BIND dn="uid=troot,o=cbrld.com" method=128 version=3
    [16/Jan/2013:08:07:32 -0500] conn=11 op=0 BIND dn="uid=troot,o=cbrld.com" method=128 version=3
    [16/Jan/2013:08:07:53 -0500] conn=12 op=0 BIND dn="uid=troot,o=cbrld.com" method=128 version=3
    [16/Jan/2013:08:51:58 -0500] conn=14 op=0 BIND dn="uid=troot,o=cbrld.com" method=128 version=3
    [16/Jan/2013:10:07:31 -0500] conn=16 op=0 BIND dn="cn=directory manager" method=128 version=3

Since chaining backend manages connections in the connection pool, once it's used without bound, it continues being used.

Interestingly, the connection with no bind occurs when the concurrent connection count reaches maxconnections (20, by default) and woken by receiving the notification from other thread.  This case, conn=13 is unbound and closed, and conn=15 is established.

    [16/Jan/2013:09:25:01 -0500] conn=13 op=80931 SRCH base="o=farm1,o=cbrLD.com" sc
    ope=2 filter="(cn=monsieur47560)" attrs=ALL authzid="uid=buser,o=cbrld.com"
    [16/Jan/2013:09:25:01 -0500] conn=13 op=80931 RESULT err=0 tag=101 nentries=0 etime=0
    [16/Jan/2013:09:25:01 -0500] conn=13 op=80932 UNBIND
    [16/Jan/2013:09:25:01 -0500] conn=13 op=80932 fd=64 closed - U1
    [16/Jan/2013:09:25:01 -0500] conn=15 fd=64 slot=64 connection from 10.16.66.124 to 10.16.66.124
    [16/Jan/2013:09:25:01 -0500] conn=15 op=0 SRCH base="o=farm1,o=cbrLD.com" scope=2 filter="(cn=monsieur14394)" attrs=ALLauthzid="uid=buser,o=cbrld.com"
    [16/Jan/2013:09:25:01 -0500] conn=15 op=0 RESULT err=0 tag=101 nentries=0 etime=0
        ....

Only a suspicious code which could provide a connection without binding is the case that (pool->binit == 0).  But there is no place where 0 is setting to bindit.  It should be always 1.  But since we don't use the value we could get rid of the check (line 362) in the future.

    158 int
    159 cb_get_connection(cb_conn_pool * pool,
    160                   LDAP ** lld,
    161                   cb_outgoing_conn ** cc,
    162                   struct timeval * maxtime,
    163                   char **errmsg)
    164 {
        ...
    339                         ld = slapi_ldap_init(hostname, port, secure, isMulti    Thread);
        ...
    359                         /* no controls and simple bind only */
    360                         /* For now, bind even if no user to detect error */
    361                         /* earlier                                       */
    362                         if (pool->bindit) {
        ...
    397                                 /* Password-based client authentication */
    398                                 rc = slapi_ldap_bind(ld, binddn, plain, mech    , NULL, &serverctrls,
    399                                                      &bind_to, NULL);
    400
    401                                 if ( ret == 0 ) slapi_ch_free_string(&plain)    ; /* free plain only if it has been duplicated */
        ...

For the RHEL6.4, I suggest to increase the value of config parameter "nsOperationConnectionsLimit" to double (40), then we should be able to work around the problem, I think.

Comment 13 Sankar Ramalingam 2013-01-18 18:14:10 UTC

Changing the summary of the bug adn clearing other flags.

Comment 14 RHEL Program Management 2013-01-18 18:16:53 UTC

Quality Engineering Management has reviewed and declined this request.
You may appeal this decision by reopening this request.

Comment 16 Sankar Ramalingam 2013-01-20 10:51:15 UTC

(In reply to comment #12)
> It looks chaining backend has an issue.  Some connection is established and
> used for operations without binding.  Search is allowed, but delete is
> rejected as expected with the acl settings.
> 
> This is the snippet of the access log.  conn=15 is established and used
> immediately for search without binding.  The searches are from ldclt.  Then,
> the same connection is used for ldclt delete, which fails since acl does not
> allow it to anonymous (no bind user).
> 
>     [16/Jan/2013:09:25:01 -0500] conn=15 fd=64 slot=64 connection from
> 10.16.66.124 to 10.16.66.124
>     [16/Jan/2013:09:25:01 -0500] conn=15 op=0 SRCH
> base="o=farm1,o=cbrLD.com" scope=2 filter="(cn=monsieur14394)" attrs=ALL
> authzid="uid=buser,o=cbrld.com"
>     [16/Jan/2013:09:25:01 -0500] conn=15 op=0 RESULT err=0 tag=101
> nentries=0 etime=0
>     [16/Jan/2013:09:25:01 -0500] conn=15 op=1 SRCH
> base="o=farm1,o=cbrLD.com" scope=2 filter="(cn=monsieur22398)" attrs=ALL
> authzid="uid=buser,o=cbrld.com"
>     [16/Jan/2013:09:25:01 -0500] conn=15 op=1 RESULT err=0 tag=101
> nentries=0 etime=0
>     [16/Jan/2013:09:25:01 -0500] conn=15 op=2 SRCH
> base="o=farm1,o=cbrLD.com" scope=2 filter="(cn=monsieur12018)" attrs=ALL
> authzid="uid=buser,o=cbrld.com"
>     ...
>     [16/Jan/2013:10:07:33 -0500] conn=15 op=48176 DEL
> dn="cn=monsieur00000,o=farm1,o=cbrLD.com" authzid="uid=buser,o=cbrld.com"
>     [16/Jan/2013:10:07:33 -0500] conn=15 op=48176 RESULT err=50 tag=107
> nentries=0 etime=0
>     ...
>     [16/Jan/2013:10:07:47 -0500] conn=15 op=48188 DEL
> dn="cn=monsieur00005,o=farm1,o=cbrLD.com" authzid="uid=buser,o=cbrld.com"
>     [16/Jan/2013:10:07:48 -0500] conn=15 op=48188 RESULT err=50 tag=107
> nentries=0 etime=1
> 
> On farm1, all the err=50 are associated with conn=15.
> #  egrep err=50 access | egrep -v "conn=15"
> #
> 
> On farm2, all the err=50 are associated with conn=16.
> # egrep err=50 access | egrep -v "conn=16"
> #
> 
> This is the error log and access log of one delete op.  It shows the binddn
> is empty.
> 
>     # egrep "conn=15 op=48188" access errors
>     access:[16/Jan/2013:10:07:47 -0500] conn=15 op=48188 DEL
> dn="cn=monsieur00005,o=farm1,o=cbrLD.com" authzid="uid=buser,o=cbrld.com"
>     access:[16/Jan/2013:10:07:48 -0500] conn=15 op=48188 RESULT err=50
> tag=107 nentries=0 etime=1
>     errors:[16/Jan/2013:10:07:47 -0500] NSACLPlugin - #### conn=15 op=48188
> binddn=""
>     errors:[16/Jan/2013:10:07:47 -0500] NSACLPlugin - #### conn=15 op=48188
> binddn=""
>     errors:[16/Jan/2013:10:07:48 -0500] NSACLPlugin - conn=15 op=48188
> (main): Deny proxy on entry(cn=monsieur00005,o=farm1,o=cbrld.com).attr(NULL)
> to anonymous: no aci matched the subject by aci(19): aciname= "troot",
> acidn="o=cbrld.com"
>     errors:[16/Jan/2013:10:07:48 -0500] NSACLPlugin - conn=15 op=48188
> (main): Deny delete on
> entry(cn=monsieur00005,o=farm1,o=cbrld.com).attr(NULL) to proxy
> (uid=buser,o=cbrld.com): error occurred
> 
> I grepped BIND in the access log on Farm1.  conn=13 and 15 are missing,
> which were running w/o bind.
> 
>     # egrep BIND access | egrep -v UNBIND
>     [16/Jan/2013:07:32:05 -0500] conn=2 op=0 BIND dn="cn=directory manager"
> method=128 version=3
>     [16/Jan/2013:07:32:05 -0500] conn=3 op=0 BIND dn="cn=directory manager"
> method=128 version=3
>     [16/Jan/2013:07:32:08 -0500] conn=4 op=0 BIND dn="uid=troot,o=cbrld.com"
> method=128 version=3
>     [16/Jan/2013:08:07:32 -0500] conn=5 op=0 BIND dn="uid=troot,o=cbrld.com"
> method=128 version=3
>     [16/Jan/2013:08:07:32 -0500] conn=6 op=0 BIND dn="uid=troot,o=cbrld.com"
> method=128 version=3
>     [16/Jan/2013:08:07:32 -0500] conn=7 op=0 BIND dn="uid=troot,o=cbrld.com"
> method=128 version=3
>     [16/Jan/2013:08:07:32 -0500] conn=8 op=0 BIND dn="uid=troot,o=cbrld.com"
> method=128 version=3
>     [16/Jan/2013:08:07:32 -0500] conn=9 op=0 BIND dn="uid=troot,o=cbrld.com"
> method=128 version=3
>     [16/Jan/2013:08:07:32 -0500] conn=10 op=0 BIND
> dn="uid=troot,o=cbrld.com" method=128 version=3
>     [16/Jan/2013:08:07:32 -0500] conn=11 op=0 BIND
> dn="uid=troot,o=cbrld.com" method=128 version=3
>     [16/Jan/2013:08:07:53 -0500] conn=12 op=0 BIND
> dn="uid=troot,o=cbrld.com" method=128 version=3
>     [16/Jan/2013:08:51:58 -0500] conn=14 op=0 BIND
> dn="uid=troot,o=cbrld.com" method=128 version=3
>     [16/Jan/2013:10:07:31 -0500] conn=16 op=0 BIND dn="cn=directory manager"
> method=128 version=3
> 
> Since chaining backend manages connections in the connection pool, once it's
> used without bound, it continues being used.
> 
> Interestingly, the connection with no bind occurs when the concurrent
> connection count reaches maxconnections (20, by default) and woken by
> receiving the notification from other thread.  This case, conn=13 is unbound
> and closed, and conn=15 is established.
> 
>     [16/Jan/2013:09:25:01 -0500] conn=13 op=80931 SRCH
> base="o=farm1,o=cbrLD.com" sc
>     ope=2 filter="(cn=monsieur47560)" attrs=ALL
> authzid="uid=buser,o=cbrld.com"
>     [16/Jan/2013:09:25:01 -0500] conn=13 op=80931 RESULT err=0 tag=101
> nentries=0 etime=0
>     [16/Jan/2013:09:25:01 -0500] conn=13 op=80932 UNBIND
>     [16/Jan/2013:09:25:01 -0500] conn=13 op=80932 fd=64 closed - U1
>     [16/Jan/2013:09:25:01 -0500] conn=15 fd=64 slot=64 connection from
> 10.16.66.124 to 10.16.66.124
>     [16/Jan/2013:09:25:01 -0500] conn=15 op=0 SRCH
> base="o=farm1,o=cbrLD.com" scope=2 filter="(cn=monsieur14394)"
> attrs=ALLauthzid="uid=buser,o=cbrld.com"
>     [16/Jan/2013:09:25:01 -0500] conn=15 op=0 RESULT err=0 tag=101
> nentries=0 etime=0
>         ....
> 
> Only a suspicious code which could provide a connection without binding is
> the case that (pool->binit == 0).  But there is no place where 0 is setting
> to bindit.  It should be always 1.  But since we don't use the value we
> could get rid of the check (line 362) in the future.
> 
>     158 int
>     159 cb_get_connection(cb_conn_pool * pool,
>     160                   LDAP ** lld,
>     161                   cb_outgoing_conn ** cc,
>     162                   struct timeval * maxtime,
>     163                   char **errmsg)
>     164 {
>         ...
>     339                         ld = slapi_ldap_init(hostname, port, secure,
> isMulti    Thread);
>         ...
>     359                         /* no controls and simple bind only */
>     360                         /* For now, bind even if no user to detect
> error */
>     361                         /* earlier                                  
> */
>     362                         if (pool->bindit) {
>         ...
>     397                                 /* Password-based client
> authentication */
>     398                                 rc = slapi_ldap_bind(ld, binddn,
> plain, mech    , NULL, &serverctrls,
>     399                                                      &bind_to, NULL);
>     400
>     401                                 if ( ret == 0 )
> slapi_ch_free_string(&plain)    ; /* free plain only if it has been
> duplicated */
>         ...
> 
> For the RHEL6.4, I suggest to increase the value of config parameter
> "nsOperationConnectionsLimit" to double (40), then we should be able to work

This is not helping. I observed no change in the behaviour. 
Report : http://hp-ml370g6-01.rhts.eng.bos.redhat.com/qa/archive/ds/rhel64/longduration/Linux/20130118-134658.html

I still see "Cannot ldap_delete (cn=monsieur01940,o=farm2,o=cbrLD.com), error=50 (Insufficient access)" errors for both farm1 nad farm2.

> around the problem, I think.