Bug 1326077

Summary: Page result search should return empty cookie if there is no returned entry
Product: Red Hat Enterprise Linux 7 Reporter: Marc Sauton <msauton>
Component: 389-ds-baseAssignee: Noriko Hosoi <nhosoi>
Status: CLOSED ERRATA QA Contact: Viktor Ashirov <vashirov>
Severity: high Docs Contact:
Priority: unspecified    
Version: 7.2CC: nhosoi, nkinder, rmeggins, spichugi, tbordaz
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: 389-ds-base-1.3.5.10-6.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-03 20:41:07 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Marc Sauton 2016-04-11 18:10:48 UTC
Description of problem:

opening this bz to track the need to provide a hotfix for a customer case with patch from

  result search should return empty cookie if there is no returned entry
  https://fedorahosted.org/389/ticket/48752
  https://fedorahosted.org/389/attachment/ticket/48752/0002-Ticket-48752-Page-result-search-should-return-empty-.patch

if proven a good hotfix by the customer after a week or so of testing, this should go into a RHEL 7.2.z


the initial issue came in IdM/IPA context as
"ID management issue - stops responding, impossible to log in (locally only)"

the problem is related to the SSSD smart refresh feature, we see patterns of burst of LDAP requests that mostly return 0 entries that seem to randomly stop, edited log sample:

[07/Apr/2016:05:46:23 -0400] conn=1684020 op=322 SRCH base="ou=SUDOers,dc=example,dc=com" scope=2 filter="(&(&(objectClass=sudoRole)(entryusn>=1635896)(!(entryusn=1635896)))(|(!(sudoHost=*))(sudoHost=ALL)(sudoHost=host1234.example.com)(sudoHost=host1234)(sudoHost=192.168.10.14)(sudoHost=192.168.10.16/27)(sudoHost=fe80::f66d:4ff:fbb3:f1234)(sudoHost=fe80::/64)(sudoHost=+*)(|(sudoHost=*\5C*)(sudoHost=*?*)(sudoHost=*\2A*)(sudoHost=*[*]*))))" attrs="objectClass cn sudoCommand sudoHost sudoUser sudoOption sudoRunAs sudoRunAsUser sudoRunAsGroup sudoNotBefore sudoNotAfter sudoOrder entryusn"
[07/Apr/2016:05:46:23 -0400] conn=1684020 op=322 RESULT err=0 tag=101 nentries=0 etime=0 notes=P pr_idx=276

we hope the patch will stop those "cascades" of LDAP traffic with SSSD.

Thierry B. / tbordaz has been helping troubleshooting this issue.


Version-Release number of selected component (if applicable):
IPA server:
redhat-release-server-7.2-9.el7.x86_64
389-ds-base-1.3.4.0-21.el7_2.x86_64
sssd-1.13.0-40.el7_2.1.x86_64

IPA client:
redhat-lsb-4.0-7.el6.centos.x86_64
sssd-1.12.4-47.el6_7.7.x86_64


How reproducible:
N/A

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 4 Simon Pichugin 2016-06-29 15:27:36 UTC
Hi Thierry,
here - https://fedorahosted.org/389/ticket/48752#comment:10
you said that reproducing steps are unknown for this ticket.

Would it be enough just to test simple paged result test suite automation for a clean execution? Or should I perform some additional steps to verify the bug?

Comment 5 thierry bordaz 2016-06-29 15:57:06 UTC
Hi Simon,

I think it would be good to adapt some page result test to the fix https://fedorahosted.org/389/changeset/f215fb63a4ba2b57c44b29d6a88d655fb98917d1/

Now the cookie will be logged in access logs like:
conn=# op=# RESULT err=0 tag=101 nentries=5 etime=0 notes=P pr_idx=2 pr_cookie=2

Something we should check is the relation between 'nentries' and 'pr_cookie'

pr_cookie=-1  => empty cookie

nentries=x and pr_cookie=y 

  x==0 and y==-1: valid (there is no more entries left)
  x>0 and y==-1: valid (it is sending the lasts entries)
  
  y>=0 and x==0: valid (there is no more entries on that backend) BUT it needs to be followed by an other PR Search until y==-1
  y>=0 and x>0: valid (there is more entries to send)

I think some tests should be done with PR searches that covers several suffixes and able to retrieve entries from several suffixes

thanks
thierry

Comment 6 thierry bordaz 2016-07-13 15:46:45 UTC
Hi Simon, Noriko

With the provided test (test_multi_suffix_search) the fix works well but could be improved to help diagnose.

A page result search returns a cookie to the ldapclient. The range of value of the cookie is <empty>, 0, 1,...max_PR_req.

When the server has no more entries to send, it sends an empty cookie to the ldapclient. Else it send a valid cookie [0..max_PR_req].

It also logs the cookie value with:
pr_cookie=0  (for empty cookie)
pr_cookie=<valid_cookie>

The fact that 0 can represent both <empty cookie> and <valid_cookie_0> is not really helpful.


Changing the fix with:
diff --git a/ldap/servers/slapd/pagedresults.c b/ldap/servers/slapd/pagedresults.c
index 07a7b69..6fec344 100644
--- a/ldap/servers/slapd/pagedresults.c
+++ b/ldap/servers/slapd/pagedresults.c
@@ -247,7 +247,7 @@ pagedresults_set_response_control( Slapi_PBlock *pb, int iscritical,
 
     /* begin sequence, payload, end sequence */
     if (current_search_count < 0) {
-        cookie = 0;
+        cookie = -1;
         cookie_str = slapi_ch_strdup("");
     } else {
         cookie = index;


Creates the following logs:
[13/Jul/2016:17:31:45.439475112 +0200] conn=2 op=43 SRCH base="o=test_create" scope=2 filter="(uid=test*)" attrs="distinguishedName sn" 
[13/Jul/2016:17:31:45.439679286 +0200] conn=2 op=43 ENTRY dn="uid=test00642,o=test_create"
[13/Jul/2016:17:31:45.439701744 +0200] conn=2 op=43 ENTRY dn="uid=test00704,o=test_create"
[13/Jul/2016:17:31:45.439720734 +0200] conn=2 op=43 ENTRY dn="uid=test00638,o=test_create"
[13/Jul/2016:17:31:45.439737108 +0200] conn=2 op=43 ENTRY dn="uid=test00209,o=test_create"
[13/Jul/2016:17:31:45.439764038 +0200] conn=2 op=43 RESULT err=0 tag=101 nentries=4 etime=0 notes=U,P pr_idx=0 pr_cookie=0
[13/Jul/2016:17:31:45.439920731 +0200] conn=2 op=44 SRCH base="o=test_create" scope=2 filter="(uid=test*)" attrs="distinguishedName sn" 
[13/Jul/2016:17:31:45.439974265 +0200] conn=2 op=44 ENTRY dn="uid=test00593,o=test_create"
[13/Jul/2016:17:31:45.439996020 +0200] conn=2 op=44 ENTRY dn="uid=test00672,o=test_create"
[13/Jul/2016:17:31:45.440013574 +0200] conn=2 op=44 ENTRY dn="uid=test00310,o=test_create"
[13/Jul/2016:17:31:45.440029526 +0200] conn=2 op=44 ENTRY dn="uid=test00816,o=test_create"
[13/Jul/2016:17:31:45.440046151 +0200] conn=2 op=44 RESULT err=0 tag=101 nentries=4 etime=0 notes=U,P pr_idx=0 pr_cookie=0
[13/Jul/2016:17:31:45.440226623 +0200] conn=2 op=45 SRCH base="o=test_create" scope=2 filter="(uid=test*)" attrs="distinguishedName sn" 
[13/Jul/2016:17:31:45.440278950 +0200] conn=2 op=45 ENTRY dn="uid=test00175,o=test_create"
[13/Jul/2016:17:31:45.440299859 +0200] conn=2 op=45 ENTRY dn="uid=test00198,o=test_create"
[13/Jul/2016:17:31:45.440313441 +0200] conn=2 op=45 RESULT err=0 tag=101 nentries=2 etime=0 notes=U,P pr_idx=0 pr_cookie=0
[13/Jul/2016:17:31:45.440451312 +0200] conn=2 op=46 SRCH base="o=test_create" scope=2 filter="(uid=test*)" attrs="distinguishedName sn" 
[13/Jul/2016:17:31:45.440585284 +0200] conn=2 op=46 ENTRY dn="uid=test00627,ou=new,o=test_create"
[13/Jul/2016:17:31:45.440610135 +0200] conn=2 op=46 ENTRY dn="uid=test00717,ou=new,o=test_create"
[13/Jul/2016:17:31:45.440630059 +0200] conn=2 op=46 ENTRY dn="uid=test00299,ou=new,o=test_create"
[13/Jul/2016:17:31:45.440648001 +0200] conn=2 op=46 ENTRY dn="uid=test00432,ou=new,o=test_create"
[13/Jul/2016:17:31:45.440666169 +0200] conn=2 op=46 RESULT err=0 tag=101 nentries=4 etime=0 notes=U,P pr_idx=0 pr_cookie=0
[13/Jul/2016:17:31:45.440824958 +0200] conn=2 op=47 SRCH base="o=test_create" scope=2 filter="(uid=test*)" attrs="distinguishedName sn" 
[13/Jul/2016:17:31:45.440888578 +0200] conn=2 op=47 ENTRY dn="uid=test00849,ou=new,o=test_create"
[13/Jul/2016:17:31:45.440913113 +0200] conn=2 op=47 ENTRY dn="uid=test00188,ou=new,o=test_create"
[13/Jul/2016:17:31:45.440932893 +0200] conn=2 op=47 ENTRY dn="uid=test00984,ou=new,o=test_create"
[13/Jul/2016:17:31:45.440951728 +0200] conn=2 op=47 ENTRY dn="uid=test00150,ou=new,o=test_create"
[13/Jul/2016:17:31:45.440969411 +0200] conn=2 op=47 RESULT err=0 tag=101 nentries=4 etime=0 notes=U,P pr_idx=0 pr_cookie=0
[13/Jul/2016:17:31:45.441131578 +0200] conn=2 op=48 SRCH base="o=test_create" scope=2 filter="(uid=test*)" attrs="distinguishedName sn" 
[13/Jul/2016:17:31:45.441201280 +0200] conn=2 op=48 ENTRY dn="uid=test00218,ou=new,o=test_create"
[13/Jul/2016:17:31:45.441225408 +0200] conn=2 op=48 ENTRY dn="uid=test00876,ou=new,o=test_create"
[13/Jul/2016:17:31:45.441239480 +0200] conn=2 op=48 RESULT err=0 tag=101 nentries=2 etime=0 notes=U,P pr_idx=0 pr_cookie=-1    


Noriko, is it ok to open a new ticket to fix this logging in the next release or  do you prefer to reopen that BZ ?

Comment 7 Noriko Hosoi 2016-07-13 17:07:52 UTC
Thierry, thanks for the improvement!

Please use this bug to include your fix in 7.3.

steps...
. i'm changing the status of this bug to 
. open an upstream ticket
. add the ticket # to this bug's devel whiteboard

Comment 8 Noriko Hosoi 2016-07-13 17:09:29 UTC
continue...

. once you push your fix to the upstream master branch, change this bug's status to POST.

Then, I could pick up your fix in the next respin.

Thanks!!

Comment 9 Simon Pichugin 2016-07-13 20:07:23 UTC
Thierry, as I understand, now pr_cookie=0 can represent both <empty cookie> and <valid_cookie_0>.

Can you, please, describe what data do you want of pr_cookie to contain in the result? So I can adopt my test case (test_multi_suffix_search) to your improvement, add it to the upstream and help you with a new fix testing.

Comment 10 Simon Pichugin 2016-07-15 10:47:34 UTC
Thierry, I will specify my concern more detailed.

You've said:
"  
  x==0 and y==-1: valid (there is no more entries left)
  x>0 and y==-1: valid (it is sending the lasts entries)
  
  y>=0 and x==0: valid (there is no more entries on that backend) BUT it needs to be followed by an other PR Search until y==-1
  y>=0 and x>0: valid (there is more entries to send)

A page result search returns a cookie to the ldapclient. The range of value of the cookie is <empty>, 0, 1,...max_PR_req.

When the server has no more entries to send, it sends an empty cookie to the ldapclient. Else it send a valid cookie [0..max_PR_req].
"

But now after the fix we have (from your logs):
Start search
dn="uid=test00642,o=test_create"
dn="uid=test00704,o=test_create"
dn="uid=test00638,o=test_create"
dn="uid=test00209,o=test_create"
pr_cookie=0
dn="uid=test00593,o=test_create"
dn="uid=test00672,o=test_create"
dn="uid=test00310,o=test_create"
dn="uid=test00816,o=test_create"
pr_cookie=0
dn="uid=test00175,o=test_create"
dn="uid=test00198,o=test_create"
pr_cookie=0
dn="uid=test00627,ou=new,o=test_create"
dn="uid=test00717,ou=new,o=test_create"
dn="uid=test00299,ou=new,o=test_create"
dn="uid=test00432,ou=new,o=test_create"
pr_cookie=0
dn="uid=test00849,ou=new,o=test_create"
dn="uid=test00188,ou=new,o=test_create"
dn="uid=test00984,ou=new,o=test_create"
dn="uid=test00150,ou=new,o=test_create"
pr_cookie=0
dn="uid=test00218,ou=new,o=test_create"
dn="uid=test00876,ou=new,o=test_create"
pr_cookie=-1 

So it is every-time pr_cookie=0 (even when there's something to send in a second time), except the last time, when pr_cookie=-1.

Will you work on it too? And then what will pr_cookie contain, counter from 0 to [some value]?

Or will you leave it like that for now and we'll proceed with only pr_cookie=-1 fix?

Comment 11 Noriko Hosoi 2016-07-15 17:18:32 UTC
Hi Simon, 

This pr_cookie value 0 is correct.
> pr_cookie=0

The cookie stores the index of the particular simple paged results request in the connection.  Internally, simple paged results objects are stored in the Connection in the array format.  In one connection, you could send multiple different simple paged result search requests simultaneously.  Each request is stored in each element in the array assigned, which index is the value of the cookie.

Please note that this fix proposed by Thierry is purely for the access log.  It does not affect the results which the client receives.

index 07a7b69..6fec344 100644
--- a/ldap/servers/slapd/pagedresults.c
+++ b/ldap/servers/slapd/pagedresults.c
@@ -247,7 +247,7 @@ pagedresults_set_response_control( Slapi_PBlock *pb, int iscritical,
 
     /* begin sequence, payload, end sequence */
     if (current_search_count < 0) {
-        cookie = 0;
+        cookie = -1;
         cookie_str = slapi_ch_strdup("");
     } else {
         cookie = index;

Comment 12 Simon Pichugin 2016-07-18 13:44:36 UTC
Test case was added to https://git.fedorahosted.org/git/389/ds.git

It is located at dirsrvtests/tests/suites/paged_results/paged_results_test.py::test_multi_suffix_search

Comment 13 thierry bordaz 2016-07-18 16:54:23 UTC
https://fedorahosted.org/389/ticket/48928 is fixed upstream (likely 1.3.5.12). moving the BZ to POST

Comment 14 Simon Pichugin 2016-08-22 08:08:48 UTC
Build tested:
389-ds-base-1.3.5.10-8.el7.x86_64

[root@stal ds]# py.test -v dirsrvtests/tests/suites/paged_results/paged_results_test.py::test_multi_suffix_search
========================== test session starts ==========================
platform linux2 -- Python 2.7.5, pytest-3.0.0, py-1.4.31, pluggy-0.3.1 -- /usr/bin/python
cachedir: .cache
DS build: 1.3.5.10 B2016.231.199
389-ds-base: 1.3.5.10-8.el7
nss: 3.21.0-17.el7
nspr: 4.11.0-1.el7_2
openldap: 2.4.40-12.el7
svrcore: 4.1.2-1.el7

rootdir: /mnt/tests/rhds/tests/upstream/ds, inifile:
plugins: beakerlib-0.5, html-1.10.0, cov-2.3.1
collected 32 items

dirsrvtests/tests/suites/paged_results/paged_results_test.py::test_multi_suffix_search PASSED

======================= 1 passed in 27.59 seconds =======================

Marking as verified.

Comment 16 errata-xmlrpc 2016-11-03 20:41:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-2594.html