Bug 1228402

Summary: Individual abandoned simple paged results request has no chance to be cleaned up
Product: Red Hat Enterprise Linux 6 Reporter: Noriko Hosoi <nhosoi>
Component: 389-ds-baseAssignee: Noriko Hosoi <nhosoi>
Status: CLOSED ERRATA QA Contact: Viktor Ashirov <vashirov>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 6.0CC: cww, gparente, jgalipea, nhosoi, nkinder, rmeggins, sramling, tlavigne
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 389-ds-base-1.2.11.15-60.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1240451 (view as bug list) Environment:
Last Closed: 2015-07-22 06:37:45 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Bug Depends On:    
Bug Blocks: 1075802, 1218341, 1240451    

Description Noriko Hosoi 2015-06-04 20:37:00 UTC
This bug is created as a clone of upstream ticket:
https://fedorahosted.org/389/ticket/48192

If multiple asynchronous simple paged results requests are received in one connection and if some of them are abandoned, the abandoned requests are not cleaned up until the connection is closed or all of the requests exceed the configured timelimit and the connection is closed.

Comment 1 Noriko Hosoi 2015-06-05 06:21:41 UTC
Justification: Strategic customers are facing this bug.  By sending repeated simple paged results requests followed by abandon, it was revealed a poor memory management for the abandoned request case.  This bug fix should solve the problem.

Comment 3 Viktor Ashirov 2015-06-08 13:08:32 UTC
Hi Noriko,

please, add steps to verify. Or maybe you have a reproducer? 
Thanks!

Comment 4 Noriko Hosoi 2015-06-09 20:01:49 UTC
Tests to verify (the both tests might already have done for the other bug, though)
1. No regression in TET filter and pagedresults tests.

2.1 Run repeated asynchronous simple paged results request + its abandon request.
    I compiled this test progra.
    (you need to adjust some variables to your server.)
    https://fedorahosted.org/389/attachment/ticket/47707/paged_def.c
    $ gcc -o paged_def paged_def.c -lldap
    Then, ran 16 of them something like this:
    ========================
    CNT=0
    while [ $CNT -lt 16 ]
    do
      ./paged_def >& /dev/null &
      CNT=`expr $CNT + 1`
    done
    ========================
2.2 Run the test for at least 12 hours and check the access log.
    If pr_idx value keeps low like this or even if it goes up but it comes down,
    it passes one issue.
    conn=8 op=157936 RESULT err=0 tag=101 nentries=11 etime=0 notes=P pr_idx=0

Comment 5 Sankar Ramalingam 2015-06-15 18:01:34 UTC
############## Result  for  backend test :   filter run
    filter run elapse time : 00:04:29
    filter run Tests PASS      : 100% (147/147)

############## Result  for  backend test :   SIMPLEPAGED run
    SIMPLEPAGED run elapse time : 00:05:37
    SIMPLEPAGED run Tests PASS      : 100% (17/17)

Acceptance tests for 389-ds-base-1.2.11.15-60 shows no regression for filter and simple paged test suites. Hence, verification at level #1 is done.

To continue with the verification, I am running asynchronous simple paged results for 12 hrs.

Comment 6 Sankar Ramalingam 2015-06-16 06:03:35 UTC
Checked the access log after 12hrs. The values for pr_idx goes up and then comes down to 0. There was also message about ABANDON simple paged results. So, I am going ahead and marking it as Verified.

[16/Jun/2015:01:56:26 -0400] conn=14 op=887530 RESULT err=32 tag=101 nentries=0 etime=0 notes=P pr_idx=295844
[16/Jun/2015:01:56:26 -0400] conn=19 op=893708 ABANDON targetop=Simple Paged Results msgid=893708
[16/Jun/2015:01:56:26 -0400] conn=19 op=893709 SRCH base="o=redhat" scope=2 filter="(cn=user100*)" attrs=ALL
[16/Jun/2015:01:56:26 -0400] conn=18 op=893147 ABANDON targetop=Simple Paged Results msgid=893147
[16/Jun/2015:01:56:26 -0400] conn=18 op=893148 SRCH base="o=redhat" scope=2 filter="(cn=user100*)" attrs=ALL
[16/Jun/2015:01:56:26 -0400] conn=13 op=891671 ABANDON targetop=Simple Paged Results msgid=891671


Build tested:
[root@qe-blade-01 ~]# rpm -qa|grep -i 389-ds-base
389-ds-base-1.2.11.15-60.el6.x86_64
389-ds-base-debuginfo-1.2.11.15-60.el6.x86_64
389-ds-base-libs-1.2.11.15-60.el6.x86_64

However, I see the CPU utilization goes upto 750%. 

15076 dsuser    20   0 3714m 1.3g 5624 S 767.1 26.5   4954:08 ns-slapd 
PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM TIME+                                
15076 dsuser    20   0 3714m 1.3g 5624 S 757.1 26.5   4956:28 ns-slapd

Comment 7 Sankar Ramalingam 2015-06-16 10:03:11 UTC
High CPU usage for simple paged results being tacked in this RHEL6.8 bug - https://bugzilla.redhat.com/show_bug.cgi?id=1210073. Hence, marking this bug as Verified.

Comment 8 errata-xmlrpc 2015-07-22 06:37:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-1326.html

Comment 9 Noriko Hosoi 2015-09-03 18:20:58 UTC
*** Bug 1210073 has been marked as a duplicate of this bug. ***