Bug 1368520 - Crash in import_wait_for_space_in_fifo().
Summary: Crash in import_wait_for_space_in_fifo().
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: 389-ds-base
Version: 7.3
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Noriko Hosoi
QA Contact: Viktor Ashirov
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-08-19 15:58 UTC by Noriko Hosoi
Modified: 2016-11-03 20:45 UTC (History)
5 users (show)

Fixed In Version: 389-ds-base-1.3.5.10-9.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-11-03 20:45:00 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2016:2594 normal SHIPPED_LIVE Moderate: 389-ds-base security, bug fix, and enhancement update 2016-11-03 12:11:08 UTC

Description Noriko Hosoi 2016-08-19 15:58:25 UTC
Description of problem:

An online reinitialization from a supplier to a consumer is causing the crash
of the consumer after 15 hours.

Version-Release number of selected component (if applicable):

How reproducible:

Not sure how easy to reproduce as the import is running for a long time ( 15
hours ) before the crash.

Steps to Reproduce:

Customer was reinitializing a consumer from a supplier using the Console.
After some hours, the consumer crashed.

Additional info:
The crash seems to happen in the import code:

========================================
Core was generated by `/usr/sbin/ns-slapd -D /etc/dirsrv/slapd-389 -i
/var/run/dirsrv/slapd-389.pid -w'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007f688fa8d072 in import_wait_for_space_in_fifo (job=0x7f6744032850,
new_esize=7011) at ldap/servers/slapd/back-ldbm/import-threads.c:1857
1857                temp_ep = job->fifo.item[i].entry;
========================================

An excerpt of the code:
========================================
1844 static void
1845 import_wait_for_space_in_fifo(ImportJob *job, size_t new_esize)
1846 {
1847     struct backentry *temp_ep = NULL;
1848     size_t i;
1849     int slot_found;
1850     PRIntervalTime sleeptime;
1851
1852     sleeptime = PR_MillisecondsToInterval(import_sleep_time);
1853
1854     /* Now check if fifo has enough space for the new entry */
1855     while ((job->fifo.c_bsize + new_esize) > job->fifo.bsize) {
1856         for ( i = 0, slot_found = 0 ; i < job->fifo.size ; i++ ) {
1857             temp_ep = job->fifo.item[i].entry;
1858             if (temp_ep) {
1859                 if (temp_ep->ep_refcnt == 0 && temp_ep->ep_id <=
job->ready_EID) {
1860                     job->fifo.item[i].entry = NULL;
1861                     if (job->fifo.c_bsize > job->fifo.item[i].esize)
1862                         job->fifo.c_bsize -= job->fifo.item[i].esize;
1863                     else
1864                         job->fifo.c_bsize = 0;
1865                     backentry_free(&temp_ep);
1866                     slot_found = 1;
1867                 }
1868             }
1869         }
1870         if ( slot_found == 0 )
1871             DS_Sleep(sleeptime);
1872     }
1873 }
========================================

See also the original bug: https://bugzilla.redhat.com/show_bug.cgi?id=1368209

Comment 3 Sankar Ramalingam 2016-09-19 13:30:06 UTC
1. Created 2 masters 2 consumers replication setup.
2. Synced entries across masters and consumers
3. Created few entries in M1 and left the setup for 2 days.
4. After 2 days, then stopped M2 and created 15000 entries on M1.
5. Started M2 and initialized from replica.
6. Re initialization didn't create any problem
7. No crash observed.


 [root@ratangad MMR_WINSYNC]# PORT=1189; SUFF="dc=passsync,dc=com"; for PORT in `echo "1189 1289 1389 1489 "`; do /usr/bin/ldapsearch -x -p $PORT -h localhost -D "cn=Directory Manager" -w Secret123 -b $SUFF |grep -i dn: | wc -l ; done
15585
1
15585
15585
[root@ratangad MMR_WINSYNC]# PORT=1189; SUFF="dc=passsync,dc=com"; for PORT in `echo "1189 1289 1389 1489 "`; do /usr/bin/ldapsearch -x -p $PORT -h localhost -D "cn=Directory Manager" -w Secret123 -b $SUFF |grep -i dn: | wc -l ; done
15585
15585
15585
15585
[root@ratangad MMR_WINSYNC]# ps -eaf |grep -i slapd
dsuser   16729     1  0 Sep17 ?        00:04:06 /usr/sbin/ns-slapd -D /etc/dirsrv/slapd-C1 -i /var/run/dirsrv/slapd-C1.pid
dsuser   16732     1  0 Sep17 ?        00:03:56 /usr/sbin/ns-slapd -D /etc/dirsrv/slapd-C2 -i /var/run/dirsrv/slapd-C2.pid
dsuser   16736     1  0 Sep17 ?        00:04:32 /usr/sbin/ns-slapd -D /etc/dirsrv/slapd-M1 -i /var/run/dirsrv/slapd-M1.pid
dsuser   16757     1  0 Sep17 ?        00:03:29 /usr/sbin/ns-slapd -D /etc/dirsrv/slapd-newinst2 -i /var/run/dirsrv/slapd-newinst2.pid
dsuser   22971     1  1 18:52 ?        00:00:07 /usr/sbin/ns-slapd -D /etc/dirsrv/slapd-M2 -i /var/run/dirsrv/slapd-M2.pid
root     23054 12186  0 18:53 pts/0    00:00:00 tail -f /var/log/dirsrv/slapd-M1/errors /var/log/dirsrv/slapd-M2/access
root     23199 22837  0 18:59 pts/2    00:00:00 grep --color=auto -i slapd
[root@ratangad MMR_WINSYNC]# rpm -qa |grep -i 389-ds-base
389-ds-base-snmp-1.3.5.10-11.el7.x86_64
389-ds-base-libs-1.3.5.10-11.el7.x86_64
389-ds-base-debuginfo-1.3.5.10-6.el7.x86_64
389-ds-base-1.3.5.10-11.el7.x86_64
389-ds-base-devel-1.3.5.10-11.el7.x86_64


Hence, marking the bug as Verified.

Comment 5 errata-xmlrpc 2016-11-03 20:45:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-2594.html


Note You need to log in before you can comment on or make changes to this bug.