1368520 – Crash in import_wait_for_space_in_fifo().

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1368520 - Crash in import_wait_for_space_in_fifo().

Summary: Crash in import_wait_for_space_in_fifo().

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	389-ds-base
Sub Component:
Version:	7.3
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	rc
Target Release:	---
Assignee:	Noriko Hosoi
QA Contact:	Viktor Ashirov
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2016-08-19 15:58 UTC by Noriko Hosoi
Modified:	2020-09-13 21:49 UTC (History)
CC List:	5 users (show)
Fixed In Version:	389-ds-base-1.3.5.10-9.el7
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2016-11-03 20:45:00 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	389ds 389-ds-base issues 2019	0	None	None	None	2020-09-13 21:49:47 UTC
Red Hat Product Errata	RHSA-2016:2594	0	normal	SHIPPED_LIVE	Moderate: 389-ds-base security, bug fix, and enhancement update	2016-11-03 12:11:08 UTC

Description Noriko Hosoi 2016-08-19 15:58:25 UTC

Description of problem:

An online reinitialization from a supplier to a consumer is causing the crash
of the consumer after 15 hours.

Version-Release number of selected component (if applicable):

How reproducible:

Not sure how easy to reproduce as the import is running for a long time ( 15
hours ) before the crash.

Steps to Reproduce:

Customer was reinitializing a consumer from a supplier using the Console.
After some hours, the consumer crashed.

Additional info:
The crash seems to happen in the import code:

========================================
Core was generated by `/usr/sbin/ns-slapd -D /etc/dirsrv/slapd-389 -i
/var/run/dirsrv/slapd-389.pid -w'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007f688fa8d072 in import_wait_for_space_in_fifo (job=0x7f6744032850,
new_esize=7011) at ldap/servers/slapd/back-ldbm/import-threads.c:1857
1857                temp_ep = job->fifo.item[i].entry;
========================================

An excerpt of the code:
========================================
1844 static void
1845 import_wait_for_space_in_fifo(ImportJob *job, size_t new_esize)
1846 {
1847     struct backentry *temp_ep = NULL;
1848     size_t i;
1849     int slot_found;
1850     PRIntervalTime sleeptime;
1851
1852     sleeptime = PR_MillisecondsToInterval(import_sleep_time);
1853
1854     /* Now check if fifo has enough space for the new entry */
1855     while ((job->fifo.c_bsize + new_esize) > job->fifo.bsize) {
1856         for ( i = 0, slot_found = 0 ; i < job->fifo.size ; i++ ) {
1857             temp_ep = job->fifo.item[i].entry;
1858             if (temp_ep) {
1859                 if (temp_ep->ep_refcnt == 0 && temp_ep->ep_id <=
job->ready_EID) {
1860                     job->fifo.item[i].entry = NULL;
1861                     if (job->fifo.c_bsize > job->fifo.item[i].esize)
1862                         job->fifo.c_bsize -= job->fifo.item[i].esize;
1863                     else
1864                         job->fifo.c_bsize = 0;
1865                     backentry_free(&temp_ep);
1866                     slot_found = 1;
1867                 }
1868             }
1869         }
1870         if ( slot_found == 0 )
1871             DS_Sleep(sleeptime);
1872     }
1873 }
========================================

See also the original bug: https://bugzilla.redhat.com/show_bug.cgi?id=1368209

Comment 3 Sankar Ramalingam 2016-09-19 13:30:06 UTC

1. Created 2 masters 2 consumers replication setup.
2. Synced entries across masters and consumers
3. Created few entries in M1 and left the setup for 2 days.
4. After 2 days, then stopped M2 and created 15000 entries on M1.
5. Started M2 and initialized from replica.
6. Re initialization didn't create any problem
7. No crash observed.


 [root@ratangad MMR_WINSYNC]# PORT=1189; SUFF="dc=passsync,dc=com"; for PORT in `echo "1189 1289 1389 1489 "`; do /usr/bin/ldapsearch -x -p $PORT -h localhost -D "cn=Directory Manager" -w Secret123 -b $SUFF |grep -i dn: | wc -l ; done
15585
1
15585
15585
[root@ratangad MMR_WINSYNC]# PORT=1189; SUFF="dc=passsync,dc=com"; for PORT in `echo "1189 1289 1389 1489 "`; do /usr/bin/ldapsearch -x -p $PORT -h localhost -D "cn=Directory Manager" -w Secret123 -b $SUFF |grep -i dn: | wc -l ; done
15585
15585
15585
15585
[root@ratangad MMR_WINSYNC]# ps -eaf |grep -i slapd
dsuser   16729     1  0 Sep17 ?        00:04:06 /usr/sbin/ns-slapd -D /etc/dirsrv/slapd-C1 -i /var/run/dirsrv/slapd-C1.pid
dsuser   16732     1  0 Sep17 ?        00:03:56 /usr/sbin/ns-slapd -D /etc/dirsrv/slapd-C2 -i /var/run/dirsrv/slapd-C2.pid
dsuser   16736     1  0 Sep17 ?        00:04:32 /usr/sbin/ns-slapd -D /etc/dirsrv/slapd-M1 -i /var/run/dirsrv/slapd-M1.pid
dsuser   16757     1  0 Sep17 ?        00:03:29 /usr/sbin/ns-slapd -D /etc/dirsrv/slapd-newinst2 -i /var/run/dirsrv/slapd-newinst2.pid
dsuser   22971     1  1 18:52 ?        00:00:07 /usr/sbin/ns-slapd -D /etc/dirsrv/slapd-M2 -i /var/run/dirsrv/slapd-M2.pid
root     23054 12186  0 18:53 pts/0    00:00:00 tail -f /var/log/dirsrv/slapd-M1/errors /var/log/dirsrv/slapd-M2/access
root     23199 22837  0 18:59 pts/2    00:00:00 grep --color=auto -i slapd
[root@ratangad MMR_WINSYNC]# rpm -qa |grep -i 389-ds-base
389-ds-base-snmp-1.3.5.10-11.el7.x86_64
389-ds-base-libs-1.3.5.10-11.el7.x86_64
389-ds-base-debuginfo-1.3.5.10-6.el7.x86_64
389-ds-base-1.3.5.10-11.el7.x86_64
389-ds-base-devel-1.3.5.10-11.el7.x86_64


Hence, marking the bug as Verified.

Comment 5 errata-xmlrpc 2016-11-03 20:45:00 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-2594.html

Note You need to log in before you can comment on or make changes to this bug.