Bug 1601241

Summary: ns-slapd - Crash when using bak2db.pl to restore a single database.
Product: Red Hat Enterprise Linux 7 Reporter: Têko Mihinto <tmihinto>
Component: 389-ds-baseAssignee: mreynolds
Status: CLOSED ERRATA QA Contact: RHDS QE <ds-qe-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 7.5CC: aadhikar, cpelland, nkinder, pasik, rmeggins, tmihinto, vashirov
Target Milestone: rc   
Target Release: 7.7   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: 389-ds-base-1.3.9.1-4.el7 Doc Type: Bug Fix
Doc Text:
Cause: Trying to restore a backup but for a specific backend. Consequence: The server crashes Fix: Prevent the crash by not dereferencing a NULL pointer Result: The backup process does not crash the server
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-08-06 12:58:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Têko Mihinto 2018-07-15 12:50:55 UTC
Description of problem:

RHDS is crashing when trying to use bak2db.pl to restore a single database.


Version-Release number of selected component (if applicable):

# rpm -qa | grep 389-ds-base
389-ds-base-libs-1.3.7.5-24.el7_5.x86_64
389-ds-base-debuginfo-1.3.7.5-24.el7_5.x86_64
389-ds-base-1.3.7.5-24.el7_5.x86_64
#

How reproducible:

Always.

Steps to Reproduce:

1. Start the RHDS instance
# ulimit -c unlimited ; systemctl start dirsrv@<INSTANCE_NAME>

2. Try to restore a single database using bak2db.pl:
# bak2db.pl -Z <INSTANCE_NAME> -D "cn=Directory Manager" -w <PASSWORD> -P LDAP -a  /var/lib/dirsrv/slapd-<INSTANCE_NAME>/bak/<INSTANCE_NAME>-2018_7_13_15_48_40/ -n userRoot
Successfully added task entry "cn=restore_2018_7_13_18_13_18, cn=restore, cn=tasks, cn=config"
#


RHDS errors log:
++++++++++++++++++++++++++++++++++++++++++++++++
[13/Jul/2018:18:13:18.414783645 +0200] - INFO - task_restore_thread - Beginning restore to 'ldbm database'
[13/Jul/2018:18:13:18.418211871 +0200] - INFO - ldbm_back_archive2ldbm - Bringing userRoot offline...
...
[13/Jul/2018:18:13:18.427815303 +0200] - INFO - dblayer_pre_close - Waiting for 4 database threads to stop
[13/Jul/2018:18:13:18.980252957 +0200] - INFO - dblayer_pre_close - All database threads now stopped
[13/Jul/2018:18:13:18.994927840 +0200] - NOTICE - dblayer_delete_database_ex - Skipping instance TekoData
...
++++++++++++++++++++++++++++++++++++++++++++++++


3. A search will fail indicating that the server is down:
# ldapsearch -o ldif-wrap=no -xLLL  -p <PORT> -h <HOST> -b "dc=TekoSoft,dc=com" -D"cn=Directory Manager" -w <PASSWORD> -sbase 1.1
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
#


Actual results:

ns-slapd crashed.

Stack trace is:

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/sbin/ns-slapd -D /etc/dirsrv/slapd-<INSTANCE> -i /var/run/dirsrv/slapd-inst'.
Program terminated with signal 11, Segmentation fault.
#0  __opendirat (dfd=dfd@entry=-100, name=name@entry=0x0) at ../sysdeps/posix/opendir.c:90
90      if (__builtin_expect (name[0], '\1') == '\0')
(gdb)
(gdb) where
#0  0x00007fcedff6ca10 in __opendirat (dfd=dfd@entry=-100, name=name@entry=0x0) at ../sysdeps/posix/opendir.c:90
#1  0x00007fcedff6ca6d in __opendir (name=name@entry=0x0) at ../sysdeps/posix/opendir.c:159
#2  0x00007fcee0f5c81c in PR_OpenDir (name=0x0) at ../../../nspr/pr/src/pthreads/ptio.c:3840
#3  0x00007fced6c54754 in dblayer_delete_database_ex (li=li@entry=0x55970f1d5900, instance=instance@entry=0x55970f595ee0 "userRoot", cldir=0x0) at ldap/servers/slapd/back-ldbm/dblayer.c:5228
#4  0x00007fced6c54e57 in dblayer_restore (li=0x55970f1d5900, src_dir=0x55970faff270 "/var/lib/dirsrv/slapd-<INSTANCE>/bak/<INSTANCE>-2018_7_13_15_48_40", task=0x55970fb31ea0, bename=0x55970f595ee0 "userRoot")
    at ldap/servers/slapd/back-ldbm/dblayer.c:6525
#5  0x00007fced6c468dd in ldbm_back_archive2ldbm (pb=0x55970f59ec60) at ldap/servers/slapd/back-ldbm/archive.c:165
#6  0x00007fcee31cea86 in task_restore_thread (arg=0x55970f59ec60) at ldap/servers/slapd/task.c:1622
#7  0x00007fcee0f5dbab in _pt_root (arg=0x55970fb27680) at ../../../nspr/pr/src/pthreads/ptthread.c:201
#8  0x00007fcee08fddd5 in start_thread (arg=0x7fceb439d700) at pthread_create.c:308
#9  0x00007fcedffaab3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
(gdb)


Expected results:

ns-slapd should not crash.


Additional info:

The script bak2db.pl should not be used to restore a single database.

See also these bugs:
Bug 1601229 - Restoring a single database with bak2db.pl fails.
Bug 1601230 - Remove the option "-n" from the bak2db.pl documentation.

Comment 6 mreynolds 2018-11-30 16:44:44 UTC
Upstream ticket:
https://pagure.io/389-ds-base/issue/50063

Comment 8 Akshay Adhikari 2019-03-26 12:40:23 UTC
Build tested: 389-ds-base-1.3.9.1-3.el7.x86_64

Steps followed:
 
1) Created a backup with command db2bak.pl

[root@qe-blade-04 ~]# db2bak.pl -Z standalone1 -D "cn=Directory Manager" -P LDAP -w password
Back up directory: /var/lib/dirsrv/slapd-standalone1/bak/standalone1-2019_3_26_8_13_2
Successfully added task entry "cn=backup_2019_3_26_8_13_2, cn=backup, cn=tasks, cn=config"

2) After successfully creating backup, tried restoring it.

[root@qe-blade-04 ~]# bak2db.pl -Z standalone1 -D "cn=Directory Manager" -w password -P LDAP -a  /var/lib/dirsrv/slapd-standalone1/bak/standalone1-2019_3_26_8_13_2/ -n userRoot
Successfully added task entry "cn=restore_2019_3_26_8_20_30, cn=restore, cn=tasks, cn=config"

Error log:

[26/Mar/2019:08:13:02.787273038 -0400] - INFO - task_backup_thread - Backup finished.
[26/Mar/2019:08:20:30.507922716 -0400] - INFO - task_restore_thread - Beginning restore to 'ldbm database'
[26/Mar/2019:08:20:30.523606686 -0400] - INFO - ldbm_back_archive2ldbm - Bringing userRoot offline...
[26/Mar/2019:08:20:30.542467412 -0400] - INFO - dblayer_pre_close - Waiting for 4 database threads to stop
[26/Mar/2019:08:20:32.528716391 -0400] - INFO - dblayer_pre_close - All database threads now stopped

3) Server is down and search operation failed.
[root@qe-blade-04 ~]# ldapsearch -D "cn=Directory Manager" -p 38901 -h `hostname` -b "dc=example,dc=com" -w password
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)

Marking this as failedQA.

Comment 9 mreynolds 2019-03-26 13:45:04 UTC
This fix was never backported to 389-ds-base-1.3.9...

Comment 10 Akshay Adhikari 2019-04-03 05:06:06 UTC
Build tested: 389-ds-base-1.3.9.1-4.el7.x86_64


1) Created a backup with command db2bak.pl

[root@host-8-244-176 password]# db2bak.pl -Z standalone1 -D "cn=Directory Manager" -P LDAP -w password
Back up directory: /var/lib/dirsrv/slapd-standalone1/bak/standalone1-2019_4_1_9_54_19
Successfully added task entry "cn=backup_2019_4_1_9_54_19, cn=backup, cn=tasks, cn=config"

2) After successfully creating the backup, do a restore.

[root@host-8-244-176 password]# bak2db.pl -Z standalone1 -D "cn=Directory Manager" -w password -P LDAP -a /var/lib/dirsrv/slapd-standalone1/bak/standalone1-2019_4_1_9_54_19 -n userRoot
Successfully added task entry "cn=restore_2019_4_1_9_54_42, cn=restore, cn=tasks, cn=config"

3) Server is up and running.

[root@host-8-244-176 password]# ldapsearch -D "cn=Directory Manager" -p 38901 -h `hostname` -b "dc=example,dc=com" -w password
# extended LDIF
#
# LDAPv3
# base <dc=example,dc=com> with scope subtree
# filter: (objectclass=*)
# requesting: ALL
#
....

Observation: If backend name 'userRoot' is specified while restoring it create an empty server, if not then it restores every entry.

Marking it as Verified.

Comment 12 errata-xmlrpc 2019-08-06 12:58:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2152