Bug 1166252

Summary: RHEL7.1 ns-slapd segfault when ipa-replica-install restarts dirsrv
Product: Red Hat Enterprise Linux 7 Reporter: Scott Poore <spoore>
Component: 389-ds-baseAssignee: mreynolds
Status: CLOSED ERRATA QA Contact: Viktor Ashirov <vashirov>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.1CC: mreynolds, nhosoi, nkinder, nsoman, rmeggins, spoore
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 389-ds-base-1.3.3.1-10.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-03-05 09:37:22 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 994690    
Attachments:
Description Flags
abrt output email for the crash none

Description Scott Poore 2014-11-20 16:25:28 UTC
Description of problem:

I'm seeing ns-slapd segfault during ipa-server-install when it stops and starts the directory server.

time:           Thu 20 Nov 2014 11:42:41 AM IST
cmdline:        /usr/sbin/ns-slapd -D /etc/dirsrv/slapd-TESTRELM-TEST -i /var/run/dirsrv/slapd-TESTRELM-TEST.pid -w /var/run/dirsrv/slapd-TESTRELM-TEST.startpid
uid:            994 (dirsrv)
abrt_version:   2.1.11
backtrace_rating: 4
crash_function: slapi_plugin_op_finished
event_log:      
executable:     /usr/sbin/ns-slapd
hostname:       vm-idm-039.testrelm.test
kernel:         3.10.0-205.el7.x86_64
last_occurrence: 1416463961
pid:            24107
pkg_arch:       x86_64
pkg_epoch:      0
pkg_name:       389-ds-base
pkg_release:    9.el7
pkg_version:    1.3.3.1
pwd:            /var/log/dirsrv/slapd-TESTRELM-TEST
runlevel:       N 3
username:       dirsrv


Version-Release number of selected component (if applicable):
389-ds-base-1.3.3.1-9.el7.x86_64

How reproducible:
seems to be always from automated tests I just ran last night.

Steps to Reproduce:
1.  install IPA master
2.  install IPA replica
3.

Actual results:
ns-slapd crash occurs during replica install at end when dirsrv is stopped.

Expected results:
no crash

Additional info:

I'll attach full crash report separately.

Comment 2 Scott Poore 2014-11-20 16:28:38 UTC
Created attachment 959442 [details]
abrt output email for the crash

Comment 3 Rich Megginson 2014-11-20 16:34:32 UTC
Mark, this looks like the plugin reload issue you are working on.  If so, please assign to you and link to the upstream ticket.

Comment 4 mreynolds 2014-11-20 17:46:04 UTC
This is related to dynamic plugins, and plugin tasks.  Investigating...

Comment 5 mreynolds 2014-11-20 19:10:44 UTC
Scott,

Do you know if "nsslapd-dynamic-plugins" is set to "on" in cn=config?  It's off by default, so I'm just double checking.  I can't reprioduce it with it set to "off", but I can crash it when its "on".  That crash is a known issue that is fixed upstream via:

https://fedorahosted.org/389/ticket/47451

So I just want to verify that this is not another new crash.

Thanks,
Mark

Comment 6 Scott Poore 2014-11-20 19:24:57 UTC
Hmmm...Looks like on a different environment (which should be setup the same way), it is not set to on:

[root@vm-idm-060 ~]# ldapsearch -xLLL -D "cn=Directory Manager" -w Secret123 -b cn=config|grep nsslapd-dynamic-plugins
nsslapd-dynamic-plugins: off

Need me to get more logs, the core file, or need access to a server with the crash?

Thanks,
Scott

Comment 7 mreynolds 2014-11-24 18:15:42 UTC
Scott, 

Can you verify the ipa version you are using?

Can you verify it is crashing again?  Is it consistent?  I'm trying to get some beaker boxes to do some testing, but beaker hasn't been cooperating recently. 

So, if you don't mind running the test again that wouild be great.  Even better is if you can enable the audit log on the DS server that is expected to crash.

Thanks,
Mark

Comment 8 Scott Poore 2014-11-24 18:32:53 UTC
Mark,

Version of ipa is:  ipa-server-4.1.0-6.el7.x86_64

I'll kick off another job.  The install we use though should already be setting the server up by default to capture everything for a crash.  It may be missing something so I'll check again.  Or is there something different to set for enabling the audit log?

I'll run the tests against and reserve the system.  Yes, the problems were consistent last week.  I haven't run a test today though.  I'll see.

Thanks,
Scott

Comment 9 Scott Poore 2014-11-24 20:51:36 UTC
Ok, I am seeing the same issues still so it appears to be consistent.  I don't see much more and the /var/log/dirsrv/slapd-*/audit log is empty.  

Is there a specific nsslapd-errorlog-level I should set?  right now it's set to:
nsslapd-errorlog-level: 16384


This is what I've got from the install to enable 389 debugging:

Installed:
  389-ds-base-debuginfo.x86_64 0:1.3.3.1-9.el7                                  

Complete!
:: [   PASS   ] :: Command 'yum -y --enablerepo *debuginfo install 389-ds-base-debuginfo' (Expected 0,1, got 0)
:: [  BEGIN   ] :: Running 'sysctl -w fs.suid_dumpable=1'
fs.suid_dumpable = 1
:: [   PASS   ] :: Command 'sysctl -w fs.suid_dumpable=1' (Expected 0, got 0)
:: [  BEGIN   ] :: Running 'echo 'ulimit -c unlimited' >> /etc/sysconfig/dirsrv'
:: [   PASS   ] :: Command 'echo 'ulimit -c unlimited' >> /etc/sysconfig/dirsrv' (Expected 0, got 0)
:: [  BEGIN   ] :: Running 'echo 'LimitCORE=infinity' >> /etc/sysconfig/dirsrv.systemd'
:: [   PASS   ] :: Command 'echo 'LimitCORE=infinity' >> /etc/sysconfig/dirsrv.systemd' (Expected 0, got 0)
:: [  BEGIN   ] :: Running 'systemctl daemon-reload'
:: [   PASS   ] :: Command 'systemctl daemon-reload' (Expected 0, got 0)

Am I missing something there?

Comment 10 mreynolds 2014-11-24 21:08:11 UTC
The audit log is not enabled by default.  This is set by:

ldapmodify ... ...
dn: cn=config
changetype: modify
replace: nsslapd-auditlog-logging-enabled
nsslapd-auditlog-logging-enabled: on

The audit log would definitely not be empty if was enabled, as it writes the enabling of the audit logging to the audit log.

And, thanks for the testing!  Any chance you can run it under valgrind too? :-)  I just got my beaker boxes, but I'm not going to get to any testing until tomorrow.

Thanks,
Mark

Comment 11 mreynolds 2014-11-24 21:33:21 UTC
Skip those tests, I found the cause of the problem and it affects all tasks.  

This is caused by the following ticket, and it will be addressed in this ticket:

https://fedorahosted.org/389/ticket/47451

Comment 12 mreynolds 2014-12-01 19:46:08 UTC
Fixed upstream

Comment 13 Sankar Ramalingam 2014-12-09 13:15:40 UTC
Please add steps to reproduce in DS alone setup.

Comment 14 mreynolds 2014-12-09 15:37:16 UTC
Verification steps;

[1]  Enable automember and memberOf plugins
[2]  Restart Server
[3]  Run automember export task 3 times
[4]  Run memberOf fixup task 3 times
[5]  If the server is still running the fix is verified.

Comment 15 mreynolds 2014-12-15 18:57:21 UTC
There are still some outstanding issues that need to be fixed.  Moving out of POST

Comment 16 mreynolds 2014-12-16 20:30:40 UTC
Fixed upstream

Comment 18 Scott Poore 2015-01-28 20:51:26 UTC
I can verify I have not seen any more crashes from ipa-replica-install since this update.

Sanity only check for IPA via ipa-replica-manage test suite.  Suite was run against 5 hosts.  1 master and 4 replicas.  With some test cases uninstalling/re-installing replicas.  So this was run many times and no crashes.  

Example ipa-replica-install output seen:


:: [ 12:17:23 ] :: RUN ipa-replica-install
:: [  BEGIN   ] :: Running ' /usr/sbin/ipa-replica-install -U --setup-ca --setup-dns --forwarder=<FORWARDER> -w Secret123 -p Secret123 /opt/rhqa_ipa/replica-info-ipaqavme.testrelm.test.gpg'
Check connection from replica to remote master 'ipaqavmd.testrelm.test':
   Directory Service: Unsecure port (389): OK
   Directory Service: Secure port (636): OK
   Kerberos KDC: TCP (88): OK
   Kerberos Kpasswd: TCP (464): OK
   HTTP Server: Unsecure port (80): OK
   HTTP Server: Secure port (443): OK

The following list of ports use UDP protocol and would need to be
checked manually:
   Kerberos KDC: UDP (88): SKIPPED
   Kerberos Kpasswd: UDP (464): SKIPPED

Connection from replica to master is OK.
Start listening on required ports for remote master check
Get credentials to log in to remote master
Check SSH connection to remote master
Execute check on remote master
Check connection from master to remote replica 'ipaqavme.testrelm.test':
   Directory Service: Unsecure port (389): OK
   Directory Service: Secure port (636): OK
   Kerberos KDC: TCP (88): OK
   Kerberos KDC: UDP (88): OK
   Kerberos Kpasswd: TCP (464): OK
   Kerberos Kpasswd: UDP (464): OK
   HTTP Server: Unsecure port (80): OK
   HTTP Server: Secure port (443): OK

Connection from master to replica is OK.

Checking forwarders, please wait ...
WARNING: DNS forwarder <FORWARDER> does not return DNSSEC signatures in answers
Please fix forwarder configuration to enable DNSSEC support.
(For BIND 9 add directive "dnssec-enable yes;" to "options {}")
WARNING: DNSSEC validation will be disabled
WARNING: conflicting time&date synchronization service 'chronyd' will
be disabled in favor of ntpd

Run connection check to master
Connection check OK
Using reverse zone(s) <REVERSEZONE>
Configuring NTP daemon (ntpd)
  [1/4]: stopping ntpd
  [2/4]: writing configuration
  [3/4]: configuring ntpd to start on boot
  [4/4]: starting ntpd
Done configuring NTP daemon (ntpd).
Configuring directory server (dirsrv): Estimated time 1 minute
  [1/35]: creating directory server user
  [2/35]: creating directory server instance
  [3/35]: adding default schema
  [4/35]: enabling memberof plugin
  [5/35]: enabling winsync plugin
  [6/35]: configuring replication version plugin
  [7/35]: enabling IPA enrollment plugin
  [8/35]: enabling ldapi
  [9/35]: configuring uniqueness plugin
  [10/35]: configuring uuid plugin
  [11/35]: configuring modrdn plugin
  [12/35]: configuring DNS plugin
  [13/35]: enabling entryUSN plugin
  [14/35]: configuring lockout plugin
  [15/35]: creating indices
  [16/35]: enabling referential integrity plugin
  [17/35]: configuring ssl for ds instance
  [18/35]: configuring certmap.conf
  [19/35]: configure autobind for root
  [20/35]: configure new location for managed entries
  [21/35]: configure dirsrv ccache
  [22/35]: enable SASL mapping fallback
  [23/35]: restarting directory server
  [24/35]: setting up initial replication
Starting replication, please wait until this has completed.

Update in progress, 1 seconds elapsed
Update in progress, 2 seconds elapsed
Update in progress, 3 seconds elapsed
Update succeeded

  [25/35]: updating schema
  [26/35]: setting Auto Member configuration
  [27/35]: enabling S4U2Proxy delegation
  [28/35]: importing CA certificates from LDAP
  [29/35]: initializing group membership
  [30/35]: adding master entry
  [31/35]: configuring Posix uid/gid generation
  [32/35]: adding replication acis
  [33/35]: enabling compatibility plugin
  [34/35]: tuning directory server
  [35/35]: configuring directory to start on boot
Done configuring directory server (dirsrv).
Configuring certificate server (pki-tomcatd): Estimated time 3 minutes 30 seconds
  [1/22]: creating certificate server user
  [2/22]: configuring certificate server instance

MARK-LWD-LOOP -- 2015-01-24 12:19:49 --
  [3/22]: stopping certificate server instance to update CS.cfg
  [4/22]: backing up CS.cfg
  [5/22]: disabling nonces
  [6/22]: set up CRL publishing
  [7/22]: enable PKIX certificate path discovery and validation
  [8/22]: starting certificate server instance
  [9/22]: creating RA agent certificate database
  [10/22]: importing CA chain to RA certificate database
  [11/22]: fixing RA database permissions
  [12/22]: setting up signing cert profile
  [13/22]: set certificate subject base
  [14/22]: enabling Subject Key Identifier
  [15/22]: enabling Subject Alternative Name
  [16/22]: enabling CRL and OCSP extensions for certificates
  [17/22]: setting audit signing renewal to 2 years
  [18/22]: configuring certificate server to start on boot
  [19/22]: configure certmonger for renewals
  [20/22]: configure certificate renewals
  [21/22]: configure Server-Cert certificate renewal
  [22/22]: Configure HTTP to proxy connections
Done configuring certificate server (pki-tomcatd).
Restarting the directory and certificate servers
Configuring Kerberos KDC (krb5kdc): Estimated time 30 seconds
  [1/9]: adding sasl mappings to the directory
  [2/9]: writing stash file from DS
  [3/9]: configuring KDC
  [4/9]: creating a keytab for the directory
  [5/9]: creating a keytab for the machine
  [6/9]: adding the password extension to the directory
  [7/9]: enable GSSAPI for replication
  [8/9]: starting the KDC
  [9/9]: configuring KDC to start on boot
Done configuring Kerberos KDC (krb5kdc).
Configuring kadmin
  [1/2]: starting kadmin 
  [2/2]: configuring kadmin to start on boot
Done configuring kadmin.
Configuring ipa_memcached
  [1/2]: starting ipa_memcached 
  [2/2]: configuring ipa_memcached to start on boot
Done configuring ipa_memcached.
Configuring the web interface (httpd): Estimated time 1 minute
  [1/15]: setting mod_nss port to 443
  [2/15]: setting mod_nss protocol list to TLSv1.0 - TLSv1.1
  [3/15]: setting mod_nss password file
  [4/15]: enabling mod_nss renegotiate
  [5/15]: adding URL rewriting rules
  [6/15]: configuring httpd
  [7/15]: configure certmonger for renewals
  [8/15]: setting up ssl
  [9/15]: importing CA certificates from LDAP
  [10/15]: publish CA cert
  [11/15]: creating a keytab for httpd
  [12/15]: clean up any existing httpd ccache
  [13/15]: configuring SELinux for httpd
  [14/15]: restarting httpd
  [15/15]: configuring httpd to start on boot
Done configuring the web interface (httpd).
Configuring ipa-otpd
  [1/2]: starting ipa-otpd 
  [2/2]: configuring ipa-otpd to start on boot
Done configuring ipa-otpd.
Applying LDAP updates
Restarting Directory server to apply updates
  [1/2]: stopping directory server
  [2/2]: starting directory server
Done.
Restarting the directory server
Restarting the KDC
Restarting the certificate server
Configuring DNS (named)
  [1/9]: generating rndc key file
  [2/9]: setting up reverse zone
  [3/9]: setting up our own record
  [4/9]: adding NS record to the zones
  [5/9]: setting up CA record
  [6/9]: setting up kerberos principal
  [7/9]: setting up named.conf
  [8/9]: configuring named to start on boot
  [9/9]: changing resolv.conf to point to ourselves
Done configuring DNS (named).

Restarting named

Global DNS configuration in LDAP server is empty
You can use 'dnsconfig-mod' command to set global DNS options that
would override settings in local named.conf files

Restarting the web server
:: [   PASS   ] :: Command ' /usr/sbin/ipa-replica-install -U --setup-ca --setup-dns --forwarder=<FORWARDER> -w Secret123 -p Secret123 /opt/rhqa_ipa/replica-info-ipaqavme.testrelm.test.gpg' (Expected 0, got 0)
:: [ 12:24:37 ] :: Check ldap_sasl_authid is not added to sssd.conf
:: [ 12:24:37 ] :: Verifying BZ 878420
:: [   PASS   ] :: File '/etc/sssd/sssd.conf' should not contain 'ldap_sasl_authid' 
:: [   PASS   ] :: File '/var/log/messages' should not contain 'sssd_be\[.*\]: segfault' 
:: [   PASS   ] :: BZ 878420 not found 
:: [ 12:24:38 ] :: Check SSSD is running
:: [ 12:24:38 ] :: Verifying BZ 878288
:: [   PASS   ] :: BZ 878288 not found 
:: [ 12:24:39 ] :: Check and workaround for BZ983075
:: [   PASS   ] :: File '/etc/dirsrv/slapd-TESTRELM-TEST/certmap.conf' should not contain 'ipaca.*,None' 
:: [ 12:24:39 ] :: Workaround for bug 1136882 for encoded packet size too big
modifying entry "cn=config"

:: [  BEGIN   ] :: Running 'ipactl stop'
ipa: INFO: The ipactl command was successful
Stopping ipa-otpd Service
Stopping pki-tomcatd Service
Stopping httpd Service
Stopping ipa_memcached Service
Stopping named Service
Stopping kadmin Service
Stopping krb5kdc Service
Stopping Directory Service
:: [   PASS   ] :: Command 'ipactl stop' (Expected 0, got 0)
:: [  BEGIN   ] :: Running 'ipactl start'
ipa: INFO: The ipactl command was successful
Starting Directory Service
Starting krb5kdc Service
Starting kadmin Service
Starting named Service
Starting ipa_memcached Service
Starting httpd Service
Starting pki-tomcatd Service
Starting ipa-otpd Service
:: [   PASS   ] :: Command 'ipactl start' (Expected 0, got 0)

Comment 19 Sankar Ramalingam 2015-01-30 13:02:42 UTC
As mentioned in comment #14, enabled member of plugin and automembers plugin. 
Then, I added automembers export task and member of fixup tasks three times. And I didn't see any crash of the server. I could successfully restart the server and no error messages in the logs too.

I tested the above scenario with nsslapd-dynamic-plugins on and off. 

Marking the bug as Verified based on my testing with the latest 389-ds-base-1.3.3.1-13 and the previous comment from Scott for IPA.

Comment 21 errata-xmlrpc 2015-03-05 09:37:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-0416.html