Bug 1445776

Summary: Deployment of FreeIPA of F26 fails with tomcat errors
Product: [Fedora] Fedora Reporter: Stephen Gallagher <sgallagh>
Component: 389-ds-baseAssignee: Rich Megginson <rmeggins>
Status: CLOSED NOTABUG QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 26CC: abokovoy, awilliam, edewata, gmarr, ipa-maint, jcholast, jhrozek, mreynolds, nhosoi, nkinder, pvoborni, rcritten, rmeggins, robatino, ssorce, tbordaz, tkrizek
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: 389-ds-base-1.3.6.5-1.fc26 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-05-01 16:27:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1349186    

Description Stephen Gallagher 2017-04-26 13:42:33 UTC
Description of problem:
Attempting to deploy FreeIPA 4.4.4 from updates-testing (currently) fails.

Version-Release number of selected component (if applicable):
freeipa-4.4.4-1.fc26
pki-core-10.4.0-1.fc26
389-ds-base-1.3.6.4-1.fc26
tomcat-8.0.43-1.fc26

How reproducible:
Only tried once

Steps to Reproduce:
1. Install Fedora Server 26 pre-release (I used https://kojipkgs.fedoraproject.org/compose/branched/Fedora-26-20170420.n.0/compose/Server/x86_64/iso/Fedora-Server-dvd-x86_64-26-20170420.n.0.iso ) with defaults except setting a sensible hostname and domain name.
2. Log in as root and run:
```
rolectl deploy domaincontroller --name `hostname -d`
```
(or use ipa-server-install directly which is what rolectl does under the hood after generating random passwords and passing arguments to set up the DNS server)

Actual results:
The deployment fails and the journal/ipa-server-install.log will show numerous failures trying to start the Certificate Authority.

Expected results:
FreeIPA should install correctly.


Additional info:


Apr 26 08:54:22 dc1.prebeta.f26.ipa server[6986]: PKIListener: org.apache.catalina.core.StandardServer[before_init]
Apr 26 08:54:23 dc1.prebeta.f26.ipa server[6986]: PKIListener: org.apache.catalina.core.StandardServer[after_init]
Apr 26 08:54:23 dc1.prebeta.f26.ipa server[6986]: PKIListener: org.apache.catalina.core.StandardServer[before_start]
Apr 26 08:54:23 dc1.prebeta.f26.ipa server[6986]: PKIListener: org.apache.catalina.core.StandardServer[configure_start]
Apr 26 08:54:23 dc1.prebeta.f26.ipa server[6986]: PKIListener: org.apache.catalina.core.StandardServer[start]
Apr 26 08:54:23 dc1.prebeta.f26.ipa roled[970]: 2017-04-26 08:54:23 ERROR: b'ipa         : DEBUG    Waiting until the CA is running'
Apr 26 08:54:23 dc1.prebeta.f26.ipa roled[970]: 2017-04-26 08:54:23 ERROR: b'ipa         : DEBUG    request POST http://dc1.prebeta.f26.ipa:8080/ca/admin/ca/getStatus'
Apr 26 08:54:23 dc1.prebeta.f26.ipa roled[970]: 2017-04-26 08:54:23 ERROR: b"ipa         : DEBUG    request body ''"
Apr 26 08:54:25 dc1.prebeta.f26.ipa server[6986]: SSLAuthenticatorWithFallback: Creating SSL authenticator with fallback
Apr 26 08:54:25 dc1.prebeta.f26.ipa server[6986]: SSLAuthenticatorWithFallback: Setting container
Apr 26 08:54:25 dc1.prebeta.f26.ipa server[6986]: SSLAuthenticatorWithFallback: Initializing authenticators
Apr 26 08:54:25 dc1.prebeta.f26.ipa server[6986]: SSLAuthenticatorWithFallback: Starting authenticators
Apr 26 08:54:26 dc1.prebeta.f26.ipa server[6986]: CMSEngine.initializePasswordStore() begins
Apr 26 08:54:26 dc1.prebeta.f26.ipa server[6986]: CMSEngine.initializePasswordStore(): tag=internaldb
Apr 26 08:54:26 dc1.prebeta.f26.ipa server[6986]: testLDAPConnection connecting to dc1.prebeta.f26.ipa:389
Apr 26 08:54:26 dc1.prebeta.f26.ipa server[6986]: CMSEngine.initializePasswordStore(): tag=replicationdb
Apr 26 08:54:26 dc1.prebeta.f26.ipa server[6986]: testLDAPConnection connecting to dc1.prebeta.f26.ipa:389
Apr 26 08:54:26 dc1.prebeta.f26.ipa server[6986]: testLDAPConnection: The specified user cn=Replication Manager masterAgreement1-dc1.prebeta.f26.ipa-pki-tomcat,cn=config does not exist
Apr 26 08:54:26 dc1.prebeta.f26.ipa server[6986]: CMSEngine: init(): password test execution failed for replicationdbwith NO_SUCH_USER.  This may not be a latest instance.  Ignoring ..
Apr 26 08:54:27 dc1.prebeta.f26.ipa audit[6336]: ANOM_ABEND auid=4294967295 uid=389 gid=389 ses=4294967295 subj=system_u:system_r:dirsrv_t:s0 pid=6336 comm="ns-slapd" exe="/usr/sbin/ns-slapd" sig=11 res=1
Apr 26 08:54:27 dc1.prebeta.f26.ipa kernel: ns-slapd[7055]: segfault at 10 ip 00007fe20651762a sp 00007fe1d608da50 error 4 in libslapd.so.0.1.0[7fe20645e000+128000]
Apr 26 08:54:27 dc1.prebeta.f26.ipa systemd[1]: dirsrv: Main process exited, code=killed, status=11/SEGV
Apr 26 08:54:27 dc1.prebeta.f26.ipa systemd[1]: dirsrv: Unit entered failed state.
Apr 26 08:54:27 dc1.prebeta.f26.ipa audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=dirsrv@PREBETA-F26-IPA comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
Apr 26 08:54:27 dc1.prebeta.f26.ipa systemd[1]: dirsrv: Failed with result 'signal'.
Apr 26 08:54:27 dc1.prebeta.f26.ipa server[6986]: Could not connect to LDAP server host dc1.prebeta.f26.ipa port 389 Error netscape.ldap.LDAPException: Unable to create socket: java.net.ConnectException: Connection refused (Connection refused) (-1)
Apr 26 08:54:27 dc1.prebeta.f26.ipa server[6986]: PKIListener: org.apache.catalina.core.StandardServer[after_start]
Apr 26 08:54:27 dc1.prebeta.f26.ipa server[6986]: PKIListener: Subsystem CA is running.
Apr 26 08:54:27 dc1.prebeta.f26.ipa roled[970]: 2017-04-26 08:54:27 ERROR: b'ipa         : DEBUG    response status 500'
Apr 26 08:54:27 dc1.prebeta.f26.ipa roled[970]: 2017-04-26 08:54:27 ERROR: b"ipa         : DEBUG    response headers {'content-length': '2351', 'content-language': 'en', 'server': 'Apache-Coyote/1.1', 'connection': 'close', 'date': 'Wed, 26 Apr 2017 12:54:27 GMT', 'content-type': '
Apr 26 08:54:27 dc1.prebeta.f26.ipa roled[970]: 2017-04-26 08:54:27 ERROR: b'ipa         : DEBUG    response body \'<!DOCTYPE html><html><head><title>Apache Tomcat/8.0.43 - Error report</title><style type="text/css">H1 {font-family:Tahoma,Arial,sans-serif;color:white;background-col
Apr 26 08:54:27 dc1.prebeta.f26.ipa roled[970]: 2017-04-26 08:54:27 ERROR: b'ipa         : DEBUG    The CA status is: check interrupted due to error: Retrieving CA status failed with status 500'
Apr 26 08:54:27 dc1.prebeta.f26.ipa roled[970]: 2017-04-26 08:54:27 ERROR: b'ipa         : DEBUG    Waiting for CA to start...'
Apr 26 08:54:28 dc1.prebeta.f26.ipa roled[970]: 2017-04-26 08:54:28 ERROR: b'ipa         : DEBUG    request POST http://dc1.prebeta.f26.ipa:8080/ca/admin/ca/getStatus'
Apr 26 08:54:28 dc1.prebeta.f26.ipa roled[970]: 2017-04-26 08:54:28 ERROR: b"ipa         : DEBUG    request body ''"
Apr 26 08:54:28 dc1.prebeta.f26.ipa roled[970]: 2017-04-26 08:54:28 ERROR: b'ipa         : DEBUG    response status 500'
Apr 26 08:54:28 dc1.prebeta.f26.ipa roled[970]: 2017-04-26 08:54:28 ERROR: b"ipa         : DEBUG    response headers {'content-length': '2351', 'content-language': 'en', 'server': 'Apache-Coyote/1.1', 'connection': 'close', 'date': 'Wed, 26 Apr 2017 12:54:28 GMT', 'content-type': '
Apr 26 08:54:28 dc1.prebeta.f26.ipa roled[970]: 2017-04-26 08:54:28 ERROR: b'ipa         : DEBUG    response body \'<!DOCTYPE html><html><head><title>Apache Tomcat/8.0.43 - Error report</title><style type="text/css">H1 {font-family:Tahoma,Arial,sans-serif;color:white;background-col
Apr 26 08:54:28 dc1.prebeta.f26.ipa roled[970]: 2017-04-26 08:54:28 ERROR: b'ipa         : DEBUG    The CA status is: check interrupted due to error: Retrieving CA status failed with status 500'
Apr 26 08:54:28 dc1.prebeta.f26.ipa roled[970]: 2017-04-26 08:54:28 ERROR: b'ipa         : DEBUG    Waiting for CA to start...'
Apr 26 08:54:29 dc1.prebeta.f26.ipa roled[970]: 2017-04-26 08:54:29 ERROR: b'ipa         : DEBUG    request POST http://dc1.prebeta.f26.ipa:8080/ca/admin/ca/getStatus'
Apr 26 08:54:29 dc1.prebeta.f26.ipa roled[970]: 2017-04-26 08:54:29 ERROR: b"ipa         : DEBUG    request body ''"
Apr 26 08:54:29 dc1.prebeta.f26.ipa roled[970]: 2017-04-26 08:54:29 ERROR: b'ipa         : DEBUG    response status 500'
Apr 26 08:54:29 dc1.prebeta.f26.ipa roled[970]: 2017-04-26 08:54:29 ERROR: b"ipa         : DEBUG    response headers {'content-length': '2351', 'content-language': 'en', 'server': 'Apache-Coyote/1.1', 'connection': 'close', 'date': 'Wed, 26 Apr 2017 12:54:29 GMT', 'content-type': '
Apr 26 08:54:29 dc1.prebeta.f26.ipa roled[970]: 2017-04-26 08:54:29 ERROR: b'ipa         : DEBUG    response body \'<!DOCTYPE html><html><head><title>Apache Tomcat/8.0.43 - Error report</title><style type="text/css">H1 {font-family:Tahoma,Arial,sans-serif;color:white;background-col
Apr 26 08:54:29 dc1.prebeta.f26.ipa roled[970]: 2017-04-26 08:54:29 ERROR: b'ipa         : DEBUG    The CA status is: check interrupted due to error: Retrieving CA status failed with status 500'
Apr 26 08:54:29 dc1.prebeta.f26.ipa roled[970]: 2017-04-26 08:54:29 ERROR: b'ipa         : DEBUG    Waiting for CA to start...'

Comment 1 Stephen Gallagher 2017-04-26 13:44:32 UTC
Proposing as a Beta blocker via the Beta criterion:
"The core functional requirements for all Featured Server Roles must be met, without any workarounds being necessary."

Since the FreeIPA server cannot be installed successfully, it cannot serve any of its functions.

Comment 2 Stephen Gallagher 2017-04-26 13:45:45 UTC
(Note to blocker review crew: currently this is only broken in updates-testing, but it's serious enough that I want to make sure it's kept on the radar).

Comment 3 Petr Vobornik 2017-04-26 14:39:11 UTC
Reproduced. I think it is a directory server issue: 
  389-ds-base-1.3.6.4-1.fc26.x86_64

PKI cannot communicate with DS.

 systemctl status dirsrv.COM.service 
● dirsrv.COM.service - 389 Directory Server TEST.EXAMPLE.COM.
   Loaded: loaded (/usr/lib/systemd/system/dirsrv@.service; disabled; vendor preset: disabled)
   Active: failed (Result: signal) since Wed 2017-04-26 16:13:28 CEST; 11min ago
 Main PID: 14647 (code=killed, signal=SEGV)
   Status: "slapd started: Ready to process requests"


ns-slapd[14647]: [26/Apr/2017:16:13:04.563485142 +0200] - INFO - ldbm_back_ldbm2index - ipaca: Indexing VLV: caRejectedRenewal-pki-tomcatIndex
ns-slapd[14647]: [26/Apr/2017:16:13:04.565677302 +0200] - INFO - ldbm_back_ldbm2index - ipaca: Indexing VLV: caRejectedRevocation-pki-tomcatIndex
ns-slapd[14647]: [26/Apr/2017:16:13:04.567787915 +0200] - INFO - ldbm_back_ldbm2index - ipaca: Indexing VLV: caRenewal-pki-tomcatIndex
ns-slapd[14647]: [26/Apr/2017:16:13:04.569939377 +0200] - INFO - ldbm_back_ldbm2index - ipaca: Indexing VLV: caRevocation-pki-tomcatIndex
ns-slapd[14647]: [26/Apr/2017:16:13:04.580337710 +0200] - INFO - ldbm_back_ldbm2index - ipaca: Finished indexing.
systemd[1]: dirsrv.COM.service: Main process exited, code=killed, status=11/SEGV
systemd[1]: dirsrv.COM.service: Unit entered failed state.
systemd[1]: dirsrv.COM.service: Failed with result 'signal'

Comment 6 Stephen Gallagher 2017-04-26 14:53:29 UTC
Correction, I had PKI 10.3.5-12.fc26, not 10.4

Comment 7 Petr Vobornik 2017-04-26 14:54:45 UTC
Thierry could you confirm my theory?

Comment 8 thierry bordaz 2017-04-26 15:10:49 UTC
Petr,

389-ds is SIGSEV but no core was dumped. I will try to reproduce it, would you give me more details how to reproduce ?

Also I noticed that SELinux is enforced and that 389-ds hit (autotuning) 1436689.
I do not know if it is leading to the crash. But if it is, running in permissive would be a temporary workaround.

Comment 9 Petr Vobornik 2017-04-26 15:19:58 UTC
I just
- installed freeipa-server-dns with updates testing enabled on minimal install
- ipa-server-install --setup-dns + mandatory options

From my script:
  sudo ipa-server-install -U -n $IPA_DOMAIN -r $IPA_REALM  -p $IPA_PASSWORD  -a $IPA_PASSWORD --setup-dns --forwarder=$DNS_FORWARDER --no-reverse

More info can be found on vm in comment 5 - including SELinux denieals (there is quite a lot of them related to various services).

Comment 10 thierry bordaz 2017-04-26 16:00:41 UTC
I hit that bug 2 days ago and opened https://pagure.io/389-ds-base/issue/49233
William identified the RC and proposed a fix that is still under review.

The crash is related to the use of persistent search where the connection handling with the new nunc-stans components had a invalid heap access (access after freed block).

A temporary workaround could be to prevent dogtag to open a persistent search.

Comment 11 thierry bordaz 2017-04-27 06:25:47 UTC
Upstream ticket https://pagure.io/389-ds-base/issue/49233 pushed

Comment 12 thierry bordaz 2017-04-27 16:03:26 UTC
Just to clarify where this fix applies.
It fixes a regression introduced in #49097 (in connection.c:connectio_threadmain OP_FLAG_PS calls slapi_pblock_init rather that memset the pblock)

So this bug does not exist in RHEL7.4

Comment 13 Adam Williamson 2017-05-01 16:18:38 UTC
So the original 389-ds-base was withdrawn and a subsequent one, which AFAIK passed all manual and automated testing, has now been pushed stable:

https://bodhi.fedoraproject.org/updates/FEDORA-2017-15e2a038b2

are we tracking anything else here or should we close this now?

Comment 14 Stephen Gallagher 2017-05-01 16:27:44 UTC
Looks good to me.

Comment 15 Geoffrey Marr 2017-05-01 18:24:50 UTC
Discussed during the 2017-05-01 blocker review meeting: [1]

The decision to delay the classification of this bug was made as we believe a fix has been released. We will revisit it next week.

[1] https://meetbot.fedoraproject.org/fedora-blocker-review/2017-05-01/f26-blocker-review.2017-05-01-16.02.txt