|Summary:||During ipa-server-upgrade, CA tries to connect to ns-slapd right after directory server is restarted, and fails|
|Product:||Red Hat Enterprise Linux 8||Reporter:||Paul Nickerson <pnickerson>|
|Component:||ipa||Assignee:||Thomas Woerner <twoerner>|
|Status:||NEW ---||QA Contact:||ipa-qe <ipa-qe>|
|Version:||8.3||CC:||abokovoy, fcami, fdc, ftrivino, ipa-maint, jcholast, jhrozek, mhjacks, pvoborni, rcritten, ssorce, tscherf, twoerner|
|Fixed In Version:||Doc Type:||If docs needed, set a value|
|Doc Text:||Story Points:||---|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
|Cloudforms Team:||---||Target Upstream Version:|
Description Paul Nickerson 2021-05-04 19:49:48 UTC
Created attachment 1779495 [details] Logs and command outputs Description of problem: When ipaupgrade runs, PKI Tomcat's CA tries to connect to 389 Directory Server's ns-slapd (LDAP) and fails with connection refused because ns-slapd happens to be restarting in that 4 second time window. CA dies at this moment. Later on, ipaupgrade tries to connect to CA and fails with internal server error, and errors out. Version-Release number of selected component (if applicable): ipa --version: VERSION: 4.8.7, API_VERSION: 2.239 Red Hat Enterprise Linux 8.3 (Ootpa) How reproducible: Happens every time I try to run "ipactl restart". Haven't tried this on a fresh system. Steps to Reproduce: 1. ipactl restart Actual results: Failure reported and all services are stopped. Expected results: Success reported and all services are running. Additional info: I had need to run "ipactl restart", and got this message: IPA version error: data needs to be upgraded (expected version '4.8.7-16.module+el8.3.0+10289+b6566038', current version '4.8.7-13.module+el8.3.0+8376+0bba7131') Automatically running upgrade, for details see /var/log/ipaupgrade.log Be patient, this may take a few minutes. After a little time, this completes with failure: [Migrating certificate profiles to LDAP] IPA server upgrade failed: Inspect /var/log/ipaupgrade.log and run command ipa-server-upgrade manually. Unexpected error - see /var/log/ipaupgrade.log for details: RemoteRetrieveError: Failed to authenticate to CA REST API /var/log/ipaupgrade.log 2021-05-04T18:17:39Z DEBUG SUCCESS: port: 8443 2021-05-04T18:17:39Z DEBUG Start of email@example.com complete 2021-05-04T18:17:40Z DEBUG args=['/bin/systemctl', 'stop', 'dirsrv@SEMI-CASHSTAR-NET.service'] 2021-05-04T18:17:42Z DEBUG args=['/bin/systemctl', 'start', 'dirsrv@SEMI-CASHSTAR-NET.service'] 2021-05-04T18:17:45Z DEBUG args=['/bin/systemctl', 'is-active', 'dirsrv@SEMI-CASHSTAR-NET.service'] 2021-05-04T18:17:55Z DEBUG request GET https://ipa01.semi.cashstar.net:8443/ca/rest/account/login 2021-05-04T18:17:55Z DEBUG request body '' 2021-05-04T18:17:55Z DEBUG response status 500 CA subsystem unavailable. Check CA debug log. /var/log/pki/pki-tomcat/ca/debug.2021-05-04.log 2021-05-04 18:17:43 [main] INFO: Initializing CA subsystem 2021-05-04 18:17:43 [main] FINE: LdapBoundConnection: Connecting to ipa01.semi.cashstar.net:636 with client cert auth 2021-05-04 18:17:43 [main] SEVERE: Unable to create socket: java.net.ConnectException: Connection refused (Connection refused) 2021-05-04 18:17:43 [main] SEVERE: Shutting down. 2021-05-04 18:17:43 [main] INFO: Shutting down CA subsystem journalctl --unit dirsrv@SEMI-CASHSTAR-NET.service May 04 18:17:42 ipa01.semi.cashstar.net systemd: Starting 389 Directory Server SEMI-CASHSTAR-NET.... May 04 18:17:44 ipa01.semi.cashstar.net ns-slapd: [04/May/2021:18:17:44.029650333 +0000] - INFO - slapd_extract_cert - CA CERT NAME: SEMI.CASHSTAR.NET IPA CA I ran this to verify: while true; do date; netstat -pan | egrep "636|389" | grep " LISTEN "; sleep 1; done Tue May 4 18:17:40 UTC 2021 tcp6 0 0 :::636 :::* LISTEN 893452/ns-slapd tcp6 0 0 :::389 :::* LISTEN 893452/ns-slapd Tue May 4 18:17:41 UTC 2021 Tue May 4 18:17:42 UTC 2021 Tue May 4 18:17:43 UTC 2021 Tue May 4 18:17:44 UTC 2021 Tue May 4 18:17:45 UTC 2021 tcp6 0 0 :::636 :::* LISTEN 893792/ns-slapd tcp6 0 0 :::389 :::* LISTEN 893792/ns-slapd The LDAP directory server ns-slapd is not listening on port 636 at 18:17:43 because dirsrv@SEMI-CASHSTAR-NET.service is still starting up at that time. And at that moment, CA tries to connect to ns-slapd on port 636, the connection is refused, and CA shuts down. And then 12 seconds later, ipa-server-upgrade tries to connect to CA and gets an Internal Server Error response. I think PKI Tomcat, which got started by ipa-server-upgrade at 18:17:39, just so happened to start CA at 18:17:43, which is right in the middle of the wrong time. This looks like a race condition. A race condition which is reliably happening every time for me.
Comment 1 Paul Nickerson 2021-05-04 20:34:03 UTC
I figured out a workaround that works on my system. Edit this file: /usr/lib/python3.6/site-packages/ipaserver/install/server/upgrade.py Near the top, add this: import time Find this line: logger.info('[Verifying that KDC configuration is using ipa-kdb backend]') And add this right afterword: logger.info('But first, I sleep so that I don\'t stop 389 Directory Server LDAP ns-slapd while PKI Tomcat CA is trying to use it.') time.sleep(20) logger.info('I have woken.') Save your changes, and run "ipactl restart" (or maybe ipa-server-upgrade would work). On my system, 20 seconds worked, and 4 was not enough. The idea is that "systemctl stop dirsrv@SEMI-CASHSTAR-NET.service" should not run until "[main] INFO: CA subsystem started" shows up in the ca debug log. And waiting 20 seconds is one dirty way to get it done.