here on Fedora 34 i have a made a Update last Night. After Rebooting the System the IPA service does not start:
[root@ipa2 ~]# systemctl status ipa
× ipa.service - Identity, Policy, Audit
Loaded: loaded (/usr/lib/systemd/system/ipa.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Fri 2021-04-09 09:17:14 CEST; 5s ago
Process: 1092 ExecStart=/usr/sbin/ipactl start (code=exited, status=1/FAILURE)
Main PID: 1092 (code=exited, status=1/FAILURE)
Apr 09 09:17:06 ipa2.linux.schnell.er ipactl: Assuming stale, cleaning and proceeding
Apr 09 09:17:07 ipa2.linux.schnell.er ipactl: Failed to read data from service file: Failed to get list of services to probe status!
Apr 09 09:17:07 ipa2.linux.schnell.er ipactl: Configured hostname 'ipa2.linux.schnell.er' does not match any master server in LDAP:
Apr 09 09:17:07 ipa2.linux.schnell.er ipactl: No master found because of error: no such entry
Apr 09 09:17:07 ipa2.linux.schnell.er ipactl: Shutting down
Apr 09 09:17:14 ipa2.linux.schnell.er ipactl: Starting Directory Service
Apr 09 09:17:14 ipa2.linux.schnell.er systemd: ipa.service: Main process exited, code=exited, status=1/FAILURE
Apr 09 09:17:14 ipa2.linux.schnell.er systemd: ipa.service: Failed with result 'exit-code'.
Apr 09 09:17:14 ipa2.linux.schnell.er systemd: Failed to start Identity, Policy, Audit.
Apr 09 09:17:14 ipa2.linux.schnell.er systemd: ipa.service: Consumed 1.444s CPU time.
[root@ipa2 ~]# ipactl stop
Stopping Directory Service
ipa: INFO: The ipactl command was successful
[root@ipa2 ~]# ipactl start
Starting Directory Service
Failed to read data from service file: Failed to get list of services to probe status!
Configured hostname 'ipa2.linux.schnell.er' does not match any master server in LDAP:
No master found because of error: no such entry
After downgrading 389-ds-base-libs and 389-ds-base from 2.0.4-1.fc34 to 2.0.3-3.fc34 everything is working fine.
We've seen the same problem in openQA testing. We have tests in openQA that deploy a domain controller (and client) as an earlier release, then upgrade to the release under test and check the controller still works; that type of test failed on the 389-ds-base-2.0.4-1.fc34 update, where it tested upgrade from current F33 to F34 with the update included:
the same type of test also failed for F33 to F35 upgrade on today's Rawhide compose, where 389-ds-base-2.0.4-1.fc35 landed:
The test for update from F34 to Rawhide did not fail, I believe because it actually started with 389-ds-base 2.0.4-1.fc34 (looks like that test runs with updates-testing enabled, which it probably shouldn't).
ab wondered if this could be caused by an unclean shutdown during upgrade. The openQA tests upgrade according to the official guidelines: they run `dnf --releasever=XX system-upgrade download` then `dnf system-upgrade reboot`. They make no attempt to trigger any reboot 'manually' after that; they rely on the upgrade process to boot, run the upgrade, and then reboot. If that doesn't happen - i.e. if the system does not eventually return to a login screen without further action on the part of the test system - the test just times out and dies.
Neither the video nor the log of a failed test show any unclean shutdown. You can download https://openqa.fedoraproject.org/tests/849781/file/role_deploy_domain_controller_check-var_log.tar.gz and verify this; running `journalctl -b-1` on the journal it contains shows the log messages from the upgrade boot, which show the IPA upgrade script attempting to run during the upgrade but failing in this way, followed by a clean completion of the upgrade process and a clean shutdown.
The problem seems to be introduced with Issue 4469 - Backend redesing phase 3a (https://directory.fedoraproject.org/docs/389ds/design/backend-redesign-phase3.html).
More specifically #4469 made some changes in the access of the entryrdn database. Entryrdn database looks valid (after upgrade) but upgraded DS can not rebuild a DN using this database.
To process any request the server need to fetch entries using their DN. For this it uses entryrdn DB but fails. The reason of the failure is not yet identified.