Bug 1930038 - 'ipa-server-install --uninstall --ignore-topology-disconnect --ignore-last-of-role' fails with org.freedesktop.DBus.Error.NoReply: Did not receive a reply
Summary: 'ipa-server-install --uninstall --ignore-topology-disconnect --ignore-last-of...
Keywords:
Status: POST
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: ipa
Version: 8.4
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Thomas Woerner
QA Contact: ipa-qe
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-02-18 09:47 UTC by Sudhir Menon
Modified: 2021-04-13 10:00 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug
Target Upstream Version:


Attachments (Terms of Use)

Description Sudhir Menon 2021-02-18 09:47:01 UTC
Description of problem: 'ipa-server-install --uninstall --ignore-topology-disconnect --ignore-last-of-role' fails with org.freedesktop.DBus.Error.NoReply: Did not receive a reply

Version-Release number of selected component (if applicable):
ipa-client-4.9.2-1.module+el8.4.0+9973+3d202164.x86_64
ipa-healthcheck-core-0.7-3.module+el8.4.0+9007+5084bdd8.noarch
ipa-selinux-4.9.2-1.module+el8.4.0+9973+3d202164.noarch
ipa-server-4.9.2-1.module+el8.4.0+9973+3d202164.x86_64
ipa-server-trust-ad-4.9.2-1.module+el8.4.0+9973+3d202164.x86_64
389-ds-base-1.4.3.16-11.module+el8.4.0+9969+312e177c.x86_64
krb5-server-1.18.2-8.el8.x86_64

How reproducible: Always

Steps to Reproduce:
1. 'ipa-server-install', '--uninstall', '-U', '--ignore-topology-disconnect', '--ignore-last-of-role'

Actual results:
[ipatests.pytest_ipa.integration.host.Host.master.cmd404] DEBUG RUN ['ipa-server-install', '--uninstall', '-U', '--ignore-topology-disconnect', '--ignore-last-of-role']
2021-02-18T04:08:47-0500 [ipatests.pytest_ipa.integration.host.Host.master.cmd403] DEBUG /usr/lib/python3.6/site-packages/ipaserver/plugins/dogtag.py:1973: The subsystem in PKIConnection.__init__() has been deprecated (https://www.dogtagpki.org/wiki/PKI_10.8_Python_Changes).
2021-02-18T04:08:47-0500 [ipatests.pytest_ipa.integration.host.Host.master.cmd403] DEBUG Deleting this server will leave your installation without a CRL generation master.
2021-02-18T04:08:47-0500 [ipatests.pytest_ipa.integration.host.Host.master.cmd403] DEBUG Updating DNS system records
2021-02-18T04:08:48-0500 [ipatests.pytest_ipa.integration.host.Host.master.cmd403] DEBUG Forcing removal of master.testrealm.test
2021-02-18T04:08:48-0500 [ipatests.pytest_ipa.integration.host.Host.master.cmd403] DEBUG Ignoring topology connectivity errors.
2021-02-18T04:08:48-0500 [ipatests.pytest_ipa.integration.host.Host.master.cmd403] DEBUG Deleting this server is not allowed as it would leave your installation without a KRA.
2021-02-18T04:08:48-0500 [ipatests.pytest_ipa.integration.host.Host.master.cmd403] DEBUG Ignoring these warnings and proceeding with removal
 [ipatests.pytest_ipa.integration.host.Host.master.cmd403] DEBUG ------------------------------------------
 [ipatests.pytest_ipa.integration.host.Host.master.cmd403] DEBUG Deleted IPA server "master.testrealm.test"
 [ipatests.pytest_ipa.integration.host.Host.master.cmd403] DEBUG ------------------------------------------
 [ipatests.pytest_ipa.integration.host.Host.master.cmd403] DEBUG org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
 [ipatests.pytest_ipa.integration.host.Host.master.cmd403] DEBUG The ipa-server-install command failed. See /var/log/ipaserver-uninstall.log for more information
 [ipatests.pytest_ipa.integration.host.Host.master.cmd403] DEBUG Exit code: 1
 [ipatests.pytest_ipa.integration.host.Host.master.cmd403] ERROR stderr: /usr/lib/python3.6/site-packages/ipaserver/plugins/dogtag.py:1973: The subsystem in PKIConnection.__init__() has been deprecated (https://www.dogtagpki.org/wiki/PKI_10.8_Python_Changes).
Forcing removal of master.testrealm.test
Ignoring topology connectivity errors.
Deleting this server is not allowed as it would leave your installation without a KRA.
Ignoring these warnings and proceeding with removal
org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
The ipa-server-install command failed. See /var/log/ipaserver-uninstall.log for more information

Expected results:
Uninstall should be successful.

Additional info: https://pagure.io/freeipa/issue/8506

Comment 2 Rob Crittenden 2021-02-18 12:52:26 UTC
The log files generated during the failure are necessary.

Comment 8 Rob Crittenden 2021-03-17 21:55:32 UTC
Re-assigning to certmonger component.

I believe what is happening is the CA is in a bad way so certs are stuck in SUBMITTING. I don't know if this is because something else holds the lock file or not.

A change was made to certmonger in Aug 2020 (certmonger-0.79.12+) to not send a SIGKILL when certmonger wants to stop waiting on a child process. This was causing the IPA renewal lock file to be left in an unknown state. It was really just a race condition as it seemed like the processes were usually nearly, but not quite, finished.

The problem in this case is that we want to stop tracking a certificate so don't care whether it is issued or not. certmonger uses waitpid() to determine when the process is finished. In this case it won't happen until after the submission is complete (it's in SUBMITTING). So it exceeds the dbus timeout. Heck, I don't know for sure that IPA would ever finish the request.

I'm going to investigate sending a SIGTERM that can be caught by the helper so it can clean itself up. Ideally after a timeout, but the DBus request timeout is something extremely short like 25 seconds. I'll consider adding retry code to the certmonger calls in IPA.

The reproducer output is in http://freeipa-org-pr-ci.s3-website.eu-central-1.amazonaws.com/jobs/3a555fa6-875f-11eb-92ec-fa163e05ce82/report.html from PR https://github.com/freeipa/freeipa/pull/5573

Comment 9 Rob Crittenden 2021-03-17 21:56:59 UTC
I reproduce this upstream by running test_integration/test_ipahealthcheck.py::TestIpaHealthCheck 5 times. It almost always fails in at least one of the invocations.

Comment 10 Rob Crittenden 2021-03-22 12:00:26 UTC
Moving back to IPA.

The DBus timeouts are seen when IPA is trying to uninstall itself. During this the certificates it issues are untracked by calling the certmonger DBus command remove_request. remove_request waits for the CA helper to complete. Since this time exceeds the DBus timeout the exception is raised.

The certs really have no chance of being issued in this case because of the way the test works. It moves forward in time to test that ipa-healtcheck correctly reports that the certs are soon to expire. Then it moves back to current time and tries to uninstall.

certmonger may wake up during the period that ipa-healthcheck is running and try to renew the certs, then the time changes back. The CA is basically hosed because if any certs are renewed in the future then nothing will work because they are not yet valid.

So modify the test to stop the CA prior to running ipa-healthcheck and uninstall in future time to prevent certificate issuance.

In short: the test needs to be fixed.

Comment 11 Rob Crittenden 2021-03-22 12:02:32 UTC
Upstream ticket:
https://pagure.io/freeipa/issue/8506


Note You need to log in before you can comment on or make changes to this bug.