I recently implemented an openQA test which does the following: * Starting from a clean Fedora 25 Server install, deploy the domain controller role * On another system, starting from a clean Fedora 25 Server install, enrol as a client in the domain (using realmd) * Once the client is enrolled, upgrade the Server system to Fedora 26, then upgrade the client system to Fedora 26 * Run through the usual server and client tests on Fedora 26 The server part of this test always fails right at the end, when the role is decommissioned with `rolectl decommission domaincontroller/domain.local`. In contrast, when a similar test is run entirely on Fedora 26 (and, indeed, entirely on Fedora 25) the decommissioning works successfully. It's only when an upgrade is involved that the decommissioning fails. The failure seems to be related to something done to the firewall configuration during decommissioning, as the system journal contains these lines at the relevant time: Jun 25 11:49:32 ipa001.domain.local firewalld[654]: WARNING: '/usr/sbin/iptables-restore --wait=2 -n' failed: Jun 25 11:49:32 ipa001.domain.local firewalld[654]: WARNING: '/usr/sbin/ip6tables-restore --wait=2 -n' failed: Jun 25 11:49:32 ipa001.domain.local firewalld[654]: ERROR: COMMAND_FAILED /var/log/rolekit doesn't provide anything useful, though - the last message in it is: 2017-06-25 14:49:31 ERROR: b'Client uninstall complete.' which I believe is passed along from the FreeIPA client uninstallation process. I will attach a tarball containing the complete contents of /var/log from the server to this report. You can use 'journalctl --file' to read the journal files under /var/log/journal .
Created attachment 1292752 [details] /var/log from an affected case
This has been happening ever since. It'd be really nice if this test would actually run successfully. Did you ever get to look at it, sgallagh?
So, this is indeed still happening, now we've solved various other issues that got in the way: https://openqa.stg.fedoraproject.org/tests/261752 shows the very same error: https://openqa.stg.fedoraproject.org/tests/261752#step/role_deploy_domain_controller_check/23
sgallagh has said on IRC that this can be reproduced simply across a reboot (i.e. deploy, reboot, attempt to decommission -> fail) and seems to be an issue in firewalld.
Proposing as a Beta freeze exception issue. It appears to me the criteria do not in fact cover decommissioning roles, though this may possibly be an oversight.
OK, I've finally tracked down the failure. The issue is definitely occurring within firewalld. It can be reproduced with the following steps that do not require rolekit: 1) Install a system with firewalld enabled 2) `firewall-cmd --add-service freeipa-ldap --permanent` 3) `firewall-cmd --add-service freeipa-ldaps --permanent` 4) Reboot the system 5) Verify that both services are enabled with `firewall-cmd --list-all` 6) `firewall-cmd --remove-service freeipa-ldaps` (Succeeds) 7) `firewall-cmd --remove-service freeipa-ldap` (Returns "Error: COMMAND_FAILED") It appears that firewalld doesn't properly handle the second removal of a permanent service for which the services have entries that overlap. The freeipa-ldap and freeipa-ldaps services are almost identical, providing numerous ports. They differ only on the LDAP tcp port, which is 389 for freeipa-ldap and 636 for freeipa-ldaps. So it appears that after removing one of the two services, firewalld cannot properly handle removing the other one. This is what causes the FreeIPA decommissioning to fail. You can reverse steps 6 and 7 above and the second one will always fail. (I was clued in that it might be related to the freeipa-ldap/s interaction because the postgresql role did not exhibit the same behavior.)
What version of firewalld? I think this was fixed in version firewalld-0.5.2-1.
I can reproduce the issue with firewalld-0.5.1-2.fc28.noarch I can indeed confirm that firewalld-0.5.2-1.fc28.noarch resolves this issue. Please get it submitted in Bodhi ASAP.
firewalld-0.5.2-1.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2018-fa59de3ded
firewalld-0.5.2-1.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-fa59de3ded
firewalld-0.5.2-2.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2018-980d3f6ad7
Discussed during blocker review [1]: AcceptedFreezeException (Beta) - decommissioning isn't actually part of the release criteria, but is a significant function of server roles, and pulling this in will improve openQA test coverage [1] https://meetbot-raw.fedoraproject.org/fedora-meeting-1/2018-03-22/
firewalld-0.5.2-2.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-980d3f6ad7
openQA testing confirmed the fix for this: https://openqa.stg.fedoraproject.org/tests/263230
firewalld-0.5.2-2.fc28 has been pushed to the Fedora 28 stable repository. If problems still persist, please make note of it in this bug report.
Can the fix for this also be sent to F27? I just started running FreeIPA upgrade tests on stable release updates, and it told me that F27 is still affected by this.
firewalld-0.4.4.5-4.fc26 has been submitted as an update to Fedora 26. https://bodhi.fedoraproject.org/updates/FEDORA-2018-0f5c19f004
firewalld-0.4.4.5-4.fc27 has been submitted as an update to Fedora 27. https://bodhi.fedoraproject.org/updates/FEDORA-2018-c65cf564c3
So I went ahead and backported the commit that looked most obviously like the fix for this - https://github.com/firewalld/firewalld/commit/54835164f610593eedd71f0a7ae62ac5258d2187 - for F26 and F27 and submitted an update. The F27 openQA test result - https://openqa.stg.fedoraproject.org/tests/289311 - seems to confirm the fix: that test is failing in all other F27 update tests right now, but passes (well, soft fails, which is more or less a pass) with this update. sgallagh, could you confirm and upkarma? I'd like to push this out so we don't have this bug causing the test to fail on *every* F27 update.
(In reply to Adam Williamson from comment #19) > So I went ahead and backported the commit that looked most obviously like > the fix for this - > https://github.com/firewalld/firewalld/commit/ > 54835164f610593eedd71f0a7ae62ac5258d2187 - for F26 and F27 and submitted an > update. The F27 openQA test result - > https://openqa.stg.fedoraproject.org/tests/289311 - seems to confirm the > fix: that test is failing in all other F27 update tests right now, but > passes (well, soft fails, which is more or less a pass) with this update. Thanks! LGTM. > sgallagh, could you confirm and upkarma? I'd like to push this out so we > don't have this bug causing the test to fail on *every* F27 update. Adding needinfo for sgallagh.
Yes, I saw this and am looking into it right now, in fact. I'll get the karma out shortly.
Confirmed, this update resolves the firewalld bug. I tested by doing the following: Installed F27 with all stable updates firewall-cmd --add-service=freeipa-ldap --permanent firewall-cmd --add-service=freeipa-ldaps --permanent Rebooted Updated to the fixed package (I did this after the reboot to confirm that the initial state was recoverable) firewall-cmd --remove-service=freeipa-ldap firewall-cmd --remove-service=freeipa-ldaps All went well.
firewalld-0.4.4.5-4.fc26 has been pushed to the Fedora 26 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-0f5c19f004
firewalld-0.4.4.5-4.fc27 has been pushed to the Fedora 27 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-c65cf564c3
firewalld-0.4.4.5-4.fc27 has been pushed to the Fedora 27 stable repository. If problems still persist, please make note of it in this bug report.
firewalld-0.4.4.5-4.fc26 has been pushed to the Fedora 26 stable repository. If problems still persist, please make note of it in this bug report.