Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1921855

Summary: Undercloud infrastructure systems IPA enrollment task is not idempotent
Product: Red Hat OpenStack Reporter: Andrea Veri <averi>
Component: ansible-tripleo-ipaAssignee: Ade Lee <alee>
Status: CLOSED ERRATA QA Contact: Jeremy Agee <jagee>
Severity: high Docs Contact:
Priority: urgent    
Version: 16.1 (Train)CC: ahasson, alee, chrisbro, dwilde, hrybacki, jagee, jamsmith
Target Milestone: z6Keywords: Triaged, ZStream
Target Release: 16.1 (Train on RHEL 8.2)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ansible-tripleo-ipa-0.2.1-1.20210406193439.3bb3c53.el8ost.noarch.rpm Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
or errata advisory 71440
Last Closed: 2021-05-26 13:50:41 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Andrea Veri 2021-01-28 17:55:00 UTC
Description of problem:

Whenever you run a deployment ansible-freeipa tries to enroll all the undercloud infrastructure systems even if they are already correctly enrolled against IDM.

Version-Release number of selected component (if applicable):

16.1

How reproducible:
100%

Steps to Reproduce:
1. Have the TLS everywhere template sourced in your deployment
2. Have an Openstack environment backed up by AAA managed by Red Hat IPA (IDM)
3. Run a deployment

Actual results:

2021-01-26 21:34:04,139 p=1004216 u=mistral n=ansible | ok: [undercloud] => {"ansible_facts": {"base_server_domain": "REDACTED.redhat.com", "base_server_fqdn": "compute-az1-17.REDACTED.redhat.com", "base_server_short_name": "compute-az1-17", "enroll_base_server": true}, "changed": false}
2021-01-26 21:34:04,177 p=1004216 u=mistral n=ansible | TASK [tripleo_ipa_registration : get host raw data and keytab info] ************
2021-01-26 21:34:04,177 p=1004216 u=mistral n=ansible | Tuesday 26 January 2021  21:34:04 +0000 (0:00:00.702)       0:48:23.118 ******* 
2021-01-26 21:34:05,349 p=1004216 u=mistral n=ansible | ok: [undercloud] => {"changed": false, "cmd": ["ipa", "host-show", "--raw", "--all", "compute-az1-17.REDACTED.redhat.com"], "delta": "0:00:00.824479", "end": "2021-01-26 21:34:05.317295", "failed_when_result": false, "msg": "non-zero return code", "rc": 1, "start": "2021-01-26 21:34:04.492816", "stderr": "ipa: ERROR: Ticket expired", "stderr_lines": ["ipa: ERROR: Ticket expired"], "stdout": "", "stdout_lines": []}
2021-01-26 21:34:05,386 p=1004216 u=mistral n=ansible | TASK [tripleo_ipa_registration : remove stale host if present] *****************
2021-01-26 21:34:05,386 p=1004216 u=mistral n=ansible | Tuesday 26 January 2021  21:34:05 +0000 (0:00:01.208)       0:48:24.327 ******* 
2021-01-26 21:34:05,444 p=1004216 u=mistral n=ansible | skipping: [undercloud] => {"changed": false, "skip_reason": "Conditional result was False"}
2021-01-26 21:34:05,480 p=1004216 u=mistral n=ansible | TASK [tripleo_ipa_registration : add new host with random one-time password] ***
2021-01-26 21:34:05,480 p=1004216 u=mistral n=ansible | Tuesday 26 January 2021  21:34:05 +0000 (0:00:00.093)       0:48:24.420 ******* 
2021-01-26 21:34:09,472 p=1004216 u=mistral n=ansible | [WARNING]: Module did not set no_log for random_password

Expected results:

The task should succeed and report the fact the system is already correctly subscribed.


Additional info:

There are 2 bugs from what I can see here:

1. There's no task that makes sure the host has a valid keytab
2. The "add new host with random one-time password" task should be triggered whenever "get host raw data and keytab info" returns a return code based on the output of a successful ipa command, as in, ipa returning an object that should be inspected and acted upon rather than returning a failure for reasons that are external to the actual output of the ipa command itself.

Comment 4 Andrea Veri 2021-02-05 12:56:47 UTC
Ade, as per our meeting last night, I'd like to add this comment to remind you the fact kdestroy is a required step whenever the user trying to gather a new kerberos ticket has already a principal in place, and that principal is either expired or not. What we observed yesterday night was the kinit -kt /etc/novajoin/krb5.keytab nova/`hostname` wasn't superseding the former principal which continued to report a "Ticket expired" error.

Comment 5 Andrea Veri 2021-02-10 12:43:57 UTC
Ade, I believe I found what the issue was with our production environment, a summary will follow:

As per [1] it seems the IPA command used for querying the status of an IDM node runs
as root, as such the workaround to avoid BZ #1921855 should be ran against the root user. The core of the
issue was related to the Kerberos ticket cache which contained an expired ticket for the Nova principal we use to
perform IDM operations within the undercloud (you could see it with klist -A, not plain klist). The expired ticket cannot
be renewed with a plain kinit (or kinit -R) but requires a kdestroy (or kdestroy -A) and a kinit again due to: (from kinit's man page)

-R     requests renewal of the ticket-granting ticket.  Note that an expired ticket cannot be renewed,
       even if the ticket is still within its renewable life.

IPA will then perform its query via the code specified at [2].

[1] https://opendev.org/x/tripleo-ipa/src/branch/master/tripleo_ipa/roles/tripleo_ipa_registration/tasks/main.yml#L36
[2] https://github.com/openstack/tripleo-heat-templates/blob/stable/train/deployment/ipa/ipaservices-baremetal-ansible.yaml#L97-L122

Comment 28 errata-xmlrpc 2021-05-26 13:50:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.1.6 bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:2097

Comment 30 Red Hat Bugzilla 2023-09-18 00:24:26 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days