Bug 1866562

Summary: Deletion of compute node fails on "tripleo_ipa_cleanup : delete hosts" step
Product: Red Hat OpenStack Reporter: Marian Krcmarik <mkrcmari>
Component: openstack-tripleo-heat-templatesAssignee: Lance Bragstad <lbragsta>
Status: CLOSED ERRATA QA Contact: Jeremy Agee <jagee>
Severity: high Docs Contact:
Priority: high    
Version: 16.1 (Train)CC: alee, chjones, gcharot, gregraka, jamsmith, lbragsta, mburns, michele, rheslop, rmascena
Target Milestone: z2Keywords: Triaged
Target Release: 16.1 (Train on RHEL 8.2)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-11.3.2-1.20200820034358.3dd00ef.el8ost Doc Type: Known Issue
Doc Text:
Currently, you cannot scale down or delete compute nodes if Red Hat OpenStack Platform is deployed with TLS Everywhere using tripleo-ipa. This is because the cleanup role, traditionally delegated to the undercloud as localhost, is now being invoked from the Workflow service (mistral) container. + For more information, see https://access.redhat.com/solutions/5336241
Story Points: ---
Clone Of:
: 1868767 (view as bug list) Environment:
Last Closed: 2020-10-28 15:38:46 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1868767, 1874847    
Attachments:
Description Flags
ansible.log
none
scale_steps_tasks.yaml none

Description Marian Krcmarik 2020-08-05 22:06:57 UTC
Created attachment 1710577 [details]
ansible.log

Created attachment 1710577 [details]
ansible.log

Description of problem:
Deletion of compute node during Compute node replacement procedure fails on an environment deployed with TLS-Everywhere (tripleo-ipa) mode (specifically multistack DCn topology).
The example of command executed:
openstack overcloud node delete --stack dcn2 ${OC_NODE_ID_COMP2_1} --yes

Fails on:
TASK [tripleo_ipa_cleanup : delete hosts, subhosts and services from freeIPA] ***
Wednesday 05 August 2020  09:17:49 +0000 (0:00:00.227)       0:00:05.213 ****** 
fatal: [dcn2-compute2-1]: FAILED! => {"changed": false, "module_stderr": "sudo: unable to open /run/sudo/ts/mistral: Permission denied\nsudo: a password is required\n", "module_stdout": "", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error", "rc": 1}

The failing ansible task looks like:
- name: delete hosts, subhosts and services from freeIPA
  cleanup_ipa_services:
    keytab: "{{ tripleo_ipa_keytab }}"
    hosts: "{{ tripleo_ipa_hosts_to_delete }}"
  become: yes

I removed the "become: yes" because it seems the task is ran under root user anyway and It failed on different error which I do not know It is caused by lack of privileges:
TASK [tripleo_ipa_cleanup : delete hosts, subhosts and services from freeIPA] ***
Wednesday 05 August 2020  21:08:13 +0000 (0:00:00.101)       0:00:03.641 ****** 
fatal: [dcn2-compute2-1]: FAILED! => {"changed": false, "msg": "'KRB5CCNAME'"} privileges or another problem, the output is:



Version-Release number of selected component (if applicable):
ansible-tripleo-ipa-0.2.1-0.20200611104546.c22fc8d.el8ost.noarch

How reproducible:
Always

Steps to Reproduce:
1. Get OSP 16.1 with TLS-E deployed with tripleo-ipa method
2. Try to delete a compute node from stack (see command above)
3.

Actual results:"module_stderr": "sudo: unable to open /run/sudo/ts/mistral: Permission denied\nsudo: a password is required\n", "module_stdout": "", "msg": "MODULE FAILURE
It fails - 

Expected results:
Successful removal

Additional info:

Comment 1 Marian Krcmarik 2020-08-05 22:07:36 UTC
Created attachment 1710578 [details]
scale_steps_tasks.yaml

Comment 18 errata-xmlrpc 2020-10-28 15:38:46 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.1 bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:4284