Description of problem: While testing blacklist stack update for the topology 3cont_3db_3msg_2net_2comp_3ceph and blacklisting networker nodes stack update fails. Version-Release number of selected component (if applicable): 2018-09-06.1 How reproducible: 100% Steps to Reproduce: 1. Deploy said topology 2. Create blacklist.yaml that includes networker nodes 3. Trigger stack update Actual results: The action raised an exception [action_ex_id=cc1e4b01-b516-4ebf-983a-b79d40d4060b, action_cls='<class 'mistral.actions.action_factory.GetOvercloudConfig'>', attributes='{}', params='{u'container_config': u'overcloud-config', u'container': u'overcloud'}'] u'fd6d150b-4086-4d97-8f80-5e8b295f5242'Warning: Permanently added '192.168.24.13' (ECDSA) to the list of known hosts. Expected results: UPDATE_COMPLETE Additional info:
My first thoughts are we need https://review.openstack.org/#/c/589290/ to honor blacklisting for enable_ssh_admin action. Not 100% sure but in the logs I can see the networker node blacklist isn't honored for this action.
Marius is currently reproducing the error in CI so we can debug in a live systemd. Also, this trace is the actual error we need to look at: https://paste.fedoraproject.org/paste/G1taMU87ldBieqX5E6wG~w (found in mistral executor)
VERIFIED: openstack-tripleo-common-9.3.1-0.20180920204842.el7ost.noarch
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:0045