Bug 1966884

Summary: overcloud deployment fails at "Write octavia inventory" task if all controller nodes are blacklisted
Product: Red Hat OpenStack Reporter: Takashi Kajinami <tkajinam>
Component: openstack-tripleo-heat-templatesAssignee: Brent Eagles <beagles>
Status: CLOSED WONTFIX QA Contact: Joe H. Rahme <jhakimra>
Severity: medium Docs Contact:
Priority: medium    
Version: 16.1 (Train)CC: beagles, frigo, gthiemon, mburns
Target Milestone: zstreamKeywords: Triaged
Target Release: 16.1 (Train on RHEL 8.2)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-04-12 13:52:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Takashi Kajinami 2021-06-02 06:36:08 UTC
Description of problem:


Currently overcloud deployment always fail if all controller nodes are blacklisted using the DeploymentServerBlacklist parameter.

~~~
parameter_defaults:
  DeploymentServerBlacklist:
    - controller-0
    - controller-1
    - controller-2
~~~

According to ansible.log, The "Write octavia inventory" task fails because of incomplete data.

/var/lib/mistral/overcloud/ansible.log
~~~
2021-06-02 06:07:42,494 p=25775 u=mistral n=ansible | TASK [Write group_vars file] ***************************************************
2021-06-02 06:07:42,495 p=25775 u=mistral n=ansible | Wednesday 02 June 2021  06:07:42 +0000 (0:00:00.987)       0:08:37.523 ******** 
2021-06-02 06:07:43,259 p=25775 u=mistral n=ansible | ok: [undercloud] => {"changed": false, "checksum": "ac8976750d6624ed9563fc28b6021e03125e3277", "dest": "/var/lib/mistral/overcloud/octavia-ansible/group_vars/octavia_vars.yaml", "gid": 1003, "group": "tripleo-admin", "mode": "0664", "owner": "tripleo-admin", "path": "/var/lib/mistral/overcloud/octavia-ansible/group_vars/octavia_vars.yaml", "secontext": "system_u:object_r:var_lib_t:s0", "size": 1726, "state": "file", "uid": 1002}
2021-06-02 06:07:43,264 p=25775 u=mistral n=ansible | TASK [Write octavia inventory] *************************************************
2021-06-02 06:07:43,264 p=25775 u=mistral n=ansible | Wednesday 02 June 2021  06:07:43 +0000 (0:00:00.769)       0:08:38.293 ******** 
2021-06-02 06:07:43,526 p=25775 u=mistral n=ansible | fatal: [undercloud]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'hostname'\n\nThe error appears to be in '/var/lib/mistral/overcloud/external_deploy_steps_tasks_step5.yaml': line 73, column 5, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n    name: Write group_vars file\n  - copy:\n    ^ here\n"}
2021-06-02 06:07:43,526 p=25775 u=mistral n=ansible | NO MORE HOSTS LEFT *************************************************************
2021-06-02 06:07:43,528 p=25775 u=mistral n=ansible | PLAY RECAP *********************************************************************
2021-06-02 06:07:43,528 p=25775 u=mistral n=ansible | compute-0                  : ok=288  changed=110  unreachable=0    failed=0    skipped=167  rescued=0    ignored=0   
2021-06-02 06:07:43,528 p=25775 u=mistral n=ansible | compute-1                  : ok=273  changed=110  unreachable=0    failed=0    skipped=163  rescued=0    ignored=0   
2021-06-02 06:07:43,528 p=25775 u=mistral n=ansible | undercloud                 : ok=103  changed=31   unreachable=0    failed=1    skipped=9    rescued=0    ignored=0
~~~

The issue is not reproduced when I blacklist only one controller node (controller-0).


Version-Release number of selected component (if applicable):
The issue can be reproduced in fresh RHOSP16.1.6 deployment.

How reproducible:
Always

Steps to Reproduce:
1. deploy overcloud with Octavia enabled
2. update overcloud with all controller nodes blacklisted

Actual results:
Deployment fails at the "Write octavia inventory" task

Expected results:
Deployment completes without any failure

Additional info:

Comment 12 Brent Eagles 2022-04-12 13:52:38 UTC
I don't think we'll get a resolution other than Takashi's workaround so I'm going to close this bug. We can re-open if it needs further scrutiny/development