Bug 1639038 - When all members of a Ceph group are blacklisted stack update fails due to malformed ceph-ansible inventory
Summary: When all members of a Ceph group are blacklisted stack update fails due to ma...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 14.0 (Rocky)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: beta
: 14.0 (Rocky)
Assignee: Giulio Fidente
QA Contact: Gurenko Alex
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-10-14 16:43 UTC by Gurenko Alex
Modified: 2019-01-11 11:54 UTC (History)
8 users (show)

Fixed In Version: openstack-tripleo-heat-templates-9.0.1-0.20181013060867.ffbe879.el7ost
Doc Type: No Doc Update
Doc Text:
undefined
Clone Of:
Environment:
Last Closed: 2019-01-11 11:53:55 UTC
Target Upstream Version:
Embargoed:
agurenko: automate_bug+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1798044 0 None None None 2018-10-16 08:28:15 UTC
OpenStack gerrit 610736 0 None MERGED Skip hosts group in ceph-ansible inventory when all are blacklisted 2020-11-18 18:07:07 UTC
Red Hat Product Errata RHEA-2019:0045 0 None None None 2019-01-11 11:54:03 UTC

Description Gurenko Alex 2018-10-14 16:43:42 UTC
Description of problem: When blacklisting ceph client nodes stack update fails due to malformed ceph-ansible inventory file


Version-Release number of selected component (if applicable): 2018-10-10.1


How reproducible:


Steps to Reproduce:
1. Deploy 3 controllers, 2 computes, 3 ceph topology
2. Blacklist all compute nodes
3. Try to perform stack update

Actual results:

TASK [run nodes-uuid] **********************************************************
Saturday 13 October 2018  00:09:02 -0400 (0:00:00.048)       0:08:06.138 ****** 
fatal: [undercloud]: FAILED! => {"changed": true, "cmd": "ANSIBLE_LOG_PATH=\"/var/lib/mistral/overcloud/ceph-ansible/nodes_uuid_command.log\" ANSIBLE_CONFIG=\"/var/lib/mistral/overcloud/ansible.cfg\" ANSIBLE_REMOTE_TEMP=/tmp/nodes_uuid_tmp ansible-playbook --private-key /var/lib/mistral/overcloud/ssh_private_key -i /var/lib/mistral/overcloud/ceph-ansible/inventory.yml /var/lib/mistral/overcloud/ceph-ansible/nodes_uuid_playbook.yml", "delta": "0:01:58.263777", "end": "2018-10-13 00:11:01.088439", "msg": "non-zero return code", "rc": 4, "start": "2018-10-13 00:09:02.824662", "stderr": " [WARNING]:  * Failed to parse /var/lib/mistral/overcloud/ceph-\nansible/inventory.yml with yaml plugin: Invalid \"hosts\" entry for \"clients\"\ngroup, requires a dictionary, found \"<type 'NoneType'>\" instead.\n [WARNING]:  * Failed to parse /var/lib/mistral/overcloud/ceph-\nansible/inventory.yml with ini plugin: /var/lib/mistral/overcloud/ceph-\nansible/inventory.yml:5: Expected key=value host variable assignment, got:\ntripleo-admin\n [WARNING]:  * Failed to parse /var/lib/mistral/overcloud/ceph-\nansible/inventory.yml with auto plugin: no root 'plugin' key found,\n'/var/lib/mistral/overcloud/ceph-ansible/inventory.yml' is not a valid YAML\ninventory plugin config file\n [WARNING]: Unable to parse /var/lib/mistral/overcloud/ceph-\nansible/inventory.yml as an inventory source\n [WARNING]: No inventory was parsed, only implicit localhost is available", "stderr_lines": [" [WARNING]:  * Failed to parse /var/lib/mistral/overcloud/ceph-", "ansible/inventory.yml with yaml plugin: Invalid \"hosts\" entry for \"clients\"", "group, requires a dictionary, found \"<type 'NoneType'>\" instead.", " [WARNING]:  * Failed to parse /var/lib/mistral/overcloud/ceph-", "ansible/inventory.yml with ini plugin: /var/lib/mistral/overcloud/ceph-", "ansible/inventory.yml:5: Expected key=value host variable assignment, got:", "tripleo-admin", " [WARNING]:  * Failed to parse /var/lib/mistral/overcloud/ceph-", "ansible/inventory.yml with auto plugin: no root 'plugin' key found,", "'/var/lib/mistral/overcloud/ceph-ansible/inventory.yml' is not a valid YAML", "inventory plugin config file", " [WARNING]: Unable to parse /var/lib/mistral/overcloud/ceph-", "ansible/inventory.yml as an inventory source", " [WARNING]: No inventory was parsed, only implicit localhost is available"], "stdout": "\nPLAY [all] *********************************************************************\n\nTASK [set nodes data] **********************************************************\nSaturday 13 October 2018  00:09:04 -0400 (0:00:00.098)       0:00:00.098 ****** \nok: [mgrs:]\nok: [hosts:]\nok: [controller-2:]\n\nTASK [register machine id] *****************************************************\nSaturday 13 October 2018  00:09:04 -0400 (0:00:00.084)       0:00:00.182 ****** \nfatal: [hosts:]: UNREACHABLE! => {\"changed\": false, \"msg\": \"SSH Error: data could not be sent to remote host \\\"hosts:\\\". Make sure this host can be reached over ssh\", \"unreachable\": true}\nfatal: [controller-2:]: UNREACHABLE! => {\"changed\": false, \"msg\": \"SSH Error: data could not be sent to remote host \\\"controller-2:\\\". Make sure this host can be reached over ssh\", \"unreachable\": true}\nfatal: [mgrs:]: UNREACHABLE! => {\"changed\": false, \"msg\": \"SSH Error: data could not be sent to remote host \\\"mgrs:\\\". Make sure this host can be reached over ssh\", \"unreachable\": true}\n\nPLAY RECAP *********************************************************************\ncontroller-2:              : ok=1    changed=0    unreachable=1    failed=0   \nhosts:                     : ok=1    changed=0    unreachable=1    failed=0   \nmgrs:                      : ok=1    changed=0    unreachable=1    failed=0   \n\nSaturday 13 October 2018  00:11:01 -0400 (0:01:56.496)       0:01:56.678 ****** \n=============================================================================== ", "stdout_lines": ["", "PLAY [all] *********************************************************************", "", "TASK [set nodes data] **********************************************************", "Saturday 13 October 2018  00:09:04 -0400 (0:00:00.098)       0:00:00.098 ****** ", "ok: [mgrs:]", "ok: [hosts:]", "ok: [controller-2:]", "", "TASK [register machine id] *****************************************************", "Saturday 13 October 2018  00:09:04 -0400 (0:00:00.084)       0:00:00.182 ****** ", "fatal: [hosts:]: UNREACHABLE! => {\"changed\": false, \"msg\": \"SSH Error: data could not be sent to remote host \\\"hosts:\\\". Make sure this host can be reached over ssh\", \"unreachable\": true}", "fatal: [controller-2:]: UNREACHABLE! => {\"changed\": false, \"msg\": \"SSH Error: data could not be sent to remote host \\\"controller-2:\\\". Make sure this host can be reached over ssh\", \"unreachable\": true}", "fatal: [mgrs:]: UNREACHABLE! => {\"changed\": false, \"msg\": \"SSH Error: data could not be sent to remote host \\\"mgrs:\\\". Make sure this host can be reached over ssh\", \"unreachable\": true}", "", "PLAY RECAP *********************************************************************", "controller-2:              : ok=1    changed=0    unreachable=1    failed=0   ", "hosts:                     : ok=1    changed=0    unreachable=1    failed=0   ", "mgrs:                      : ok=1    changed=0    unreachable=1    failed=0   ", "", "Saturday 13 October 2018  00:11:01 -0400 (0:01:56.496)       0:01:56.678 ****** ", "=============================================================================== "]}


Expected results:

Stack update complets successfully

Additional info:

Comment 2 Giulio Fidente 2018-10-15 09:43:29 UTC
Do I understand correctly this is only an issue when all clients (computes) are blacklisted?

Comment 3 Gurenko Alex 2018-10-15 10:52:43 UTC
(In reply to Giulio Fidente from comment #2)
> Do I understand correctly this is only an issue when all clients (computes)
> are blacklisted?

Cannot say for sure right now, I will try to run this job with only 1 node blacklisted and see the result.

Comment 4 Marius Cornea 2018-10-15 12:57:08 UTC
(In reply to Giulio Fidente from comment #2)
> Do I understand correctly this is only an issue when all clients (computes)
> are blacklisted?

Yes, that's an issue only when all clients are blacklisted

Comment 19 Gurenko Alex 2018-11-12 09:17:18 UTC
We have a passing job in CI that was used to catch this issue

Verified on puddle 2018-11-07.2

Comment 24 errata-xmlrpc 2019-01-11 11:53:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:0045


Note You need to log in before you can comment on or make changes to this bug.