Bug 1509157 - Unable to list known health checks
Summary: Unable to list known health checks
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.7.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 3.7.z
Assignee: Luke Meyer
QA Contact: Wenkai Shi
URL:
Whiteboard:
Depends On:
Blocks: 1538407
TreeView+ depends on / blocked
 
Reported: 2017-11-03 08:29 UTC by Wenkai Shi
Modified: 2018-04-05 09:31 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: In order for the adhoc.yml playbook to list health checks, they are all loaded, and one failed to load in that environment. Consequence: Listing health checks failed with an error. Fix: The problem health check was adjusted. Result: Listing works again.
Clone Of:
: 1538407 (view as bug list)
Environment:
Last Closed: 2018-04-05 09:30:40 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:0636 None None None 2018-04-05 09:31:32 UTC

Description Wenkai Shi 2017-11-03 08:29:50 UTC
Description of problem:
Unable to list known health checks

Version-Release number of the following components:
openshift-ansible-3.7.0-0.190.0.git.0.129e91a.el7

How reproducible:
100%

Steps to Reproduce:
1. Run /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-checks/adhoc.yml playbook.
2.
3.

Actual results:
# grep -nir "openshift_deployment_type" hosts
34:openshift_deployment_type=openshift-enterprise

# ansible-playbook -i hosts -v /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-checks/adhoc.yml
...
TASK [List known health checks] **************************************************************************************************************************************************************
fatal: [localhost]: FAILED! => {"changed": false, "failed": true, "msg": "This check expects the 'openshift_deployment_type' inventory variable to be defined\nin order to proceed, but it is undefined. There may be a bug\nin Ansible, the checks, or their dependencies.", "playbook_context": "adhoc"}
	to retry, use: --limit @/usr/share/ansible/openshift-ansible/playbooks/byo/openshift-checks/adhoc.retry

PLAY RECAP ***********************************************************************************************************************************************************************************
localhost                  : ok=1    changed=0    unreachable=0    failed=1   
...

Expected results:
Should list known health checks successfully
Additional info:

Comment 1 Wenkai Shi 2017-11-03 09:37:14 UTC
Same result in ansible-2.4.0.0-5.el7 & ansible-2.3.2.0-2.el7

Comment 2 Tim Bielawa 2017-11-03 14:31:12 UTC
I ran into an issue like this recently while writing up the 'add container provider' playbooks for openshift-management. Specifically, in my playbook it was not picking up variables I had defined in my inventory, just like you have.

I looked quickly at the playbook you referenced, playbooks/byo/openshift-checks/adhoc.yml and confirmed you ran into the same 'issue' as me. That is to say, inventory variables are not being picked up.

This is caused by the definition of the 'hosts' in the adhoc checks playbook:

> - name: OpenShift health checks
>   hosts: localhost
>   connection: local

When `hosts` is 'localhost' I found that ansible does not consider the provided '-i hosts' inventory file.

I do not know if historically Ansible used to read the inventory file, but I know that the present behavior is to ignore the inventory. I'm tagging Russell to check this out, too.

Comment 4 Luke Meyer 2018-01-15 19:58:38 UTC
The play runs against localhost, and the code aborts because openshift_deployment_type is not defined for localhost. Defining it (or just deployment_type) in the [OSEv3:vars] subsection does not define it for localhost. You can set it for localhost explicitly by adding at the top of the inventory file:

localhost openshift_deployment_type=openshift-enterprise

...or by specifying it as an extra variable. Obviously, you shouldn't have to do that to get this to list checks. Unrelated changes to initialization code broke this path.

Comment 6 Luke Meyer 2018-01-18 12:33:26 UTC
Merged to master
3.7 backport https://github.com/openshift/openshift-ansible/pull/6772

Comment 8 Wenkai Shi 2018-01-25 02:47:50 UTC
Verified with version openshift-ansible-3.7.26-1.git.0.f87f1af.el7, known health checks can be list with command [1].

[1]. ansible-playbook -i hosts -v /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-checks/adhoc.yml

...
     Message:  This playbook is meant to run health checks, but no checks were requested. Set the `openshift_checks` variable to a comma-separated list of check names or a YAML list. Available checks:
                 curator
                 diagnostics
                 disk_availability
                 docker_image_availability
                 docker_storage
                 elasticsearch
                 etcd_imagedata_size
                 etcd_traffic
                 etcd_volume
                 fluentd
                 fluentd_config
                 kibana
                 logging
                 logging_index_time
                 memory_availability
                 ovs_version
                 package_availability
                 package_update
                 package_version
               
               Tags can be used as a shortcut to select multiple checks. Available tags and the checks they select:
                 @etcd = etcd_imagedata_size,etcd_traffic,etcd_volume
                 @health = curator,diagnostics,docker_storage,elasticsearch,etcd_traffic,etcd_volume,fluentd,fluentd_config,kibana,logging_index_time,ovs_version
                 @logging = curator,elasticsearch,fluentd,kibana,logging_index_time
                 @preflight = disk_availability,docker_image_availability,docker_storage,memory_availability,package_availability,package_update,package_version

Comment 12 errata-xmlrpc 2018-04-05 09:30:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0636


Note You need to log in before you can comment on or make changes to this bug.