Bug 1616047 - [3.11.0-0.14.0] logging installation using openshift ansible logging playbook fails.
Summary: [3.11.0-0.14.0] logging installation using openshift ansible logging playbook...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging
Version: 3.11.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 3.11.0
Assignee: ewolinet
QA Contact: Anping Li
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-08-14 19:35 UTC by Siva Reddy
Modified: 2018-10-11 07:25 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
As part of figuring out which nodes to create a sysctl for as part of installing an ES5 cluster, we weren't excluding etcd hosts, which may not be nodes and therefore would be missing a fact that we were using to evaluate if a node matched an ansible host in the inventory.
Clone Of:
Environment:
Last Closed: 2018-10-11 07:24:57 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
The ansible log with -vvv (172.70 KB, text/plain)
2018-08-15 14:11 UTC, Siva Reddy
no flags Details
inventory file (6.33 KB, text/plain)
2018-08-15 14:39 UTC, Siva Reddy
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:2652 0 None None None 2018-10-11 07:25:28 UTC

Description Siva Reddy 2018-08-14 19:35:41 UTC
Description of problem:
    The install fails when trying to install logging on a 3.11 OCP cluster.

Version-Release number of selected component (if applicable):
oc v3.11.0-0.14.0
kubernetes v1.11.0+d4cacc0
features: Basic-Auth GSSAPI Kerberos SPNEGO

openshift v3.11.0-0.14.0
kubernetes v1.11.0+d4cacc0

How reproducible:
Always

Steps to Reproduce:
1. Install logging using openshift-ansible playbook.
ansible-playbook -i inventory openshift-ansible/playbooks/openshift-logging/config.yml

-- inventory file
[OSEv3:vars]
deployment_type=openshift-enterprise

openshift_deployment_type=openshift-enterprise
openshift_release=v3.11
openshift_docker_additional_registries=registry.reg-aws.openshift.com

oreg_url=registry.reg-aws.openshift.com:443/openshift3/ose-${component}:${version}
openshift_logging_image_version=v3.11
openshift_logging_install_logging=true
openshift_logging_es_cluster_size=1
openshift_logging_es_pvc_dynamic=true
openshift_logging_es_pvc_size=50Gi
openshift_logging_es_pvc_storage_class_name=gp2
openshift_logging_fluentd_read_from_head=false
openshift_logging_curator_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_logging_kibana_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_logging_es_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_logging_es_memory_limit=8Gi
openshift_logging_install_eventrouter=true


Actual results:
   The install fails at task
TASK [Evaluate oo_elasticsearch_nodes] **************************************************************************************************************************************
skipping: [ec2-34-217-107-181.us-west-2.compute.amazonaws.com] => (item=ec2-34-217-107-181.us-west-2.compute.amazonaws.com) 
fatal: [ec2-34-217-107-181.us-west-2.compute.amazonaws.com]: FAILED! => {"msg": "The conditional check 'hostvars[item]['openshift']['common']['ip'] in openshift_logging_elas
ticsearch_hosts' failed. The error was: error while evaluating conditional (hostvars[item]['openshift']['common']['ip'] in openshift_logging_elasticsearch_hosts): 'ansible.v
ars.hostvars.HostVarsVars object' has no attribute 'openshift'\n\nThe error appears to have been in '/root/openshift-ansible/playbooks/openshift-logging/private/config.yml':
 line 82, column 7, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n    # files for specific nodes based on 
<node>.status.addresses[@.type==InternalIP].address\n    - name: Evaluate oo_elasticsearch_nodes\n      ^ here\n"}






Description of problem:

Version-Release number of the following components:
rpm -q openshift-ansible
rpm -q ansible
ansible --version

How reproducible:

Steps to Reproduce:
1.
2.
3.

Actual results:
Please include the entire output from the last TASK line through the end of output if an error is generated

Expected results:

Additional info:
Please attach logs from ansible-playbook with the -vvv flag

Comment 1 Rich Megginson 2018-08-14 20:15:40 UTC
Eric - any ideas?

Comment 2 ewolinet 2018-08-14 22:26:21 UTC
Can you provide the full output of running the logging playbook with -vvv?

Comment 3 Siva Reddy 2018-08-15 14:11:22 UTC
Created attachment 1476156 [details]
The ansible log with -vvv

Comment 4 ewolinet 2018-08-15 14:33:15 UTC
The fact collection looks to have those structures in them...

Siva, 
What does your inventory group section look like? ([masters], [nodes], etc)?

Comment 5 Siva Reddy 2018-08-15 14:39:37 UTC
Created attachment 1476167 [details]
inventory file

Comment 6 ewolinet 2018-08-15 14:58:04 UTC
I think that task is failing when its evaluating the etcd group since it is not a node. If so then the fix would be to only iterate over the list of nodes, not groups['OSEv3'].

Will try to recreate this locally.

Comment 7 Siva Reddy 2018-08-15 15:05:31 UTC
Eric, Yes that looks like the case. I made the etcd group empty and re-ran the playbook. it finished successfully.

Comment 8 ewolinet 2018-08-15 15:44:38 UTC
Siva, that's what I observed locally as well. The plan is to update the loop to only iterate over the list of node hosts; I will open up a PR with the fix afterwards.

Comment 11 Siva Reddy 2018-08-23 16:37:50 UTC
Verified this bug on

openshift v3.11.0-0.20.0
oc v3.11.0-0.19.0
kubernetes v1.11.0+d4cacc0
features: Basic-Auth GSSAPI Kerberos SPNEGO

openshift-ansible git log
# cd openshift-ansible
# git log
commit d003bdf30c23a47f485f9f1f394a3d247c88da06
Merge: ac09fc2 921df2b
Author: OpenShift Merge Robot <openshift-merge-robot.github.com>
Date:   Thu Aug 23 08:56:20 2018 -0700


      The test was performed on a OCP cluster which had split etcd and master. And etcd was not a node. THe installation completed successfully.

# oc project 
Using project "openshift-logging" on server 
# oc get pods
NAME                                      READY     STATUS    RESTARTS   AGE
logging-es-data-master-lzw28btf-1-5xpt7   2/2       Running   0          6m
logging-fluentd-4lz76                     1/1       Running   0          6m
logging-fluentd-gl8hm                     1/1       Running   0          6m
logging-fluentd-n4dfw                     1/1       Running   0          6m
logging-fluentd-qqn2j                     1/1       Running   0          6m
logging-kibana-1-vc9v7                    2/2       Running   0          7m

Comment 13 errata-xmlrpc 2018-10-11 07:24:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2652


Note You need to log in before you can comment on or make changes to this bug.