Bug 1547210

Summary:	3.9.0-0.46.0 logging deploy fails when openshift_logging_es_nodeselector not specified
Product:	OpenShift Container Platform	Reporter:	Mike Fiedler <mifiedle>
Component:	Installer	Assignee:	Vadim Rutkovsky <vrutkovs>
Status:	CLOSED ERRATA	QA Contact:	Mike Fiedler <mifiedle>
Severity:	medium	Docs Contact:
Priority:	unspecified
Version:	3.9.0	CC:	anli, aos-bugs, jcantril, jokerman, juzhao, mmccomas, pruan, rmeggins, vrutkovs
Target Milestone:	---	Keywords:	Reopened
Target Release:	3.9.0
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:	Cause: Default nodeselector for elasticsearch was not set Consequence: Installation with default settings and enabled logging failed Fix: Correct handling of a default nodeselector was implemented Result: Logging install completes without a nodeselector set	Story Points:	---
Clone Of:		Environment:
Last Closed:	2018-12-13 19:26:51 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Mike Fiedler 2018-02-20 18:15:32 UTC

Description of problem:

The latest logging installer is requiring openshift_logging_es_nodeselector where it was not required before.  The default should be to allow ES to schedule normally.   The syntax of this variable is also a JSON map which would not be very friendly if this really is a required inventory var.

TASK [openshift_logging_elasticsearch : Ensure that ElasticSearch has nodes to run on] *****************************************************
fatal: [ip-172-31-35-49]: FAILED! => {"msg": "The conditional check 'openshift_schedulable_node_labels | lib_utils_oo_has_no_matching_selector(openshift_logging_es_nodeselector)' failed. The error was: error while evaluating conditional (openshift_schedulable_node_labels | lib_utils_oo_has_no_matching_selector(openshift_logging_es_nodeselector)): {{ groups['oo_nodes_to_config'] | lib_utils_oo_get_node_labels(hostvars) }}: 'dict object' has no attribute 'oo_nodes_to_config'\n\nThe error appears to have been in '/usr/share/ansible/openshift-ansible/roles/openshift_logging_elasticsearch/tasks/main.yaml': line 2, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n---\n- name: Ensure that ElasticSearch has nodes to run on\n  ^ here\n"}



Version-Release number of selected component (if applicable): openshift-ansible and logging v3.9.0-0.46.0


How reproducible: Always


Steps to Reproduce:
1. With openshift-ansible 3.9.0-0.46.0, install logging with the inventory below.


Actual results:

Install fails with the error above.

Expected results:

Install succeeds as in past 3.9 builds (and past releases)

Additional info:

[OSEv3:children]
masters
etcd

[masters]
ip-172-31-35-49

[etcd]
ip-172-31-35-49

[OSEv3:vars]
deployment_type=openshift-enterprise

openshift_deployment_type=openshift-enterprise
openshift_release=v3.9
openshift_docker_additional_registries=registry.reg-aws.openshift.com


openshift_logging_install_logging=true
openshift_logging_master_url=https://ec2-54-201-71-102.us-west-2.compute.amazonaws.com:8443
openshift_logging_master_public_url=https://ec2-54-201-71-102.us-west-2.compute.amazonaws.com:8443
openshift_logging_kibana_hostname=kibana.apps.0220-y8n.qe.rhcloud.com
openshift_logging_namespace=logging
openshift_logging_image_prefix=registry.reg-aws.openshift.com:443/openshift3/
openshift_logging_image_version=v3.9.0-0.46.0
openshift_logging_es_cluster_size=3
openshift_logging_es_pvc_dynamic=true
openshift_logging_es_pvc_size=50Gi
openshift_logging_es_pvc_storage_class_name=gp2
openshift_logging_fluentd_read_from_head=false
openshift_logging_use_mux=false

Comment 1 Jeff Cantrill 2018-02-20 20:42:54 UTC

@Vadum,

Who advised this change was required?  From the logging team perspective, it is unnecessary to define specific nodes upon which to run the logging pods.  Is this a requirement driven by someone else?

Comment 2 Scott Dodson 2018-02-20 20:59:57 UTC

Mike,

Do you have nodes defined or is that an incomplete inventory pasted in comment 0?

Comment 3 Mike Fiedler 2018-02-20 21:06:52 UTC

Apologies, incomplete inventory.   Full inventory:

[OSEv3:children]                                                      
masters                                                               
etcd                                                                  

[masters]                                                             
ip-172-31-35-49                    

[etcd]                                                                
ip-172-31-35-49                    

[OSEv3:vars]                                                          
deployment_type=openshift-enterprise                                  

openshift_deployment_type=openshift-enterprise                                                                                              
openshift_release=v3.9                                                
openshift_docker_additional_registries=registry.reg-aws.openshift.com                                                                       


openshift_logging_install_logging=true                                
openshift_logging_master_url=https://ec2-54-201-71-102.us-west-2.compute.amazonaws.com:8443                                                 
openshift_logging_master_public_url=https://ec2-54-201-71-102.us-west-2.compute.amazonaws.com:8443                                          
openshift_logging_kibana_hostname=kibana.apps.0220-y8n.qe.rhcloud.com                                                                       
openshift_logging_namespace=logging                                   
openshift_logging_image_prefix=registry.reg-aws.openshift.com:443/openshift3/                                                               
openshift_logging_image_version=v3.9                                  
openshift_logging_es_cluster_size=3                                   
openshift_logging_es_pvc_dynamic=true                                 
openshift_logging_es_pvc_size=50Gi                                    
openshift_logging_es_pvc_storage_class_name=gp2                                                                                             
openshift_logging_fluentd_read_from_head=false                                                                                              
openshift_logging_use_mux=false

Comment 5 Mike Fiedler 2018-02-20 21:11:22 UTC

Sorry - disregard comment 3.  So I should have an actual [nodes] section now for logging install?   Trying that.

Comment 6 Mike Fiedler 2018-02-20 21:21:23 UTC

It works with a [nodes] section.

Comment 7 Vadim Rutkovsky 2018-02-20 21:54:59 UTC

Sorry, I didn't expect this change to cause so many side-effects. The idea behind this is to stop the component install before it would hang up due to incorrect nodeselector or unschedulable nodes. The code now reads ansible config to find out which labels do the nodes have. This would change after https://github.com/openshift/openshift-ansible/pull/7172 is merged - openshift-ansible would dynamically find out node labels, so there would be no need to add '[nodes]' group and specify all node labels

Comment 8 Mike Fiedler 2018-02-21 12:17:57 UTC

Re-opening and assigning to myself to verify once https://github.com/openshift/openshift-ansible/pull/7172 merges

Comment 9 Jeff Cantrill 2018-02-21 14:18:14 UTC

*** Bug 1547375 has been marked as a duplicate of this bug. ***

Comment 10 Jeff Cantrill 2018-02-22 15:12:00 UTC

*** Bug 1547972 has been marked as a duplicate of this bug. ***

Comment 11 Mike Fiedler 2018-02-22 19:13:10 UTC

I am assigning this back for the time being.   This has had a major impact on the functional QE teams automation and seems to be a breaking change after feature freeze.   If the pull in comment 7 fixes this we can move it back to ON_QA.   Is that targeted for 3.9.0?

Comment 12 Scott Dodson 2018-02-28 20:19:32 UTC

https://github.com/openshift/openshift-ansible/pull/7241 in openshift-ansible-3.9.1-1

Comment 13 Anping Li 2018-03-06 07:25:41 UTC

The logging can be deployed without the openshift_logging_es_nodeselector with openshift3/ose-ansible/images/v3.9.2-1, So move bug to verified.

Comment 16 errata-xmlrpc 2018-12-13 19:26:51 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3748