Bug 1491636 - [3.6] openshift_logging_es_ops_nodeselector didn't take affect
Summary: [3.6] openshift_logging_es_ops_nodeselector didn't take affect
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging
Version: 3.6.1
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 3.6.z
Assignee: Jan Wozniak
QA Contact: Anping Li
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-09-14 10:21 UTC by Anping Li
Modified: 2020-01-31 19:03 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
undefined
Clone Of:
Environment:
Last Closed: 2017-11-21 05:41:13 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
The inventory file I used (849 bytes, application/x-gzip)
2017-10-23 13:32 UTC, Anping Li
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:3255 0 normal SHIPPED_LIVE OpenShift Container Platform atomic-openshift-utils bug fix and enhancement 2017-11-21 10:40:59 UTC

Description Anping Li 2017-09-14 10:21:33 UTC
Description of problem:
When specify both  openshift_logging_es_ops_nodeselector and openshift_logging_es_nodeselector.
openshift_logging_es_ops_nodeselector wasn't take affect, the nodeSelector for es-ops deploymentconfig is from openshift_logging_es_nodeselector.



Version-Release number of selected component (if applicable):
openshift-ansible-3.6.173.0.33

How reproducible:
always

Steps to Reproduce:
1. deploy logging using the following items
openshift_logging_image_version=v3.6
openshift_logging_namespace=logging
openshift_logging_use_ops=true
openshift_logging_es_cluster_size=1
openshift_logging_es_ops_cluster_size=1

openshift_logging_es_nodeselector={'logging-node':'1'}
openshift_logging_es_ops_nodeselector={'logging-node':'2'}


Actual results:

The nodeSelector for es-ops deploymentconfig is from openshift_logging_es_nodeselector.

#  oc get dc logging-es-data-master-7yz1mgn3  -o yaml |grep -A 2 nodeSelector
      nodeSelector:
        logging-node: "1"
      restartPolicy: Always
#  oc get dc logging-es-ops-data-master-5d2wii6h  -o yaml |grep -A 2 nodeSelector
      nodeSelector:
        logging-node: "1"
      restartPolicy: Always



Expected results:


Additional info:

Comment 1 Jan Wozniak 2017-09-14 12:55:48 UTC
And probably not only the nodeselector, looks like bunch of ansible variables openshift_logging_es_ops_* are not picked up based on the component but always the openshift_logging_es_* ones.

https://github.com/openshift/openshift-ansible/blob/c515805a1ecfa980bc56af4eb58c5655fde6be44/roles/openshift_logging_elasticsearch/tasks/main.yaml#L271-L285

I will try to work on a fix

Comment 2 Jan Wozniak 2017-09-14 15:57:32 UTC
The only relevant config not honored is "openshift_logging_es_nodeselector".

Others not forwarded are "openshift_logging_es_ca_ext" and "openshift_logging_es_ops_storage_group" but they are not used anywhere. 

PR with a fix pending:
https://github.com/openshift/openshift-ansible/pull/5416

Comment 3 openshift-github-bot 2017-09-19 21:32:14 UTC
Commits pushed to master at https://github.com/openshift/openshift-ansible

https://github.com/openshift/openshift-ansible/commit/3566f8c0e8ec12995096b3ea9256266c252f3fb8
Bug 1491636 - honor openshift_logging_es_ops_nodeselector

https://github.com/openshift/openshift-ansible/commit/a01ee6767c673fe2ee144eb851a5488fcf40f2ec
Merge pull request #5416 from wozniakjan/bug1491636/honor_ops_nodeselector

Automatic merge from submit-queue

Bug 1491636 - honor openshift_logging_es_ops_nodeselector

https://bugzilla.redhat.com/show_bug.cgi?id=1491636

Comment 5 Anping Li 2017-09-29 07:49:24 UTC
No nodeselector was configured in es-ops deploymentconfig. 

1) set variables
openshift_logging_es_nodeselector={'logging-node':'1'}
openshift_logging_es_ops_nodeselector={'logging-node':'2'}

2) check the nodeSelector
# oc get dc logging-es-data-master-8xc21lmr -o json |grep -A 2 nodeSelector
                "nodeSelector": {
                    "logging-node": "1"
                },
# oc get dc logging-es-ops-data-master-9iujvkxc -o json |grep -A 2 nodeSelector

Comment 6 Anping Li 2017-09-29 07:50:11 UTC
I was using openshift-ansible-3.7.0-0.134.0.git.0.6f43fc3.el7.noarch for comment 5

Comment 7 Anping Li 2017-09-29 07:58:58 UTC
The variable was not honored openshift_logging_es_ops_memory_limit=2Gi too.

Comment 8 Jan Wozniak 2017-10-02 11:45:12 UTC
There was another PR [1] to openshift-ansible setting certain variables as 'facts'  due to the deprecation of '*_hosted_*' vars. Among those is the openshift_logging_es_ops_nodeselector.

Due to ansible variable precedence, the only way to override facts is with passing --extra-var on the command line [2]. Setting openshift_logging_es_ops_nodeselector in the inventory no longer sets it sufficiently because it will be overwritten when ansible sets the fact.

Currently, for the testing purposes, you can run the playbook as follows:
$ ansible-playbook playbooks/byo/openshift-cluster/openshift-logging.yml --extra-vars "openshift_logging_es_ops_nodeselector={'ns_key':'ns_val'}"


Maybe ewolinetz or sdodson could explain what is the rationale behind the facts setting and what is the expected behavior there?



[1] https://github.com/openshift/openshift-ansible/pull/5176

[2] http://docs.ansible.com/ansible/latest/playbooks_variables.html#variable-precedence-where-should-i-put-a-variable

Comment 9 Jeff Cantrill 2017-10-05 20:18:37 UTC
Moving this to on_qa.  Per comment #6 shouldnt we be using a v3.6 of openshift-ansible instead of 3.7?

Comment 10 Anping Li 2017-10-11 08:35:17 UTC
Jeff, Yes, the issue was pull in due to the deprecation of '*_hosted_*' vars . The openshift_logging_es_ops_nodeselector works with the released openshift-ansible-v3.6.173.0.21-17.

It still won't work with latest available puddle pacakge openshift-ansible-3.6.173.0.45 even with --extra-vars  "openshift_logging_es_ops_nodeselector={'logging-node':'2'}


Test Result:

With latest v3.6 packages openshift-ansible-3.6.173.0.45. The nodeSelector still use values from openshift_logging_es_nodeselector.

ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/openshift-logging.yml --extra-vars  "openshift_logging_es_ops_nodeselector={'logging-node':'2'}"


$ oc get dc logging-es-ops-data-master-15qoianl -o json |grep -A 1 Select
                "nodeSelector": {
                    "logging-node": "1"

Comment 11 Anping Li 2017-10-11 08:37:29 UTC
(In reply to Jeff Cantrill from comment #9)
> Moving this to on_qa.  Per comment #6 shouldnt we be using a v3.6 of
> openshift-ansible instead of 3.7?

I had expected there is fix in v3.7. so I tested with v3.7 packages.

Comment 12 ewolinet 2017-10-11 14:05:25 UTC
(In reply to Anping Li from comment #10)
> Jeff, Yes, the issue was pull in due to the deprecation of '*_hosted_*' vars
> . The openshift_logging_es_ops_nodeselector works with the released
> openshift-ansible-v3.6.173.0.21-17.

To clarify, the *_hosted_* vars are not actually deprecated yet, they are still used. However we will warn users if they are being set in v3.7

Also, we only warn about variables that would have 'hosted' in the variable name. So openshift_logging_es_ops_nodeselector should not be impacted by deprecated vars. 




How are you setting the openshift_logging_es_nodeselector? If it is with -e on the command line, that will overwrite the openshift_logging role trying to set it to openshift_logging_es_ops_nodeselector when calling openshift_logging_elasticsearch for an ops install.

Comment 13 Anping Li 2017-10-23 13:29:28 UTC
Forgive me use v3.7 as example. The v3.7 have same problem. I set the inventory variable in inventory file as following. 


openshift_logging_install_logging=true.
openshift_logging_image_version=v3.7
openshift_logging_namespace=logging
openshift_logging_es_cluster_size=1
openshift_logging_es_memory_limit=2Gi
openshift_logging_es_nodeselector={'logging-node': '1'}

openshift_logging_use_ops=true
openshift_logging_es_ops_cluster_size=1
openshift_logging_es_ops_nodeselector={'logging-node':'2'}

Comment 14 Anping Li 2017-10-23 13:32:13 UTC
Created attachment 1342209 [details]
The inventory file I used

Comment 15 Jan Wozniak 2017-10-24 11:27:24 UTC
created two more PRs to unify behavior between master and 3.6

1) https://github.com/openshift/openshift-ansible/pull/5857 - remove setting of nodeselector as 'fact' to allow overriding default value in the inventory

2) https://github.com/openshift/openshift-ansible/pull/5858 - adding missing statement forwarding nodeselector to the DC

Comment 17 openshift-github-bot 2017-10-25 09:41:21 UTC
Commits pushed to master at https://github.com/openshift/openshift-ansible

https://github.com/openshift/openshift-ansible/commit/1d86e3a156dcf7950d1b2c4877d82f3a183b2033
Bug 1491636 - honor node selectors

The deprecation of `*_hosted_*` vars made logging node selectors set in
the inventory to be ignored.

Node selectors were set as 'facts' and they have higher priority than
inventory variables. Setting logging node selectors could be achieved only
by using command line --extra-vars.

https://github.com/openshift/openshift-ansible/commit/3dea8bceeb98d5a30f0223f7109cc78caa3a6b8f
Merge pull request #5857 from wozniakjan/bz_1491636/logging/nodeselector

Automatic merge from submit-queue.

Bug 1491636 - honor node selectors

The deprecation of `*_hosted_*` vars made logging node selectors set in the inventory to be ignored.

Node selectors were set as 'facts' and they have higher priority than inventory variables. Setting logging node selectors could be achieved only by using command line --extra-vars.

It is related to the same BZ as https://github.com/openshift/openshift-ansible/pull/5858 but the issue is different.

Comment 19 Anping Li 2017-11-06 10:47:18 UTC
The fix pass testing on openshift-ansible-3.6.173.0.65-1.git.0.63cbf23.el7.noarch


openshift_logging_install_logging=true
openshift_logging_image_version=v3.6
openshift_logging_es_nodeselector={'logging-node': '1'}
openshift_logging_use_ops=true
openshift_logging_es_ops_nodeselector={'logging-node':'2'}


# oc get dc  --selector="logging-infra"="elasticsearch" -o json |grep -A 2 Selector 
                        "nodeSelector": {
                            "logging-node": "1"
                        },
--
                        "nodeSelector": {
                            "logging-node": "2"
                        },

Comment 20 Anping Li 2017-11-06 10:48:18 UTC
@Samuel, The fix in in openshift-ansible only. Could we move this bug to the installer errata?

Comment 25 errata-xmlrpc 2017-11-21 05:41:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:3255


Note You need to log in before you can comment on or make changes to this bug.