Description of problem: Node scaleup fails with variable parsing error on node label Version-Release number of the following components: atomic-openshift-clients-3.6.173.0.96-1.git.0.8f6ff22.el7.x86_64 openshift-ansible-docs-3.6.173.0.96-1.git.0.2954b4a.el7.noarch atomic-openshift-sdn-ovs-3.6.173.0.96-1.git.0.8f6ff22.el7.x86_64 atomic-openshift-docker-excluder-3.6.173.0.96-1.git.0.8f6ff22.el7.noarch atomic-openshift-excluder-3.6.173.0.96-1.git.0.8f6ff22.el7.noarch tuned-profiles-atomic-openshift-node-3.6.173.0.96-1.git.0.8f6ff22.el7.x86_64 openshift-ansible-callback-plugins-3.6.173.0.96-1.git.0.2954b4a.el7.noarch openshift-ansible-lookup-plugins-3.6.173.0.96-1.git.0.2954b4a.el7.noarch atomic-openshift-node-3.6.173.0.96-1.git.0.8f6ff22.el7.x86_64 openshift-ansible-filter-plugins-3.6.173.0.96-1.git.0.2954b4a.el7.noarch openshift-ansible-playbooks-3.6.173.0.96-1.git.0.2954b4a.el7.noarch atomic-openshift-utils-3.6.173.0.96-1.git.0.2954b4a.el7.noarch openshift-ansible-3.6.173.0.96-1.git.0.2954b4a.el7.noarch atomic-openshift-3.6.173.0.96-1.git.0.8f6ff22.el7.x86_64 openshift-ansible-roles-3.6.173.0.96-1.git.0.2954b4a.el7.noarch atomic-openshift-master-3.6.173.0.96-1.git.0.8f6ff22.el7.x86_64 Ansible tested with below: ansible-2.4.2.0-2.el7.noarch ansible-2.2.3.0-1.el7.noarch How reproducible: Repeatable by customer, unconfirmed by support team Actual results: TASK [openshift_facts : Gather Cluster facts and set is_containerized if needed] *** fatal: [y44461.example.net]: FAILED! => { "changed": false, "failed": true, "module_stderr": "Shared connection to y44461.example.net closed.\r\n", "module_stdout": "Traceback (most recent call last):\r\n File \"/tmp/ansible_aVAPjT/ansible_module_openshift_facts.py\", line 2560, in <module>\r\n main()\r\n File \"/tmp/ansible_aVAPjT/ansible_module_openshift_facts.py\", line 2547, in main\r\n protected_facts_to_overwrite)\r\n File \"/tmp/ansible_aVAPjT/ansible_module_openshift_facts.py\", line 1955, in __init__\r\n protected_facts_to_overwrite)\r\n File \"/tmp/ansible_aVAPjT/ansible_module_openshift_facts.py\", line 2013, in generate_facts\r\n facts = build_kubelet_args(facts)\r\n File \"/tmp/ansible_aVAPjT/ansible_module_openshift_facts.py\", line 1217, in build_kubelet_args\r\n labels_str = list(map(lambda x: '='.join(x), facts['node']['labels'].items()))\r\n File \"/tmp/ansible_aVAPjT/ansible_module_openshift_facts.py\", line 1217, in <lambda>\r\n labels_str = list(map(lambda x: '='.join(x), facts['node']['labels'].items()))\r\nTypeError: sequence item 1: expected string or Unicode, int found\r\n" } Tried new node definition like: [new_nodes] y44461.example.net openshift_schedulable=false and: [new_nodes] y44461.example.net openshift_schedulable=false openshift_node_labels="{'region': 'nonproduction', 'zone': 'DK2', 'logging-infra-fluentd': 'true'}"
Might be related? https://github.com/openshift/openshift-ansible/issues/6459
Created https://github.com/openshift/openshift-ansible/pull/7687, but not entirely sure if that would resolve it. Steven, could you attach facts (stored in "/etc/ansible/facts.d" on each node) to this bug so that I could verify the PR would fix it?
Right, ' "logging-es-node": 4' on y44461 is breaking it. The PR would fix that
Fix is available in openshift-ansible-3.6.173.0.113-1
Can't figure out how the int value "4" come out, looks like no label 'logging-es-node': 4" set in the attached inventory file. Vadim, any hints to reproduce the bug?
Gan - yes - set label 'logging-es-node': 4 in inventory and run once. It will make cached facts on the node. Vadim - shall create new BZ that ansible should check and not accept such settings in inventory initially ?
(In reply to Dmitry Zhukovski from comment #9) > Gan - yes - set label 'logging-es-node': 4 in inventory and run once. It > will make cached facts on the node. > > Vadim - shall create new BZ that ansible should check and not accept such > settings in inventory initially ? I don't have a clear reproducer for this, unfortunately. One of the machines on customer installation had these facts set (the inventory didn't contain these) It seems the instructions in https://github.com/openshift/openshift-ansible/issues/6459 would reproduce it
(In reply to Dmitry Zhukovski from comment #9) > Vadim - shall create new BZ that ansible should check and not accept such > settings in inventory initially ? There is no simple and generic way to resolve this. 4 might be a valid value for a var in ansible inventory. This issue would be mitigated once YAML inventory is used instead of INI format. Ansible would complain about the value being invalid before the playbook is run. See https://trello.com/c/pCfMmOae/638-3-document-yaml-based-group-vars
Verified in openshift-ansible-3.6.173.0.113-1.git.0.8a42ef5.el7.noarch.rpm Installation succeeded with int values set in openshift_node_labels. openshift_node_labels="{'role': 'node',1: 2}"
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2007