Description of problem: fact gathering during upgrade from 3.4 -> 3.5 failed: Version-Release number of the following components: rpm -q openshift-ansible ansible openshift-ansible-3.5.78-1.git.0.f7be576.el7.noarch ansible-2.2.3.0-1.el7.noarch [hermanh@lsrv0071 ~]$ ansible --version ansible 2.2.3.0 config file = /etc/ansible/ansible.cfg configured module search path = Default w/o overrides How reproducible: For the customer always Steps to Reproduce: 1. Try to upgrade using the upgrade.yml playbook Actual results: 2017-06-29 11:55:50,474 p=45692 u=ocpauto | TASK [openshift_facts : Gather Cluster facts and set is_containerized if needed] *** 2017-06-29 11:55:52,467 p=45692 u=ocpauto | ok: [atom0010.linux.rabobank.nl] 2017-06-29 11:55:52,475 p=45692 u=ocpauto | ok: [atom0008.linux.rabobank.nl] 2017-06-29 11:55:52,482 p=45692 u=ocpauto | fatal: [atom0001.linux.rabobank.nl]: FAILED! => { "changed": false, "failed": true, "module_stderr": "Shared connection to atom0001.linux.rabobank.nl closed.\r\n", "module_stdout": "Traceback (most recent call last):\r\n File \"/tmp/ansible_CHxjyW/ansible_module_openshift_facts.py\", line 2495, in <module>\r\n main()\r\n File \"/tmp/ansible_CHxjyW/ansible_module_openshift_facts.py\", line 2482, in main\r\n protected_facts_to_overwrite)\r\n File \"/tmp/ansible_CHxjyW/ansible_module_openshift_facts.py\", line 1913, in __init__\r\n protected_facts_to_overwrite)\r\n File \"/tmp/ansible_CHxjyW/ansible_module_openshift_facts.py\", line 1979, in generate_facts\r\n facts = set_builddefaults_facts(facts)\r\n File \"/tmp/ansible_CHxjyW/ansible_module_openshift_facts.py\", line 1686, in set_builddefaults_facts\r\n delete_empty_keys(facts['master']['admission_plugin_config']['BuildDefaults']['configuration']['env'])\r\nKeyError: 'env'\r\n" } MSG: MODULE FAILURE Expected results: Facts to be gathered correctly Additional info: Please attach logs from ansible-playbook with the -vvv flag
Created attachment 1296105 [details] Ansible log
Customer used the following workaround: commenting out below line in mode python fact module (/usr/share/ansible/openshift-ansible/roles/openshift_facts/library/openshift_facts.py) fixed the issue and allowed us to upgrade: ------- 1422 #delete_empty_keys(facts['master']['admission_plugin_config']['BuildDefaults']['configuration']['env']) -------
Pull request open on https://github.com/openshift/openshift-ansible/pull/5375
Branch has been merged into * master * release-1.5 Still waiting on release-3.6 branch
QE try to reproduced the bug with three ways, but did not get a failed upgrade. on atomic-openshift-utils-3.5.119-1.git.0.9e9bb4e.el7.noarch. 1, reproduce the bug as following steps: 1). Install ocp 3.4 with specified build config in inventory hosts file. <--snip--> openshift_buildoverrides_nodeselectors={'registry': 'enabled','router': 'enabled'} openshift_builddefaults_nodeselectors={'registry': 'enabled','router': 'enabled'} <--snip--> 2). Upgrade 3.4 to 3.5 with above inventory file. But I met another issue in step 1 that the above setting in step1 did not take effect for 3.4 installer. And I confirmed with related QE that "setting builddefaults and buildoverrides" is releasied from 3.5 and not support in 3.4. 2, Though it is not supported for upgrade that add the variables just during 3.4-3.5 upgrade but not during 3.4 installation. Try to reproduce the issue as following: 1). Install ocp 3.4 without any build config in inventory hosts file. 2). Upgrade 3.4 to 3.5 with following variables added in inventory hosts file. <--snip--> openshift_buildoverrides_nodeselectors={'registry': 'enabled','router': 'enabled'} openshift_builddefaults_nodeselectors={'registry': 'enabled','router': 'enabled'} Unfortunately, upgrade succeed and the variables did not take effect. 3, Try to reproduce the issue as following steps: 1) Install ocp 3.4 without any build config in inventory file. 2) Change master-config.yaml to add build config and then restart master service. admissionConfig: pluginConfig: BuildDefaults: configuration: apiVersion: v1 kind: BuildDefaultsConfig nodeSelector: registry: enabled router: enabled BuildOverrides: configuration: apiVersion: v1 kind: BuildOverridesConfig forcePull: true nodeSelector: registry: enabled router: enabled 3)Uprade 3.4 to 3.5 with following variables added in inventory hosts file. <--snip--> openshift_buildoverrides_nodeselectors={'registry': 'enabled','router': 'enabled'} openshift_builddefaults_nodeselectors={'registry': 'enabled','router': 'enabled'} Unfortunately, upgrade succeed and the variables keep original config. @Tim AFAIK, above variables are just for advanced install but not for upgrade. Would u mind give me more inspirations about how this issue re-produced?
Because it can not be re-produced according to the config in attachment hosts file. So QE changed to use json format to config builddefaults, then it can be reproduced on version atomic-openshift-utils-3.5.120-1.git.0.c60f69a.el7.noarch. TASK [openshift_facts : Gather Cluster facts and set is_containerized if needed] ******************************************************************************************** fatal: [x.x.x.x]: FAILED! => { "changed": false, "failed": true, "module_stderr": "Shared connection to x.x.x.x closed.\r\n", "module_stdout": "Traceback (most recent call last):\r\n File \"/tmp/ansible_Hl3xm7/ansible_module_openshift_facts.py\", line 2489, in <module>\r\n main()\r\n File \"/tmp/ansible_Hl3xm7/ansible_module_openshift_facts.py\", line 2476, in main\r\n protected_facts_to_overwrite)\r\n File \"/tmp/ansible_Hl3xm7/ansible_module_openshift_facts.py\", line 1911, in __init__\r\n protected_facts_to_overwrite)\r\n File \"/tmp/ansible_Hl3xm7/ansible_module_openshift_facts.py\", line 1977, in generate_facts\r\n facts = set_builddefaults_facts(facts)\r\n File \"/tmp/ansible_Hl3xm7/ansible_module_openshift_facts.py\", line 1682, in set_builddefaults_facts\r\n delete_empty_keys(facts['master']['admission_plugin_config']['BuildDefaults']['configuration']['env'])\r\nKeyError: 'env'\r\n", "rc": 0 } MSG: MODULE FAILURE
Version: atomic-openshift-utils-3.5.125-1.git.0.1c43b24.el7.noarch Steps: 1. Install ocp 3.4 with builddefaults setting in inventory hosts file. openshift_builddefaults_json={"BuildDefaults":{"configuration":{"apiVersion":"v1","nodeSelector":{"registry": "enabled"},"kind":"BuildDefaultsConfig"}}} 2. Upgrade 3.4 to 3.5 with above hosts file Upgrade still met the error. TASK [openshift_facts : Gather Cluster facts and set is_containerized if needed] ******************************************************************************************** fatal: [x.x.x.x]: FAILED! => { "changed": false, "failed": true, "module_stderr": "Shared connection to x.x.x.x closed.\r\n", "module_stdout": "Traceback (most recent call last):\r\n File \"/tmp/ansible_ZHVTSp/ansible_module_openshift_facts.py\", line 2491, in <module>\r\n main()\r\n File \"/tmp/ansible_ZHVTSp/ansible_module_openshift_facts.py\", line 2478, in main\r\n protected_facts_to_overwrite)\r\n File \"/tmp/ansible_ZHVTSp/ansible_module_openshift_facts.py\", line 1913, in __init__\r\n protected_facts_to_overwrite)\r\n File \"/tmp/ansible_ZHVTSp/ansible_module_openshift_facts.py\", line 1979, in generate_facts\r\n facts = set_builddefaults_facts(facts)\r\n File \"/tmp/ansible_ZHVTSp/ansible_module_openshift_facts.py\", line 1684, in set_builddefaults_facts\r\n delete_empty_keys(facts['master']['admission_plugin_config']['BuildDefaults']['configuration']['env'])\r\nKeyError: 'env'\r\n", "rc": 0 } Assign back, please let me know if my steps are not correct.
Version: atomic-openshift-utils-3.5.128-1.git.0.f183c7b.el7.noarch Steps: 1. Install ocp 3.4 with builddefaults setting in inventory hosts file. openshift_builddefaults_json={"BuildDefaults":{"configuration":{"apiVersion":"v1","nodeSelector":{"registry": "enabled"},"kind":"BuildDefaultsConfig"}}} 2. Upgrade 3.4 to 3.5 with above hosts file Upgrade failed for the same error again.
Any news on when this will be included in an errata ?
I have tried the temporary workaround commented previously in comment #2: - Comment line delete_empty_keys(facts['master']['admission_plugin_config']['BuildDefaults']['configuration']['env']) in /usr/share/ansible/openshift-ansible/roles/openshift_facts/library/openshift_facts.py But it didn't work, we're still getting: "module_stdout": "Traceback (most recent call last):\r\n File \"/tmp/ansible_6_ygCW/ansible_module_openshift_facts.py\", line 2491, in <module>\r\n main()\r\n File \"/tmp/ansible_6_ygCW/ansible_module_openshift_facts.py\", line 2478, in main\r\n protected_facts_to_overwrite)\r\n File \"/tmp/ansible_6_ygCW/ansible_module_openshift_facts.py\", line 1913, in __init__\r\n protected_facts_to_overwrite)\r\n File \"/tmp/ansible_6_ygCW/ansible_module_openshift_facts.py\", line 1979, in generate_facts\r\n facts = set_builddefaults_facts(facts)\r\n File \"/tmp/ansible_6_ygCW/ansible_module_openshift_facts.py\", line 1684, in set_builddefaults_facts\r\n delete_empty_keys(facts['master']['admission_plugin_config']['BuildDefaults']['configuration']['env'])\r\nKeyError: 'env'\r\n", Is there any other thing we can do to skip this and try to move on with the upgrade ?
https://github.com/openshift/openshift-ansible/pull/6184 proposed fix against master
@Scott About the three cloned bug, I tried to re-produce them but failed due to fresh install failed with openshift_builddefaults_json specified[1]. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1515746
Add cases.
This fix is in openshift-ansible-3.7.10-1, openshift-ansible-3.6.173.0.81-1, openshift-ansible-3.5.146-1 all of which have shipped. It's also fixed in master as well.