Bug 1469387 - fact gathering fails during upgrade from 3.4 to 3.5 on delete_empty_keys(facts['master']['admission_plugin_config']['BuildDefaults']['configuration']['env'])\r\nKeyError: 'env'\r\n"
fact gathering fails during upgrade from 3.4 to 3.5 on delete_empty_keys(fact...
Status: MODIFIED
Product: OpenShift Container Platform
Classification: Red Hat
Component: Upgrade (Show other bugs)
3.5.0
Unspecified Unspecified
urgent Severity urgent
: ---
: 3.8.0
Assigned To: Scott Dodson
liujia
:
Depends On:
Blocks: 1491718 1515457 1515458 1515459
  Show dependency treegraph
 
Reported: 2017-07-11 03:50 EDT by Javier Ramirez
Modified: 2017-12-13 10:25 EST (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
The configuration management for builddefaults attempts to remove environment variables that were previously defined but have since been removed from the configuration. In situations where no environment variables have been configured this was failing because the 'env' key did not exist. The process has now been updated to skip the cleanup when the env key does not exist.
Story Points: ---
Clone Of:
: 1491718 1515457 1515458 1515459 (view as bug list)
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Ansible log (2.26 MB, application/x-gzip)
2017-07-11 03:51 EDT, Javier Ramirez
no flags Details

  None (edit)
Description Javier Ramirez 2017-07-11 03:50:23 EDT
Description of problem:
fact gathering during upgrade from 3.4 -> 3.5 failed:

Version-Release number of the following components:
rpm -q openshift-ansible ansible
openshift-ansible-3.5.78-1.git.0.f7be576.el7.noarch
ansible-2.2.3.0-1.el7.noarch
[hermanh@lsrv0071 ~]$ ansible --version
ansible 2.2.3.0
  config file = /etc/ansible/ansible.cfg
  configured module search path = Default w/o overrides

How reproducible:
For the customer always

Steps to Reproduce:
1. Try to upgrade using the upgrade.yml playbook


Actual results:
2017-06-29 11:55:50,474 p=45692 u=ocpauto |  TASK [openshift_facts : Gather Cluster facts and set is_containerized if needed] ***
2017-06-29 11:55:52,467 p=45692 u=ocpauto |  ok: [atom0010.linux.rabobank.nl]
2017-06-29 11:55:52,475 p=45692 u=ocpauto |  ok: [atom0008.linux.rabobank.nl]
2017-06-29 11:55:52,482 p=45692 u=ocpauto |  fatal: [atom0001.linux.rabobank.nl]: FAILED! => {
    "changed": false,
    "failed": true,
    "module_stderr": "Shared connection to atom0001.linux.rabobank.nl closed.\r\n",
    "module_stdout": "Traceback (most recent call last):\r\n  File \"/tmp/ansible_CHxjyW/ansible_module_openshift_facts.py\", line 2495, in <module>\r\n    main()\r\n  File \"/tmp/ansible_CHxjyW/ansible_module_openshift_facts.py\", line 2482, in main\r\n    protected_facts_to_overwrite)\r\n  File \"/tmp/ansible_CHxjyW/ansible_module_openshift_facts.py\", line 1913, in __init__\r\n    protected_facts_to_overwrite)\r\n  File \"/tmp/ansible_CHxjyW/ansible_module_openshift_facts.py\", line 1979, in generate_facts\r\n    facts = set_builddefaults_facts(facts)\r\n  File \"/tmp/ansible_CHxjyW/ansible_module_openshift_facts.py\", line 1686, in set_builddefaults_facts\r\n    delete_empty_keys(facts['master']['admission_plugin_config']['BuildDefaults']['configuration']['env'])\r\nKeyError: 'env'\r\n"
}

MSG:

MODULE FAILURE

Expected results:
Facts to be gathered correctly

Additional info:
Please attach logs from ansible-playbook with the -vvv flag
Comment 1 Javier Ramirez 2017-07-11 03:51 EDT
Created attachment 1296105 [details]
Ansible log
Comment 2 Javier Ramirez 2017-07-11 03:52:13 EDT
Customer used the following workaround:
commenting out below line in mode python fact module (/usr/share/ansible/openshift-ansible/roles/openshift_facts/library/openshift_facts.py) fixed the issue and allowed us to upgrade:
-------
 1422              #delete_empty_keys(facts['master']['admission_plugin_config']['BuildDefaults']['configuration']['env'])
-------
Comment 4 Tim Bielawa 2017-09-12 10:54:15 EDT
Pull request open on https://github.com/openshift/openshift-ansible/pull/5375
Comment 5 Tim Bielawa 2017-09-14 09:25:53 EDT
Branch has been merged into

* master
* release-1.5

Still waiting on release-3.6 branch
Comment 7 liujia 2017-09-20 06:26:35 EDT
QE try to reproduced the bug with three ways, but did not get a failed upgrade. on atomic-openshift-utils-3.5.119-1.git.0.9e9bb4e.el7.noarch.

1, reproduce the bug as following steps:

1). Install ocp 3.4 with specified build config in inventory hosts file.
<--snip-->
openshift_buildoverrides_nodeselectors={'registry': 'enabled','router': 'enabled'}
openshift_builddefaults_nodeselectors={'registry': 'enabled','router': 'enabled'}
<--snip-->
2). Upgrade 3.4 to 3.5 with above inventory file.

But I met another issue in step 1 that the above setting in step1 did not take effect for 3.4 installer. And I confirmed with related QE that "setting builddefaults and buildoverrides" is releasied from 3.5 and not support in 3.4.

2, Though it is not supported for upgrade that add the variables just during 3.4-3.5 upgrade but not during 3.4 installation.  Try to reproduce the issue as following:
1). Install ocp 3.4 without any build config in inventory hosts file.
2). Upgrade 3.4 to 3.5 with following variables added in inventory hosts file.
<--snip-->
openshift_buildoverrides_nodeselectors={'registry': 'enabled','router': 'enabled'}
openshift_builddefaults_nodeselectors={'registry': 'enabled','router': 'enabled'}

Unfortunately, upgrade succeed and the variables did not take effect.

3, Try to reproduce the issue as following steps:
1) Install ocp 3.4 without any build config in inventory file.
2) Change master-config.yaml to add build config and then restart master service.
admissionConfig:
  pluginConfig:
    BuildDefaults:
      configuration:
        apiVersion: v1
        kind: BuildDefaultsConfig
        nodeSelector:
          registry: enabled
          router: enabled
    BuildOverrides:
      configuration:
        apiVersion: v1
        kind: BuildOverridesConfig
        forcePull: true
        nodeSelector:
          registry: enabled
          router: enabled
3)Uprade 3.4 to 3.5 with following variables added in inventory hosts file.
<--snip-->
openshift_buildoverrides_nodeselectors={'registry': 'enabled','router': 'enabled'}
openshift_builddefaults_nodeselectors={'registry': 'enabled','router': 'enabled'}

Unfortunately, upgrade succeed and the variables keep original config.


@Tim
AFAIK, above variables are just for advanced install but not for upgrade. Would u mind give me more inspirations about how this issue re-produced?
Comment 8 liujia 2017-09-27 05:10:36 EDT
Because it can not be re-produced according to the config in attachment hosts file. So QE changed to use json format to config builddefaults, then it can be reproduced on version atomic-openshift-utils-3.5.120-1.git.0.c60f69a.el7.noarch.

TASK [openshift_facts : Gather Cluster facts and set is_containerized if needed] ********************************************************************************************
fatal: [x.x.x.x]: FAILED! => {
    "changed": false,
    "failed": true,
    "module_stderr": "Shared connection to x.x.x.x closed.\r\n",
    "module_stdout": "Traceback (most recent call last):\r\n  File \"/tmp/ansible_Hl3xm7/ansible_module_openshift_facts.py\", line 2489, in <module>\r\n    main()\r\n  File \"/tmp/ansible_Hl3xm7/ansible_module_openshift_facts.py\", line 2476, in main\r\n    protected_facts_to_overwrite)\r\n  File \"/tmp/ansible_Hl3xm7/ansible_module_openshift_facts.py\", line 1911, in __init__\r\n    protected_facts_to_overwrite)\r\n  File \"/tmp/ansible_Hl3xm7/ansible_module_openshift_facts.py\", line 1977, in generate_facts\r\n    facts = set_builddefaults_facts(facts)\r\n  File \"/tmp/ansible_Hl3xm7/ansible_module_openshift_facts.py\", line 1682, in set_builddefaults_facts\r\n    delete_empty_keys(facts['master']['admission_plugin_config']['BuildDefaults']['configuration']['env'])\r\nKeyError: 'env'\r\n",
    "rc": 0
}

MSG:

MODULE FAILURE
Comment 9 liujia 2017-09-27 06:06:02 EDT
Version:
atomic-openshift-utils-3.5.125-1.git.0.1c43b24.el7.noarch

Steps:
1. Install ocp 3.4 with builddefaults setting in inventory hosts file.
openshift_builddefaults_json={"BuildDefaults":{"configuration":{"apiVersion":"v1","nodeSelector":{"registry": "enabled"},"kind":"BuildDefaultsConfig"}}}

2. Upgrade 3.4 to 3.5 with above hosts file

Upgrade still met the error.
TASK [openshift_facts : Gather Cluster facts and set is_containerized if needed] ********************************************************************************************
fatal: [x.x.x.x]: FAILED! => {
    "changed": false,
    "failed": true,
    "module_stderr": "Shared connection to x.x.x.x closed.\r\n",
    "module_stdout": "Traceback (most recent call last):\r\n  File \"/tmp/ansible_ZHVTSp/ansible_module_openshift_facts.py\", line 2491, in <module>\r\n    main()\r\n  File \"/tmp/ansible_ZHVTSp/ansible_module_openshift_facts.py\", line 2478, in main\r\n    protected_facts_to_overwrite)\r\n  File \"/tmp/ansible_ZHVTSp/ansible_module_openshift_facts.py\", line 1913, in __init__\r\n    protected_facts_to_overwrite)\r\n  File \"/tmp/ansible_ZHVTSp/ansible_module_openshift_facts.py\", line 1979, in generate_facts\r\n    facts = set_builddefaults_facts(facts)\r\n  File \"/tmp/ansible_ZHVTSp/ansible_module_openshift_facts.py\", line 1684, in set_builddefaults_facts\r\n    delete_empty_keys(facts['master']['admission_plugin_config']['BuildDefaults']['configuration']['env'])\r\nKeyError: 'env'\r\n",
    "rc": 0
}

Assign back, please let me know if my steps are not correct.
Comment 11 liujia 2017-09-28 03:43:11 EDT
Version:
atomic-openshift-utils-3.5.128-1.git.0.f183c7b.el7.noarch

Steps:
1. Install ocp 3.4 with builddefaults setting in inventory hosts file.
openshift_builddefaults_json={"BuildDefaults":{"configuration":{"apiVersion":"v1","nodeSelector":{"registry": "enabled"},"kind":"BuildDefaultsConfig"}}}

2. Upgrade 3.4 to 3.5 with above hosts file

Upgrade failed for the same error again.
Comment 13 Javier Ramirez 2017-11-02 10:20:54 EDT
Any news on when this will be included in an errata ?
Comment 15 Nicolas Nosenzo 2017-11-17 11:21:29 EST
I have tried the temporary workaround commented previously in comment #2:

- Comment line 
delete_empty_keys(facts['master']['admission_plugin_config']['BuildDefaults']['configuration']['env'])
in /usr/share/ansible/openshift-ansible/roles/openshift_facts/library/openshift_facts.py

But it didn't work, we're still getting:

    "module_stdout": "Traceback (most recent call last):\r\n  File \"/tmp/ansible_6_ygCW/ansible_module_openshift_facts.py\", line 2491, in <module>\r\n    main()\r\n  File \"/tmp/ansible_6_ygCW/ansible_module_openshift_facts.py\", line 2478, in main\r\n    protected_facts_to_overwrite)\r\n  File \"/tmp/ansible_6_ygCW/ansible_module_openshift_facts.py\", line 1913, in __init__\r\n    protected_facts_to_overwrite)\r\n  File \"/tmp/ansible_6_ygCW/ansible_module_openshift_facts.py\", line 1979, in generate_facts\r\n    facts = set_builddefaults_facts(facts)\r\n  File \"/tmp/ansible_6_ygCW/ansible_module_openshift_facts.py\", line 1684, in set_builddefaults_facts\r\n    delete_empty_keys(facts['master']['admission_plugin_config']['BuildDefaults']['configuration']['env'])\r\nKeyError: 'env'\r\n",

Is there any other thing we can do to skip this and try to move on with the upgrade ?
Comment 18 Scott Dodson 2017-11-20 11:46:10 EST
https://github.com/openshift/openshift-ansible/pull/6184 proposed fix against master
Comment 26 liujia 2017-11-21 05:49:01 EST
@Scott

About the three cloned bug, I tried to re-produce them but failed due to fresh install failed with openshift_builddefaults_json specified[1].

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1515746
Comment 27 liujia 2017-12-13 01:04:08 EST
Add cases.

Note You need to log in before you can comment on or make changes to this bug.