Bug 1469387 - fact gathering fails during upgrade from 3.4 to 3.5 on delete_empty_keys(facts['master']['admission_plugin_config']['BuildDefaults']['configuration']['env'])\r\nKeyError: 'env'\r\n"
Summary: fact gathering fails during upgrade from 3.4 to 3.5 on delete_empty_keys(fact...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cluster Version Operator
Version: 3.5.0
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 3.5.z
Assignee: Scott Dodson
QA Contact: liujia
URL:
Whiteboard:
Depends On:
Blocks: 1491718 1515457 1515458 1515459
TreeView+ depends on / blocked
 
Reported: 2017-07-11 07:50 UTC by Javier Ramirez
Modified: 2021-03-11 15:26 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
The configuration management for builddefaults attempts to remove environment variables that were previously defined but have since been removed from the configuration. In situations where no environment variables have been configured this was failing because the 'env' key did not exist. The process has now been updated to skip the cleanup when the env key does not exist.
Clone Of:
: 1491718 1515457 1515458 1515459 (view as bug list)
Environment:
Last Closed: 2018-01-09 20:18:58 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Ansible log (2.26 MB, application/x-gzip)
2017-07-11 07:51 UTC, Javier Ramirez
no flags Details

Description Javier Ramirez 2017-07-11 07:50:23 UTC
Description of problem:
fact gathering during upgrade from 3.4 -> 3.5 failed:

Version-Release number of the following components:
rpm -q openshift-ansible ansible
openshift-ansible-3.5.78-1.git.0.f7be576.el7.noarch
ansible-2.2.3.0-1.el7.noarch
[hermanh@lsrv0071 ~]$ ansible --version
ansible 2.2.3.0
  config file = /etc/ansible/ansible.cfg
  configured module search path = Default w/o overrides

How reproducible:
For the customer always

Steps to Reproduce:
1. Try to upgrade using the upgrade.yml playbook


Actual results:
2017-06-29 11:55:50,474 p=45692 u=ocpauto |  TASK [openshift_facts : Gather Cluster facts and set is_containerized if needed] ***
2017-06-29 11:55:52,467 p=45692 u=ocpauto |  ok: [atom0010.linux.rabobank.nl]
2017-06-29 11:55:52,475 p=45692 u=ocpauto |  ok: [atom0008.linux.rabobank.nl]
2017-06-29 11:55:52,482 p=45692 u=ocpauto |  fatal: [atom0001.linux.rabobank.nl]: FAILED! => {
    "changed": false,
    "failed": true,
    "module_stderr": "Shared connection to atom0001.linux.rabobank.nl closed.\r\n",
    "module_stdout": "Traceback (most recent call last):\r\n  File \"/tmp/ansible_CHxjyW/ansible_module_openshift_facts.py\", line 2495, in <module>\r\n    main()\r\n  File \"/tmp/ansible_CHxjyW/ansible_module_openshift_facts.py\", line 2482, in main\r\n    protected_facts_to_overwrite)\r\n  File \"/tmp/ansible_CHxjyW/ansible_module_openshift_facts.py\", line 1913, in __init__\r\n    protected_facts_to_overwrite)\r\n  File \"/tmp/ansible_CHxjyW/ansible_module_openshift_facts.py\", line 1979, in generate_facts\r\n    facts = set_builddefaults_facts(facts)\r\n  File \"/tmp/ansible_CHxjyW/ansible_module_openshift_facts.py\", line 1686, in set_builddefaults_facts\r\n    delete_empty_keys(facts['master']['admission_plugin_config']['BuildDefaults']['configuration']['env'])\r\nKeyError: 'env'\r\n"
}

MSG:

MODULE FAILURE

Expected results:
Facts to be gathered correctly

Additional info:
Please attach logs from ansible-playbook with the -vvv flag

Comment 1 Javier Ramirez 2017-07-11 07:51:16 UTC
Created attachment 1296105 [details]
Ansible log

Comment 2 Javier Ramirez 2017-07-11 07:52:13 UTC
Customer used the following workaround:
commenting out below line in mode python fact module (/usr/share/ansible/openshift-ansible/roles/openshift_facts/library/openshift_facts.py) fixed the issue and allowed us to upgrade:
-------
 1422              #delete_empty_keys(facts['master']['admission_plugin_config']['BuildDefaults']['configuration']['env'])
-------

Comment 4 Tim Bielawa 2017-09-12 14:54:15 UTC
Pull request open on https://github.com/openshift/openshift-ansible/pull/5375

Comment 5 Tim Bielawa 2017-09-14 13:25:53 UTC
Branch has been merged into

* master
* release-1.5

Still waiting on release-3.6 branch

Comment 7 liujia 2017-09-20 10:26:35 UTC
QE try to reproduced the bug with three ways, but did not get a failed upgrade. on atomic-openshift-utils-3.5.119-1.git.0.9e9bb4e.el7.noarch.

1, reproduce the bug as following steps:

1). Install ocp 3.4 with specified build config in inventory hosts file.
<--snip-->
openshift_buildoverrides_nodeselectors={'registry': 'enabled','router': 'enabled'}
openshift_builddefaults_nodeselectors={'registry': 'enabled','router': 'enabled'}
<--snip-->
2). Upgrade 3.4 to 3.5 with above inventory file.

But I met another issue in step 1 that the above setting in step1 did not take effect for 3.4 installer. And I confirmed with related QE that "setting builddefaults and buildoverrides" is releasied from 3.5 and not support in 3.4.

2, Though it is not supported for upgrade that add the variables just during 3.4-3.5 upgrade but not during 3.4 installation.  Try to reproduce the issue as following:
1). Install ocp 3.4 without any build config in inventory hosts file.
2). Upgrade 3.4 to 3.5 with following variables added in inventory hosts file.
<--snip-->
openshift_buildoverrides_nodeselectors={'registry': 'enabled','router': 'enabled'}
openshift_builddefaults_nodeselectors={'registry': 'enabled','router': 'enabled'}

Unfortunately, upgrade succeed and the variables did not take effect.

3, Try to reproduce the issue as following steps:
1) Install ocp 3.4 without any build config in inventory file.
2) Change master-config.yaml to add build config and then restart master service.
admissionConfig:
  pluginConfig:
    BuildDefaults:
      configuration:
        apiVersion: v1
        kind: BuildDefaultsConfig
        nodeSelector:
          registry: enabled
          router: enabled
    BuildOverrides:
      configuration:
        apiVersion: v1
        kind: BuildOverridesConfig
        forcePull: true
        nodeSelector:
          registry: enabled
          router: enabled
3)Uprade 3.4 to 3.5 with following variables added in inventory hosts file.
<--snip-->
openshift_buildoverrides_nodeselectors={'registry': 'enabled','router': 'enabled'}
openshift_builddefaults_nodeselectors={'registry': 'enabled','router': 'enabled'}

Unfortunately, upgrade succeed and the variables keep original config.


@Tim
AFAIK, above variables are just for advanced install but not for upgrade. Would u mind give me more inspirations about how this issue re-produced?

Comment 8 liujia 2017-09-27 09:10:36 UTC
Because it can not be re-produced according to the config in attachment hosts file. So QE changed to use json format to config builddefaults, then it can be reproduced on version atomic-openshift-utils-3.5.120-1.git.0.c60f69a.el7.noarch.

TASK [openshift_facts : Gather Cluster facts and set is_containerized if needed] ********************************************************************************************
fatal: [x.x.x.x]: FAILED! => {
    "changed": false,
    "failed": true,
    "module_stderr": "Shared connection to x.x.x.x closed.\r\n",
    "module_stdout": "Traceback (most recent call last):\r\n  File \"/tmp/ansible_Hl3xm7/ansible_module_openshift_facts.py\", line 2489, in <module>\r\n    main()\r\n  File \"/tmp/ansible_Hl3xm7/ansible_module_openshift_facts.py\", line 2476, in main\r\n    protected_facts_to_overwrite)\r\n  File \"/tmp/ansible_Hl3xm7/ansible_module_openshift_facts.py\", line 1911, in __init__\r\n    protected_facts_to_overwrite)\r\n  File \"/tmp/ansible_Hl3xm7/ansible_module_openshift_facts.py\", line 1977, in generate_facts\r\n    facts = set_builddefaults_facts(facts)\r\n  File \"/tmp/ansible_Hl3xm7/ansible_module_openshift_facts.py\", line 1682, in set_builddefaults_facts\r\n    delete_empty_keys(facts['master']['admission_plugin_config']['BuildDefaults']['configuration']['env'])\r\nKeyError: 'env'\r\n",
    "rc": 0
}

MSG:

MODULE FAILURE

Comment 9 liujia 2017-09-27 10:06:02 UTC
Version:
atomic-openshift-utils-3.5.125-1.git.0.1c43b24.el7.noarch

Steps:
1. Install ocp 3.4 with builddefaults setting in inventory hosts file.
openshift_builddefaults_json={"BuildDefaults":{"configuration":{"apiVersion":"v1","nodeSelector":{"registry": "enabled"},"kind":"BuildDefaultsConfig"}}}

2. Upgrade 3.4 to 3.5 with above hosts file

Upgrade still met the error.
TASK [openshift_facts : Gather Cluster facts and set is_containerized if needed] ********************************************************************************************
fatal: [x.x.x.x]: FAILED! => {
    "changed": false,
    "failed": true,
    "module_stderr": "Shared connection to x.x.x.x closed.\r\n",
    "module_stdout": "Traceback (most recent call last):\r\n  File \"/tmp/ansible_ZHVTSp/ansible_module_openshift_facts.py\", line 2491, in <module>\r\n    main()\r\n  File \"/tmp/ansible_ZHVTSp/ansible_module_openshift_facts.py\", line 2478, in main\r\n    protected_facts_to_overwrite)\r\n  File \"/tmp/ansible_ZHVTSp/ansible_module_openshift_facts.py\", line 1913, in __init__\r\n    protected_facts_to_overwrite)\r\n  File \"/tmp/ansible_ZHVTSp/ansible_module_openshift_facts.py\", line 1979, in generate_facts\r\n    facts = set_builddefaults_facts(facts)\r\n  File \"/tmp/ansible_ZHVTSp/ansible_module_openshift_facts.py\", line 1684, in set_builddefaults_facts\r\n    delete_empty_keys(facts['master']['admission_plugin_config']['BuildDefaults']['configuration']['env'])\r\nKeyError: 'env'\r\n",
    "rc": 0
}

Assign back, please let me know if my steps are not correct.

Comment 11 liujia 2017-09-28 07:43:11 UTC
Version:
atomic-openshift-utils-3.5.128-1.git.0.f183c7b.el7.noarch

Steps:
1. Install ocp 3.4 with builddefaults setting in inventory hosts file.
openshift_builddefaults_json={"BuildDefaults":{"configuration":{"apiVersion":"v1","nodeSelector":{"registry": "enabled"},"kind":"BuildDefaultsConfig"}}}

2. Upgrade 3.4 to 3.5 with above hosts file

Upgrade failed for the same error again.

Comment 13 Javier Ramirez 2017-11-02 14:20:54 UTC
Any news on when this will be included in an errata ?

Comment 15 Nicolas Nosenzo 2017-11-17 16:21:29 UTC
I have tried the temporary workaround commented previously in comment #2:

- Comment line 
delete_empty_keys(facts['master']['admission_plugin_config']['BuildDefaults']['configuration']['env'])
in /usr/share/ansible/openshift-ansible/roles/openshift_facts/library/openshift_facts.py

But it didn't work, we're still getting:

    "module_stdout": "Traceback (most recent call last):\r\n  File \"/tmp/ansible_6_ygCW/ansible_module_openshift_facts.py\", line 2491, in <module>\r\n    main()\r\n  File \"/tmp/ansible_6_ygCW/ansible_module_openshift_facts.py\", line 2478, in main\r\n    protected_facts_to_overwrite)\r\n  File \"/tmp/ansible_6_ygCW/ansible_module_openshift_facts.py\", line 1913, in __init__\r\n    protected_facts_to_overwrite)\r\n  File \"/tmp/ansible_6_ygCW/ansible_module_openshift_facts.py\", line 1979, in generate_facts\r\n    facts = set_builddefaults_facts(facts)\r\n  File \"/tmp/ansible_6_ygCW/ansible_module_openshift_facts.py\", line 1684, in set_builddefaults_facts\r\n    delete_empty_keys(facts['master']['admission_plugin_config']['BuildDefaults']['configuration']['env'])\r\nKeyError: 'env'\r\n",

Is there any other thing we can do to skip this and try to move on with the upgrade ?

Comment 18 Scott Dodson 2017-11-20 16:46:10 UTC
https://github.com/openshift/openshift-ansible/pull/6184 proposed fix against master

Comment 26 liujia 2017-11-21 10:49:01 UTC
@Scott

About the three cloned bug, I tried to re-produce them but failed due to fresh install failed with openshift_builddefaults_json specified[1].

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1515746

Comment 27 liujia 2017-12-13 06:04:08 UTC
Add cases.

Comment 29 Scott Dodson 2018-01-09 20:16:29 UTC
This fix is in openshift-ansible-3.7.10-1, openshift-ansible-3.6.173.0.81-1, openshift-ansible-3.5.146-1 all of which have shipped.

It's also fixed in master as well.


Note You need to log in before you can comment on or make changes to this bug.