Bug 1857212

Summary: install-config is not a reliable way to determine the cluster version, can cause operator failure
Product: Migration Toolkit for Containers Reporter: Erik Nelson <ernelson>
Component: GeneralAssignee: John Matthews <jmatthew>
Status: VERIFIED --- QA Contact: Xin jiang <xjiang>
Severity: unspecified Docs Contact: Avital Pinnick <apinnick>
Priority: unspecified    
Version: 1.3.0CC: jmatthew, sregidor, xjiang
Target Milestone: ---   
Target Release: 1.3.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1857211 Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1857211    
Bug Blocks:    

Description Erik Nelson 2020-07-15 13:09:42 UTC
+++ This bug was initially created as a clone of Bug #1857211 +++

Description of problem:
We currently use the "install-config" section of the kube-system/cluster-config-v1 to determine the cluster version (https://github.com/konveyor/mig-operator/blob/release-1.2.3/roles/migrationcontroller/tasks/main.yml#L62).

This is not reliable, and there's no guarantee that the install-config key will exist. We have had reported operator failures in customer clusters with the following error:

u001b[0m\n\r\nTASK [migrationcontroller : set_fact] ******************************************\r\n\u001b[1;30mtask path: /opt/ansible/roles/migrationcontroller/tasks/main.yml:62\u001b[0m\n\u001b[0;31mfatal: [localhost]: FAILED! => {\"msg\": \"The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'metadata'\\n\\nThe error appears to be in '/opt/ansible/roles/migrationcontroller/tasks/main.yml': line 62, column 7, but may\\nbe elsewhere in the file depending on the exact syntax problem.\\n\\nThe offending line appears to be:\\n\\n\\n    - set_fact:\\n      ^ here\\n\"}\u001b[0m\n\r\nTASK [migrationcontroller : k8s_status]

Also see the failed lookup: https://github.com/konveyor/mig-operator/blob/release-1.2.3/roles/migrationcontroller/tasks/main.yml#L68

We have a more reliable source proposed here: https://github.com/konveyor/mig-operator/issues/380

Version-Release number of selected component (if applicable):
1.2.3

How reproducible:
It's undetermined what would cause the install-config to be missing, but if it is, the operator will fail on this task every time.

Comment 2 Sergio 2020-11-16 17:19:05 UTC
Verified using MTC 1.3.2

openshift-migration-rhel7-operator@sha256:fb3b30d76eaf0f87220e2fddd34905f5b4ec544b1e5bb7481124a684a62739ff


MTC could be installed without problems after deleting the key from kube-system/cluster-config-v1 config map. It has been verified completely deleting the config map.