Bug 1847191 - CAM cluster version detection not working for some OCP 3.9 clusters "/version" endpoint Major/Minor fields missing
Summary: CAM cluster version detection not working for some OCP 3.9 clusters "/version...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Migration Tooling
Version: 3.9.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.4.z
Assignee: Derek Whatley
QA Contact: Xin jiang
URL:
Whiteboard:
Depends On: 1846521
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-06-15 21:01 UTC by John Matthews
Modified: 2020-06-30 06:54 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1846521
Environment:
Last Closed: 2020-06-30 06:54:18 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:2765 0 None None None 2020-06-30 06:54:20 UTC

Description John Matthews 2020-06-15 21:01:37 UTC
+++ This bug was initially created as a clone of Bug #1846521 +++

Description of problem:
After providing MigCluster connection details, CAM hits a "Reconcile failed" error shown on the MigCluster due to:"error":"strconv.Atoi: parsing \"\": invalid syntax"

This problem arises on clusters where k8s major/minor versions are not filled out in the k8s `/version` endpoint

Response Body: {
  "major": "",
  "minor": "",
  "gitVersion": "v1.9.1+a0ce1bc657",
  "gitCommit": "a0ce1bc",
  "gitTreeState": "clean",
  "buildDate": "2018-04-26T16:48:23Z",
  "goVersion": "go1.9.4",
  "compiler": "gc",
  "platform": "linux/amd64"
}


Version-Release number of selected component (if applicable):
CAM v1.2.0
OCP 3.9.27 (other OCP 3.9 versions also affected)

How reproducible:
Always, on select OCP 3.9 clusters where kube major/minor versions aren't available (known to fail on OCP 3.9.27)

Steps to Reproduce:
1. Install CAM 1.2.0
2. Go to web UI (CLI also works) 
3. Set up a MigCluster ("Cluster" in web UI) using connection details of an OCP 3.9.27 cluster
4. Verify that error condition is raised on MigCluster due to reconcile failure. Error message will mention "atoi" conversion.

Actual results:
Reconcile failed on MigCluster, blocks all other actions on this MigCluster

Expected results:
Reconcile works as expected on MigCluster, migrations can be performed


Additional info:

This is a known issue with some OCP 3.9 versions: https://github.com/openshift/origin/pull/19731 

We are developing a fix to overcome the missing major/minor version fields by instead parsing gitVersion, which appears to still be available on OCP 3.9 clusters missing the major/minor fields.

--- Additional comment from Derek Whatley on 2020-06-15 19:56:27 UTC ---

Plugin PR: https://github.com/konveyor/openshift-migration-plugin/pull/65
Controller PR: https://github.com/konveyor/mig-controller/pull/567

--- Additional comment from Derek Whatley on 2020-06-15 20:02:21 UTC ---

PRs are merged.

Comment 1 Derek Whatley 2020-06-17 14:29:46 UTC
We have confirmed that this issue isn't present in 3.9.99+. 

Note to QE: Make sure you're using an old enough version of OCP (3.9.27 is known to exhibit the issue) when verifying the fix.

Comment 11 whu 2020-06-19 14:15:13 UTC
This bug has been fixed in CAM 1.2.3 stage image

The image information:
      openshift-migration-rhel7-operator@sha256:58b41647b27dfc1791bacedd998b02725bce6ffe5ae9577c7b33a0cc9e33a408
    - name: HOOK_RUNNER_REPO
      value: openshift-migration-hook-runner-rhel7@sha256
    - name: HOOK_RUNNER_TAG
      value: 86a048f0ee9726b4331d10190dc5851330b66c0326d94652ac07f33a501ae323
    - name: MIG_CONTROLLER_REPO
      value: openshift-migration-controller-rhel8@sha256
    - name: MIG_CONTROLLER_TAG
      value: b7eadaaae8f2173328aa4782795e0911ac1e546d7d3dd72d4eb36e855fd4c6bf
    - name: MIG_UI_REPO
      value: openshift-migration-ui-rhel8@sha256
    - name: MIG_UI_TAG
      value: 6abfaea8ac04e3b5bbf9648a3479b420b4baec35201033471020c9cae1fe1e11
    - name: MIGRATION_REGISTRY_REPO
      value: openshift-migration-registry-rhel8@sha256
    - name: MIGRATION_REGISTRY_TAG
      value: ea6301a15277d448c8756881c7e2e712893ca8041c913476640f52da9e76cad9
    - name: VELERO_REPO
      value: openshift-migration-velero-rhel8@sha256
    - name: VELERO_TAG
      value: 1a33e327dd610f0eebaaeae5b3c9b4170ab5db572b01a170be35b9ce946c0281
    - name: VELERO_PLUGIN_REPO
      value: openshift-migration-plugin-rhel8@sha256
    - name: VELERO_PLUGIN_TAG
      value: 8dbf92e2f0de49049cb376e6941ab49846ed122b6b9328881fe490fb0905fa38
    - name: VELERO_AWS_PLUGIN_REPO
      value: openshift-migration-velero-plugin-for-aws-rhel8@sha256
    - name: VELERO_AWS_PLUGIN_TAG
      value: 22c58f575ce2f54bf995fced82f89ba173329d9b88409cf371122f9ae8cabda1
    - name: VELERO_GCP_PLUGIN_REPO
      value: openshift-migration-velero-plugin-for-gcp-rhel8@sha256
    - name: VELERO_GCP_PLUGIN_TAG
      value: 37c0b170d168fcebb104e465621e4ce97515d82549cd37cb42be94e3e55a4271
    - name: VELERO_AZURE_PLUGIN_REPO
      value: openshift-migration-velero-plugin-for-microsoft-azure-rhel8@sha256
    - name: VELERO_AZURE_PLUGIN_TAG
      value: dd92ad748a84754e5d78287e29576a5b95448e929824e86e80c60857d0c7aff9
    - name: VELERO_RESTIC_RESTORE_HELPER_REPO
      value: openshift-migration-velero-restic-restore-helper-rhel8@sha256
    - name: VELERO_RESTIC_RESTORE_HELPER_TAG
      value: e9459138ec3531eefbefa181dae3fd93fe5cf210b2a0bd3bca7ba38fbec97f60


source cluster information:
$ oc version
oc v3.9.0+191fece
kubernetes v1.9.1+a0ce1bc657
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://master.sregidor-ocp3-tgt.qe.devcluster.openshift.com:443
openshift v3.9.27
kubernetes v1.9.1+a0ce1bc657


After adding source cluster to CAM

$ oc get migcluster source-cluster  -o yaml -n openshift-migration
apiVersion: migration.openshift.io/v1alpha1
kind: MigCluster
metadata:
  annotations:
    openshift.io/touch: cc4dc2f7-b235-11ea-8861-0a580a81020a
  creationTimestamp: "2020-06-19T14:04:29Z"
  generation: 2
  name: source-cluster
  namespace: openshift-migration
  resourceVersion: "227365"
  selfLink: /apis/migration.openshift.io/v1alpha1/namespaces/openshift-migration/migclusters/source-cluster
  uid: 99e5072c-f9cf-4095-a2e1-ff3b1d4e5a7e
spec:
  insecure: true
  isHostCluster: false
  serviceAccountSecretRef:
    name: source-cluster-7722m
    namespace: openshift-config
  url: https://master.sregidor-ocp3-tgt.qe.devcluster.openshift.com:443
status:
  conditions:
  - category: Required
    lastTransitionTime: "2020-06-19T14:04:30Z"
    message: The cluster is ready.
    status: "True"
    type: Ready
  observedDigest: 53f57da30f388652c5febfed3700b3346aaf470d0b52a6a1efaa34370d3c9cb2

Comment 13 errata-xmlrpc 2020-06-30 06:54:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2765


Note You need to log in before you can comment on or make changes to this bug.