Bug 1861904

Summary: Plan validation GVK warnings should not show kinds that are ignored via exclude_resources
Product: OpenShift Container Platform Reporter: Erik Nelson <ernelson>
Component: Migration ToolingAssignee: Derek Whatley <dwhatley>
Status: CLOSED ERRATA QA Contact: Xin jiang <xjiang>
Severity: low Docs Contact:
Priority: unspecified    
Version: 4.4CC: chezhang, mberube, rjohnson, sregidor, whu
Target Milestone: ---   
Target Release: 4.4.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1861908 (view as bug list) Environment:
Last Closed: 2020-08-05 10:51:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1861908    
Bug Blocks:    

Description Erik Nelson 2020-07-29 21:00:08 UTC
Description of problem:
The GVK warnings should not include warnings that the controller has been configured to ignore via exclude_resources, which is set on the controller by the operator as an environment variable. This could potentially lead to user confusion.

For example, by default, the CAM 1.2.4 operator excludes all kinds that are part of the service catalog's group. Often the GVK warning will trigger for service catalog objects, but as of 1.2.4, they're being ignored anyway, so we shouldn't warn.

Version-Release number of selected component (if applicable):
CAM 1.2.4

How reproducible:
Every time

Steps to Reproduce:
1. Use a source cluster that would trigger a GVK warning due to service catalog resources
2. Deploy CAM with 1.2.4 operator, this will have the service catalog resources as part of exclude_resources

Actual results:
catalog GVK warning will be written to the CR

Expected results:
catalog GVK warning should not be present, because the controller is configured to ignore them.

Comment 2 Derek Whatley 2020-07-31 19:19:39 UTC
PR is up, under review. https://github.com/konveyor/mig-controller/pull/611

Comment 3 Derek Whatley 2020-07-31 19:45:23 UTC
PR is merged, cherry-picked to release-1.2.4 branch. 

https://github.com/konveyor/mig-controller/commit/a562f9501179a1b9b6cf475f1a4fdf4bc045f721

Comment 7 Xin jiang 2020-08-03 09:30:26 UTC
LGTM

Comment 8 Sergio 2020-08-03 09:52:00 UTC
Using 1.2.4 to verify the issue

Image: openshift-migration-rhel7-operator@sha256:5d74961e57f46100a12ecc717cd4d7242d5bcbc590d1489232fade937943b76a

    - name: MIG_CONTROLLER_REPO
      value: openshift-migration-controller-rhel8@sha256
    - name: MIG_CONTROLLER_TAG
      value: ed0bf64f09fec40963eca9c0b78677591511edd1d6ea6ba57677a7dfdbeb0045
    - name: MIG_UI_REPO
      value: openshift-migration-ui-rhel8@sha256
    - name: MIG_UI_TAG
      value: 6abfaea8ac04e3b5bbf9648a3479b420b4baec35201033471020c9cae1fe1e11
    - name: MIGRATION_REGISTRY_REPO
      value: openshift-migration-registry-rhel8@sha256
    - name: MIGRATION_REGISTRY_TAG
      value: 37536b4487d3668a7105737695a0651e6be64720bc72a69da74153a8443ac9e1
    - name: VELERO_REPO
      value: openshift-migration-velero-rhel8@sha256
    - name: VELERO_TAG
      value: 461ea0c165ed525d4276056f6aab879dcf011facb00e94acc88ae6e9f33f1637
    - name: VELERO_PLUGIN_REPO
      value: openshift-migration-plugin-rhel8@sha256
    - name: VELERO_PLUGIN_TAG
      value: 40fee0819d750149b282b58019f4a118e296a754414fceaa4a1162deebee4898



When we deploy an application using the catalog and we migrate it, we get the following status in the migplan:

status:
  conditions:
  - category: Warn
    lastTransitionTime: "2020-08-03T09:43:37Z"
    message: 'Some namespaces contain GVKs incompatible with destination cluster.
      See: `incompatibleNamespaces` for details'
    status: "True"
    type: GVKsIncompatible
  - category: Required
    lastTransitionTime: "2020-08-03T09:43:38Z"
    message: The `persistentVolumes` list has been updated with discovered PVs.
    reason: Done
    status: "True"
    type: PvsDiscovered
  - category: Required
    lastTransitionTime: "2020-08-03T09:43:39Z"
    message: The storage resources have been created.
    status: "True"
    type: StorageEnsured
  - category: Required
    lastTransitionTime: "2020-08-03T09:43:42Z"
    message: The migration registry resources have been created.
    status: "True"
    type: RegistriesEnsured
  - category: Required
    lastTransitionTime: "2020-08-03T09:43:42Z"
    message: The migration plan is ready.
    status: "True"
    type: Ready
  excludedResources:
  - imagetags
  - templateinstances
  - clusterserviceversions
  - packagemanifests
  - subscriptions
  - servicebrokers
  - servicebindings
  - serviceclasses
  - serviceinstances
  - serviceplans
  incompatibleNamespaces:
  - gvks:
    - group: servicecatalog.k8s.io
      kind: serviceinstances
      version: v1beta1
    name: ocp-myservice
  observedDigest: d59560acff2ec81237d28633cdd45d15f6c89a80a8bda568e91179eb360fe6cf



There is a warning for serviceinstances even if this kind of resources are ignored. We move the BZ to ASSIGNED status.

Comment 9 Derek Whatley 2020-08-03 22:18:05 UTC
I was able to verify that there is a mis-ordering of events causing this behavior. 


===================
Steps to reproduce:
===================
 - Create a MigPlan. GVK incompat diff will be calculated early in validations. This step reads from 'Plan.Status.ExcludedResources', which is apparently not filled out until later in reconcile
 - First run of GVK diff doesn't take into account the 'Plan.Status.ExcludedResources' since the list didn't arrive until after GVK diff ran
 - Force another reconcile on MigPlan, now that 'Plan.Status.ExcludedResources' is present, GVK diff will produce the expected output for `Plan.Status. IncompatibleNamespaces'


I'm working up a fix now.

Comment 10 Derek Whatley 2020-08-03 22:35:28 UTC
To illustrate the issue:

---------------------------
One moment I see this:
---------------------------

  status:
    conditions:
    [...]
    excludedResources:
    - imagetags
    incompatibleNamespaces:
    - gvks:
      - group: servicecatalog.k8s.io
        kind: serviceinstances
        version: v1beta1
      name: bz1861904


---------------------------
The next moment I see this:
---------------------------

  status:
    conditions:
    [...]
    excludedResources:
    - imagetags
    incompatibleNamespaces:
    excludedResources:
    - imagetags
    - templateinstances
    - clusterserviceversions
    - packagemanifests
    - subscriptions
    - servicebrokers
    - servicebindings
    - serviceclasses
    - serviceinstances
    - serviceplans
    incompatibleNamespaces:
    - gvks:
      - group: servicecatalog.k8s.io
        kind: serviceinstances
        version: v1beta1
      name: bz1861904


---------------------------
And if I modify the PV action:
---------------------------

  status:
    conditions:
    [...]
    excludedResources:
    - imagetags
    incompatibleNamespaces:
    excludedResources:
    - imagetags
    - templateinstances
    - clusterserviceversions
    - packagemanifests
    - subscriptions
    - servicebrokers
    - servicebindings
    - serviceclasses
    - serviceinstances
    - serviceplans

Comment 11 Derek Whatley 2020-08-03 23:02:13 UTC
Posted new fix PR, waiting on review. https://github.com/konveyor/mig-controller/pull/617

Comment 12 Derek Whatley 2020-08-03 23:12:19 UTC
PR merged and cherry-picked to release-1.2.4 branch of mig-controller.

----------------
serviceinstance
----------------

oc get serviceinstances -n bz1861904 -o yaml | head                                                                                                                                
apiVersion: v1
items:
- apiVersion: servicecatalog.k8s.io/v1beta1
  kind: ServiceInstance
  metadata:
    creationTimestamp: 2020-07-30T17:15:15Z
    finalizers:
    - kubernetes-incubator/service-catalog
    generateName: mysql-persistent-
    generation: 1


-----------------
migplan status
-----------------
  status:
    conditions:
    - category: Required
      lastTransitionTime: 2020-08-03T23:10:02Z
      message: The `persistentVolumes` list has been updated with discovered PVs.
      reason: Done
      status: "True"
      type: PvsDiscovered
    - category: Required
      lastTransitionTime: 2020-08-03T23:10:04Z
      message: The storage resources have been created.
      status: "True"
      type: StorageEnsured
    - category: Required
      lastTransitionTime: 2020-08-03T23:10:06Z
      message: The migration registry resources have been created.
      status: "True"
      type: RegistriesEnsured
    - category: Required
      lastTransitionTime: 2020-08-03T23:10:06Z
      message: The migration plan is ready.
      status: "True"
      type: Ready
    excludedResources:
    - imagetags
    - templateinstances
    - clusterserviceversions
    - packagemanifests
    - subscriptions
    - servicebrokers
    - servicebindings
    - serviceclasses
    - serviceinstances
    - serviceplans

Comment 15 Xin jiang 2020-08-04 06:27:31 UTC
Verified with the latest images.

----------------
serviceinstance
----------------
$ oc get serviceinstances -o yaml -n test | head
apiVersion: v1
items:
- apiVersion: servicecatalog.k8s.io/v1beta1
  kind: ServiceInstance
  metadata:
    creationTimestamp: 2020-08-04T06:13:59Z
    finalizers:
    - kubernetes-incubator/service-catalog
    generateName: mariadb-ephemeral-
    generation: 1


-----------------
migplan status
-----------------
  spec:
    destMigClusterRef:
      name: host
      namespace: openshift-migration
    migStorageRef:
      name: automatic
      namespace: openshift-migration
    namespaces:
    - test
    srcMigClusterRef:
      name: source-cluster
      namespace: openshift-migration
  status:
    conditions:
    - category: Required
      lastTransitionTime: "2020-08-04T06:23:47Z"
      message: The `persistentVolumes` list has been updated with discovered PVs.
      reason: Done
      status: "True"
      type: PvsDiscovered
    - category: Required
      lastTransitionTime: "2020-08-04T06:23:48Z"
      message: The storage resources have been created.
      status: "True"
      type: StorageEnsured
    - category: Required
      lastTransitionTime: "2020-08-04T06:23:49Z"
      message: The migration registry resources have been created.
      status: "True"
      type: RegistriesEnsured
    - category: Required
      lastTransitionTime: "2020-08-04T06:23:49Z"
      message: The migration plan is ready.
      status: "True"
      type: Ready
    excludedResources:
    - imagetags
    - templateinstances
    - clusterserviceversions
    - packagemanifests
    - subscriptions
    - servicebrokers
    - servicebindings
    - serviceclasses
    - serviceinstances
    - serviceplans
    observedDigest: 862edc9a279a46fe6a08620d034d8a2768ca22d52f28df542d31223d26850531
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

Comment 17 errata-xmlrpc 2020-08-05 10:51:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Cluster Application Migration (CAM) Tool Image Release Advisory 1.2.4), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:3320