Bug 2000644
| Summary: | Invalid migration plan causes "controller" pod to crash | ||
|---|---|---|---|
| Product: | Migration Toolkit for Containers | Reporter: | Sergio <sregidor> |
| Component: | Controller | Assignee: | Jason Montleon <jmontleo> |
| Status: | CLOSED ERRATA | QA Contact: | Xin jiang <xjiang> |
| Severity: | urgent | Docs Contact: | Avital Pinnick <apinnick> |
| Priority: | urgent | ||
| Version: | 1.6.0 | CC: | ernelson, jmontleo, prajoshi, rjohnson, ssingla, whu, xjiang |
| Target Milestone: | --- | ||
| Target Release: | 1.6.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-09-29 14:35:53 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
It happens too if the malformed field is ‘srcMigClusterRef’ Probably a duplicated of this other BZ https://bugzilla.redhat.com/show_bug.cgi?id=1951869 We believe this to be resolved as of: https://github.com/konveyor/mig-controller/pull/1186 Verfied using
SOURCE CLUSTER: AWS OCP 3.11 (MTC 1.5.1) NFS
TARGET CLUSTER: AWS OCP 4.9 (MTC 1.6.0) OCS4
openshift-migration-rhel8-operator@sha256:ef00e934ed578a4acb429f8710284d10acf2cf98f38a2b2268bbea8b5fd7139c
- name: MIG_CONTROLLER_REPO
value: openshift-migration-controller-rhel8@sha256
- name: MIG_CONTROLLER_TAG
value: 27f465b2cd38cee37af5c3d0fd745676086fe0391e3c459d4df18dd3a12e7051
- name: MIG_UI_REPO
value: openshift-migration-ui-rhel8@sha256
- name: MIG_UI_TAG
Now we get 2 critical conditions and the migration controller is not crashing.
For destination cluster:
status:
conditions:
- category: Critical
lastTransitionTime: "2021-09-08T11:04:13Z"
message: 'The `dstMigClusterRef` must reference a valid `migcluster`, subject: openshift-migration/foo.'
reason: NotFound
status: "True"
type: InvalidDestinationClusterRef
- category: Critical
lastTransitionTime: "2021-09-08T11:04:13Z"
message: 'Reconcile failed: [destination cluster not found]. See controller logs for details.'
status: "True"
type: ReconcileFailed
For source cluster:
status:
conditions:
- category: Critical
lastTransitionTime: "2021-09-08T11:07:07Z"
message: 'The `srcMigClusterRef` must reference a valid `migcluster`, subject: openshift-migration/foo.'
reason: NotFound
status: "True"
type: InvalidSourceClusterRef
- category: Critical
lastTransitionTime: "2021-09-08T11:07:07Z"
message: 'Reconcile failed: [source cluster not found]. See controller logs for details.'
status: "True"
type: ReconcileFailed
Moved to VERIFIED status
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Migration Toolkit for Containers (MTC) 1.6.0 security & bugfix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3694 |
Description of problem: When a migration plan is malformed and the 'destMigClusterRef' is pointing to a non existing migcluster resource the migration controller starts crashing. Version-Release number of selected component (if applicable): SOURCE CLUSTER: AWS 3.11 (MTC 1.5.1) TARGET CLUSTER: AWS 4.9 (MTC 1.6.0) How reproducible: Always Steps to Reproduce: 1. Create an empty namespace in the source cluster $ oc new-project empty-project 2. Create a migration plan in order to migrate this namespace. Use this name for the migplan: "migplan-ocp-40171-malformed-crds" 3. Patch the migplan in order to create a malformed migplan $ oc -n openshift-migration patch migplans migplan-ocp-40171-malformed-crds -p '{"spec":{"destMigClusterRef": {"name": "foo"}}}' --type='merge' Actual results: The migration controller pod starts crashing $ oc get pods NAME READY STATUS RESTARTS AGE migration-controller-6d854fd675-ntczb 1/2 CrashLoopBackOff 5 (103s ago) 6m32s Expected results: The migration plan should become "Not ready" and should have a Critical condition informing about the problem. Additional info: This is the crash in the migration controller pod: {"level":"info","ts":1630591811310,"logger":"controller-runtime.manager.controller.migcluster-controller","msg":"Starting workers","worker count":1} {"level":"info","ts":1630591811310,"logger":"controller-runtime.manager.controller.migplan-controller","msg":"Starting workers","worker count":1} {"level":"info","ts":1630591811311,"logger":"controller-runtime.manager.controller.directimagemigration-controller","msg":"Starting workers","worker count":1} {"level":"info","ts":1630591811312,"logger":"controller-runtime.manager.controller.migstorage-controller","msg":"Starting Controller"} {"level":"info","ts":1630591811312,"logger":"controller-runtime.manager.controller.migstorage-controller","msg":"Starting workers","worker count":1} {"level":"info","ts":1630591811313,"logger":"controller-runtime.manager.controller.miganalytic-controller","msg":"Starting workers","worker count":2} {"level":"info","ts":1630591811313,"logger":"controller-runtime.manager.controller.directvolumemigration-controller","msg":"Starting workers","worker count":1} {"level":"info","ts":1630591811313,"logger":"controller-runtime.manager.controller.mighook-controller","msg":"Starting workers","worker count":1} {"level":"info","ts":1630591811313,"logger":"controller-runtime.manager.controller.directvolumemigrationprogress-controller","msg":"Starting workers","worker count":5} {"level":"info","ts":1630591811314,"logger":"controller-runtime.manager.controller.migmigration-controller","msg":"Starting workers","worker count":1} E0902 14:10:11.378266 1 runtime.go:78] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference) goroutine 556 [running]: k8s.io/apimachinery/pkg/util/runtime.logPanic(0x230f860, 0x3b865b0) /remote-source/app/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:74 +0x95 k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0) /remote-source/app/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:48 +0x86 panic(0x230f860, 0x3b865b0) /opt/rh/go-toolset-1.16/root/usr/lib/go-toolset-1.16-golang/src/runtime/panic.go:965 +0x1b9 github.com/konveyor/mig-controller/pkg/apis/migration/v1alpha1.(*MigCluster).BuildRestConfig(0x0, 0x2a4ab40, 0xc00139bdc0, 0x25e3ce0, 0x8, 0xc00487e720) /remote-source/app/pkg/apis/migration/v1alpha1/migcluster_types.go:423 +0x3a github.com/konveyor/mig-controller/pkg/apis/migration/v1alpha1.(*MigCluster).GetClient(0x0, 0x2a4ab40, 0xc00139bdc0, 0xc00139bdc0, 0x0, 0x0, 0x0) /remote-source/app/pkg/apis/migration/v1alpha1/migcluster_types.go:209 +0x5a github.com/konveyor/mig-controller/pkg/controller/migplan.ReconcileMigPlan.getPotentialFilePermissionConflictNamespaces(0x2a4a868, 0xc000bb4aa0, 0x2a1fb70, 0xc00054df40, 0xc0002a8230, 0x0, 0x0, 0xc001d10500, 0x3c7c9c0, 0x0, ...) /remote-source/app/pkg/controller/migplan/validation.go:307 +0x2c5 github.com/konveyor/mig-controller/pkg/controller/migplan.ReconcileMigPlan.validateNamespaces(0x2a4a868, 0xc000bb4aa0, 0x2a1fb70, 0xc00054df40, 0xc0002a8230, 0x0, 0x0, 0x2a2a020, 0xc002026e70, 0xc001d10500, ...) /remote-source/app/pkg/controller/migplan/validation.go:451 +0x389 github.com/konveyor/mig-controller/pkg/controller/migplan.ReconcileMigPlan.validate(0x2a4a868, 0xc000bb4aa0, 0x2a1fb70, 0xc00054df40, 0xc0002a8230, 0x0, 0x0, 0x2a2a020, 0xc002026e70, 0xc001d10500, ...) /remote-source/app/pkg/controller/migplan/validation.go:158 +0x39c github.com/konveyor/mig-controller/pkg/controller/migplan.(*ReconcileMigPlan).Reconcile(0xc00054df80, 0x2a2a020, 0xc002026e70, 0xc000557590, 0x13, 0xc000c09cc0, 0x20, 0xc002026e00, 0x0, 0x0, ...) /remote-source/app/pkg/controller/migplan/migplan_controller.go:261 +0x553 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0008c52c0, 0x2a29f78, 0xc000702000, 0x23a4e20, 0xc001e9e240) /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:263 +0x30d sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0008c52c0, 0x2a29f78, 0xc000702000, 0xc00011de00) /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:235 +0x205 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.1(0x2a29f78, 0xc000702000) /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:198 +0x4a k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1() /remote-source/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185 +0x37 k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc00011df50) /remote-source/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155 +0x5f k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc003e15f50, 0x29dbba0, 0xc002026db0, 0xc000702001, 0xc000bca120) /remote-source/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156 +0x9b k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc00011df50, 0x3b9aca00, 0x0, 0xc80a01, 0xc000bca120) /remote-source/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x98 k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext(0x2a29f78, 0xc000702000, 0xc001f75de0, 0x3b9aca00, 0x0, 0x1) /remote-source/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185 +0xa6 k8s.io/apimachinery/pkg/util/wait.UntilWithContext(0x2a29f78, 0xc000702000, 0xc001f75de0, 0x3b9aca00) /remote-source/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:99 +0x57 created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1 /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:195 +0x497 panic: runtime error: invalid memory address or nil pointer dereference [recovered] panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x118 pc=0x196c9da] goroutine 556 [running]: k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0) /remote-source/app/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:55 +0x109 panic(0x230f860, 0x3b865b0) /opt/rh/go-toolset-1.16/root/usr/lib/go-toolset-1.16-golang/src/runtime/panic.go:965 +0x1b9 github.com/konveyor/mig-controller/pkg/apis/migration/v1alpha1.(*MigCluster).BuildRestConfig(0x0, 0x2a4ab40, 0xc00139bdc0, 0x25e3ce0, 0x8, 0xc00487e720) /remote-source/app/pkg/apis/migration/v1alpha1/migcluster_types.go:423 +0x3a github.com/konveyor/mig-controller/pkg/apis/migration/v1alpha1.(*MigCluster).GetClient(0x0, 0x2a4ab40, 0xc00139bdc0, 0xc00139bdc0, 0x0, 0x0, 0x0) /remote-source/app/pkg/apis/migration/v1alpha1/migcluster_types.go:209 +0x5a github.com/konveyor/mig-controller/pkg/controller/migplan.ReconcileMigPlan.getPotentialFilePermissionConflictNamespaces(0x2a4a868, 0xc000bb4aa0, 0x2a1fb70, 0xc00054df40, 0xc0002a8230, 0x0, 0x0, 0xc001d10500, 0x3c7c9c0, 0x0, ...) /remote-source/app/pkg/controller/migplan/validation.go:307 +0x2c5 github.com/konveyor/mig-controller/pkg/controller/migplan.ReconcileMigPlan.validateNamespaces(0x2a4a868, 0xc000bb4aa0, 0x2a1fb70, 0xc00054df40, 0xc0002a8230, 0x0, 0x0, 0x2a2a020, 0xc002026e70, 0xc001d10500, ...) /remote-source/app/pkg/controller/migplan/validation.go:451 +0x389 github.com/konveyor/mig-controller/pkg/controller/migplan.ReconcileMigPlan.validate(0x2a4a868, 0xc000bb4aa0, 0x2a1fb70, 0xc00054df40, 0xc0002a8230, 0x0, 0x0, 0x2a2a020, 0xc002026e70, 0xc001d10500, ...) /remote-source/app/pkg/controller/migplan/validation.go:158 +0x39c github.com/konveyor/mig-controller/pkg/controller/migplan.(*ReconcileMigPlan).Reconcile(0xc00054df80, 0x2a2a020, 0xc002026e70, 0xc000557590, 0x13, 0xc000c09cc0, 0x20, 0xc002026e00, 0x0, 0x0, ...) /remote-source/app/pkg/controller/migplan/migplan_controller.go:261 +0x553 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0008c52c0, 0x2a29f78, 0xc000702000, 0x23a4e20, 0xc001e9e240) /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:263 +0x30d sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0008c52c0, 0x2a29f78, 0xc000702000, 0xc00011de00) /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:235 +0x205 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.1(0x2a29f78, 0xc000702000) /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:198 +0x4a k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1() /remote-source/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185 +0x37 k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc00011df50) /remote-source/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155 +0x5f k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc003e15f50, 0x29dbba0, 0xc002026db0, 0xc000702001, 0xc000bca120) /remote-source/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156 +0x9b k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc00011df50, 0x3b9aca00, 0x0, 0xc80a01, 0xc000bca120) /remote-source/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x98 k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext(0x2a29f78, 0xc000702000, 0xc001f75de0, 0x3b9aca00, 0x0, 0x1) /remote-source/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185 +0xa6 k8s.io/apimachinery/pkg/util/wait.UntilWithContext(0x2a29f78, 0xc000702000, 0xc001f75de0, 0x3b9aca00) /remote-source/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:99 +0x57 created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1 /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:195 +0x497