Bug 2137301 - Migration stuck on DirectVolume stage during storage class conversion on source cluster
Summary: Migration stuck on DirectVolume stage during storage class conversion on sour...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Migration Toolkit for Containers
Classification: Red Hat
Component: Controller
Version: 1.7.5
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: 1.7.5
Assignee: Pranav Gaikwad
QA Contact: mohamed
Anjana Suparna Sriram
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-10-24 12:00 UTC by mohamed
Modified: 2022-10-31 11:10 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-10-31 11:10:35 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
SCC migration steps (54.74 KB, image/png)
2022-10-24 12:00 UTC, mohamed
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github migtools mig-controller pull 1321 0 None Merged Bug 2137301: Fix references to client 2022-10-24 15:46:59 UTC
Github migtools mig-controller pull 1322 0 None Merged [release-1.7.5] Bug 2137301: fix references to client (#1321) 2022-10-24 16:16:03 UTC
Red Hat Product Errata RHBA-2022:7262 0 None None None 2022-10-31 11:10:40 UTC

Description mohamed 2022-10-24 12:00:47 UTC
Created attachment 1919952 [details]
SCC migration steps

Description of problem:

Performing SCC on source cluster causes the migration controller to go into CrashLoopBackOff state. 

The migration is stuck on the Creating Rsync daemon pods on the target cluster stage forever, with a 500 code message displayed to the user.

Creating another type of migration is not possible after this occurs as it fails to fetch the namespaces in the creation wizard


Version-Release number of selected component (if applicable):

GCP 4.6 -> GCP 4.9

How reproducible:
Always

Steps to Reproduce:
1. Create a new scc on source cluster
2. Perform the migration

Actual results:
1. The migration gets stuck on the direct volume stage (Creating Rsync daemon pods on the target cluster)

2. When navigating to the migration details page a message is displayed to the user : "Danger alert:Error loading debug data for plan "test-sc"
Request failed with status code 500"

3. When trying to create a new migration plane (full, state, or another scc) the creation fails because it fails to fetch the namespaces

Expected results:
The migration should be executed successfully without errors

Additional info:

$ oc -n openshift-migration get pods

NAME                                               READY   STATUS             RESTARTS        AGE
migration-controller-86bdfc89b9-64r69              1/2     CrashLoopBackOff   20 (2m3s ago)   165m
migration-log-reader-7cfdbcf68b-xjddq              2/2     Running            0               165m
migration-operator-ff767bd8f-c67gw                 1/1     Running            0               166m
migration-ui-c6cd68c7c-2vg2t                       1/1     Running            0               165m
openshift-adp-controller-manager-6cfbdb444-kxcnc   1/1     Running            0               166m
restic-88j7c                                       1/1     Running            0               166m
restic-hzktl                                       1/1     Running            0               166m
restic-p9gkh                                       1/1     Running            0               166m
restic-q7kvg                                       1/1     Running            0               166m
restic-x2g97                                       1/1     Running            0               166m
velero-78c677bddb-tvd9t                            1/1     Running            0               166m

from the logs:

{"level":"info","ts":1666612415.4290595,"logger":"directvolume","msg":"[RUN] (Step 15/20) Creating Rsync daemon pods on the target cluster","dvm":"migration-6346a-6wvv7","migMigration":"migration-6346a","phase":"CreateRsyncTransferPods"}
{"level":"info","ts":1666612415.4791296,"logger":"directvolume","msg":"Getting Rsync Password from Secret on host MigCluster","dvm":"migration-6346a-6wvv7","migMigration":"migration-6346a","phase":"CreateRsyncTransferPods","secret":""}
{"level":"info","ts":1666612415.6127393,"logger":"storage","msg":"CR","migStorage":"automatic","conditions":{"conditions":[{"type":"Ready","status":"True","category":"Required","message":"The storage is ready.","lastTransitionTime":"2022-10-24T09:11:11Z"}]}}
E1024 11:53:35.896631       1 runtime.go:78] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
goroutine 716 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic({0x1fb2100, 0x3a3f710})
	/remote-source/mig-controller/app/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:74 +0x85
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0xc0035fdac0})
	/remote-source/mig-controller/app/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:48 +0x75
panic({0x1fb2100, 0x3a3f710})
	/usr/lib/golang/src/runtime/panic.go:1038 +0x215
github.com/konveyor/mig-controller/pkg/controller/directvolumemigration.(*Task).getSecurityContext(0xc00558b988, {0x27068c8, 0xc00423ed80}, {0xc002233bc2, 0x31}, 0x0)
	/remote-source/mig-controller/app/pkg/controller/directvolumemigration/rsync.go:231 +0xf5
github.com/konveyor/mig-controller/pkg/controller/directvolumemigration.(*Task).getRsyncTransferServerMutations(0xc00558b988, {0x27068c8, 0xc00423ed80}, {0xc002233bc2, 0x31})
	/remote-source/mig-controller/app/pkg/controller/directvolumemigration/rsync.go:198 +0x16a
github.com/konveyor/mig-controller/pkg/controller/directvolumemigration.(*Task).ensureRsyncTransferServer(0x269a420)
	/remote-source/mig-controller/app/pkg/controller/directvolumemigration/rsync.go:355 +0x55a
github.com/konveyor/mig-controller/pkg/controller/directvolumemigration.(*Task).Run(0xc00558b988, {0x269a420, 0xc0025fb1a0})
	/remote-source/mig-controller/app/pkg/controller/directvolumemigration/task.go:330 +0xb9a
github.com/konveyor/mig-controller/pkg/controller/directvolumemigration.(*ReconcileDirectVolumeMigration).migrate(0xc000dd5c80, {0x269a420, 0xc0025fb1a0}, 0xc001615900)
	/remote-source/mig-controller/app/pkg/controller/directvolumemigration/migrate.go:46 +0x265
github.com/konveyor/mig-controller/pkg/controller/directvolumemigration.(*ReconcileDirectVolumeMigration).Reconcile(0xc000dd5c80, {0x269a420, 0xc0025fb1a0}, {{{0xc0027917d0, 0x20d44c0}, {0xc0027917b8, 0xc0037407c0}}})
	/remote-source/mig-controller/app/pkg/controller/directvolumemigration/directvolumemigration_controller.go:148 +0x734
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000985040, {0x269a378, 0xc00060e000}, {0x2042be0, 0xc0035fdac0})
	/remote-source/mig-controller/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:263 +0x303
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000985040, {0x269a378, 0xc00060e000})
	/remote-source/mig-controller/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:235 +0x205
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.1({0x269a378, 0xc00060e000})
	/remote-source/mig-controller/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:198 +0x46
k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1()
	/remote-source/mig-controller/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185 +0x25
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0x7fec02414e08)
	/remote-source/mig-controller/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155 +0x67
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0x0, {0x2659240, 0xc0025fb0e0}, 0x1, 0xc000c0e120)
	/remote-source/mig-controller/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156 +0xb6
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0x0, 0x3b9aca00, 0x0, 0x0, 0x0)
	/remote-source/mig-controller/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x89
k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext({0x269a378, 0xc00060e000}, 0xc003587fd0, 0x0, 0x0, 0x0)
	/remote-source/mig-controller/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185 +0x99
k8s.io/apimachinery/pkg/util/wait.UntilWithContext({0x269a378, 0xc00060e000}, 0x5c206d, 0x1c)
	/remote-source/mig-controller/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:99 +0x2b
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
	/remote-source/mig-controller/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:195 +0x4b0
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x128 pc=0x1a5c1f5]

goroutine 716 [running]:
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0xc0035fdac0})
	/remote-source/mig-controller/app/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:55 +0xd8
panic({0x1fb2100, 0x3a3f710})
	/usr/lib/golang/src/runtime/panic.go:1038 +0x215
github.com/konveyor/mig-controller/pkg/controller/directvolumemigration.(*Task).getSecurityContext(0xc00558b988, {0x27068c8, 0xc00423ed80}, {0xc002233bc2, 0x31}, 0x0)
	/remote-source/mig-controller/app/pkg/controller/directvolumemigration/rsync.go:231 +0xf5
github.com/konveyor/mig-controller/pkg/controller/directvolumemigration.(*Task).getRsyncTransferServerMutations(0xc00558b988, {0x27068c8, 0xc00423ed80}, {0xc002233bc2, 0x31})
	/remote-source/mig-controller/app/pkg/controller/directvolumemigration/rsync.go:198 +0x16a
github.com/konveyor/mig-controller/pkg/controller/directvolumemigration.(*Task).ensureRsyncTransferServer(0x269a420)
	/remote-source/mig-controller/app/pkg/controller/directvolumemigration/rsync.go:355 +0x55a
github.com/konveyor/mig-controller/pkg/controller/directvolumemigration.(*Task).Run(0xc00558b988, {0x269a420, 0xc0025fb1a0})
	/remote-source/mig-controller/app/pkg/controller/directvolumemigration/task.go:330 +0xb9a
github.com/konveyor/mig-controller/pkg/controller/directvolumemigration.(*ReconcileDirectVolumeMigration).migrate(0xc000dd5c80, {0x269a420, 0xc0025fb1a0}, 0xc001615900)
	/remote-source/mig-controller/app/pkg/controller/directvolumemigration/migrate.go:46 +0x265
github.com/konveyor/mig-controller/pkg/controller/directvolumemigration.(*ReconcileDirectVolumeMigration).Reconcile(0xc000dd5c80, {0x269a420, 0xc0025fb1a0}, {{{0xc0027917d0, 0x20d44c0}, {0xc0027917b8, 0xc0037407c0}}})
	/remote-source/mig-controller/app/pkg/controller/directvolumemigration/directvolumemigration_controller.go:148 +0x734
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000985040, {0x269a378, 0xc00060e000}, {0x2042be0, 0xc0035fdac0})
	/remote-source/mig-controller/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:263 +0x303
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000985040, {0x269a378, 0xc00060e000})
	/remote-source/mig-controller/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:235 +0x205
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.1({0x269a378, 0xc00060e000})
	/remote-source/mig-controller/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:198 +0x46
k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1()
	/remote-source/mig-controller/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185 +0x25
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0x7fec02414e08)
	/remote-source/mig-controller/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155 +0x67
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0x0, {0x2659240, 0xc0025fb0e0}, 0x1, 0xc000c0e120)
	/remote-source/mig-controller/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156 +0xb6
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0x0, 0x3b9aca00, 0x0, 0x0, 0x0)
	/remote-source/mig-controller/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x89
k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext({0x269a378, 0xc00060e000}, 0xc003587fd0, 0x0, 0x0, 0x0)
	/remote-source/mig-controller/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185 +0x99
k8s.io/apimachinery/pkg/util/wait.UntilWithContext({0x269a378, 0xc00060e000}, 0x5c206d, 0x1c)
	/remote-source/mig-controller/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:99 +0x2b
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
	/remote-source/mig-controller/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:195 +0x4b0

Comment 1 Pranav Gaikwad 2022-10-24 13:50:31 UTC
Caused due to using wrong clients in DVM. Fixed in https://github.com/migtools/mig-controller/pull/1321

Comment 2 mohamed 2022-10-24 14:11:21 UTC
SCC = storage class conversion

Comment 8 mohamed 2022-10-25 13:24:32 UTC
Verified using MTC prestage 
metadata_nvr: openshift-migration-operator-metadata-container-v1.7.5-28

GCP 4.6 -> GCP 4.9

Comment 12 errata-xmlrpc 2022-10-31 11:10:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Migration Toolkit for Containers (MTC) 1.7.5 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:7262


Note You need to log in before you can comment on or make changes to this bug.