Bug 2137301

Summary: Migration stuck on DirectVolume stage during storage class conversion on source cluster
Product: Migration Toolkit for Containers Reporter: mohamed <midays>
Component: ControllerAssignee: Pranav Gaikwad <pgaikwad>
Status: CLOSED ERRATA QA Contact: mohamed <midays>
Severity: urgent Docs Contact: Anjana Suparna Sriram <asriram>
Priority: unspecified    
Version: 1.7.5CC: jmatthew, pgaikwad, rjohnson
Target Milestone: ---Keywords: Regression
Target Release: 1.7.5   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-10-31 11:10:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
SCC migration steps none

Description mohamed 2022-10-24 12:00:47 UTC
Created attachment 1919952 [details]
SCC migration steps

Description of problem:

Performing SCC on source cluster causes the migration controller to go into CrashLoopBackOff state. 

The migration is stuck on the Creating Rsync daemon pods on the target cluster stage forever, with a 500 code message displayed to the user.

Creating another type of migration is not possible after this occurs as it fails to fetch the namespaces in the creation wizard


Version-Release number of selected component (if applicable):

GCP 4.6 -> GCP 4.9

How reproducible:
Always

Steps to Reproduce:
1. Create a new scc on source cluster
2. Perform the migration

Actual results:
1. The migration gets stuck on the direct volume stage (Creating Rsync daemon pods on the target cluster)

2. When navigating to the migration details page a message is displayed to the user : "Danger alert:Error loading debug data for plan "test-sc"
Request failed with status code 500"

3. When trying to create a new migration plane (full, state, or another scc) the creation fails because it fails to fetch the namespaces

Expected results:
The migration should be executed successfully without errors

Additional info:

$ oc -n openshift-migration get pods

NAME                                               READY   STATUS             RESTARTS        AGE
migration-controller-86bdfc89b9-64r69              1/2     CrashLoopBackOff   20 (2m3s ago)   165m
migration-log-reader-7cfdbcf68b-xjddq              2/2     Running            0               165m
migration-operator-ff767bd8f-c67gw                 1/1     Running            0               166m
migration-ui-c6cd68c7c-2vg2t                       1/1     Running            0               165m
openshift-adp-controller-manager-6cfbdb444-kxcnc   1/1     Running            0               166m
restic-88j7c                                       1/1     Running            0               166m
restic-hzktl                                       1/1     Running            0               166m
restic-p9gkh                                       1/1     Running            0               166m
restic-q7kvg                                       1/1     Running            0               166m
restic-x2g97                                       1/1     Running            0               166m
velero-78c677bddb-tvd9t                            1/1     Running            0               166m

from the logs:

{"level":"info","ts":1666612415.4290595,"logger":"directvolume","msg":"[RUN] (Step 15/20) Creating Rsync daemon pods on the target cluster","dvm":"migration-6346a-6wvv7","migMigration":"migration-6346a","phase":"CreateRsyncTransferPods"}
{"level":"info","ts":1666612415.4791296,"logger":"directvolume","msg":"Getting Rsync Password from Secret on host MigCluster","dvm":"migration-6346a-6wvv7","migMigration":"migration-6346a","phase":"CreateRsyncTransferPods","secret":""}
{"level":"info","ts":1666612415.6127393,"logger":"storage","msg":"CR","migStorage":"automatic","conditions":{"conditions":[{"type":"Ready","status":"True","category":"Required","message":"The storage is ready.","lastTransitionTime":"2022-10-24T09:11:11Z"}]}}
E1024 11:53:35.896631       1 runtime.go:78] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
goroutine 716 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic({0x1fb2100, 0x3a3f710})
	/remote-source/mig-controller/app/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:74 +0x85
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0xc0035fdac0})
	/remote-source/mig-controller/app/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:48 +0x75
panic({0x1fb2100, 0x3a3f710})
	/usr/lib/golang/src/runtime/panic.go:1038 +0x215
github.com/konveyor/mig-controller/pkg/controller/directvolumemigration.(*Task).getSecurityContext(0xc00558b988, {0x27068c8, 0xc00423ed80}, {0xc002233bc2, 0x31}, 0x0)
	/remote-source/mig-controller/app/pkg/controller/directvolumemigration/rsync.go:231 +0xf5
github.com/konveyor/mig-controller/pkg/controller/directvolumemigration.(*Task).getRsyncTransferServerMutations(0xc00558b988, {0x27068c8, 0xc00423ed80}, {0xc002233bc2, 0x31})
	/remote-source/mig-controller/app/pkg/controller/directvolumemigration/rsync.go:198 +0x16a
github.com/konveyor/mig-controller/pkg/controller/directvolumemigration.(*Task).ensureRsyncTransferServer(0x269a420)
	/remote-source/mig-controller/app/pkg/controller/directvolumemigration/rsync.go:355 +0x55a
github.com/konveyor/mig-controller/pkg/controller/directvolumemigration.(*Task).Run(0xc00558b988, {0x269a420, 0xc0025fb1a0})
	/remote-source/mig-controller/app/pkg/controller/directvolumemigration/task.go:330 +0xb9a
github.com/konveyor/mig-controller/pkg/controller/directvolumemigration.(*ReconcileDirectVolumeMigration).migrate(0xc000dd5c80, {0x269a420, 0xc0025fb1a0}, 0xc001615900)
	/remote-source/mig-controller/app/pkg/controller/directvolumemigration/migrate.go:46 +0x265
github.com/konveyor/mig-controller/pkg/controller/directvolumemigration.(*ReconcileDirectVolumeMigration).Reconcile(0xc000dd5c80, {0x269a420, 0xc0025fb1a0}, {{{0xc0027917d0, 0x20d44c0}, {0xc0027917b8, 0xc0037407c0}}})
	/remote-source/mig-controller/app/pkg/controller/directvolumemigration/directvolumemigration_controller.go:148 +0x734
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000985040, {0x269a378, 0xc00060e000}, {0x2042be0, 0xc0035fdac0})
	/remote-source/mig-controller/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:263 +0x303
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000985040, {0x269a378, 0xc00060e000})
	/remote-source/mig-controller/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:235 +0x205
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.1({0x269a378, 0xc00060e000})
	/remote-source/mig-controller/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:198 +0x46
k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1()
	/remote-source/mig-controller/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185 +0x25
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0x7fec02414e08)
	/remote-source/mig-controller/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155 +0x67
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0x0, {0x2659240, 0xc0025fb0e0}, 0x1, 0xc000c0e120)
	/remote-source/mig-controller/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156 +0xb6
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0x0, 0x3b9aca00, 0x0, 0x0, 0x0)
	/remote-source/mig-controller/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x89
k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext({0x269a378, 0xc00060e000}, 0xc003587fd0, 0x0, 0x0, 0x0)
	/remote-source/mig-controller/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185 +0x99
k8s.io/apimachinery/pkg/util/wait.UntilWithContext({0x269a378, 0xc00060e000}, 0x5c206d, 0x1c)
	/remote-source/mig-controller/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:99 +0x2b
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
	/remote-source/mig-controller/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:195 +0x4b0
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x128 pc=0x1a5c1f5]

goroutine 716 [running]:
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0xc0035fdac0})
	/remote-source/mig-controller/app/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:55 +0xd8
panic({0x1fb2100, 0x3a3f710})
	/usr/lib/golang/src/runtime/panic.go:1038 +0x215
github.com/konveyor/mig-controller/pkg/controller/directvolumemigration.(*Task).getSecurityContext(0xc00558b988, {0x27068c8, 0xc00423ed80}, {0xc002233bc2, 0x31}, 0x0)
	/remote-source/mig-controller/app/pkg/controller/directvolumemigration/rsync.go:231 +0xf5
github.com/konveyor/mig-controller/pkg/controller/directvolumemigration.(*Task).getRsyncTransferServerMutations(0xc00558b988, {0x27068c8, 0xc00423ed80}, {0xc002233bc2, 0x31})
	/remote-source/mig-controller/app/pkg/controller/directvolumemigration/rsync.go:198 +0x16a
github.com/konveyor/mig-controller/pkg/controller/directvolumemigration.(*Task).ensureRsyncTransferServer(0x269a420)
	/remote-source/mig-controller/app/pkg/controller/directvolumemigration/rsync.go:355 +0x55a
github.com/konveyor/mig-controller/pkg/controller/directvolumemigration.(*Task).Run(0xc00558b988, {0x269a420, 0xc0025fb1a0})
	/remote-source/mig-controller/app/pkg/controller/directvolumemigration/task.go:330 +0xb9a
github.com/konveyor/mig-controller/pkg/controller/directvolumemigration.(*ReconcileDirectVolumeMigration).migrate(0xc000dd5c80, {0x269a420, 0xc0025fb1a0}, 0xc001615900)
	/remote-source/mig-controller/app/pkg/controller/directvolumemigration/migrate.go:46 +0x265
github.com/konveyor/mig-controller/pkg/controller/directvolumemigration.(*ReconcileDirectVolumeMigration).Reconcile(0xc000dd5c80, {0x269a420, 0xc0025fb1a0}, {{{0xc0027917d0, 0x20d44c0}, {0xc0027917b8, 0xc0037407c0}}})
	/remote-source/mig-controller/app/pkg/controller/directvolumemigration/directvolumemigration_controller.go:148 +0x734
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000985040, {0x269a378, 0xc00060e000}, {0x2042be0, 0xc0035fdac0})
	/remote-source/mig-controller/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:263 +0x303
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000985040, {0x269a378, 0xc00060e000})
	/remote-source/mig-controller/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:235 +0x205
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.1({0x269a378, 0xc00060e000})
	/remote-source/mig-controller/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:198 +0x46
k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1()
	/remote-source/mig-controller/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185 +0x25
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0x7fec02414e08)
	/remote-source/mig-controller/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155 +0x67
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0x0, {0x2659240, 0xc0025fb0e0}, 0x1, 0xc000c0e120)
	/remote-source/mig-controller/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156 +0xb6
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0x0, 0x3b9aca00, 0x0, 0x0, 0x0)
	/remote-source/mig-controller/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x89
k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext({0x269a378, 0xc00060e000}, 0xc003587fd0, 0x0, 0x0, 0x0)
	/remote-source/mig-controller/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185 +0x99
k8s.io/apimachinery/pkg/util/wait.UntilWithContext({0x269a378, 0xc00060e000}, 0x5c206d, 0x1c)
	/remote-source/mig-controller/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:99 +0x2b
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
	/remote-source/mig-controller/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:195 +0x4b0

Comment 1 Pranav Gaikwad 2022-10-24 13:50:31 UTC
Caused due to using wrong clients in DVM. Fixed in https://github.com/migtools/mig-controller/pull/1321

Comment 2 mohamed 2022-10-24 14:11:21 UTC
SCC = storage class conversion

Comment 8 mohamed 2022-10-25 13:24:32 UTC
Verified using MTC prestage 
metadata_nvr: openshift-migration-operator-metadata-container-v1.7.5-28

GCP 4.6 -> GCP 4.9

Comment 12 errata-xmlrpc 2022-10-31 11:10:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Migration Toolkit for Containers (MTC) 1.7.5 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:7262