Description of problem: Migrations are getting stuck at stageBackup stage, when triggered with indirect mode. For direct mode itβs working fine. Version-Release number of selected component (if applicable): Source GCP 4.6 MTC 1.7.2 + OADP 1.0.3 Target GCP 4.10 MTC 1.7.2 + OADP 1.0.3 How reproducible: Always Steps to Reproduce: 1. Deploy an application in source cluster 2. Trigger migration with indirect mode Actual results: Migrations are getting stuck at StageBackup stage. $ oc get migmigration migration-52827 -o yaml spec: migPlanRef: name: test4 namespace: openshift-migration quiescePods: true stage: false status: conditions: - category: Advisory lastTransitionTime: "2022-05-31T10:15:05Z" message: 'Step: 30/49' reason: StageBackupCreated status: "True" type: Running - category: Required lastTransitionTime: "2022-05-31T10:13:31Z" message: The migration is ready. status: "True" type: Ready - category: Required durable: true lastTransitionTime: "2022-05-31T10:14:05Z" message: The migration registries are healthy. status: "True" type: RegistriesHealthy - category: Advisory durable: true lastTransitionTime: "2022-05-31T10:14:37Z" message: '[1] Stage pods created.' status: "True" type: StagePodsCreated itinerary: Final observedDigest: 6a51be85e3b968769b1713084a928b5114ec8e9b3c26662cf534ade8ed78b794 phase: StageBackupCreated pipeline: - completed: "2022-05-31T10:14:06Z" message: Completed name: Prepare started: "2022-05-31T10:13:31Z" - completed: "2022-05-31T10:14:26Z" message: Completed name: Backup progress: - 'Backup openshift-migration/migration-52827-initial-nrqvg: 41 out of estimated total of 41 objects backed up (17s)' started: "2022-05-31T10:14:06Z" - message: Waiting for stage backup to complete. name: StageBackup phase: StageBackupCreated progress: - 'Backup openshift-migration/migration-52827-stage-z8w4d: 0 out of estimated total of 5 objects backed up (52m56s)' - 'PodVolumeBackup openshift-migration/migration-52827-stage-z8w4d-f76h9: 0 bytes out of 0 bytes backed up (52m40s)' started: "2022-05-31T10:14:26Z" - message: Not started name: StageRestore - message: Not started name: Restore - message: Not started name: Cleanup startTimestamp: "2022-05-31T10:13:31Z" $ oc logs migration-log-reader-5d6d95499b-72bvn -c color openshift-migration velero-57c48b4bb-n9s4x velero time="2022-05-31T11:08:24Z" level=info msg="Found 1 backups in the backup location that do not exist in the cluster and need to be synced" backupLocation=automatic-c6mbt controller=backup-sync logSource="pkg/controller/backup_sync_controller.go:204" openshift-migration velero-57c48b4bb-n9s4x velero time="2022-05-31T11:08:24Z" level=info msg="Attempting to sync backup into cluster" backup=migration-58d98-initial-rsrdt backupLocation=automatic-c6mbt controller=backup-sync logSource="pkg/controller/backup_sync_controller.go:212" openshift-migration velero-57c48b4bb-n9s4x velero time="2022-05-31T11:08:24Z" level=error msg="Error getting backup metadata from backup store" backup=migration-58d98-initial-rsrdt backupLocation=automatic-c6mbt controller=backup-sync error="rpc error: code = Unknown desc = storage: object doesn't exist" error.file="/remote-source/src/github.com/vmware-tanzu/velero/pkg/persistence/object_store.go:289" error.function="github.com/vmware-tanzu/velero/pkg/persistence.(*objectBackupStore).GetBackupMetadata" logSource="pkg/controller/backup_sync_controller.go:216" openshift-migration velero-57c48b4bb-n9s4x velero time="2022-05-31T11:08:24Z" level=info msg="Validating backup storage location" backup-storage-location=automatic-c6mbt controller=backup-storage-location logSource="pkg/controller/backup_storage_location_controller.go:114" openshift-migration velero-57c48b4bb-n9s4x velero time="2022-05-31T11:08:24Z" level=info msg="Found 1 backups in the backup location that do not exist in the cluster and need to be synced" backupLocation=automatic-gt8v9 controller=backup-sync logSource="pkg/controller/backup_sync_controller.go:204" openshift-migration velero-57c48b4bb-n9s4x velero time="2022-05-31T11:08:24Z" level=info msg="Attempting to sync backup into cluster" backup=migration-58d98-initial-rsrdt backupLocation=automatic-gt8v9 controller=backup-sync logSource="pkg/controller/backup_sync_controller.go:212" openshift-migration velero-57c48b4bb-n9s4x velero time="2022-05-31T11:08:24Z" level=info msg="Backup storage location valid, marking as available" backup-storage-location=automatic-c6mbt controller=backup-storage-location logSource="pkg/controller/backup_storage_location_controller.go:121" openshift-migration velero-57c48b4bb-n9s4x velero time="2022-05-31T11:08:24Z" level=info msg="Validating backup storage location" backup-storage-location=automatic-gt8v9 controller=backup-storage-location logSource="pkg/controller/backup_storage_location_controller.go:114" openshift-migration velero-57c48b4bb-n9s4x velero time="2022-05-31T11:08:24Z" level=error msg="Error getting backup metadata from backup store" backup=migration-58d98-initial-rsrdt backupLocation=automatic-gt8v9 controller=backup-sync error="rpc error: code = Unknown desc = storage: object doesn't exist" error.file="/remote-source/src/github.com/vmware-tanzu/velero/pkg/persistence/object_store.go:289" error.function="github.com/vmware-tanzu/velero/pkg/persistence.(*objectBackupStore).GetBackupMetadata" logSource="pkg/controller/backup_sync_controller.go:216" openshift-migration velero-57c48b4bb-n9s4x velero time="2022-05-31T11:08:24Z" level=info msg="Backup storage location valid, marking as available" backup-storage-location=automatic-gt8v9 controller=backup-storage-location logSource="pkg/controller/backup_storage_location_controller.go:121" openshift-migration migration-controller-56d764884-7fxkd mtc {"level":"info","ts":1653995305.3079288,"logger":"migration","msg":"Checking registry health","migMigration":"migration-52827"} openshift-migration migration-controller-56d764884-7fxkd mtc {"level":"info","ts":1653995305.389897,"logger":"migration","msg":"Found 2/2 registries in healthy condition.","migMigration":"migration-52827","message":""} openshift-migration migration-controller-56d764884-7fxkd mtc {"level":"info","ts":1653995305.390091,"logger":"migration","msg":"[RUN] (Step 30/49) Waiting for stage backup to complete.","migMigration":"migration-52827","phase":"StageBackupCreated"} openshift-migration migration-controller-56d764884-7fxkd mtc {"level":"info","ts":1653995305.8250961,"logger":"migration","msg":"Velero Backup progress report","migMigration":"migration-52827","phase":"StageBackupCreated","backup":"openshift-migration/migration-52827-stage-z8w4d","backupProgress":["Backup openshift-migration/migration-52827-stage-z8w4d: 0 out of estimated total of 5 objects backed up (53m21s)","PodVolumeBackup openshift-migration/migration-52827-stage-z8w4d-f76h9: 0 bytes out of 0 bytes backed up (53m5s)"]} openshift-migration migration-controller-56d764884-7fxkd mtc {"level":"info","ts":1653995305.8251326,"logger":"migration","msg":"Stage Backup on source cluster is incomplete. Waiting.","migMigration":"migration-52827","phase":"StageBackupCreated","backup":"openshift-migration/migration-52827-stage-z8w4d","backupPhase":"InProgress","backupProgress":"0/5","backupWarnings":0,"backupErrors":0} Expected results: Migrations should be successful. Additional info:
We're expecting a fix for this to land in OADP 1.0.3 on 6/14.
Verified with MTC 1.7.2 + OADP 1.0.3 (Stage) $ oc get csv -n openshift-migration NAME DISPLAY VERSION REPLACES PHASE mtc-operator.v1.7.2 Migration Toolkit for Containers Operator 1.7.2 Succeeded oadp-operator.v1.0.3 OADP Operator 1.0.3 Succeeded $ oc get migplan -n openshift-migration test-indirect -o yaml apiVersion: migration.openshift.io/v1alpha1 kind: MigPlan metadata: annotations: migration.openshift.io/selected-migplan-type: full name: test-indirect namespace: openshift-migration spec: destMigClusterRef: name: host namespace: openshift-migration indirectImageMigration: true indirectVolumeMigration: true migStorageRef: name: automatic namespace: openshift-migration namespaces: - ocp-django persistentVolumes: - capacity: 1Gi name: pvc-ecd2c872-946b-4710-b733-21f347bcb7ea proposedCapacity: 1Gi pvc: accessModes: - ReadWriteOnce hasReference: true name: postgresql namespace: ocp-django selection: action: copy copyMethod: filesystem storageClass: standard storageClass: standard supported: actions: - skip - copy copyMethods: - filesystem - snapshot srcMigClusterRef: name: source-cluster namespace: openshift-migration $ oc get migmigration -n openshift-migration migration-f3c5c -o yaml apiVersion: migration.openshift.io/v1alpha1 kind: MigMigration metadata: labels: migration.openshift.io/migplan-name: test-indirect migration.openshift.io/migration-uid: 3ede7239-d48e-43db-8ed6-99af1eff761e name: migration-f3c5c namespace: openshift-migration spec: migPlanRef: name: test-indirect namespace: openshift-migration quiescePods: true stage: false status: conditions: - category: Advisory durable: true lastTransitionTime: "2022-06-06T06:46:40Z" message: The migration has completed successfully. reason: Completed status: "True" type: Succeeded itinerary: Final observedDigest: b0320c2fc4ba12a915d7133d0b2bc798024ce836ae4c87417098721261076177 phase: Completed pipeline: - completed: "2022-06-06T06:43:53Z" message: Completed name: Prepare started: "2022-06-06T06:43:20Z" - completed: "2022-06-06T06:44:19Z" message: Completed name: Backup progress: - 'Backup openshift-migration/migration-f3c5c-initial-gkll5: 41 out of estimated total of 41 objects backed up (17s)' started: "2022-06-06T06:43:53Z" - completed: "2022-06-06T06:45:36Z" message: Completed name: StageBackup progress: - 'Backup openshift-migration/migration-f3c5c-stage-xp444: 6 out of estimated total of 6 objects backed up (21s)' - 'PodVolumeBackup openshift-migration/migration-f3c5c-stage-xp444-mbcwp: 46.74 MB out of 46.74 MB backed up (5s)' started: "2022-06-06T06:44:19Z" - completed: "2022-06-06T06:46:31Z" message: Completed name: StageRestore progress: - 'Restore openshift-migration/migration-f3c5c-stage-trbw2: 6 out of estimated total of 6 objects restored (19s)' - 'PodVolumeRestore openshift-migration/migration-f3c5c-stage-trbw2-jnnx2: 46.74 MB out of 46.74 MB restored (6s)' - 'Pod ocp-django/stage-postgresql-6k959: Container sleep-0 ' started: "2022-06-06T06:45:36Z" - completed: "2022-06-06T06:46:40Z" message: Completed name: Restore progress: - 'Restore openshift-migration/migration-f3c5c-final-zmgbp: 37 out of estimated total of 37 objects restored (4s)' - All the stage pods are restored, waiting for restore to Complete started: "2022-06-06T06:46:31Z" - completed: "2022-06-06T06:46:40Z" message: Completed name: Cleanup started: "2022-06-06T06:46:40Z" startTimestamp: "2022-06-06T06:43:20Z" Indirect migrations are working fine, Moving this to verified status.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Migration Toolkit for Containers (MTC) 1.7.3 security and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5840