Bug 2091965
| Summary: | [MTC] Migrations gets stuck at StageBackup stage for indirect runs [OADP-BL] | ||
|---|---|---|---|
| Product: | Migration Toolkit for Containers | Reporter: | Prasad Joshi <prajoshi> |
| Component: | Controller | Assignee: | Jason Montleon <jmontleo> |
| Status: | CLOSED ERRATA | QA Contact: | Prasad Joshi <prajoshi> |
| Severity: | urgent | Docs Contact: | Richard Hoch <rhoch> |
| Priority: | urgent | ||
| Version: | 1.7.2 | CC: | ernelson, rjohnson |
| Target Milestone: | --- | ||
| Target Release: | 1.7.2 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-08-02 07:45:50 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
We're expecting a fix for this to land in OADP 1.0.3 on 6/14. Verified with MTC 1.7.2 + OADP 1.0.3 (Stage)
$ oc get csv -n openshift-migration
NAME DISPLAY VERSION REPLACES PHASE
mtc-operator.v1.7.2 Migration Toolkit for Containers Operator 1.7.2 Succeeded
oadp-operator.v1.0.3 OADP Operator 1.0.3 Succeeded
$ oc get migplan -n openshift-migration test-indirect -o yaml
apiVersion: migration.openshift.io/v1alpha1
kind: MigPlan
metadata:
annotations:
migration.openshift.io/selected-migplan-type: full
name: test-indirect
namespace: openshift-migration
spec:
destMigClusterRef:
name: host
namespace: openshift-migration
indirectImageMigration: true
indirectVolumeMigration: true
migStorageRef:
name: automatic
namespace: openshift-migration
namespaces:
- ocp-django
persistentVolumes:
- capacity: 1Gi
name: pvc-ecd2c872-946b-4710-b733-21f347bcb7ea
proposedCapacity: 1Gi
pvc:
accessModes:
- ReadWriteOnce
hasReference: true
name: postgresql
namespace: ocp-django
selection:
action: copy
copyMethod: filesystem
storageClass: standard
storageClass: standard
supported:
actions:
- skip
- copy
copyMethods:
- filesystem
- snapshot
srcMigClusterRef:
name: source-cluster
namespace: openshift-migration
$ oc get migmigration -n openshift-migration migration-f3c5c -o yaml
apiVersion: migration.openshift.io/v1alpha1
kind: MigMigration
metadata:
labels:
migration.openshift.io/migplan-name: test-indirect
migration.openshift.io/migration-uid: 3ede7239-d48e-43db-8ed6-99af1eff761e
name: migration-f3c5c
namespace: openshift-migration
spec:
migPlanRef:
name: test-indirect
namespace: openshift-migration
quiescePods: true
stage: false
status:
conditions:
- category: Advisory
durable: true
lastTransitionTime: "2022-06-06T06:46:40Z"
message: The migration has completed successfully.
reason: Completed
status: "True"
type: Succeeded
itinerary: Final
observedDigest: b0320c2fc4ba12a915d7133d0b2bc798024ce836ae4c87417098721261076177
phase: Completed
pipeline:
- completed: "2022-06-06T06:43:53Z"
message: Completed
name: Prepare
started: "2022-06-06T06:43:20Z"
- completed: "2022-06-06T06:44:19Z"
message: Completed
name: Backup
progress:
- 'Backup openshift-migration/migration-f3c5c-initial-gkll5: 41 out of estimated total of 41 objects backed up (17s)'
started: "2022-06-06T06:43:53Z"
- completed: "2022-06-06T06:45:36Z"
message: Completed
name: StageBackup
progress:
- 'Backup openshift-migration/migration-f3c5c-stage-xp444: 6 out of estimated total of 6 objects backed up (21s)'
- 'PodVolumeBackup openshift-migration/migration-f3c5c-stage-xp444-mbcwp: 46.74 MB out of 46.74 MB backed up (5s)'
started: "2022-06-06T06:44:19Z"
- completed: "2022-06-06T06:46:31Z"
message: Completed
name: StageRestore
progress:
- 'Restore openshift-migration/migration-f3c5c-stage-trbw2: 6 out of estimated total of 6 objects restored (19s)'
- 'PodVolumeRestore openshift-migration/migration-f3c5c-stage-trbw2-jnnx2: 46.74 MB out of 46.74 MB restored (6s)'
- 'Pod ocp-django/stage-postgresql-6k959: Container sleep-0 '
started: "2022-06-06T06:45:36Z"
- completed: "2022-06-06T06:46:40Z"
message: Completed
name: Restore
progress:
- 'Restore openshift-migration/migration-f3c5c-final-zmgbp: 37 out of estimated total of 37 objects restored (4s)'
- All the stage pods are restored, waiting for restore to Complete
started: "2022-06-06T06:46:31Z"
- completed: "2022-06-06T06:46:40Z"
message: Completed
name: Cleanup
started: "2022-06-06T06:46:40Z"
startTimestamp: "2022-06-06T06:43:20Z"
Indirect migrations are working fine, Moving this to verified status.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Migration Toolkit for Containers (MTC) 1.7.3 security and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5840 |
Description of problem: Migrations are getting stuck at stageBackup stage, when triggered with indirect mode. For direct mode it’s working fine. Version-Release number of selected component (if applicable): Source GCP 4.6 MTC 1.7.2 + OADP 1.0.3 Target GCP 4.10 MTC 1.7.2 + OADP 1.0.3 How reproducible: Always Steps to Reproduce: 1. Deploy an application in source cluster 2. Trigger migration with indirect mode Actual results: Migrations are getting stuck at StageBackup stage. $ oc get migmigration migration-52827 -o yaml spec: migPlanRef: name: test4 namespace: openshift-migration quiescePods: true stage: false status: conditions: - category: Advisory lastTransitionTime: "2022-05-31T10:15:05Z" message: 'Step: 30/49' reason: StageBackupCreated status: "True" type: Running - category: Required lastTransitionTime: "2022-05-31T10:13:31Z" message: The migration is ready. status: "True" type: Ready - category: Required durable: true lastTransitionTime: "2022-05-31T10:14:05Z" message: The migration registries are healthy. status: "True" type: RegistriesHealthy - category: Advisory durable: true lastTransitionTime: "2022-05-31T10:14:37Z" message: '[1] Stage pods created.' status: "True" type: StagePodsCreated itinerary: Final observedDigest: 6a51be85e3b968769b1713084a928b5114ec8e9b3c26662cf534ade8ed78b794 phase: StageBackupCreated pipeline: - completed: "2022-05-31T10:14:06Z" message: Completed name: Prepare started: "2022-05-31T10:13:31Z" - completed: "2022-05-31T10:14:26Z" message: Completed name: Backup progress: - 'Backup openshift-migration/migration-52827-initial-nrqvg: 41 out of estimated total of 41 objects backed up (17s)' started: "2022-05-31T10:14:06Z" - message: Waiting for stage backup to complete. name: StageBackup phase: StageBackupCreated progress: - 'Backup openshift-migration/migration-52827-stage-z8w4d: 0 out of estimated total of 5 objects backed up (52m56s)' - 'PodVolumeBackup openshift-migration/migration-52827-stage-z8w4d-f76h9: 0 bytes out of 0 bytes backed up (52m40s)' started: "2022-05-31T10:14:26Z" - message: Not started name: StageRestore - message: Not started name: Restore - message: Not started name: Cleanup startTimestamp: "2022-05-31T10:13:31Z" $ oc logs migration-log-reader-5d6d95499b-72bvn -c color openshift-migration velero-57c48b4bb-n9s4x velero time="2022-05-31T11:08:24Z" level=info msg="Found 1 backups in the backup location that do not exist in the cluster and need to be synced" backupLocation=automatic-c6mbt controller=backup-sync logSource="pkg/controller/backup_sync_controller.go:204" openshift-migration velero-57c48b4bb-n9s4x velero time="2022-05-31T11:08:24Z" level=info msg="Attempting to sync backup into cluster" backup=migration-58d98-initial-rsrdt backupLocation=automatic-c6mbt controller=backup-sync logSource="pkg/controller/backup_sync_controller.go:212" openshift-migration velero-57c48b4bb-n9s4x velero time="2022-05-31T11:08:24Z" level=error msg="Error getting backup metadata from backup store" backup=migration-58d98-initial-rsrdt backupLocation=automatic-c6mbt controller=backup-sync error="rpc error: code = Unknown desc = storage: object doesn't exist" error.file="/remote-source/src/github.com/vmware-tanzu/velero/pkg/persistence/object_store.go:289" error.function="github.com/vmware-tanzu/velero/pkg/persistence.(*objectBackupStore).GetBackupMetadata" logSource="pkg/controller/backup_sync_controller.go:216" openshift-migration velero-57c48b4bb-n9s4x velero time="2022-05-31T11:08:24Z" level=info msg="Validating backup storage location" backup-storage-location=automatic-c6mbt controller=backup-storage-location logSource="pkg/controller/backup_storage_location_controller.go:114" openshift-migration velero-57c48b4bb-n9s4x velero time="2022-05-31T11:08:24Z" level=info msg="Found 1 backups in the backup location that do not exist in the cluster and need to be synced" backupLocation=automatic-gt8v9 controller=backup-sync logSource="pkg/controller/backup_sync_controller.go:204" openshift-migration velero-57c48b4bb-n9s4x velero time="2022-05-31T11:08:24Z" level=info msg="Attempting to sync backup into cluster" backup=migration-58d98-initial-rsrdt backupLocation=automatic-gt8v9 controller=backup-sync logSource="pkg/controller/backup_sync_controller.go:212" openshift-migration velero-57c48b4bb-n9s4x velero time="2022-05-31T11:08:24Z" level=info msg="Backup storage location valid, marking as available" backup-storage-location=automatic-c6mbt controller=backup-storage-location logSource="pkg/controller/backup_storage_location_controller.go:121" openshift-migration velero-57c48b4bb-n9s4x velero time="2022-05-31T11:08:24Z" level=info msg="Validating backup storage location" backup-storage-location=automatic-gt8v9 controller=backup-storage-location logSource="pkg/controller/backup_storage_location_controller.go:114" openshift-migration velero-57c48b4bb-n9s4x velero time="2022-05-31T11:08:24Z" level=error msg="Error getting backup metadata from backup store" backup=migration-58d98-initial-rsrdt backupLocation=automatic-gt8v9 controller=backup-sync error="rpc error: code = Unknown desc = storage: object doesn't exist" error.file="/remote-source/src/github.com/vmware-tanzu/velero/pkg/persistence/object_store.go:289" error.function="github.com/vmware-tanzu/velero/pkg/persistence.(*objectBackupStore).GetBackupMetadata" logSource="pkg/controller/backup_sync_controller.go:216" openshift-migration velero-57c48b4bb-n9s4x velero time="2022-05-31T11:08:24Z" level=info msg="Backup storage location valid, marking as available" backup-storage-location=automatic-gt8v9 controller=backup-storage-location logSource="pkg/controller/backup_storage_location_controller.go:121" openshift-migration migration-controller-56d764884-7fxkd mtc {"level":"info","ts":1653995305.3079288,"logger":"migration","msg":"Checking registry health","migMigration":"migration-52827"} openshift-migration migration-controller-56d764884-7fxkd mtc {"level":"info","ts":1653995305.389897,"logger":"migration","msg":"Found 2/2 registries in healthy condition.","migMigration":"migration-52827","message":""} openshift-migration migration-controller-56d764884-7fxkd mtc {"level":"info","ts":1653995305.390091,"logger":"migration","msg":"[RUN] (Step 30/49) Waiting for stage backup to complete.","migMigration":"migration-52827","phase":"StageBackupCreated"} openshift-migration migration-controller-56d764884-7fxkd mtc {"level":"info","ts":1653995305.8250961,"logger":"migration","msg":"Velero Backup progress report","migMigration":"migration-52827","phase":"StageBackupCreated","backup":"openshift-migration/migration-52827-stage-z8w4d","backupProgress":["Backup openshift-migration/migration-52827-stage-z8w4d: 0 out of estimated total of 5 objects backed up (53m21s)","PodVolumeBackup openshift-migration/migration-52827-stage-z8w4d-f76h9: 0 bytes out of 0 bytes backed up (53m5s)"]} openshift-migration migration-controller-56d764884-7fxkd mtc {"level":"info","ts":1653995305.8251326,"logger":"migration","msg":"Stage Backup on source cluster is incomplete. Waiting.","migMigration":"migration-52827","phase":"StageBackupCreated","backup":"openshift-migration/migration-52827-stage-z8w4d","backupPhase":"InProgress","backupProgress":"0/5","backupWarnings":0,"backupErrors":0} Expected results: Migrations should be successful. Additional info: