Description of problem: When we run a migration with the "verify" option enabled, and there is corrupted data, the migmigration plan is not displaying a `ResticVerifyErrors`. Instead of that we get a `VeleroStageRestorePartiallyFailed` warning. Version-Release number of selected component (if applicable): TARGET CLUSTER: AWS OCP4.9 GP2 (MTC 1.6.0) (CONTROLLER + IU) SOURCE CLUSTER: AWS OCP3.11 GP2 (MTC 1.5.1) REPLICATION REPOSITORY: AWS S3 How reproducible: Always Steps to Reproduce: 1. Follow "OCP-30240 - Data validation with indirect migration" test steps Actual results: The migmigration resource displays this warning 12:41:14.479 - INFO: "category": "Warn", 12:41:14.480 - INFO: "durable": true, 12:41:14.482 - INFO: "lastTransitionTime": "2021-09-17T12:40:55Z", 12:41:14.483 - INFO: "message": "Stage Restore openshift-migration/ocp-30240-datavalidation-mig-1631882348-stage-lnsfb: partially failed on destination cluster", 12:41:14.484 - INFO: "status": "True", 12:41:14.485 - INFO: "type": "VeleroStageRestorePartiallyFailed" 12:41:14.485 - INFO: }, The podvolumerestore resource is displaying this error status: completionTimestamp: "2021-09-17T12:49:22Z" message: |- error restoring volume: error running restic restore, cmd=restic restore --repo=s3:s3-us-east-2.amazonaws.com/camreplication/velero/restic/ocp-30240-datavalidation --password-file=/tmp/credentials/openshift-migration/velero-restic-credentials-repository-password --cache-dir=/scratch/.cache/restic 8b2700ff --target=. --delete --verify, stdout=restoring <Snapshot 8b2700ff of [/host_pods/fecb458b-17b4-11ec-97b2-0ef8128329e3/volumes/kubernetes.io~aws-ebs/pvc-feb97997-17b4-11ec-97b2-0ef8128329e3] at 2021-09-17 12:49:03.908299474 +0000 UTC by root@velero> to . verifying files in . , stderr=ignoring error for /index.html: Invalid file size: expected 21 got 0 Fatal: There were 1 errors : exit status 1 phase: Failed progress: bytesDone: 21 totalBytes: 21 startTimestamp: "2021-09-17T12:49:20Z" The restore resource is displaying this error: status: completionTimestamp: "2021-09-17T12:49:24Z" errors: 1 phase: PartiallyFailed progress: itemsRestored: 6 totalItems: 6 startTimestamp: "2021-09-17T12:49:07Z" warnings: 2 If we describe the restore resource we get this error: Errors: Velero: pod volume restore failed: error restoring volume: error running restic restore, cmd=restic restore --repo=s3:s3-us-east-2.amazonaws.com/camreplication/velero/restic/ocp-30240-datavalidation --password-file=/tmp/credentials/openshift-migration/velero-restic-credentials-repository-password --cache-dir=/scratch/.cache/restic 8b2700ff --target=. --delete --verify, stdout=restoring <Snapshot 8b2700ff of [/host_pods/fecb458b-17b4-11ec-97b2-0ef8128329e3/volumes/kubernetes.io~aws-ebs/pvc-feb97997-17b4-11ec-97b2-0ef8128329e3] at 2021-09-17 12:49:03.908299474 +0000 UTC by root@velero> to . verifying files in . , stderr=ignoring error for /index.html: Invalid file size: expected 21 got 0 Fatal: There were 1 errors : exit status 1 Cluster: <none> Namespaces: <none> In the restic pod we can see this error: time="2021-09-17T12:47:06Z" level=info msg="Ran command=restic restore --repo=s3:s3-us-east-2.amazonaws.com/camreplication/velero/restic/ocp-30240-datavalidation --password-file=/tmp/credentials/openshift-migration/velero-restic-credentials-repository-password --cache-dir=/scratch/.cache/restic 35cc7b9a --target=. --delete --verify, stdout=restoring <Snapshot 35cc7b9a of [/host_pods/fecb458b-17b4-11ec-97b2-0ef8128329e3/volumes/kubernetes.io~aws-ebs/pvc-feb97997-17b4-11ec-97b2-0ef8128329e3] at 2021-09-17 12:46:22.060127014 +0000 UTC by root@velero> to .\nverifying files in .\nfinished verifying 1 files in .\n, stderr=" controller=pod-volume-restore logSource="pkg/controller/pod_volume_restore_controller.go:399" name=ocp-30240-datavalidation-mig-1631882720-stage-phas-stage-zm5drc namespace=openshift-migration restore=openshift-migration/ocp-30240-datavalidation-mig-1631882720-stage-phas-stage-zfqfh time="2021-09-17T12:47:06Z" level=info msg="Restore completed" controller=pod-volume-restore logSource="pkg/controller/pod_volume_restore_controller.go:323" name=ocp-30240-datavalidation-mig-1631882720-stage-phas-stage-zm5drc namespace=openshift-migration restore=openshift-migration/ocp-30240-datavalidation-mig-1631882720-stage-phas-stage-zfqfh time="2021-09-17T12:47:06Z" level=info msg="Restore starting" controller=pod-volume-restore logSource="pkg/controller/pod_volume_restore_controller.go:282" name=ocp-30240-datavalidation-mig-1631882720-stage-phas-stage-z9gz8w namespace=openshift-migration restore=openshift-migration/ocp-30240-datavalidation-mig-1631882720-stage-phas-stage-zfqfh time="2021-09-17T12:47:08Z" level=info msg="Ran command=restic restore --repo=s3:s3-us-east-2.amazonaws.com/camreplication/velero/restic/ocp-30240-datavalidation --password-file=/tmp/credentials/openshift-migration/velero-restic-credentials-repository-password --cache-dir=/scratch/.cache/restic 2cb7df66 --target=. --delete --verify, stdout=restoring <Snapshot 2cb7df66 of [/host_pods/fecb458b-17b4-11ec-97b2-0ef8128329e3/volumes/kubernetes.io~aws-ebs/pvc-feb4a4b2-17b4-11ec-97b2-0ef8128329e3] at 2021-09-17 12:46:19.488067273 +0000 UTC by root@velero> to .\nverifying files in .\nfinished verifying 2 files in .\n, stderr=" controller=pod-volume-restore logSource="pkg/controller/pod_volume_restore_controller.go:399" name=ocp-30240-datavalidation-mig-1631882720-stage-phas-stage-z9gz8w namespace=openshift-migration restore=openshift-migration/ocp-30240-datavalidation-mig-1631882720-stage-phas-stage-zfqfh time="2021-09-17T12:47:08Z" level=info msg="Restore completed" controller=pod-volume-restore logSource="pkg/controller/pod_volume_restore_controller.go:323" name=ocp-30240-datavalidation-mig-1631882720-stage-phas-stage-z9gz8w namespace=openshift-migration restore=openshift-migration/ocp-30240-datavalidation-mig-1631882720-stage-phas-stage-zfqfh time="2021-09-17T12:49:20Z" level=info msg="Restore starting" controller=pod-volume-restore logSource="pkg/controller/pod_volume_restore_controller.go:282" name=ocp-30240-datavalidation-mig-1631882901-stage-jzq7g-5mmnv namespace=openshift-migration restore=openshift-migration/ocp-30240-datavalidation-mig-1631882901-stage-jzq7g time="2021-09-17T12:49:22Z" level=error msg="Error restoring volume" controller=pod-volume-restore error="error running restic restore, cmd=restic restore --repo=s3:s3-us-east-2.amazonaws.com/camreplication/velero/restic/ocp-30240-datavalidation --password-file=/tmp/credentials/openshift-migration/velero-restic-credentials-repository-password --cache-dir=/scratch/.cache/restic 8b2700ff --target=. --delete --verify, stdout=restoring <Snapshot 8b2700ff of [/host_pods/fecb458b-17b4-11ec-97b2-0ef8128329e3/volumes/kubernetes.io~aws-ebs/pvc-feb97997-17b4-11ec-97b2-0ef8128329e3] at 2021-09-17 12:49:03.908299474 +0000 UTC by root@velero> to .\nverifying files in .\n, stderr=ignoring error for /index.html: Invalid file size: expected 21 got 0\nFatal: There were 1 errors\n\n: exit status 1" error.file="/go/src/github.com/vmware-tanzu/velero/pkg/controller/pod_volume_restore_controller.go:397" error.function="github.com/vmware-tanzu/velero/pkg/controller.(*podVolumeRestoreController).restorePodVolume" logSource="pkg/controller/pod_volume_restore_controller.go:310" name=ocp-30240-datavalidation-mig-1631882901-stage-jzq7g-5mmnv namespace=openshift-migration restore=openshift-migration/ocp-30240-datavalidation-mig-1631882901-stage-jzq7g time="2021-09-17T12:49:22Z" level=info msg="Restore starting" controller=pod-volume-restore logSource="pkg/controller/pod_volume_restore_controller.go:282" name=ocp-30240-datavalidation-mig-1631882901-stage-jzq7g-j2nxn namespace=openshift-migration restore=openshift-migration/ocp-30240-datavalidation-mig-1631882901-stage-jzq7g time="2021-09-17T12:49:24Z" level=info msg="Ran command=restic restore --repo=s3:s3-us-east-2.amazonaws.com/camreplication/velero/restic/ocp-30240-datavalidation --password-file=/tmp/credentials/openshift-migration/velero-restic-credentials-repository-password --cache-dir=/scratch/.cache/restic e3a4a92a --target=. --delete --verify, stdout=restoring <Snapshot e3a4a92a of [/host_pods/fecb458b-17b4-11ec-97b2-0ef8128329e3/volumes/kubernetes.io~aws-ebs/pvc-feb4a4b2-17b4-11ec-97b2-0ef8128329e3] at 2021-09-17 12:49:01.578486552 +0000 UTC by root@velero> to .\nverifying files in .\nfinished verifying 2 files in .\n, stderr=" controller=pod-volume-restore logSource="pkg/controller/pod_volume_restore_controller.go:399" name=ocp-30240-datavalidation-mig-1631882901-stage-jzq7g-j2nxn namespace=openshift-migration restore=openshift-migration/ocp-30240-datavalidation-mig-1631882901-stage-jzq7g time="2021-09-17T12:49:24Z" level=info msg="Restore completed" controller=pod-volume-restore logSource="pkg/controller/pod_volume_restore_controller.go:323" name=ocp-30240-datavalidation-mig-1631882901-stage-jzq7g-j2nxn namespace=openshift-migration restore=openshift-migration/ocp-30240-datavalidation-mig-1631882901-stage-jzq7g Expected results: The migmigration resource should display a `ResticVerifyErrors` warning instead of a `VeleroStageRestorePartiallyFailed` warning. Additional info:
this pr - https://github.com/openshift/velero/pull/147 - fixes the issue
Verified with MTC 1.7.0 (pre-stage) metadata_nvr: openshift-migration-operator-metadata-container-v1.7.0-33 image: registry.redhat.io/rhmtc/openshift-migration-controller-rhel8@sha256:c08211e7fa12e8a12ea07ed0f31bf24bdd4cc95b166a34d826e3240a0534e622 Migmigration resource is displaying the correct warning. $ oc get migmigration ocp-30240-datavalidation-mig-1645536593 -o yaml - category: Warn durable: true lastTransitionTime: "2022-02-22T13:32:27Z" message: There were verify errors found in 1 Restic volume restores. See restore `ocp-30240-datavalidation-mig-1645536593-stage-hqzds` for details status: "True" type: ResticVerifyErrors Moving status to Verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Migration Toolkit for Containers (MTC) 1.7.0 release advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:1043