Bug 1844638 - Automatic rollback of migrated workloads is not configurable
Summary: Automatic rollback of migrated workloads is not configurable
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Migration Tooling
Version: 4.4
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.4.z
Assignee: Derek Whatley
QA Contact: Xin jiang
URL:
Whiteboard:
Depends On: 1845092
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-06-05 20:58 UTC by Derek Whatley
Modified: 2020-06-17 00:04 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1845092 (view as bug list)
Environment:
Last Closed: 2020-06-17 00:04:13 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2020:2571 0 None None None 2020-06-17 00:04:22 UTC

Description Derek Whatley 2020-06-05 20:58:09 UTC
Description of problem:
When a migration fails, the user running a migration has no option to leave partially migrated workloads in place on the destination so that they can finish the migration manually.

Version-Release number of selected component (if applicable):
1.2.0

How reproducible:
Always

Steps to Reproduce:
1. Run a migration
2. Encounter an error

Actual results:
Migration enters failed state and then runs "FailedItinerary" which deletes resources from target cluster and scales them back up on source


Expected results:
User is given a configuration option to allow for enabling or disabling of automatic rollback functionality


Additional info:

Comment 1 Derek Whatley 2020-06-05 20:58:51 UTC
PRs adding a config switch for this are up, working on getting them tested.

mig-controller: https://github.com/konveyor/mig-controller/pull/560
mig-operator: https://github.com/konveyor/mig-operator/pull/370

Comment 2 Derek Whatley 2020-06-08 20:28:41 UTC
PRs merged and cherry-picked to release-1.2.2 branches, waiting for next build.


https://github.com/konveyor/mig-controller/pull/560
https://github.com/konveyor/mig-operator/pull/370

Comment 6 Sergio 2020-06-10 15:25:06 UTC
Verified using CAM 1.2.2 stage

In order to verify the issue, we configured a very short restic timeout and removed the restic pods to force a failure in 'StageRestoreCreated' stage, and run a quiesced migration of a django application

By default (not configured mig_failure_rollback attribute in MigrationController resource) the migration failed, and the result was pods being quiesced in the source application.

When we configured "mig_failure_rollback: true" and run the application again the PVC was deleted in the target cluster and the pods in the source cluster were scaled up again and were working fine.


    - name: MIG_CONTROLLER_REPO
      value: openshift-migration-controller-rhel8@sha256
    - name: MIG_CONTROLLER_TAG
      value: 3923f6000eaff8c5f02d778e1d7b93515a8bc23990d54f917c30a108f7a37b3a
    - name: MIG_UI_REPO
      value: openshift-migration-ui-rhel8@sha256
    - name: MIG_UI_TAG
      value: 6abfaea8ac04e3b5bbf9648a3479b420b4baec35201033471020c9cae1fe1e11
    - name: MIGRATION_REGISTRY_REPO
      value: openshift-migration-registry-rhel8@sha256
    - name: MIGRATION_REGISTRY_TAG
      value: ea6301a15277d448c8756881c7e2e712893ca8041c913476640f52da9e76cad9
    - name: VELERO_REPO
      value: openshift-migration-velero-rhel8@sha256
    - name: VELERO_TAG
      value: 1a33e327dd610f0eebaaeae5b3c9b4170ab5db572b01a170be35b9ce946c0281
    - name: VELERO_PLUGIN_REPO
      value: openshift-migration-plugin-rhel8@sha256
    - name: VELERO_PLUGIN_TAG
      value: 7eba00127497c4ca6452f9be0c167c2276bed462b648edf51d8bbe7265392879

Comment 8 errata-xmlrpc 2020-06-17 00:04:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:2571


Note You need to log in before you can comment on or make changes to this bug.