1906724 – A rollback after a "move" migration does not delete the PVs

Bug 1906724 - A rollback after a "move" migration does not delete the PVs

Summary: A rollback after a "move" migration does not delete the PVs

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Migration Toolkit for Containers
Classification:	Red Hat
Component:	General
Sub Component:
Version:	1.4.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	1.4.0
Assignee:	Derek Whatley
QA Contact:	Xin jiang
Docs Contact:	Avital Pinnick
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-12-11 09:39 UTC by Sergio
Modified:	2021-02-11 12:55 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2021-02-11 12:54:49 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2020:5329	0	None	None	None	2021-02-11 12:55:07 UTC

Description Sergio 2020-12-11 09:39:53 UTC

Description of problem:
If we run a migration using "move" action and we rollback this migration the PVs are not deleted in the destination cluster.

Version-Release number of selected component (if applicable):
MTC 1.4.0 stage

How reproducible:
Always

Steps to Reproduce:
1. Deploy and application using a external nfs volume (that can be moved)
2. Migrate the application to the destination cluster using "move" action to migrate the PVs.
3. After the migration ends, rollback the migration.


Actual results:
The PVs are not delete in the destination cluster.


$ oc get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS     CLAIM                                            STORAGECLASS                  REASON   AGE
pv18                                       100Gi      RWO,ROX,RWX    Retain           Released   testmove2/nginx-html                                                                    81m
pv29                                       100Gi      RWO,ROX,RWX    Retain           Released   testmove2/nginx-logs                                                                    81m

We can see that after the rollback the PVs are there, and they have a "Released" status.

If we migrate again the same application with a "move" action the result is that, since the PVs are previously there in the destination cluster because of the rollback, the resulting PVCs after migrate -> rollback -> migrate is this one

$ oc get pvc
NAME         STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
nginx-html   Pending   pv18     0                                        29s
nginx-logs   Pending   pv29     0                                        29s


Expected results:
The PVs should be deleted after the rollback, and when we migrate the application again after the rollback the PVs should have the right content.


Additional info:

Right now we are always rollbacking "Retain" policy PVs, since MTC always changes the policy to "Retain" before moving the PV.

This MTC behavior could change in the future (it's not already decided), and the original policy could be restored at the end of the migration, so probably we should take into account other policies, like "Recycle", that if we do not handle carefully could end in all the data being deleted from the storage.

Comment 1 Derek Whatley 2020-12-11 21:55:20 UTC

My first thought is that PVCs that are moved do not get the same correlation labels that we attach to other migrated resources. I'll check into this.

Comment 2 Derek Whatley 2021-01-06 22:08:07 UTC

I was able to reproduce this today. The PVC is deleted but the PV remains. I believe this is a more general problem with rollback not handling cluster-scoped resources correctly. I noticed that the delete loop we have in place only appears to look at namespaced resources.

https://github.com/konveyor/mig-controller/blob/ff4ed4f194cdbe511b01ecbe50dbe0022da0f7f4/pkg/controller/migmigration/restore.go#L584-L623

Will work on a fix tomorrow.

Comment 3 Derek Whatley 2021-01-08 19:23:10 UTC

Fix merged and cherry-picked into release-1.4.0 branch.

https://github.com/konveyor/mig-controller/commit/f669d75d66689c94baa92ed6ada60b6e91daeb67

Comment 7 Sergio 2021-01-21 13:29:29 UTC

Verified using MTC 1.4.0

openshift-migration-rhel7-operator@sha256:ae21f9a062bc660957807dfc540bf8c91f52b8d076b674990ad38aca6c76b7b4
    - name: MIG_CONTROLLER_REPO
      value: openshift-migration-controller-rhel8@sha256
    - name: MIG_CONTROLLER_TAG
      value: 4168777e430bada2e4a303e9df86529a62d184aa99593f93ceba8c86a5ed460f

After the rollback the moved PVs in the target cluster are deleted.

We move the issue to VERIFIED status.

Comment 9 errata-xmlrpc 2021-02-11 12:54:49 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Migration Toolkit for Containers (MTC) tool image release advisory 1.4.0), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:5329

Note You need to log in before you can comment on or make changes to this bug.