Bug 1906730

Summary: PV policy has changed after running a migration using the "move" action
Product: Migration Toolkit for Containers Reporter: Sergio <sregidor>
Component: GeneralAssignee: Scott Seago <sseago>
Status: CLOSED ERRATA QA Contact: Xin jiang <xjiang>
Severity: medium Docs Contact: Avital Pinnick <apinnick>
Priority: medium    
Version: 1.4.0CC: chezhang, ernelson, rjohnson, rpattath, whu, xjiang
Target Milestone: ---   
Target Release: 1.4.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-02-11 12:54:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Sergio 2020-12-11 10:07:48 UTC
Description of problem:
When we move a PV with a policy different from "Retain" the policy is changed to "Retain", and it remains like that in the destination cluster after the migration ends.

Version-Release number of selected component (if applicable):
MTC 1.4.0

How reproducible:
Always

Steps to Reproduce:
1. Deploy an application using an external nfs volume (that can be migrated)
2. Configure "Recycle" as the PV policy in the PVs used by this application

oc edit pv ${PV_NAME}

for instance
oc get pv pv18
NAME      CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS    CLAIM                  STORAGECLASS   REASON    AGE
pv18      100Gi      RWO,ROX,RWX    Recycle          Bound     testmove2/nginx-html                            26m


3. Migrate the application using "move" action to migrate the PV.


Actual results:

In the source and destination cluster the PV will be configured with "Retain" policy, instead of the original "Recycle". 

$ oc get pv pv18
NAME      CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS    CLAIM                  STORAGECLASS   REASON    AGE
pv18      100Gi      RWO,ROX,RWX    Retain           Bound     testmove2/nginx-html                            31m


Expected results:
To configure the PV with "Retain" policy is necessary to avoid undesired behaviors while migrating the volume.

Maybe the solution is not to restore the original policy, but to add some information in the MTC "move" action documentation explaining that this is a side effect of the "move" action, and that if we want the original policy back it should be done manually.


Additional info:

Be aware that restoring the original policy could lead to undesired interactions with rollback, and it needs to be handled carefully to avoid the volume data deletion, for instance, in case of "recycle".

Comment 1 Scott Seago 2021-01-04 18:27:54 UTC
What's not yet clear is what we want to do to fix this. There are a few options:
1) Do nothing; document this as a side effect of move operations
2) Annotate PVCs where we modify policy and restore the original policy post-restore. As mentioned in the bug, this has implications for rollback. It's possible that rollback will lose PV data if we do this and later attempt rollback.
3) Do the work from 2) above and also exclude these PVCs from rollback; document that "rollback" doesn't roll back moved PVs"
4) Annotate as on 2) but don't update policy. Provide some sort of post-migration action that will restore PVC policy once we no longer plan to do any more rollback operations. Document that once this is done, rollback is dangerous for moved PVs. Would this be a shell script or some other manual operation?

Comment 2 Scott Seago 2021-01-05 19:41:41 UTC
As discussed with John this morning, I think the right approach here is choice #4.

We'll create an annotation for any PVC where we modify the policy, but we won't actually modify the policy post-migration. Reverting the policy post-migration brings with it the risk that rollback will delete user data. We don't want that, but we want to provide users with the information necessary to manually restore the pre-migration policy. To do this we want to add an annotation indicating the previous policy any time we modify policy as part of a migration action.

Comment 3 Scott Seago 2021-01-08 21:02:07 UTC
Fix is here: https://github.com/konveyor/openshift-velero-plugin/pull/67

Comment 7 Sergio 2021-01-22 15:48:46 UTC
Verified using MTC 1.4.0

openshift-migration-rhel7-operator@sha256:41d653c7fea749cb6add3aadb5716b9a443ab83adef88504915177bbc5aa0fda
    - name: MIG_CONTROLLER_REPO
      value: openshift-migration-controller-rhel8@sha256
    - name: MIG_CONTROLLER_TAG
      value: 94b2886c52a97b39b6643cca5af7948a90be8a89f5110d596be117f34d54abff



2 PVCs were moved. One with Retain policy, the other one with Recycle policy.

After the migration both pvcs had Retain policy.

After the migration the pvc that originally had Recycle policy had this annotation:

metadata:
  annotations:
    migration.openshift.io/orig-reclaim-policy: Recycle


We need to take into account that if we execute a move migration and a rollback, the Recycle PV will have "Retain" policy after the rollback instead of the original one. It will have the annotation too. The rollback will not restore the original policy.

Move to VERIFIED status.

Comment 9 errata-xmlrpc 2021-02-11 12:54:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Migration Toolkit for Containers (MTC) tool image release advisory 1.4.0), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:5329