Bug 2189408 - [FaaS-Migration] After migration of Provider , new provider also start uninstalling after some time
Summary: [FaaS-Migration] After migration of Provider , new provider also start unins...
Keywords:
Status: VERIFIED
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: odf-managed-service
Version: 4.13
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Ritesh Chikatwar
QA Contact: Neha Berry
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-04-25 06:50 UTC by suchita
Modified: 2023-08-09 17:00 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github rchikatw odf-managed-service-migration pull 34 0 None Merged Cleaning up code 2023-04-25 13:04:23 UTC

Description suchita 2023-04-25 06:50:16 UTC
Description of problem:
Faas Provider also in uninstallation state after 20-30 min of successful migration of provider

Version-Release number of selected component (if applicable):
$ oc get csv -n fusion-storage
NAME                                      DISPLAY                       VERSION             REPLACES                                  PHASE
managed-fusion-agent.v2.0.11              Managed Fusion Agent          2.0.11                                                        Succeeded
observability-operator.v0.0.20            Observability Operator        0.0.20              observability-operator.v0.0.19            Succeeded
ocs-operator.v4.13.0-168.stable           OpenShift Container Storage   4.13.0-168.stable                                             Succeeded
ose-prometheus-operator.4.10.0            Prometheus Operator           4.10.0                                                        Succeeded
route-monitor-operator.v0.1.500-6152b76   Route Monitor Operator        0.1.500-6152b76     route-monitor-operator.v0.1.498-e33e391   Pending

clusterVersion:
NAME      VERSION   
version   4.12.13

How reproducible:
2/2

Steps to Reproduce:
1. Install Appliance mode cluster
2. install FasS agent provider
3. Use migrate.sh script to migrate cluster. 
 ./migrate.sh -provider <oldClusterID> <newClusterID> -d -dev
repo checked out after PR#33 merged
4. wait for provider  migration completion
5. Keep a watch on all cluste staus using rosa list cluster command 


Actual results:
provider Migration is completed and old provider and its service are deleted along with new FasS agent provider cluster . 

Expected results:
provider Migration is completed and old provider and its service are deleted. New FasS agent provider cluster should remain in a ready state. 

Additional info:

Workaround: 
The root cause is noted in another BUg https://bugzilla.redhat.com/show_bug.cgi?id=2189409
this causes the deletion of the New cluster while deleting the older appliance provider.

the workaround is as soon as migration script shows below message, Go to aws consume-> volumes-> add filer of volumes with new provider names. Go to each mons and OSD volumes details page, tags-> managed tags and delete the tag with key contails name of oldprovider 
ex: key kubernetes.io/cluster/sgatfane-p1425-svjlq" with value "owned" where sgatfane-p1425 is appliance cluster name

Comment 1 Rewant 2023-04-25 10:43:49 UTC
The script fetches the aws EBS volumes key, using 

aws ec2 describe-volumes --volume-id $volumeID --filters Name=tag:kubernetes.io/created-for/pvc/namespace,Values=openshift-storage  --region $region --query "Volumes[*].Tags" | jq .[] | jq -r '.[]| select (.Value == "owned")|.Key', 

which is then used to replace the tag for name. if the default output is not set to json in aws configure, it will fail.

If the owned tag is not deleted for old provider, the EBS volumes will be deleted when it deletes the cluster. We think this might be the reason the new provider got deleted.

We added the --output flag to each commands where we fetch the tags. That should solve the issue.

PR: https://github.com/rchikatw/odf-managed-service-migration/pull/34

Comment 2 Ritesh Chikatwar 2023-04-25 13:04:24 UTC
Suchita,

PR: https://github.com/rchikatw/odf-managed-service-migration/pull/34 is merged please take the latest changes and verify the migration.

Comment 3 suchita 2023-04-26 07:13:09 UTC
Yesterday's migration provider is even with a workaround -Removal of tags with appliance mode provider name, in volumes. 
Still this issue is observed and not immediately after migration, this is after typically 12-14 hours of the first FasS provider creation.

I will update further after migration with changes from PR#34

Comment 4 suchita 2023-04-26 07:13:10 UTC
Yesterday's migration provider is even with a workaround -Removal of tags with appliance mode provider name, in volumes. 
Still this issue is observed and not immediately after migration, this is after typically 12-14 hours of the first FasS provider creation.

I will update further after migration with changes from PR#34

Comment 5 suchita 2023-05-08 08:06:28 UTC
After observation on 4 migration setups, ( migration with >= PR#34), this uninstallation of the provider is not observed.
Marking this BZ as verified.


Note You need to log in before you can comment on or make changes to this bug.