Bug 1982604 - MigAnalytic fails to get source cluster resources, all reported as 0 in ocp 3.9 sometimes
Summary: MigAnalytic fails to get source cluster resources, all reported as 0 in ocp 3...
Keywords:
Status: CLOSED DUPLICATE of bug 1982729
Alias: None
Product: Migration Toolkit for Containers
Classification: Red Hat
Component: General
Version: 1.5.0
Hardware: Unspecified
OS: Unspecified
high
low
Target Milestone: ---
: 1.6.0
Assignee: Pranav Gaikwad
QA Contact: Xin jiang
Avital Pinnick
URL:
Whiteboard:
Depends On:
Blocks: 1982729
TreeView+ depends on / blocked
 
Reported: 2021-07-15 09:12 UTC by whu
Modified: 2021-08-26 18:55 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1982729 (view as bug list)
Environment:
Last Closed: 2021-08-26 18:55:00 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description whu 2021-07-15 09:12:19 UTC
Description of problem:
When create migplan for an application with PV in AWS OCP3.9 cluster, the MigAnalytic fails to get the source cluster resources, all reported as 0 sometimes.  The probability of occurrence is very high.

Version-Release number of selected component (if applicable):
MTC 1.5.0
image: quay-enterprise-quay-enterprise.apps.cam-tgt-21420.qe.devcluster.openshift.com/admin/openshift-migration-rhel7-operator:v1.5.0-23
Source cluster : AWS OCP 3.9 (controller)
Target cluster: AWS OCP 4.8

How reproducible:
1.  Prepare an nginx application in 3.9 source cluster
$ ansible-playbook deploy-app.yml -e use_role=ocp-nginxpv -e namespace=ocp-24706-basicvolmig

2. Create indirect migration plan against nginx 

3. Check analytics message in migration plan 

Actual results:
The migplan will be ready status, but there is warning message “ Failed gathering extended PV usage information for PVs [nginx-logs nginx-html] in migplan. The MigAnalytic reported as 0 

Expected results:
The migplan will be ready status without warning and error message.  The MigAnalytic reported as 0 

Additional info:
$ oc get pvc -n ocp-24706-basicvolmig
NAME     	STATUS	VOLUME                                 	CAPACITY   ACCESS MODES   STORAGECLASS   AGE
nginx-html   Bound 	pvc-602321ba-e51c-11eb-b736-0e2f00abf38f   1Gi    	RWO        	gp2        	1h
nginx-logs   Bound 	pvc-601ee567-e51c-11eb-b736-0e2f00abf38f   1Gi    	RWO        	gp2        	1h

$ oc get migplan ocp-24706-basicvolmig-migplan-1626319591  -o yaml
apiVersion: migration.openshift.io/v1alpha1
kind: MigPlan
metadata:
  ……...
  name: ocp-24706-basicvolmig-migplan-1626319591
  namespace: openshift-migration
spec:
  destMigClusterRef:
	name: target-cluster
	namespace: openshift-migration
  indirectImageMigration: true
  indirectVolumeMigration: true
  migStorageRef:
	name: automatic
	namespace: openshift-migration
  namespaces:
  - ocp-24706-basicvolmig
  persistentVolumes:
  - capacity: 1Gi
	name: pvc-601ee567-e51c-11eb-b736-0e2f00abf38f
	proposedCapacity: "0"
	pvc:
  	accessModes:
  	- ReadWriteOnce
  	hasReference: true
  	name: nginx-logs
  	namespace: ocp-24706-basicvolmig
	selection:
  	action: copy
  	copyMethod: filesystem
  	storageClass: gp2
	storageClass: gp2
	supported:
  	actions:
  	- copy
  	- move
  	copyMethods:
  	- filesystem
  	- snapshot
  - capacity: 1Gi
	name: pvc-602321ba-e51c-11eb-b736-0e2f00abf38f
	proposedCapacity: "0"
	pvc:
  	accessModes:
  	- ReadWriteOnce
  	hasReference: true
  	name: nginx-html
  	namespace: ocp-24706-basicvolmig
	selection:
  	action: copy
  	copyMethod: filesystem
  	storageClass: gp2
	storageClass: gp2
	supported:
  	actions:
  	- copy
  	- move
  	copyMethods:
  	- filesystem
  	- snapshot
  srcMigClusterRef:
	name: host
	namespace: openshift-migration
status:
  conditions:
  - category: Required
	lastTransitionTime: 2021-07-15T03:26:36Z
	message: The `persistentVolumes` list has been updated with discovered PVs.
	reason: Done
	status: "True"
	type: PvsDiscovered
  - category: Required
	lastTransitionTime: 2021-07-15T03:26:36Z
	message: The storage resources have been created.
	reason: Done
	status: "True"
	type: StorageEnsured
  - category: Warn
	lastTransitionTime: 2021-07-15T04:11:44Z
	message: Failed gathering extended PV usage information for PVs [nginx-logs nginx-html],
  	please see MigAnalytic openshift-migration/ocp-24706-basicvolmig-migplan-1626319591-szwd6
  	for details
	reason: FailedRunningDf
	status: "True"
	type: ExtendedPVAnalysisFailed
  - category: Required
	lastTransitionTime: 2021-07-15T03:26:36Z
	message: The migration plan is ready.
	status: "True"
	type: Ready
  destStorageClasses:
  - accessModes:
	- ReadWriteOnce
	default: true
	name: gp2
	provisioner: kubernetes.io/aws-ebs
  - accessModes:
	- ReadWriteOnce
	name: gp2-csi
	provisioner: ebs.csi.aws.com
  excludedResources:
………

$ oc get miganalytic ocp-24706-basicvolmig-migplan-1626319591-szwd6  -n openshift-migration -o yaml
apiVersion: migration.openshift.io/v1alpha1
kind: MigAnalytic
metadata:
 ……...
  name: ocp-24706-basicvolmig-migplan-1626319591-szwd6
  namespace: openshift-migration
spec:
  analyzeExtendedPVCapacity: true
  analyzeImageCount: false
  analyzeK8SResources: false
  analyzePVCapacity: false
  migPlanRef:
	name: ocp-24706-basicvolmig-migplan-1626319591
	namespace: openshift-migration
status:
  analytics:
	excludedk8sResourceTotal: 0
	imageCount: 0
	imageSizeTotal: "0"
	incompatiblek8sResourceTotal: 0
	k8sResourceTotal: 0
	namespaces:
	- excludedK8SResourceTotal: 0
  	imageCount: 0
  	imageSizeTotal: "0"
  	incompatibleK8SResourceTotal: 0
  	k8sResourceTotal: 0
  	namespace: ocp-24706-basicvolmig
  	persistentVolumes:
  	- actualCapacity: "0"
    	comment: No change in PV capacity is needed.
    	name: nginx-logs
    	proposedCapacity: "0"
    	requestedCapacity: 1Gi
  	- actualCapacity: "0"
    	comment: No change in PV capacity is needed.
    	name: nginx-html
    	proposedCapacity: "0"
    	requestedCapacity: 1Gi
  	pvCapacity: "0"
  	pvCount: 0
	percentComplete: 100
	plan: ocp-24706-basicvolmig-migplan-1626319591
	pvCapacity: "0"
	pvCount: 0
  conditions:
  - category: Warn
	lastTransitionTime: 2021-07-15T03:26:34Z
	message: Failed gathering extended PV usage information for PVs [nginx-logs nginx-html]
	reason: FailedRunningDf
	status: "True"
	type: ExtendedPVAnalysisFailed
  - category: Required
	lastTransitionTime: 2021-07-15T03:26:34Z
	message: The analytic is ready.
	status: "True"
	type: Ready
  observedGeneration: 1

Comment 1 whu 2021-07-15 09:16:22 UTC
one similar bug : https://bugzilla.redhat.com/show_bug.cgi?id=1918504

Comment 2 Jason Montleon 2021-07-15 17:22:13 UTC
Copying my notes from the 1.4.6 bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1982729#c2

This isn't as much a problem with Analytics as it is the pv resize feature which attempts to use the restic daemonset to determine the actual disk usage of the volumes.

IIUC the failures are limited to using analytics for pv resize when migrating from older OCP releases (3.7, 3.9) and the pod comes into existence after the restic daemonset was started.

restic uses a hostPath mount to peer into the volume and bind remount does not exist on these versions so if the application comes up after the daemonset it is oblivious to it.

Possible solutions might include restarting the daeonset before running the analytic (I think this would be costly performance wise on large clusters) or creating a pod on the node to run the size check instead of using the restic daemonset so it always exists after the application.

Comment 3 whu 2021-07-16 09:46:49 UTC
Additional information

After clicking "refresh` button, the analytic were updated and got the correct value.
But the warning regarding the pv resize is still in migplan.

Comment 4 Pranav Gaikwad 2021-07-16 19:11:47 UTC
I captured a note describing this limitation in our upstream docs here: https://github.com/konveyor/mig-operator/pull/716

Comment 5 Pranav Gaikwad 2021-08-26 18:55:00 UTC
This is a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1982729 and we do not intend to fix the underlying problem in MTC as it only affects certain cases on 3.9 clusters and there is a workaround available to unblock the users. The documentation around this is already merged. As a result, I am closing this issue.

*** This bug has been marked as a duplicate of bug 1982729 ***


Note You need to log in before you can comment on or make changes to this bug.