Description of problem: Since the stage pod creating failed due to OutOfcpu in the source cluster, the Stage action is stuck for a long time, more than 35 mins. Version-Release number of selected component (if applicable): MTC 1.3.0 How reproducible: always Steps to Reproduce: 1. in order to reproduce this issue, you could add a resource quota to limit the cup resource to the application's namespace in the source cluster 2. create a migplan to migrate the application and launch Stage action Actual results: The Stage action is stuck at Stage Running status for a long time Expected results: The Stage should fail after the Stage process retrys a period of time Additional info: Source cluster: $ oc get pod -n ocp-29918-hooks NAME READY STATUS RESTARTS AGE nginx-deployment-69ff56478c-d6rxn 1/1 Running 0 125m stage-nginx-deployment-69ff56478c-d6rxn-2lxvt 0/1 OutOfcpu 0 15m Target cluster: $ oc get migmigration -n openshift-migration c530ab50-fcb6-11ea-a51b-e794783a4dec NAME READY PLAN STAGE ITINERARY PHASE AGE c530ab50-fcb6-11ea-a51b-e794783a4dec True ocp-29918-hooks true Stage StagePodsCreated 45m
As of 1.4.z+, we think a status condition should be raised in the event that the pod doesn't become healthy during a certain timeout period. Let's verify that to be the case as of 1.6.0 and we can close as fixed.
https://github.com/konveyor/mig-controller/pull/747 This PR will report that the stage pod failed and will fail the migration. Closing this since the fix is released, please re-open if the problem is not solved.