Bug 1829401 - Failed hooks cannot be run again
Summary: Failed hooks cannot be run again
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Migration Tooling
Version: 4.4
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.4.0
Assignee: Jason Montleon
QA Contact: Xin jiang
URL: https://github.com/konveyor/mig-contr...
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-04-29 13:59 UTC by Sergio
Modified: 2020-05-28 11:11 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-05-28 11:10:47 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2020:2326 0 None None None 2020-05-28 11:11:15 UTC

Description Sergio 2020-04-29 13:59:54 UTC
Description of problem:
When we add a hook to a migration plan, if this hook fails and we cancel and retrigger the migration, a new job for the hook is not created, and since the failed previous job already reached the max backoff limit, the playbook is never run.

Version-Release number of selected component (if applicable):
KONVEYOR 1.2

How reproducible:
Always.

Steps to Reproduce:
1. Create a migration plan with a hook that will always fail (any hook with syntax errors, for instance)
2. Migrate the plan
3. The migration will fail (there is a bug that doesnt fail the migration and the migration is stuck forever, if that happens, cancel the migration)
4. Run the migration again

Actual results:
The new run will not create a new job for the hook. And since the old job already reached the max backoff limit, the hook will never be executed.

Expected results:
Since we are running the migration again, a new job should be created for the hook and the hook should be run again.

Additional info:

Comment 1 Jason Montleon 2020-04-30 14:27:58 UTC
I added a label we can look for based on the migmigration UID. Since each run creates a new migmigration this stays unique and allows for a new copy to be created.

Comment 5 Sergio 2020-05-08 10:38:31 UTC
Verified using CAM 1.2 stage

A failed hook is retriggered using an new job for the hook

$ oc get jobs
NAME                         COMPLETIONS   DURATION   AGE
hookfail-postrestore-h4877   0/1           13m        13m
hookfail-postrestore-lt6fg   0/1           29m        29m

Comment 7 errata-xmlrpc 2020-05-28 11:10:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:2326


Note You need to log in before you can comment on or make changes to this bug.