1829401 – Failed hooks cannot be run again

Bug 1829401 - Failed hooks cannot be run again

Summary: Failed hooks cannot be run again

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Migration Tooling
Sub Component:
Version:	4.4
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	4.4.0
Assignee:	Jason Montleon
QA Contact:	Xin jiang
Docs Contact:
URL:	https://github.com/konveyor/mig-contr...
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-04-29 13:59 UTC by Sergio
Modified:	2020-05-28 11:11 UTC (History)
CC List:	8 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:	2020-05-28 11:10:47 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHEA-2020:2326	0	None	None	None	2020-05-28 11:11:15 UTC

Description Sergio 2020-04-29 13:59:54 UTC

Description of problem:
When we add a hook to a migration plan, if this hook fails and we cancel and retrigger the migration, a new job for the hook is not created, and since the failed previous job already reached the max backoff limit, the playbook is never run.

Version-Release number of selected component (if applicable):
KONVEYOR 1.2

How reproducible:
Always.

Steps to Reproduce:
1. Create a migration plan with a hook that will always fail (any hook with syntax errors, for instance)
2. Migrate the plan
3. The migration will fail (there is a bug that doesnt fail the migration and the migration is stuck forever, if that happens, cancel the migration)
4. Run the migration again

Actual results:
The new run will not create a new job for the hook. And since the old job already reached the max backoff limit, the hook will never be executed.

Expected results:
Since we are running the migration again, a new job should be created for the hook and the hook should be run again.

Additional info:

Comment 1 Jason Montleon 2020-04-30 14:27:58 UTC

I added a label we can look for based on the migmigration UID. Since each run creates a new migmigration this stays unique and allows for a new copy to be created.

Comment 5 Sergio 2020-05-08 10:38:31 UTC

Verified using CAM 1.2 stage

A failed hook is retriggered using an new job for the hook

$ oc get jobs
NAME                         COMPLETIONS   DURATION   AGE
hookfail-postrestore-h4877   0/1           13m        13m
hookfail-postrestore-lt6fg   0/1           29m        29m

Comment 7 errata-xmlrpc 2020-05-28 11:10:47 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:2326

Note You need to log in before you can comment on or make changes to this bug.