Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1826298

Summary:	even when I cancel ReX job, remediation still shows it as running
Product:	Red Hat Satellite	Reporter:	Jan Hutař <jhutar>
Component:	RH Cloud - Cloud Connector	Assignee:	Adam Ruzicka <aruzicka>
Status:	CLOSED ERRATA	QA Contact:	Lukáš Hellebrandt <lhellebr>
Severity:	medium	Docs Contact:
Priority:	unspecified
Version:	6.7.0	CC:	aruzicka, ehelms, inecas, jyejare, sdunning
Target Milestone:	6.8.0	Keywords:	Triaged
Target Release:	Unused
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	python3-receptor-satellite-1.0.2	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2020-10-27 13:01:20 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Jan Hutař 2020-04-21 11:46:31 UTC

Description of problem:
Even when I cancel ReX job, remediation still shows it as running. Even after Receptor restart


Version-Release number of selected component (if applicable):
satellite-6.7.0-7.el7sat.noarch


How reproducible:
1 of 1


Steps to Reproduce:
1. Schedule remediation run while previous one is still running
2. Cancel all the jobs on Satellite side, make sure it is cancelled (state "failed"
3. Check Remediations status on RH Cloud


Actual results:
It is still marked as running there


Expected results:
It is marked as failed or so

Comment 6 Jitendra Yejare 2020-08-06 15:27:31 UTC

QUERY:
-------------

Check Working and NonWorking scenario.

Working Scenario:
=======================

Steps:
-------------
1. Schedule remediation run from RH Cloud.
2. While the job is still running, Cancel the job on Satellite side, it is canceled (state "failed")
3. Checked Remediations status on RH Cloud


Observation:
--------------
The RH CLoud shows the remediation status as Failed and shows similar status as in Satellite job `Job has been canceled by user`.



Non-Working Scenario(The Actual bug Steps in description):
============================

Steps:
------------
1. Schedule remediation run while previous one is still running. (@ consecutive runs)
2. Cancel all the jobs on Satellite side, make sure it is canceled (state "failed"
3. Check Remediations status on RH Cloud



Observation:
--------------
The remediation status shows running and also succeeded in the end. Not sure if it bcz playbook execution finished before the job actually get cancelled.



Whats the correct way of verifying this bug Working / Non Working?

Comment 7 Adam Ruzicka 2020-08-11 07:38:43 UTC

Talked about this with Jitendra on friday. This whole thing is hard to time correctly. On Jitendra's test machines, all the jobs showed up in satellite as cancelled by the user, but at the same time exited with exit code 0. Anything that exits with 0 is considered successful and so it was reported to cloud. In this specific case, I'd say the behavior of cloud+receptor is correct. If the process exits with 0 on its own, it doesn't matter if we tried to cancel it or not, it still managed to end successfully on its own.

The key here is to find more things to remediate (maybe something involving a reboot?) which will take longer time to give us more space to actually cancel the job in satellite. An alternative would be to tweak the template that is being used in Satellite to include a sleep or something at the end, but that would be rather hacky

Comment 8 Jitendra Yejare 2020-08-11 10:52:47 UTC

Verified:
-------------------
1. 
For Single cancellation of job look at the `comment 6 - Working Scenario` section.

2.
For Consecutive cancellation of the job here are the updates, this time with the job that takes more time to finish :-

Steps:
------------
1. Schedule remediation runs, while the previous one is still running. (consecutive runs).
2. Cancel all the jobs on the Satellite side, make sure it is canceled (state "failed").
3. Check Remediations status on RH Cloud.


Observation:
--------------
1. The RH CLoud shows the remediation status of the job as Failed and all job tasks status as Failed in Satellite job tasks.
2. Both the job task logs are showing - `Job has been canceled by user'.


Note:
---------
I tested this scenario with 2 consecutive jobs.

Comment 11 errata-xmlrpc 2020-10-27 13:01:20 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Satellite 6.8 release), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:4366