Bug 1785904

Summary: RunHostJob tasks may stuck forever if the main task (RunHostsJob) is terminated abnormally.
Product: Red Hat Satellite Reporter: Hao Chang Yu <hyu>
Component: Remote ExecutionAssignee: satellite6-bugs <satellite6-bugs>
Status: CLOSED WONTFIX QA Contact: Peter Ondrejka <pondrejk>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 6.6.0CC: aruzicka, inecas, mkalyat
Target Milestone: UnspecifiedKeywords: Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-08-10 21:37:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Hao Chang Yu 2019-12-22 06:44:00 UTC
Description of problem:
RunHostJob tasks may stuck forever if the main task (RunHostsJob) is terminated abnormally.

Steps to Reproduce:
1. Create an ansible job invocation that will run on hundreds of hosts 
2. Wait for the main task (RunHostsJob) to be in "running" state and some sub tasks (RunHostJob <hostname>) are created.
3. Restart foreman-tasks.
4. Check the status of the main task. Ensure that it is failed with the following error:

2: Actions::RemoteExecution::RunHostsJob (skipped) [ 258.62s / 0.06s ] Sub plans
Started at: 2019-12-22 03:57:27 UTC
Ended at: 2019-12-22 04:01:45 UTC
Real time: 258.62s
Execution time (excluding suspended state): 0.06s

Input:

---
job_invocation:
  id: 129
  name: Ansible Commands
  description: Run echo date
job_category: Ansible Commands
job_invocation_id: 129
current_request_id: 
current_timezone: Australia/Brisbane
current_user_id: 4
current_organization_id: 1
current_location_id: 
Output:

--- {}
Error:

StandardError

Abnormal termination (previous state: running)


5. Noticed that the sub tasks are stuck on "running" state and never finish.


Actual results:
Sub tasks stuck forever.


Expected results:
Should either cancel the sub tasks or let them proceed.

Comment 3 Hao Chang Yu 2019-12-22 06:50:33 UTC
foreman-rake console

> ForemanTasks::RemoteTask.all.size
=> 149

> ForemanTasks::RemoteTask.triggered.size
=> 0

I think sub tasks are stuck because the main task got terminated before triggering them.

Comment 6 Adam Ruzicka 2020-01-03 12:11:07 UTC
Created redmine issue https://projects.theforeman.org/issues/28631 from this bug

Comment 8 Mike McCune 2021-07-13 21:54:48 UTC
Upon review of our valid but aging backlog the Satellite Team has concluded that this Bugzilla does not meet the criteria for a resolution in the near term, and are planning to close in a month. This message may be a repeat of a previous update and the bug is again being considered to be closed. If you have any concerns about this, please contact your Red Hat Account team.  Thank you.

Comment 9 Mike McCune 2021-08-10 21:34:00 UTC
Thank you for your interest in Red Hat Satellite. We have evaluated this request, and while we recognize that it is a valid request, we do not expect this to be implemented in the product in the foreseeable future. This is due to other priorities for the product, and not a reflection on the request itself. We are therefore closing this out as WONTFIX. If you have any concerns about this feel free to contact your Red Hat Account Team. Thank you.