Hide Forgot
Description of problem: There is no way how to cancel task which is still scheduling its child tasks Version-Release number of selected component (if applicable): satellite-6.2.1-1.3.el7sat.noarch How reproducible: always Steps to Reproduce: 1. Schedule errata upgrade task via katello-agent on large number of systems (say 10k) 2. It takes multiple hours to parent task to start all 10k child tasks Actual results: In this time you are not able to cancel parent task (e.g. when you have noticed something in your infrastructure is broken and needs to be fixed first) Expected results: It should be possible to cancel task even if it is still starting its childs
Created redmine issue http://projects.theforeman.org/issues/17528 from this bug
*** Bug 1417180 has been marked as a duplicate of this bug. ***
This bug can be quite bothering for customers in the scenario: - wanting to run a job on say 5k clients - fired it with a typo / error and wanted to cancel it - since sub-tasks are created with cadence less than 1 task per second (1task/sec seems upper limit), the user has to wait almost 2 hours to cancel the task; the time when the cancel will succeed can be estimated just with some tolerance (i.e. the job can be executed on a system _before_ some cancellation attempt succeeds)
(worth to test as well: - launch a job invocation of >100 hosts with time span set to some higher value - ensure then that individual dynflow tasks are picked up even during the phase of generating the foreman sub-tasks (i.e. dynflow picked up 1st foreman (sub)task while 100th foreman (sub)task has not been generated yet) - to check that: - open foreman task with the job execution, click to sub-tasks link, sort subtasks per statr time - open very oldest subtask, click to dynflow console, then to Execution history tab - "start execution" timestamp is the time when dynflow picked this job from foreman Current behaviour: - when time span is set, dynflow picks up all tasks _after_ the latest one is generated (and they are generated with cadency less than 1 task per second) Expected behaviour: - even with time span set, dynflow picks up tasks during foreman generates them
Upstream bug assigned to aruzicka
Moving this bug to POST for triage into Satellite 6 since the upstream issue http://projects.theforeman.org/issues/17528 has been resolved.
*** Bug 1446725 has been marked as a duplicate of this bug. ***
Testing in Sat 6.3 snap 22, when starting a remote job on 4000 hosts and subsequently canceling, the sub-tasks stop executing after n*100 blocks as expected, but the parent task remains in running status forever and hosts with sub-tasks not yet executed remain in state ?N/A. So putting back to ASSIGNED for further investigation.
There is a related issue in remote execution tracked here as well https://bugzilla.redhat.com/show_bug.cgi?id=1516651 - resolving that BZ should also move this BZ to ON_QA state again.
Based on comment 18, I am moving this to ON_QA since https://bugzilla.redhat.com/show_bug.cgi?id=1516651 has been verified.
Checked again in Sat 6.3 snap 30, when the job is canceled in progress, it finishes scheduling the current batch of hosts (100) and the rest of tasks is not started.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. > > > > For information on the advisory, and where to find the updated files, follow the link below. > > > > If the solution does not work for you, open a new bug report. > > > > https://access.redhat.com/errata/RHSA-2018:0336