Description of problem:
(note: this bug is related to the way a couple of tasks are managed, not the tasks plugin itself. I wasn't sure of the best category)
If the Satellite server has issues that result in unclean shutdown, its possible for the long-running tasks to not get cleaned up at termination. This prevents the long-running tasks from restarting upon server restart. Users have to shut down foreman-tasks again, clean the long-running tasks, and then restart.
The long-running tasks are "Listen On Candlepin Events" and "Monitor Event Queue". I haven't seen issues with "Insights Email Notifications" but it may fall into the same boat.
Version-Release number of selected component (if applicable): 6.2.10
Steps to reproduce:
1) Take a look at LOCE, it should be in running-pending.
2) systemctl kill -s 9 foreman-tasks
3) Take a look at LOCE, it is still in running-pending even though the executor is dead now (simulates unlcean shutdown), note uuid of the LOCE task
4) systemctl restart foreman-tasks
5) Wait for a while (to let foreman-tasks fully initialize)
6) Refresh tasks list
LOCE task sticks around and is still kept in running-pending.
LOCE task is switched to stopped-$whatever, anoter LOCE task is spawned and is in running-pending.
Created redmine issue http://projects.theforeman.org/issues/21207 from this bug
Upstream bug assigned to aruzicka
Please disregard the previous "steps to reproduce" in comment #5
I've failed to reproduce this issue, but I've it happen several times. Basically there are two variants of this bug
After an unclean shutdown
1) there is no LOCE running.
2) there is a couple of LOCEs running.
The new way of handling long running tasks in Dynflow should take care of both of those and make sure there is always exactly one instance running.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA.
> > For information on the advisory, and where to find the updated files, follow the link below.
> > If the solution does not work for you, open a new bug report.
> > https://access.redhat.com/errata/RHSA-2018:0336
Reopening, since the katello patch isn't present in 6.3.1. And that's why no LOCE task after foreman-tasks restart happened at a customer and also on some internal system.
- no LOCE task can have serious consequences (e.g. candlepin in maintenance mode, so no new systems register)
- backport seems easy
I am asking for z-stream.
(sadly, there is no reproducer available ATM)
I'm running into this on 6.2.15. How do I get LOCE into running/pending state again, please? I did a "shutdown -h now" and assumed this would run a "katello-service stop" or similar. In any event I appear to have had an unclean shutdown and I'm stuck with a LOCE in paused/pending state after the reboot.
What steps are required?
*** Bug 1605025 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.