Bug 1652056 - Enhance resiliency mechanism to avoid memory recycler leading to tasks paused with 'Abnormal termination (previous state: running)' error
Summary: Enhance resiliency mechanism to avoid memory recycler leading to tasks paused...
Keywords:
Status: NEW
Alias: None
Product: Red Hat Satellite 6
Classification: Red Hat
Component: Tasks Plugin
Version: 6.3.4
Hardware: Unspecified
OS: Unspecified
unspecified
medium vote
Target Milestone: Unspecified
Assignee: satellite6-bugs
QA Contact: Peter Ondrejka
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-11-21 13:42 UTC by Ivan Necas
Modified: 2019-11-15 14:08 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Foreman Issue Tracker 25528 'Normal' 'New' 'Enhance resiliency mechanism to avoid memory recycler leading to tasks paused with ''Abnormal termination (previous sta... 2019-11-11 07:55:02 UTC
Red Hat Bugzilla 1652060 'unspecified' 'CLOSED' 'Singleton actions may not start after unclean shutdown' 2019-11-14 09:19:45 UTC

Internal Links: 1652060

Description Ivan Necas 2018-11-21 13:42:35 UTC
Description of problem:
With memory recycler, it happens more often that the tasks can get interrupted
during the execution. In sake of transparency of the recycling process, we should
try to handle this situation better so that the user doesn't have to deal with
the error explicitly

Version-Release number of selected component (if applicable):
6.3.0

How reproducible:
Occasionally

Steps to Reproduce:
1. setup memory limit in /etc/sysconfig/foreman-tasks (EXECUTOR_MEMORY_LIMIT=2gb, for easier reproducing, one might decrease
the EXECUTOR_MEMORY_MONITOR_DELAY to get the restarting more often)
2. restart foreman-tasks
3. start using Satellite in larger environment (continuous registration of hosts + content view publishes in combination with multiple capsules)

Actual results:

After some time, some tasks can end up in paused/error state `Abnormal termination (previous state: running)`


Expected results:
We should analyse this cases and find a way how to resume those before requiring
the user to manually interact with those

Additional info:

We will try to find more reliable reproducer, as we will develop the fix for this issue.

Comment 1 Adam Ruzicka 2018-11-21 13:57:53 UTC
Created redmine issue http://projects.theforeman.org/issues/25528 from this bug


Note You need to log in before you can comment on or make changes to this bug.