Bug 1628638 - The termination procedure after memory threshold exceeded can get stuck, waiting infinitely for some events to occur
Summary: The termination procedure after memory threshold exceeded can get stuck, wait...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite 6
Classification: Red Hat
Component: Tasks Plugin
Version: Unspecified
Hardware: Unspecified
OS: Unspecified
unspecified
high vote
Target Milestone: Released
Assignee: satellite6-bugs
QA Contact: Jan Hutař
URL:
Whiteboard:
Depends On: 1654217 1665461
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-09-13 15:41 UTC by Ivan Necas
Modified: 2019-10-14 11:01 UTC (History)
12 users (show)

Fixed In Version: tfm-rubygem-dynflow-1.1.1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1661291 1684687 (view as bug list)
Environment:
Last Closed: 2019-05-14 12:38:03 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2019:1222 None None None 2019-05-14 12:38:10 UTC
Foreman Issue Tracker 25021 None None None 2018-09-24 12:23:50 UTC
Red Hat Bugzilla 1654975 None CLOSED Dynflow executor termination may hang if there is an action which keeps the executor occupied 2019-10-14 11:01:12 UTC
Red Hat Knowledge Base (Solution) 3891181 None None None 2019-02-06 14:25:35 UTC

Internal Links: 1654975

Description Ivan Necas 2018-09-13 15:41:00 UTC
Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:
Occasionally

Steps to Reproduce:
1. set a memory threshold on dynflow executor (EXECUTOR_MEMORY_LIMIT=2gb in /etc/sysconfig/foreman-tasks in sat 6.3, /etc/sysconfig/dynflowd in 6.4

2. run some substantial load on the tasking system continuous resyncing and publishing of CVs might be a good start
3. wait for the memory limit of the dynflow process to cross the limit - watch production.log for 'Memory level exceeded' message

Actual results:
The process still keeps running, termination is not finished, the tasks are not proceeding anymore

Expected results:
The process exits in timely manner and new one gets started

Additional info:
I've written the reproduced based on customer observations, haven't reproducer it locally in production just yet, we however know about places where we wait in the termination phase without any timeout

Comment 2 sthirugn@redhat.com 2018-09-14 15:31:56 UTC
Increasing the severity.

Comment 4 Ivan Necas 2018-09-24 12:23:47 UTC
Created redmine issue https://projects.theforeman.org/issues/25021 from this bug

Comment 5 Ivan Necas 2018-10-11 16:22:43 UTC
Fixed upstream in https://github.com/Dynflow/dynflow/pull/297

Comment 28 Mike McCune 2019-03-01 22:16:01 UTC
This bug was cloned and is still going to be included in the 6.4.3 release. It no longer has the sat-6.4.z+ flag and 6.4.3 Target Milestone Set which are now on the 6.4.z cloned bug. Please see the Clones field to track the progress of this bug in the 6.4.3 release.

Comment 31 errata-xmlrpc 2019-05-14 12:38:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:1222


Note You need to log in before you can comment on or make changes to this bug.