Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1877917 - dynflow_executor.output grows extremely large in short period of time.
Summary: dynflow_executor.output grows extremely large in short period of time.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Dynflow
Version: 6.7.0
Hardware: All
OS: Linux
high
high
Target Milestone: 6.9.3
Assignee: Adam Ruzicka
QA Contact: Lukáš Hellebrandt
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-10 18:37 UTC by Dylan Gross
Modified: 2024-12-20 19:15 UTC (History)
19 users (show)

Fixed In Version: tfm-rubygem-dynflow-1.4.8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1962840 (view as bug list)
Environment:
Last Closed: 2021-07-01 14:56:48 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github Dynflow dynflow issues 388 0 None open Dynflow collects stack trace across action suspends 2021-05-04 10:23:26 UTC
Red Hat Knowledge Base (Solution) 5385251 0 None None None 2020-09-10 18:48:20 UTC
Red Hat Product Errata RHBA-2021:2636 0 None None None 2021-07-01 14:57:24 UTC

Description Dylan Gross 2020-09-10 18:37:03 UTC
Description of problem:

   The /var/log/foreman/dynflow_executor.output grows extremely large in short period of time.   (600+ GB in under 24 hours.)

Version-Release number of selected component (if applicable):

   Red Hat Satellite 6.7.2

How reproducible:   Unknown - Not reproducible at will


Actual results:

   The dynflow_executor.output grew over half a TB in a day.

Expected results:

   The dynflow_executor.output logs relevant info and remains at a reasonable size.

Additional info:

  There was obviously some underlying condition with the dynflow_executor that caused this massive amount of atypical logging.   

  Impact to the system came from two different scenarios.   While this particular situation had logging in an alternate location that could accomodate  a few hundred GB, this would likely file /var on most systems.    The second scenario is executing a foreman-debug, which runs xzcat against the admittedly impressively compressed file, and expands it in the foreman-debug location (default=/var/tmp).  (This second scenario can be worked around by specifying an alternate location, if you're aware of the situation ahead of time).

Comment 13 asml_gperumal 2021-04-08 08:07:35 UTC
Hi,
is this bug applicable for Satellite 6.7.5?

Comment 17 Lukáš Hellebrandt 2021-06-10 10:10:02 UTC
Verified with Sat 6.9.3 snap 1.0.

Used reproducer (don't forget to change the TASKS_ROOT to current foreman-tasks version): https://gist.github.com/adamruzicka/3a4681f488e5978c7bd49e544f1ce124 (thanks Adam)

In 6.8, upon invoking the reproducer, there were 5816 lines added to the production.log (basically an absurdly long traceback). In 6.9.3, it's only 59 lines (a shorter and perhaps more useful traceback). No regression was found (manually, automation results not available) => VERIFIED.

Comment 18 Mike McCune 2021-06-10 22:25:14 UTC
Looking at 6.8.6 and 6.9.2 they each have the same Dynflow version:


# rpm -q satellite tfm-rubygem-dynflow
satellite-6.8.6-1.el7sat.noarch
tfm-rubygem-dynflow-1.4.7-1.fm2_1.el7sat.noarch


# rpm -q satellite tfm-rubygem-dynflow
satellite-6.9.2-1.el7sat.noarch
tfm-rubygem-dynflow-1.4.7-1.fm2_1.el7sat.noarch

so, whatever change resolved this issue, it isn't in Dynflow proper.

Adam, any ideas on what else may have resolved this?

Comment 19 Adam Ruzicka 2021-06-11 06:34:31 UTC
@Mike You tried reproducing it or is it just a response to what Lukas wrote in #17?

Sidekiq-based deployments (6.8 and up) seem to be immune to this due to the split of workers into separate processes. To verify this on 6.9.3 we had to jump through a few hoops to trigger the bug at all.

Comment 24 errata-xmlrpc 2021-07-01 14:56:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Satellite 6.9.3 Async Bug Fix Update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:2636

Comment 26 Red Hat Bugzilla 2023-09-15 00:47:54 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days


Note You need to log in before you can comment on or make changes to this bug.