Bug 1817728

Summary: Default task polling is too frequent at scale
Product: Red Hat Satellite Reporter: Mike McCune <mmccune>
Component: Tasks PluginAssignee: Adam Ruzicka <aruzicka>
Status: CLOSED ERRATA QA Contact: Peter Ondrejka <pondrejk>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 6.6.0CC: aruzicka, egolov, hyu, jdickers, ktordeur, momran, pmoravec, wclark
Target Milestone: 6.8.0Keywords: Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: tfm-rubygem-foreman-tasks-1.1.1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1832587 (view as bug list) Environment:
Last Closed: 2020-10-27 13:01:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
tfm-rubygem-foreman-tasks hotfix RPM for Satellite 6.7.0
none
UPDATED tfm-rubygem-foreman-tasks hotfix RPM
none
UPDATED hotfix RPM
none
FURTHER UPDATED Hotfix RPM none

Description Mike McCune 2020-03-26 19:58:30 UTC
The default polling back-off algorithm can still cause an overload of polling during long running tasks:


  def poll_intervals
    [0.5, 1, 2, 4, 8, 16] 
  end 

when there are large numbers of tasks, you can get 'storms' of status checks during long running tasks.

Updated and expanded array has shown improvement in working conditions when allowing longer tasks a greater time period of checking in:

    [0.5, 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024] 

We are going to update the default array to the above value and offer a configuration setting so it can be tuned to specific environments.

Comment 3 Adam Ruzicka 2020-03-27 11:22:19 UTC
Created redmine issue https://projects.theforeman.org/issues/29423 from this bug

Comment 4 Bryan Kearney 2020-03-27 16:02:28 UTC
Upstream bug assigned to aruzicka

Comment 5 Bryan Kearney 2020-03-27 16:02:30 UTC
Upstream bug assigned to aruzicka

Comment 6 wclark 2020-04-17 18:40:18 UTC
Created attachment 1679728 [details]
tfm-rubygem-foreman-tasks hotfix RPM for Satellite 6.7.0

Comment 7 wclark 2020-04-17 18:44:49 UTC
HOTFIX is available for Satellite 6.7.0

INSTALLATION INSTRUCTIONS:

1. Make a complete backup or snapshot of Satellite server

2. Download the attached file tfm-rubygem-foreman-tasks-0.17.5.2-2.HOTFIXRHBZ1817728.el7sat.noarch.rpm and copy it to Satellite server

3. # yum install ./tfm-rubygem-foreman-tasks-0.17.5.2-2.HOTFIXRHBZ1817728.el7sat.noarch.rpm --disableplugin=foreman-protector

4. # systemctl restart dynflowd httpd

Comment 10 wclark 2020-04-28 21:09:43 UTC
Created attachment 1682631 [details]
UPDATED tfm-rubygem-foreman-tasks hotfix RPM

This also includes the hotfix for https://bugzilla.redhat.com/show_bug.cgi?id=1824931

The filename has changed but installation instructions are otherwise the same as before

Comment 12 Bryan Kearney 2020-05-06 13:45:58 UTC
Moving this bug to POST for triage into Satellite 6 since the upstream issue https://projects.theforeman.org/issues/29423 has been resolved.

Comment 13 wclark 2020-06-01 16:27:54 UTC
Created attachment 1694147 [details]
UPDATED hotfix RPM

INSTALLATION INSTRUCTIONS:

1. Make a complete backup or snapshot of Satellite server

2. Download the attached file tfm-rubygem-foreman-tasks-0.17.5.2-4.HOTFIXRHBZ1817728.el7sat.noarch.rpm and copy it to Satellite server

3. # yum install ./tfm-rubygem-foreman-tasks-0.17.5.2-4.HOTFIXRHBZ1817728.el7sat.noarch.rpm --disableplugin=foreman-protector

4. # systemctl restart dynflowd httpd

Comment 14 wclark 2020-06-03 16:57:09 UTC
Created attachment 1694862 [details]
FURTHER UPDATED Hotfix RPM

IMPORTANT UPDATE:

The previously provided Hotfix RPM included a Work In Progress Fix for BZ1817728 "Default task polling is too frequent at scale", which has some significant differences from the final merged version of the fix. Ordinarily, Satellite Engineering does not ship code that has not yet been in merged upstream, but after testing, an exception was made in this case to deliver an immediate fix to our customers. The fix has since undergone further revisions, so this Hotfix has been rebuilt to match the final merged version.

Please find attached the latest update to the Hotfix, as well as installation instructions for cases where the Hotfix is installed for the first time, or updating the previous Hotfix version.

INSTALLATION INSTRUCTIONS [FRESH INSTALL OF HOTFIX]:

1. Make a complete backup or snapshot of Satellite server

2. Download the attached file tfm-rubygem-foreman-tasks-0.17.5.2-5.HOTFIXRHBZ1817728.el7sat.noarch.rpm and copy it to Satellite server

3. # yum install ./tfm-rubygem-foreman-tasks-0.17.5.2-5.HOTFIXRHBZ1817728.el7sat.noarch.rpm --disableplugin=foreman-protector

4. # systemctl restart dynflowd httpd

INSTALLATION INSTRUCTIONS [UPDATING ANY EARLIER HOTFIX VERSION]

Steps 1-4, the same as above, and additionally

5. # echo "Setting.where(name: 'foreman_tasks_polling_intervals').delete_all" | foreman-rake console

Comment 15 Peter Ondrejka 2020-07-14 11:12:10 UTC
Verified on Satellite 6.8 snap 8, the polling interval setting has been added and gets applied

Comment 18 errata-xmlrpc 2020-10-27 13:01:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Satellite 6.8 release), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:4366