Bug 1600920 - Insights Plan Execution is slow at scale
Summary: Insights Plan Execution is slow at scale
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Ansible - Configuration Management
Version: 6.4
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: 6.4.0
Assignee: Marek Hulan
QA Contact: sbadhwar
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-07-13 11:40 UTC by sbadhwar
Modified: 2019-11-05 23:28 UTC (History)
6 users (show)

Fixed In Version: foreman_ansible-2.2.7-1,foreman_remote_execution-1.5.6-1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-10-16 19:14:39 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Foreman Issue Tracker 24262 0 Normal Closed Insights Plan Execution is slow at scale 2020-05-05 12:12:42 UTC
Foreman Issue Tracker 24866 0 Normal Closed Caching for job invocation macros 2020-05-05 12:12:42 UTC

Internal Links: 1601155

Description sbadhwar 2018-07-13 11:40:17 UTC
Description of problem:
When trying to run an Insights plan with 1 rule for "Kernel vulnerable to local privilege escalation via DCCP module (CVE-2017-6074)" on nearly 6K hosts, the execution of the plan seems to be very slow.

While this plan is executing, there are no other kind of jobs that are running in parallel(Repo sync, client reg, etc.)

Currently, the job seems to have executed on only 300 hosts in an hour and taking a look at the database, it seems the planning for task is very slow. Only 301 tasks have been planned.

SELECT COUNT(*) FROM foreman_tasks_tasks WHERE parent_task_id='0f814ce4-2a8e-4be6-83a5-4359bcbb98eb';
-[ RECORD 1 ]
count | 301

Not sure, if this issue should go under which component, so marking it for tasks plugin. Please do correct it if necessary.

Version-Release number of selected component (if applicable):
Satellite 6.4 Snap 10

How reproducible:
Always


Steps to Reproduce:
1. Create a insights plan with applicability to about 6K hosts.
2. Execute the plan by clicking on Run Playbook.

Actual results:
The planning of the tasks for execution plan appears to be slow and the tasks seem to run very slowly(nearly serialized - If there are 2 tasks A and B, then B will start only after A has finished).


Additional info:
Foreman debug: http://debugs.theforeman.org/foreman-debug-vX0MA.tar.xz

Comment 3 Adam Ruzicka 2018-07-16 12:07:52 UTC
The code causing the slowness lives in foreman-ansible, changing the component, even though the fix for this may land elsewhere.

Notes:
The slowness is caused by the way how we render the template to run for insights. When retrieving the playbook from the insights service, we get a single uber-playbook which should remediate all the issues for all the hosts in the plan. When rendering a template for a single host we retrieve the uber-playbook, pick out the parts relevant to the host and run it.

The issue occurs when there's a lot of hosts. We render the template for each host separately, and therefore for N hosts retrieve the uber-playbook N-times. As the number of hosts in the insights plan grows, the uber-playbook and the time needed to retrieve it grows as well. In the testing setup it took +-15 seconds for 6k hosts.

Assuming the uber-playbook won't change within a single job invocation and for a single plan, we should try to use these two keys to cache the uber-playbook.

Comment 4 Adam Ruzicka 2018-07-16 12:10:19 UTC
Created redmine issue http://projects.theforeman.org/issues/24262 from this bug

Comment 5 Marek Hulan 2018-08-06 14:22:47 UTC
As discussed with Adam, it would be good to start caching the playbook for each host since in our case it's always the same.

Comment 6 Satellite Program 2018-09-07 16:07:03 UTC
Upstream bug assigned to mhulan

Comment 7 Satellite Program 2018-09-07 16:07:07 UTC
Upstream bug assigned to mhulan

Comment 8 Satellite Program 2018-09-13 08:06:56 UTC
Moving this bug to POST for triage into Satellite 6 since the upstream issue https://projects.theforeman.org/issues/24262 has been resolved.

Comment 10 Patrick Creech 2018-09-21 00:42:39 UTC
both package versions or greater are downstream

Comment 12 Bryan Kearney 2018-10-16 19:14:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2927


Note You need to log in before you can comment on or make changes to this bug.