Description of problem: When trying to run an Insights plan with 1 rule for "Kernel vulnerable to local privilege escalation via DCCP module (CVE-2017-6074)" on nearly 6K hosts, the execution of the plan seems to be very slow. While this plan is executing, there are no other kind of jobs that are running in parallel(Repo sync, client reg, etc.) Currently, the job seems to have executed on only 300 hosts in an hour and taking a look at the database, it seems the planning for task is very slow. Only 301 tasks have been planned. SELECT COUNT(*) FROM foreman_tasks_tasks WHERE parent_task_id='0f814ce4-2a8e-4be6-83a5-4359bcbb98eb'; -[ RECORD 1 ] count | 301 Not sure, if this issue should go under which component, so marking it for tasks plugin. Please do correct it if necessary. Version-Release number of selected component (if applicable): Satellite 6.4 Snap 10 How reproducible: Always Steps to Reproduce: 1. Create a insights plan with applicability to about 6K hosts. 2. Execute the plan by clicking on Run Playbook. Actual results: The planning of the tasks for execution plan appears to be slow and the tasks seem to run very slowly(nearly serialized - If there are 2 tasks A and B, then B will start only after A has finished). Additional info: Foreman debug: http://debugs.theforeman.org/foreman-debug-vX0MA.tar.xz
The code causing the slowness lives in foreman-ansible, changing the component, even though the fix for this may land elsewhere. Notes: The slowness is caused by the way how we render the template to run for insights. When retrieving the playbook from the insights service, we get a single uber-playbook which should remediate all the issues for all the hosts in the plan. When rendering a template for a single host we retrieve the uber-playbook, pick out the parts relevant to the host and run it. The issue occurs when there's a lot of hosts. We render the template for each host separately, and therefore for N hosts retrieve the uber-playbook N-times. As the number of hosts in the insights plan grows, the uber-playbook and the time needed to retrieve it grows as well. In the testing setup it took +-15 seconds for 6k hosts. Assuming the uber-playbook won't change within a single job invocation and for a single plan, we should try to use these two keys to cache the uber-playbook.
Created redmine issue http://projects.theforeman.org/issues/24262 from this bug
As discussed with Adam, it would be good to start caching the playbook for each host since in our case it's always the same.
Upstream bug assigned to mhulan
Moving this bug to POST for triage into Satellite 6 since the upstream issue https://projects.theforeman.org/issues/24262 has been resolved.
https://github.com/theforeman/foreman-packaging/pull/2980 - rex https://github.com/theforeman/foreman-packaging/pull/2982 - ansible
both package versions or greater are downstream
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2927