Red Hat Bugzilla – Bug 1288152
Errata Install to Content Host takes too long and doesn't scale well
Last modified: 2018-05-25 11:24:48 EDT
Description of problem:
When applying (installing) errata on a Content Host from the server, it takes a very long time.
For example, installing ~556 errata on a single content host was observed to take ~44 minutes. Most of that time (~36 minutes) was during the 'initiating the install' phase of the task (i.e. executing the pulp consumer content install). While this may not sound too bad a first glance, it won't scale well as the behavior is linear. As a result, if there were 100 content hosts, that same action could take ~3 days.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. install Satellite 6.1.4
2. import a manifest
3. enable and sync the RHEL 6Server RPM repo
4. create a content view containing the RHEL 6Server RPM repo and publish it
5. register a RHEL 6.5 client to the above content view
6. initiate an errata install from the UI
(Hosts -> Content Hosts -> select the host -> Errata -> scroll through
all of the errata, select them and 'Apply Selected')
Observe that the errata are installed on the content host; however, it takes a very long time to initiate and execute the task.
We need to investigate why it is taking so long to initiate the errata install and look for ways to optimize it to improve performance.
Attaching a foreman debug from the server where the above scenario was executed.
Created attachment 1101863 [details]
foreman-debug: installing errata on single host
Do you have any other info on what the bottleneck was? CPU? Memory? Disk IO?
The Pulp upstream bug status is at NEW. Updating the external tracker on this bug.
The Pulp upstream bug priority is at High. Updating the external tracker on this bug.
I do not have additional data at this time.
Based on https://bugzilla.redhat.com/show_bug.cgi?id=1269509, it looks like this issue has already been resolved in 6.1.5.
My mistake. This is server-side and unrelated to: 1269509.
The Pulp upstream bug priority is at Normal. Updating the external tracker on this bug.
The Pulp upstream bug status is at ASSIGNED. Updating the external tracker on this bug.
The problem stems from the fact that Pulp takes each errata and turns it into a list of packages. As the number of errata grows, the amount of time it takes to translate them into package lists grows.
The proper solution is to smarten up the Katello agent/pulpplugin for Gofer to be aware of errata as a content type. This way Pulp can completely avoid having to generate lists of packages. Yum is very good at figuring out what packages belong to what errata.
I have tested this solution with yum 3.4.3 on RHEL7.
I have also tested this with yum 3.2.22 and yum-security 1.1.16 on RHEL 5.10.
I have confirmed that Yum 3.2.22 ans yum-security 1.1.16 have shipped with RHEL 5 since update 5.
The Pulp upstream bug status is at POST. Updating the external tracker on this bug.
The fix for this issue requires:
- the agent on each content host to be updated
- yum to be up to date on each content host (https://bugzilla.redhat.com/show_bug.cgi?id=1246026)
- yum-plugin-security package to be installed on RHEL 5 and 6 content hosts
The Pulp upstream bug status is at MODIFIED. Updating the external tracker on this bug.
The Pulp upstream bug status is at ON_QA. Updating the external tracker on this bug.
This is going to require adding a:
on the katello-agent-package for RHEL5 and RHEL6. We do not need that Requires on RHEL7 as this is built into yum.
Created redmine issue http://projects.theforeman.org/issues/15366 from this bug
The Pulp upstream bug status is at CLOSED - CURRENTRELEASE. Updating the external tracker on this bug.
A possible workaround for anyone who encounters this error is to use remote execution rather than Katello agent to install errata. This requires a bit of setup with asymmetric encryption keys (SSH private/public keys between capsule and clients). Errata installation is performed the same way as before, except the user selects the dropdown arrow on the "Apply Selected" button, then selects "via remote execution" in the Content Hosts > Errata page.
*** Bug 1290867 has been marked as a duplicate of this bug. ***
*** Bug 1418174 has been marked as a duplicate of this bug. ***
*** Bug 1466985 has been marked as a duplicate of this bug. ***
I've been able to reproduce this issue on Sat 6.2.14.
This Satellite was installed on 6.2.13 and then upgraded to 6.2.14 with no issues.
Then I provisioned 4 hosts (VMs) from Satellite with RHEL 7.3 GA, no 'yum update' during the install. Next, I switched every one of them to RHEL 7Server so they'd have many errata to apply.
One of the hosts then had these errata applicable:
- 60(sec) 185(bugfix) 26(enh). I had installed a few additional packages to this host only, prior to this test.
The remaining 3 hosts had these errata:
- 34(sec) 123(bugfix) 19(enh).
On the Content Hosts page I selected these 4 hosts and did a bulk action to apply all errata.
The 4 tasks emerging from this bulk action (1 task per host) took 35, 77, 82, and 89 minutes to complete. Pulp displayed >100% CPU use for 45 minutes.
This Satellite has 4 cores, total 8 threads. 16 GB RAM. All storage is on a pretty fast sata SSD. Disk I/O was never the bottleneck for this Errata Applicability calculation.
No swapping occurred on Satellite during this whole process.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA.
> For information on the advisory, and where to find the updated files, follow the link below.
> If the solution does not work for you, open a new bug report.