Bug 1853076
| Summary: | large capsule syncs cause slow processing of dynflow tasks/steps | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Satellite | Reporter: | Waldirio M Pinheiro <wpinheir> | ||||
| Component: | Capsule - Content | Assignee: | Justin Sherrill <jsherril> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Vladimír Sedmík <vsedmik> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 6.7.0 | CC: | ahumbe, arahaman, avnkumar, dhjoshi, dsynk, ehelms, fperalta, hyu, iballou, ikaur, jhutar, joboyer, jsherril, ktordeur, rbertolj, saydas, smajumda, wclark | ||||
| Target Milestone: | 6.8.0 | Keywords: | PrioBumpGSS, Triaged | ||||
| Target Release: | Unused | ||||||
| Hardware: | All | ||||||
| OS: | All | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | rubygem-katello-3.16.0-0.16.rc4.1 | Doc Type: | If docs needed, set a value | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | |||||||
| : | 1857359 (view as bug list) | Environment: | |||||
| Last Closed: | 2020-10-27 13:03:46 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Waldirio M Pinheiro
2020-07-01 21:35:46 UTC
Created redmine issue https://projects.theforeman.org/issues/30286 from this bug Dear Team, is there actually a workaround available for this issue? My customer is also facing it and would like to understand what could be an ETA for a (hot)fix? Thanks in advance, Cisco. Moving this bug to POST for triage into Satellite since the upstream issue https://projects.theforeman.org/issues/30286 has been resolved. Created attachment 1700153 [details]
HOTFIX RPM for Satellite 6.7.1
HOTFIX is attached. Please find installation instructions below: 1. Take a backup or snapshot of Satellite server 2. Download the Hotfix RPM and copy it to Satellite server 3. # yum install tfm-rubygem-katello-3.14.0.21-6.HOTFIXRHBZ1830403RHBZ1789911RHBZ1853076.el7sat.noarch.rpm --disableplugin=foreman-protector 4. # systemctl restart httpd dynflow By default, the Hotfix will configure a batch size of 25 for Pulp sub-tasks during Capsule sync. The effect is that it will reduce the necessary amount of polling of Dynflow --> Pulp, reducing the load on both services as neither needs to track nor communicate with the other about 1000s of sub-tasks. The batch size is also configurable so you may find a more optimal value for your deployment. To configure it, navigate to Administer --> Settings --> Content --> modify the parameter labeled "Batch size to sync repositories in." I'm very sorry Vláďo, I was not able to work on this :-/ To verify this BZ I was comparing two setups: 1) Satellite + Capsule 6.7.0 snap 20 2) Satellite + Capsule 6.8.0 snap 14 In each setup 6 repos (RHEL7Server, RHEL7Server-Optional, RHSCL for RHEL7, RHEL8-BaseOS, RHEL8-AppStream, test_simple_errata) were published into 40 content views each (240 content views in total) and were synchronized (Complete Sync) from Sat to Caps. I used the default batch size settings 'foreman_proxy_batch_size'=25 in case 2). Four hosts were registered and unregistered through the capsule. Results: ------------------------------------------------------------------- 6.7.0-20 6.8.0-14 ------------------------------------------------------------------- Overall sync time [hh:mm:ss] 28:44:45 26:54:33 Host registration time 13 sec 11 sec Host unregistration time 2 s 2 s Average errata enumeration time 163 s 19.5 s Average CPU load during sync 1.88 1.89 Median CPU load during sync 0.87 0.22 REX command run time (hostnamectl) 27-44 s 4-8 s ------------------------------------------------------------------- Conclusion: We can see huge improvement in the Errata enumeration time for new registered hosts (need to use workaround of BZ#1771921) and also REX times improved significantly. Overall sync time has improved slightly (by 1h50m) while the average CPU load remained almost the same. Mean CPU load was lower at 6.8 as higher peaks and longer valleys occurred during the sync. I haven't noticed any large or fast-growing log files during or after the sync. The size of /var/log/foreman was 3.5MB and whole /var/log directory occupied ~1G of space on both instances after the sync. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Satellite 6.8 release), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:4366 |