Description of problem: Regenerate applicability (both for a repo or for a consumer) requests from mongo repo_content_units data that are not required for the calculation. This has very evident negative performance impact, as - it puts mongo under higher load - the task keeps redundant data in a map When restricting the search to return just required fields only, I get 35-50%(!!!) better times in reg.app. tasks - both on my small Satellite and on the customer data. In particular: - rpms queried at https://github.com/pulp/pulp_rpm/blob/2.18-release/plugins/pulp_rpm/plugins/profilers/yum.py#L291 fetch NEVRA+checksums information - this is due to TYPE_ID_RPM = "rpm" and: > db.content_types.find({'id': 'rpm'}, {'unit_key': 1}) { "_id" : ObjectId("5e3560edb6dd5248dcdd9406"), "unit_key" : [ "name", "epoch", "version", "release", "arch", "checksumtype", "checksum" ] } - so "rpms" map keeps NEVRA information + checksumtype + checksum for each rpm unit returned - but just NEVRA information is sufficient for reg.app. calculations (cf. https://github.com/pulp/pulp_rpm/blob/2.18-release/plugins/pulp_rpm/plugins/profilers/yum.py#L481-L493) - so my "monkey improvement" is: - add a new auxiliary content type with unit_key having NEVRA only - yum.py#L291 to call get_repo_units with this new content type instead of TYPE_ID_RPM Particular patch in Additional info. Testing this on a customer data showed 35-50% improvement in task duration (details will follow in next update). Version-Release number of selected component (if applicable): Sat 6.5 / 6.6 / 6.7 How reproducible: 100% on customer data Steps to Reproduce: on customer data, fire repo reg.applicability to recalculate it for a repo (consumer reg.app. tasks are optimised to skip if no change detected, so they are not suitable for such performance testing - but figures I got from a smarter testing show the same improvement rates). To fire such request manually: 1. have repo_regen_request.template.json : { "repo_criteria": { "filters": {"id": {"$in": ["REPO"]}} }, "parallel": true } 2. run: pulpAdminPassword=$(grep ^default_password /etc/pulp/server.conf | cut -d' ' -f2) repo_id=YourOrganization-Red_Hat_Enterprise_Linux_Server-Red_Hat_Enterprise_Linux_7_Server_RPMs_x86_64_7Server # fill with a big repo id cat repo_regen_request.template.json | sed "s/REPO/$repo/g" > repo_regen_request.json curl -u admin:$pulpAdminPassword -X POST -d @repo_regen_request.json https://$(hostname -f)/pulp/api/v2/repositories/actions/content/regenerate_applicability/ 3. Output will be like: {"group_id": "f82229ca-c2be-4da6-89a2-b45ad7e94605", "_href": "/pulp/api/v2/task_groups/f82229ca-c2be-4da6-89a2-b45ad7e94605/"} recall the _href and append "state_summary/" after it 4. Repeatedly query task group status: curl -u admin:$pulpAdminPassword https://$(hostname -f)/pulp/api/v2/task_groups/f82229ca-c2be-4da6-89a2-b45ad7e94605/ until "finished" tasks equal to "total" tasks 5. Measure the duration. Optionally, measure how long these tasks took in /var/log/messages : Mar 10 11:24:46 pmoravec-sat651-for-02523362 pulp: celery.app.trace:INFO: [4b47b05c] Task pulp.server.managers.consumer.applicability.batch_regenerate_applicability[4b47b05c-86c7-4cf3-b128-56bb85b844b9] succeeded in 40.393520645s: None Actual results: MyOrg-Red_Hat_Software_Collections__for_RHEL_Server_-Red_Hat_Software_Collections_RPMs_for_Red_Hat_Enterprise_Linux_7_Server_x86_64_7Server : 0:53:54 whole exec.time batch_regenerate_applicability: 1243 tasks, 24243.3s in sum, 20.7031s on average MyOrg-Red_Hat_Enterprise_Linux_Server-Red_Hat_Enterprise_Linux_7_Server_-_Optional_RPMs_x86_64_7Server : 2:01:54 whole exec.time batch_regenerate_applicability: 1240 tasks, 58765.2s in sum, 44.7905s on average MyOrg-Red_Hat_Enterprise_Linux_Server-Red_Hat_Enterprise_Linux_7_Server_RPMs_x86_64_7Server 3:01:27 whole exec.time batch_regenerate_applicability: 1246 tasks, 86869.2s in sum, 69.7184s on average Expected results: With patch applied: MyOrg-Red_Hat_Software_Collections__for_RHEL_Server_-Red_Hat_Software_Collections_RPMs_for_Red_Hat_Enterprise_Linux_7_Server_x86_64_7Server : 0:25:24 whole exec.time batch_regenerate_applicability: 1243 tasks, 12107.1s in sum, 9.74021s on average - improvement by 50% (!!) MyOrg-Red_Hat_Enterprise_Linux_Server-Red_Hat_Enterprise_Linux_7_Server_-_Optional_RPMs_x86_64_7Server : 1:17:16 whole exec.time batch_regenerate_applicability: 1240 tasks, 36917.6s in sum, 29.7722s on average - improvement by 35% MyOrg-Red_Hat_Enterprise_Linux_Server-Red_Hat_Enterprise_Linux_7_Server_RPMs_x86_64_7Server 1:47:54 whole exec.time batch_regenerate_applicability: 1246 tasks, 51643.5s in sum, 41.4474s on average - improvement by 40% Additional info: The patch: 1) add the new content type: mongo pulp_database db.content_types.insert({'id': 'rpm_nevra', 'unit_key': [ 'name', 'epoch', 'version', 'release', 'arch' ], 'display_name': 'RPM_NEVRA', 'description': 'auxiliary for reg.app.', '_ns': 'content_types', 'search_indexes': [ 'downloaded']}) 2) Apply patch: --- /usr/lib/python2.7/site-packages/pulp_rpm/common/ids.py.orig 2020-03-09 21:50:25.972970511 +0100 +++ /usr/lib/python2.7/site-packages/pulp_rpm/common/ids.py 2020-03-09 21:51:03.225812600 +0100 @@ -17,6 +17,7 @@ EXPORT_DISTRIBUTOR_ID = 'export_distribu TYPE_ID_ISO = 'iso' TYPE_ID_RPM = 'rpm' +TYPE_ID_RPM_NEVRA = 'rpm_nevra' TYPE_ID_SRPM = 'srpm' UNIT_KEY_RPM = ( "name", "epoch", "version", "release", "arch", "checksum", "checksumtype") --- /usr/lib/python2.7/site-packages/pulp_rpm/plugins/profilers/yum.py.orig 2020-03-09 22:01:30.882251718 +0100 +++ /usr/lib/python2.7/site-packages/pulp_rpm/plugins/profilers/yum.py 2020-03-09 22:16:02.048726781 +0100 @@ -8,7 +8,7 @@ from pulp.server.db import model from pulp.server.db.model.criteria import UnitAssociationCriteria from pulp_rpm.common.constants import VIRTUAL_MODULEMDS -from pulp_rpm.common.ids import TYPE_ID_ERRATA, TYPE_ID_RPM, TYPE_ID_MODULEMD +from pulp_rpm.common.ids import TYPE_ID_ERRATA, TYPE_ID_RPM, TYPE_ID_RPM_NEVRA, TYPE_ID_MODULEMD from pulp_rpm.plugins.db import models from pulp_rpm.yum_plugin import util @@ -288,7 +288,7 @@ class YumProfiler(Profiler): # Create lookup table of available RPMs for errata applicability, find applicable RPMs # and modules. additional_unit_fields = ['is_modular'] - rpms = conduit.get_repo_units(bound_repo_id, TYPE_ID_RPM, additional_unit_fields) + rpms = conduit.get_repo_units(bound_repo_id, TYPE_ID_RPM_NEVRA, additional_unit_fields) available_rpm_nevras = {'modular': set(), 'non-modular': set()} for rpm in rpms:
Tested few approaches on the customer data: 1) original pulp + mongo content 2) Hao's patch + index added 3) additionally, applied improved (and working) patch "query NEVRA+modular info only" (patch follows) 4) additionally, remove modularity profiles for non-RHEL8 systems (roughly speaking) 2) is in #c4 + #c6 3) is patch based on my original idea but less monkey/intrusive approach that really counts applicability (orig.patch didnt): diff -rup a/usr/lib/python2.7/site-packages/pulp/plugins/conduits/profiler.py b/usr/lib/python2.7/site-packages/pulp/plugins/conduits/profiler.py --- a/usr/lib/python2.7/site-packages/pulp/plugins/conduits/profiler.py 2020-03-11 08:58:31.212958350 +0100 +++ b/usr/lib/python2.7/site-packages/pulp/plugins/conduits/profiler.py 2020-03-11 08:57:53.226133703 +0100 @@ -34,7 +34,7 @@ class ProfilerConduit(MultipleRepoUnitsM bindings = manager.find_by_consumer(consumer_id) return [b['repo_id'] for b in bindings] - def get_repo_units(self, repo_id, content_type_id, additional_unit_fields=None): + def get_repo_units(self, repo_id, content_type_id, additional_unit_fields=None, only_unit_fields=None): """ Searches for units in the given repository with given content type and returns a plugin unit containing unit id, unit key and any additional @@ -55,7 +55,10 @@ class ProfilerConduit(MultipleRepoUnitsM """ additional_unit_fields = additional_unit_fields or [] try: - unit_key_fields = units_controller.get_unit_key_fields_for_type(content_type_id) + if only_unit_fields is None: + unit_key_fields = units_controller.get_unit_key_fields_for_type(content_type_id) + else: + unit_key_fields = only_unit_fields serializer = units_controller.get_model_serializer_for_type(content_type_id) # Query repo association manager to get all units of given type diff -rup a/usr/lib/python2.7/site-packages/pulp_rpm/plugins/profilers/yum.py b/usr/lib/python2.7/site-packages/pulp_rpm/plugins/profilers/yum.py --- a/usr/lib/python2.7/site-packages/pulp_rpm/plugins/profilers/yum.py 2020-03-11 08:58:25.396985197 +0100 +++ b/usr/lib/python2.7/site-packages/pulp_rpm/plugins/profilers/yum.py 2020-03-11 08:58:06.792071081 +0100 @@ -288,7 +288,7 @@ class YumProfiler(Profiler): # Create lookup table of available RPMs for errata applicability, find applicable RPMs # and modules. additional_unit_fields = ['is_modular'] - rpms = conduit.get_repo_units(bound_repo_id, TYPE_ID_RPM, additional_unit_fields) + rpms = conduit.get_repo_units(bound_repo_id, TYPE_ID_RPM, additional_unit_fields, NVREA_KEYS) available_rpm_nevras = {'modular': set(), 'non-modular': set()} for rpm in rpms: 4) there are consumer profiles for e.g. RHEL6 or RHEL7 for modularity, that are empty but taken into account, like: consumer_unit_profiles: { "_id" : ObjectId("5e66464fb6dd526718ac1d61"), "profile" : [ ], "_ns" : "consumer_unit_profiles", "profile_hash" : "4f53cda18c2baa0c0354bb5f9a3ecbe5ed12ab4d8e11ba873c2f11161202b945", "consumer_id" : "c3ad7949-834b-4dc5-9839-95f39c85924c", "content_type" : "modulemd", "id" : "5e66464fb6dd526718ac1d61" } see empty profile and content_type = modulemd. Removing those profiles via: db.consumer_unit_profiles.remove({'profile': []}) was my trick. Additionally, consumer_unit_profiles for nonexisting consumers (consumer_id not seen in consumers collection) can be deleted - the attached case has clean_orphaned_consumer_profiles.sh script for that. In 4), I cleaned consumer unit profiles from those two types of orphans. Testbed used on the customer data: - reg.app. of "RHEL7 software collections" repo (1243 batch applicabilities tasks invoked) - concurrently, run reg.app. of 200 consumers Results from this testbed: #tasks sum_time avg_time =============================================================== orig:regenerate_applicability_for_consumers 200 2954.68 14.7734 orig:batch_regenerate_applicability 1243 29841.8 24.0079 =============================================================== hao:regenerate_applicability_for_consumers 200 1947.58 9.73789 hao:batch_regenerate_applicability 1243 26499.1 21.3187 =============================================================== hao+pmoravec-nevra:regenerate_applicability_for_consumers 200 1682.96 8.4148 hao+pmoravec-nevra:batch_regenerate_applicability 1234 25435.2 20.612 =============================================================== hao+pmoravec-NEVRA+orphans:regenerate_applicability_for_consumers 200 1634.87 8.17437 hao+pmoravec-NEVRA+orphans:batch_regenerate_applicability 1234 25196.9 20.4189 So overall improvement: - reg.app. of consumers improved by 44% - reg.app. of the repo improved by 15.5%
Created attachment 1669696 [details] clean_orphaned_consumer_profiles.sh The bash script to clean profiles for non-existing consumers (cf. point 4).
The Pulp upstream bug status is at NEW. Updating the external tracker on this bug.
The Pulp upstream bug priority is at Normal. Updating the external tracker on this bug.
Continuing in tests from #c7 against improved Hao's patch per https://github.com/pulp/pulp_rpm/pull/1640 : cumulative results (my patch + 2 cleanups + Hao's improved): hao-improved:regenerate_applicability_for_consumers 200 1589.71 7.94857 hao-improved:batch_regenerate_applicability 1232 15331.8 12.4446 So, accumulative improvement is almost 50%, hugely due to Hao's patch. Kudos!
The patches have been merged upstream.
The Pulp upstream bug status is at MODIFIED. Updating the external tracker on this bug.
To make the HF efficient, it was confirmed one index is needed to add to mongo, as per PR https://github.com/pulp/pulp_rpm/pull/1659 : mongo pulp_database --eval "db.erratum_pkglists.createIndex( { repo_id: 1 } )" Comparing performance of reg.app. tasks with the index, then without it and after adding it again (on the HF packages over 6.5.3): =============================================================== index:regenerate_applicability_for_consumers 200 1430.48 7.15242 index:batch_regenerate_applicability 1237 12530.3 10.1296 =============================================================== NoIndex:regenerate_applicability_for_consumers 200 1729.54 8.64772 NoIndex:batch_regenerate_applicability 1243 12700.3 10.2175 =============================================================== indexAgain:regenerate_applicability_for_consumers 200 1255.74 6.2787 indexAgain:batch_regenerate_applicability 1237 12319.4 9.95908 Just using the index: - repo reg.app. (batch reg.app.) is improved by 1-3% - consumer reg.app. is improved by 17% - 27%
Hotfix is available for Satellite 6.6.2 INSTALLATION INSTRUCTIONS: 1. Make a backup or snapshot of Satellite server 2. Add the following index to MongoDB to improve performance of some queries # mongo pulp_database --eval "db.erratum_pkglists.createIndex( { repo_id: 1 } )" 3. Download attached files and copy them to Satellite server: pulp-server-2.19.1.1-2.HOTFIXRHBZ1812031.el7sat.noarch.rpm pulp-rpm-plugins-2.19.1.1-3.HOTFIXRHBZ1812031.el7sat.noarch.rpm 4. Install the packages # yum update pulp-server-2.19.1.1-2.HOTFIXRHBZ1812031.el7sat.noarch.rpm pulp-rpm-plugins-2.19.1.1-3.HOTFIXRHBZ1812031.el7sat.noarch.rpm --disableplugin=foreman-protector 5. Restart pulp services (ideally when no pulp task in in progress) # for i in pulp_celerybeat pulp_resource_manager pulp_streamer pulp_workers; do service $i restart; done
Created attachment 1674130 [details] pulp-server hotfix RPM for Satellite 6.6.2
Created attachment 1674131 [details] pulp-rpm-plugins hotfix RPM for Satellite 6.6.2
Hotfix is available for Satellite 6.5.3 INSTALLATION INSTRUCTIONS: 1. Make a backup or snapshot of Satellite server 2. Add the following index to MongoDB to improve performance of some queries # mongo pulp_database --eval "db.erratum_pkglists.createIndex( { repo_id: 1 } )" 3. Download attached files and copy them to Satellite server: pulp-server-2.18.1.1-2.HOTFIXRHBZ1812031.el7sat.noarch.rpm pulp-rpm-plugins-2.18.1.6-2.HOTFIXRHBZ1812031.el7sat.noarch.rpm 4. Install the packages # yum update pulp-server-2.18.1.1-2.HOTFIXRHBZ1812031.el7sat.noarch.rpm pulp-rpm-plugins-2.18.1.6-2.HOTFIXRHBZ1812031.el7sat.noarch.rpm --disableplugin=foreman-protector 5. Restart pulp services (ideally when no pulp task in in progress) # for i in pulp_celerybeat pulp_resource_manager pulp_streamer pulp_workers; do service $i restart; done
Created attachment 1674133 [details] pulp-server hotfix RPM for Satellite 6.5.3
Created attachment 1674134 [details] pulp-rpm-plugins hotfix RPM for Satellite 6.5.3
Created attachment 1679713 [details] pulp-rpm-plugins hotfix RPM for Satellite 6.7.0
Hotfix is available for Satellite 6.7.0 This replaces a previously published hotfix with an updated (.3) version of pulp-server as well as an additional index added to the database: pulp-server-2.21.0-3.HOTFIXRHBZ1812031.el7sat.noarch.rpm INSTALLATION INSTRUCTIONS: 1. Make a backup or snapshot of Satellite server 2. Add the following index to MongoDB to improve performance of some queries # mongo pulp_database --eval "db.erratum_pkglists.createIndex( { repo_id: 1 } )" # mongo pulp_database --eval "db.consumer_unit_profiles.createIndex( {"id": 1} )" 3. Download attached files and copy them to Satellite server: pulp-server-2.21.0-3.HOTFIXRHBZ1812031.el7sat.noarch.rpm pulp-rpm-plugins-2.21.0.4-2.HOTFIXRHBZ1812031.el7sat.noarch.rpm 4. Install the packages # yum update pulp-server-2.21.0-3.HOTFIXRHBZ1812031.el7sat.noarch.rpm pulp-rpm-plugins-2.21.0.4-2.HOTFIXRHBZ1812031.el7sat.noarch.rpm --disableplugin=foreman-protector 5. Restart pulp services (ideally when no pulp task in in progress) # for i in pulp_celerybeat pulp_resource_manager pulp_streamer pulp_workers; do service $i restart; done
Created attachment 1682553 [details] pulp-server hotfix RPM for Satellite 6.7.0
The Pulp upstream bug status is at CLOSED - CURRENTRELEASE. Updating the external tracker on this bug.
The Pulp upstream bug status is at ON_QA. Updating the external tracker on this bug.
NOTE: The hotfix to this bug delivered in 6.6.2 is also applicable to 6.6.3 as the version of pulp did not change in the delivery of 6.6.3. To apply the hotfix to this bug on a Satellite 6.6.3 system, follow the instructions here: https://bugzilla.redhat.com/show_bug.cgi?id=1812031#c17
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Satellite 6.8 release), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:4366