Red Hat Bugzilla – Bug 1573892
regenerate applicability of a consumer takes many minutes
Last modified: 2018-10-05 17:08:16 EDT
Description of problem: in a customer setup reproduced internally, a single pulp.server.managers.consumer.applicability.regenerate_applicability_for_consumers task takes many minutes for a single consumer. This makes Satellite unusable when many systems are updated (as they send package profile to Sat what triggers the reg.app. task). coredumps taken during the task execution showed the task spends almost whole time in the same code that https://bugzilla.redhat.com/show_bug.cgi?id=1523433 refers to. Applying the patch from the BZ led to approx. 1/4 time improvement, but still a reg.app. task running for 5minutes or so is tooo much. Typical consumer is bound to few repos and (after adding some debugs) the majority of time is spent on RHEL6 and RHEL6 Extras repos calculation (for the consumer): Apr 24 07:41:21 dell-per820-2 pulp: celery.worker.strategy:INFO: Received task: pulp.server.managers.consumer.applicability.regenerate_applicability_for_consumers[c990e256-a580-476f-8205-044fd8ee807d] Apr 24 07:41:21 dell-per820-2 pulp: pulp.server.managers.consumer.applicability:INFO: PavelM: regenerate_applicability for bound_repo_id ORG-Linux-Red_Hat_Enterprise_Linux_Server-Red_Hat_Enterprise_Linux_6_Server_RPMs_x86_64_6Server Apr 24 07:45:02 dell-per820-2 pulp: pulp.server.managers.consumer.applicability:INFO: PavelM: regenerate_applicability for bound_repo_id ORG-Linux-Red_Hat_Enterprise_Linux_Server-Red_Hat_Satellite_Tools_6_2_for_RHEL_6_Server_RPMs_x86_64 Apr 24 07:45:03 dell-per820-2 pulp: pulp.server.managers.consumer.applicability:INFO: PavelM: regenerate_applicability for bound_repo_id ORG-Linux-Red_Hat_Enterprise_Linux_Server-Red_Hat_Enterprise_Linux_6_Server_-_RH_Common_RPMs_x86_64_6Server Apr 24 07:45:04 dell-per820-2 pulp: pulp.server.managers.consumer.applicability:INFO: PavelM: regenerate_applicability for bound_repo_id ORG-Linux-Red_Hat_Enterprise_Linux_Server-Red_Hat_Enterprise_Linux_6_Server_-_Optional_RPMs_x86_64_6Server Apr 24 07:47:58 dell-per820-2 pulp: pulp.server.managers.consumer.applicability:INFO: PavelM: regenerate_applicability for bound_repo_id ORG-Linux-Red_Hat_Enterprise_Linux_Server-Red_Hat_Enterprise_Linux_6_Server_-_Extras_RPMs_x86_64 Apr 24 07:47:58 dell-per820-2 pulp: celery.worker.job:INFO: Task pulp.server.managers.consumer.applicability.regenerate_applicability_for_consumers[c990e256-a580-476f-8205-044fd8ee807d] succeeded in 397.601516957s: None See some further observation in: https://bugzilla.redhat.com/show_bug.cgi?id=1523433#c17 https://bugzilla.redhat.com/show_bug.cgi?id=1523433#c18 Reproducer: https://bugzilla.redhat.com/show_bug.cgi?id=1523433#c19 (beaker default password) Version-Release number of selected component (if applicable): 6.3.1 How reproducible: 100% Steps to Reproduce: 1. Use the reproducer machine / bz1523433#c19 (beaker default password) 2. Check the time the reg.app. task will take Actual results: >5minutes for the specified consumers Expected results: below 1 minute (?) will be a win Additional info: Problem seen on 6.2.14, customer upgrade to 6.3.1 didnt help here; reproducer machines are being updated to 6.3.1 Sizes of some mongo collections: # for i in consumer_bindings consumers consumer_unit_profiles erratum_pkglists repo_content_units repo_profile_applicability repos units_erratum units_package_group units_rpm ; do echo $i $(mongo pulp_database --eval "db.${i}.count()" | grep "^[0-9]"); done consumer_bindings 15766 consumers 4641 consumer_unit_profiles 4628 erratum_pkglists 188923 repo_content_units 16372066 repo_profile_applicability 197668 repos 4433 units_erratum 23975 units_package_group 89676 units_rpm 238287 # (is the 16M repo_content_units the key slow down factor?)
Another observation from the same customer / user scenario: remove orphans takes >6 hours (and still running with 100%CPU on mongo).
(In reply to Pavel Moravec from comment #3) > Another observation from the same customer / user scenario: remove orphans > takes >6 hours (and still running with 100%CPU on mongo). Remove orphans took over 2 days there :-S
The Pulp upstream bug status is at MODIFIED. Updating the external tracker on this bug.
The Pulp upstream bug priority is at Normal. Updating the external tracker on this bug.
All upstream Pulp bugs are at MODIFIED+. Moving this bug to POST.
There must be a bug in serializers part of the patch, as units search of a repo fails. Try searches like: pulpAdminPassword=$(grep ^default_password /etc/pulp/server.conf | cut -d' ' -f2) repo=whatever-Repository-you-have curl -i -H "Content-Type: application/json" -X POST -d "{\"criteria\":{\"type_ids\":[\"erratum\"],\"fields\":{\"unit\":[],\"association\":[\"unit_id\"]}}}" -u admin:$pulpAdminPassword https://$(hostname -f)/pulp/api/v2/repositories/${repo}/search/units/ (this POST request is queried by katello when processing Katello::Api::Rhsm::CandlepinProxiesController#get requests)
Created attachment 1446278 [details] tested patch Tested the cumulative patch of pulp_rpm PRs 1107 (without unit tests) and 1111 - see attached, applied via: cd /usr/lib/python2.7/site-packages/pulp_rpm cat /root/bz1573892-improvement-and-serializers.patch | patch -p3 (the above can be shared as officially _untested_ patch; "yum reinstall pulp-rpm-plugins" is a rollback) My testing results are all green: (*) reg.app. on the "benchmarked" consumers was still similarly significantly faster (3-20 times, now) (*) reg.app. properly updates errata applicability (played with downloading&upgrading&removing a package) (*) errata search works fine: repo=someRepoName curl -H "Content-Type: application/json" -X POST -d "{\"criteria\":{\"type_ids\":[\"erratum\"]}}" -u admin:$pulpAdminPassword https://$(hostname -f)/pulp/api/v2/repositories/${repo}/search/units/ (*) recursive units association works fine (tested per #c28) (*) previously failing errata search per #c31 works fine (*) tried several CV actions, all work OK
The Pulp upstream bug status is at CLOSED - CURRENTRELEASE. Updating the external tracker on this bug.
Requesting needsinfo from upstream developer ttereshc@redhat.com because the 'FailedQA' flag is set.
There was a missed CP as we didn't associate 3886 https://pulp.plan.io/issues/3886 to this bug, moving back to POST
Ignore above comment, I was looking at the wrong RPM/repo
Created attachment 1475668 [details] verification screenshot Verified in Satellite 6.3.3 Snap 2. Regenerate Applicability now only takes me less than a minute for RHEL 6 and RHEL 7 systems. RHEL 6 Systems had 135 applicable updates. RHEL 7 Systems had 376 applicable updates. See attached screenshot for task execution times.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2550