Description of problem: Occasionally, the Actions::Pulp::Consumer::GenerateApplicability task will raise the following: traceback: | Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/celery/app/trace.py", line 240, in trace_task R = retval = fun(*args, **kwargs) File "/usr/lib/python2.7/site-packages/pulp/server/async/tasks.py", line 473, in __call__ return super(Task, self).__call__(*args, **kwargs) File "/usr/lib/python2.7/site-packages/pulp/server/async/tasks.py", line 103, in __call__ return super(PulpTask, self).__call__(*args, **kwargs) File "/usr/lib/python2.7/site-packages/celery/app/trace.py", line 437, in __protected_call__ return self.run(*args, **kwargs) File "/usr/lib/python2.7/site-packages/pulp/server/managers/consumer/applicability.py", line 107, in regenerate_applicability_for_consumers manager.regenerate_applicability(profile_hash, content_type, profile_id, repo_id) File "/usr/lib/python2.7/site-packages/pulp/server/managers/consumer/applicability.py", line 272, in regenerate_applicability applicability) File "/usr/lib/python2.7/site-packages/pulp/server/managers/consumer/applicability.py", line 384, in create applicability.save() File "/usr/lib/python2.7/site-packages/pulp/server/db/model/consumer.py", line 234, in save self._id = self.get_collection().insert(new_document) File "/usr/lib64/python2.7/site-packages/pymongo/collection.py", line 2200, in insert check_keys, manipulate, write_concern) File "/usr/lib64/python2.7/site-packages/pymongo/collection.py", line 530, in _insert check_keys, manipulate, write_concern, op_id, bypass_doc_val) File "/usr/lib64/python2.7/site-packages/pymongo/collection.py", line 513, in _insert_one _check_write_command_response([(0, result)]) File "/usr/lib64/python2.7/site-packages/pymongo/helpers.py", line 300, in _check_write_command_response raise DuplicateKeyError(error.get("errmsg"), 11000, error) DuplicateKeyError: insertDocument :: caused by :: 11000 E11000 duplicate key error index: pulp_database.repo_profile_applicability.$profile_hash_-1_repo_id_-1 dup key: { : "29989a2ea7c148716ba750976ca1229628d4f76b45edd3d8378934a6ebc9a135", : "OrgName-Red_Hat_Software_Collections__for_RHEL_Server_-Red_Hat_Software_Collections_RPMs_for_Red_Hat_Enterprise_Linux_7_Server_x86_64_7Server" } This seems to happen only occasionally, so I suspect its a race condition. This bug causes issues with tasks hanging around forever because of bz 1425159. Basically if a user hits this bug, they can't just cancel out the task via the web UI due to 1425159, and we have to dive into the dynflow console to hit cancel. Version-Release number of selected component (if applicable): 6.2.10 # rpm -qa | grep pulp | sort pulp-admin-client-2.8.7.12-1.el7sat.noarch pulp-docker-plugins-2.0.3.1-1.el7sat.noarch pulp-katello-1.0.2-1.el7sat.noarch pulp-puppet-plugins-2.8.7.1-1.el7sat.noarch pulp-puppet-tools-2.8.7.1-1.el7sat.noarch pulp-rpm-admin-extensions-2.8.7.14-2.el7sat.noarch pulp-rpm-plugins-2.8.7.14-2.el7sat.noarch pulp-rpm-yumplugins-2.8.7.14-2.el7sat.noarch pulp-selinux-2.8.7.12-1.el7sat.noarch pulp-server-2.8.7.12-1.el7sat.noarch python-pulp-bindings-2.8.7.12-1.el7sat.noarch python-pulp-client-lib-2.8.7.12-1.el7sat.noarch python-pulp-common-2.8.7.12-1.el7sat.noarch python-pulp-docker-common-2.0.3.1-1.el7sat.noarch python-pulp-oid_validation-2.8.7.12-1.el7sat.noarch python-pulp-puppet-common-2.8.7.1-1.el7sat.noarch python-pulp-repoauth-2.8.7.12-1.el7sat.noarch python-pulp-rpm-common-2.8.7.14-2.el7sat.noarch python-pulp-streamer-2.8.7.12-1.el7sat.noarch rubygem-smart_proxy_pulp-1.2.2-1.el7sat.noarch How reproducible: rarely
The Pulp upstream bug status is at NEW. Updating the external tracker on this bug.
The Pulp upstream bug priority is at Normal. Updating the external tracker on this bug.
The Pulp upstream bug priority is at High. Updating the external tracker on this bug.
The Pulp upstream bug status is at ASSIGNED. Updating the external tracker on this bug.
The Pulp upstream bug status is at POST. Updating the external tracker on this bug.
The Pulp upstream bug status is at MODIFIED. Updating the external tracker on this bug.
All upstream Pulp bugs are at MODIFIED+. Moving this bug to POST.
*** Bug 1475894 has been marked as a duplicate of this bug. ***
The Pulp upstream bug status is at ON_QA. Updating the external tracker on this bug.
The Pulp upstream bug status is at CLOSED - CURRENTRELEASE. Updating the external tracker on this bug.
*** Bug 1488062 has been marked as a duplicate of this bug. ***
Created attachment 1325504 [details] pulp-server hotfix RPM for 6.2.11 (el7)
Created attachment 1325505 [details] pulp-server hotfix RPM for 6.2.11 (el6)
Created attachment 1326020 [details] pulp-server hotfix RPM for 6.2.11 (el7)
NOT supported patch/fix for 6.2.12: cd /usr/lib/python2.7/site-packages wget https://github.com/pulp/pulp/commit/7a90d861d6d63bd8e6f17c03ca095f768127758c.patch cat 7a90d861d6d63bd8e6f17c03ca095f768127758c.patch | patch -p2 for i in pulp_resource_manager pulp_workers pulp_celerybeat; do service $i restart; done
Created attachment 1350777 [details] pulp-server hotfix RPM for 6.2.12 (el7)
Created attachment 1350782 [details] pulp-server hotfix RPM for 6.2.12 (el6)
Generic hotfix RPMs for this BZ to fix the issue in 6.2.12 have been attached.
*** HOTFIX INSTRUCTIONS *** 1) Download pulp-server RPM from bug for either EL6 or EL7 2) katello-service stop 3) rpm -Uvh pulp-server-2.8.7.15-2.HOTFIXRHBZ1468022.el7sat.noarch.rpm 4) katello-service start
I've installed that hotfix. katello-service start went smoothly, although I'm still seeing a lot of warning messages for Tomcat, and error messages (non fatal) for Squid. The latter should be fairly easy to fix by just removing those two obsolete directives. But that's beside the point of this BZ. Thank you for the hotfix. Will there be one for 6.2.13?
Hit same issue on upgrading 6.2.12 -> 6.3 snap25 Nov 17 16:00:01 qe-sat6-upgrade-rhel7 pulp: celery.worker.job:ERROR: (4270-61088) Traceback (most recent call last): Nov 17 16:00:01 qe-sat6-upgrade-rhel7 pulp: celery.worker.job:ERROR: (4270-61088) File "/usr/lib/python2.7/site-packages/celery/app/trace.py", line 240, in trace_task Nov 17 16:00:01 qe-sat6-upgrade-rhel7 pulp: celery.worker.job:ERROR: (4270-61088) R = retval = fun(*args, **kwargs) Nov 17 16:00:01 qe-sat6-upgrade-rhel7 pulp: celery.worker.job:ERROR: (4270-61088) File "/usr/lib/python2.7/site-packages/pulp/server/async/tasks.py", line 473, in __call__ Nov 17 16:00:01 qe-sat6-upgrade-rhel7 pulp: celery.worker.job:ERROR: (4270-61088) return super(Task, self).__call__(*args, **kwargs) Nov 17 16:00:01 qe-sat6-upgrade-rhel7 pulp: celery.worker.job:ERROR: (4270-61088) File "/usr/lib/python2.7/site-packages/pulp/server/async/tasks.py", line 103, in __call__ Nov 17 16:00:01 qe-sat6-upgrade-rhel7 pulp: celery.worker.job:ERROR: (4270-61088) return super(PulpTask, self).__call__(*args, **kwargs) Nov 17 16:00:01 qe-sat6-upgrade-rhel7 pulp: celery.worker.job:ERROR: (4270-61088) File "/usr/lib/python2.7/site-packages/celery/app/trace.py", line 437, in __protected_call__ Nov 17 16:00:01 qe-sat6-upgrade-rhel7 pulp: celery.worker.job:ERROR: (4270-61088) return self.run(*args, **kwargs) Nov 17 16:00:01 qe-sat6-upgrade-rhel7 pulp: celery.worker.job:ERROR: (4270-61088) File "/usr/lib/python2.7/site-packages/pulp/server/managers/consumer/applicability.py", line 107, in regenerate_applicability_for_consumers Nov 17 16:00:01 qe-sat6-upgrade-rhel7 pulp: celery.worker.job:ERROR: (4270-61088) manager.regenerate_applicability(profile_hash, content_type, profile_id, repo_id) Nov 17 16:00:01 qe-sat6-upgrade-rhel7 pulp: celery.worker.job:ERROR: (4270-61088) File "/usr/lib/python2.7/site-packages/pulp/server/managers/consumer/applicability.py", line 272, in regenerate_applicability Nov 17 16:00:01 qe-sat6-upgrade-rhel7 pulp: celery.worker.job:ERROR: (4270-61088) applicability) Nov 17 16:00:01 qe-sat6-upgrade-rhel7 pulp: celery.worker.job:ERROR: (4270-61088) File "/usr/lib/python2.7/site-packages/pulp/server/managers/consumer/applicability.py", line 384, in create Nov 17 16:00:01 qe-sat6-upgrade-rhel7 pulp: celery.worker.job:ERROR: (4270-61088) applicability.save() Nov 17 16:00:01 qe-sat6-upgrade-rhel7 pulp: celery.worker.job:ERROR: (4270-61088) File "/usr/lib/python2.7/site-packages/pulp/server/db/model/consumer.py", line 234, in save Nov 17 16:00:01 qe-sat6-upgrade-rhel7 pulp: celery.worker.job:ERROR: (4270-61088) self._id = self.get_collection().insert(new_document) Nov 17 16:00:01 qe-sat6-upgrade-rhel7 pulp: celery.worker.job:ERROR: (4270-61088) File "/usr/lib64/python2.7/site-packages/pymongo/collection.py", line 2200, in insert Nov 17 16:00:01 qe-sat6-upgrade-rhel7 pulp: celery.worker.job:ERROR: (4270-61088) check_keys, manipulate, write_concern) Nov 17 16:00:01 qe-sat6-upgrade-rhel7 pulp: celery.worker.job:ERROR: (4270-61088) File "/usr/lib64/python2.7/site-packages/pymongo/collection.py", line 530, in _insert Nov 17 16:00:01 qe-sat6-upgrade-rhel7 pulp: celery.worker.job:ERROR: (4270-61088) check_keys, manipulate, write_concern, op_id, bypass_doc_val) Nov 17 16:00:01 qe-sat6-upgrade-rhel7 pulp: celery.worker.job:ERROR: (4270-61088) File "/usr/lib64/python2.7/site-packages/pymongo/collection.py", line 513, in _insert_one Nov 17 16:00:01 qe-sat6-upgrade-rhel7 pulp: celery.worker.job:ERROR: (4270-61088) _check_write_command_response([(0, result)]) Nov 17 16:00:01 qe-sat6-upgrade-rhel7 pulp: celery.worker.job:ERROR: (4270-61088) File "/usr/lib64/python2.7/site-packages/pymongo/helpers.py", line 300, in _check_write_command_response Nov 17 16:00:01 qe-sat6-upgrade-rhel7 pulp: celery.worker.job:ERROR: (4270-61088) raise DuplicateKeyError(error.get("errmsg"), 11000, error) Nov 17 16:00:01 qe-sat6-upgrade-rhel7 pulp: celery.worker.job:ERROR: (4270-61088) DuplicateKeyError: insertDocument :: caused by :: 11000 E11000 duplicate key error index: pulp_database.repo_profile_applicability.$profile_hash_-1_repo_id_-1 dup key: { : "0b583821e55e8f92bb3c7b932740bb02b0819726d7c2dcfb835a4c44b226ed21", : "Default_Organization-Red_Hat_Enterprise_Linux_Server-Red_Hat_Enterprise_Linux_6_Server_RPMs_x86_64_6Server" }
this is actually part of pulp 2.13.4, moving to ON_DEV
qa_notes: 6.2.13-1: - been able to safely reproduce the duplicatekeyerror by creating 2 mock consumers and uploading the package profiles for them while manually adding a sleep(30) to the applicability_profile creation branch (to make more time for the tasks to end up trying to create the profile).
qa_notes: waiting for https://bugzilla.redhat.com/show_bug.cgi?id=1446712 since it blocks my reproducer. Anyway, the fix looks sane to me and i believe the duplicatekeyerror will be properly handled now.
Will/can hotfix RPMs be made for 6.2.13 as well? Or will the ones made previously for 6.2.12 work for 6.2.13 as well? Thanks.
(In reply to Matt from comment #60) > Will/can hotfix RPMs be made for 6.2.13 as well? Or will the ones made > previously for 6.2.12 work for 6.2.13 as well? > > Thanks. I dont know plans about releasing HF for 6.2.13. Applying the HF for 6.2.12 to 6.2.13 will 1) hit package version requirements problems, 2) revert some another fix with reconnection to mongo. Unofficial unsupported patch that works for me even on 6.2.13 (again, it is unsupported): # cat ~/bz1468022.sat6213.patch --- pulp-server-2.8.7.17-1/usr/lib/python2.7/site-packages/pulp/server/managers/consumer/applicability.py 2017-12-30 09:08:52.523887747 +0100 +++ pulp-server-2.8.7.15-2.HOTFIXRHBZ1468022/usr/lib/python2.7/site-packages/pulp/server/managers/consumer/applicability.py 2017-12-30 09:04:09.523498007 +0100 @@ -7,6 +7,7 @@ from logging import getLogger from uuid import uuid4 from celery import task +from pymongo.errors import DuplicateKeyError from pulp.plugins.conduits.profiler import ProfilerConduit from pulp.plugins.config import PluginCallConfiguration @@ -260,16 +261,20 @@ class ApplicabilityRegenerationManager(o _logger.debug(msg) return - if existing_applicability: - # Update existing applicability object - existing_applicability.applicability = applicability - existing_applicability.save() - else: + try: # Create a new RepoProfileApplicability object and save it in the db RepoProfileApplicability.objects.create(profile_hash, bound_repo_id, - unit_profile['profile'], + profile, applicability) + except DuplicateKeyError: + # Update existing applicability + if not existing_applicability: + applicability_dict = RepoProfileApplicability.get_collection().find_one( + {'repo_id': bound_repo_id, 'profile_hash': profile_hash}) + existing_applicability = RepoProfileApplicability(**applicability_dict) + existing_applicability.applicability = applicability + existing_applicability.save() @staticmethod def _get_existing_repo_content_types(repo_id): @@ -456,6 +461,8 @@ class RepoProfileApplicabilityManager(ob # Remove all RepoProfileApplicability objects that reference these profile hashes if missing_profile_hashes: rpa_collection.remove({'profile_hash': {'$in': missing_profile_hashes}}) + + # Instantiate one of the managers on the object it manages for convenience RepoProfileApplicability.objects = RepoProfileApplicabilityManager() # cd / # cat ~/bz1468022.sat6213.patch | patch -p1 patching file usr/lib/python2.7/site-packages/pulp/server/managers/consumer/applicability.py # and to apply the patch: for i in pulp_resource_manager pulp_workers pulp_celerybeat; do service $i restart; done (but again, this isnt officially supported)
*** Bug 1530687 has been marked as a duplicate of this bug. ***
VERIFIED on sat6.3.0-33 - had to disable streamer log blacklisting (pulp/server/logs.py) - had to introduce some time.sleep() into the `regenerate_applicability` method in order to have enough time to send out package&repository profile from different host. - had to add some _logger.info() messages to be sure my request ended up in the DuplicateKeyError branch. - was able to reliably invoke the dupekey scenario which was now correctly handled by pulp.
Did this fix make it into 6.2.14, and if not might there be a hotfix pending, please?
Hi Ben, it looks like this was cloned to bug 1515195 and should be included in 6.2.14 as part of that bugzilla.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. > > For information on the advisory, and where to find the updated files, follow the link below. > > If the solution does not work for you, open a new bug report. > > https://access.redhat.com/errata/RHSA-2018:0336