Bug 734782

Summary: DuplicateKeyError: E11000 duplicate key error index
Product: Red Hat Update Infrastructure for Cloud Providers Reporter: James Slagle <jslagle>
Component: RHUAAssignee: John Matthews <jmatthew>
Status: CLOSED ERRATA QA Contact: mkovacik
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 2.0.1CC: igulina, jmatthew, kbidarka, sghai, tsanders
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Do not document
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-03-01 22:05:37 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 746803    
Attachments:
Description Flags
pulp logs none

Description James Slagle 2011-08-31 12:47:32 UTC
When kicking off a sync of all the repos for RHUA 2.0 using the issued CDN cert got the following for one of the repos:

DuplicateKeyError: E11000 duplicate key error index: pulp_database.packages.$name_-1_epoch_-1_version_-1_release_-1_arch_-1_filename_-1_checksum_-1  dup key: { : "selinux-policy-devel", : "0", : "2.4.6", : "255.el5_4.1", : "noarch", : "selinux-policy-devel-2.4.6-255.el5_4.1.noarch.rpm", : { sha256: "85d4a7922bb47963b7bb6f5f423bc601b299afad" } }
2011-08-30 20:13:17,047 1671:140266130777856: pulp.server.tasking.task:ERROR: task:380 Task failed: Task 41dd7142-d35c-11e0-813b-123139078252: _sync(rhel-server-5-5.7-x86_64, synchronizer=<pulp.server.api.synchronizers.YumSynchronizer object at 0x7f922807a590>, skip_dict={}, max_speed=None, threads=None, progress_callback=<bound method RepoSyncTask.progress_callback of <pulp.server.api.repo_sync_task.RepoSyncTask object at 0x7f923480af10>>)
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/pulp/server/tasking/task.py", line 329, in run
    result = self.callable(*self.args, **self.kwargs)
  File "/usr/lib/python2.6/site-packages/pulp/server/api/repo_sync.py", line 225, in _sync
    progress_callback, synchronizer, max_speed, threads)
  File "/usr/lib/python2.6/site-packages/pulp/server/api/repo_sync.py", line 308, in fetch_content
    added_packages = synchronizer.add_packages_from_dir(repo_dir, repo_id, skip_dict)
  File "/usr/lib/python2.6/site-packages/pulp/server/api/synchronizers.py", line 134, in add_packages_from_dir
    package = self.import_package(package, repo_id, repo_defined=True)
  File "/usr/lib/python2.6/site-packages/pulp/server/api/synchronizers.py", line 241, in import_package
    repo_defined=repo_defined)
  File "/usr/lib/python2.6/site-packages/pulp/server/api/package.py", line 54, in create
    self.collection.insert(p, safe=True)
  File "/usr/lib/python2.6/site-packages/pulp/server/db/connection.py", line 80, in retry
    return method(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/pymongo/collection.py", line 270, in insert
    check_keys, safe, kwargs), safe)
  File "/usr/lib64/python2.6/site-packages/pymongo/connection.py", line 642, in _send_message
    return self.__check_response_to_last_error(response)
  File "/usr/lib64/python2.6/site-packages/pymongo/connection.py", line 611, in __check_response_to_last_error
    raise DuplicateKeyError(error["err"])

Comment 1 James Slagle 2011-08-31 12:50:11 UTC
Created attachment 520814 [details]
pulp logs

Attaching tarball of pulp logs on the system.

Comment 2 James Slagle 2011-08-31 13:12:42 UTC
It's possible this could be a race condition as multiple repos were sync'ing at the same time.  I'm trying a resync of the affected repo to see if it goes away.

Comment 3 James Slagle 2011-08-31 13:14:19 UTC
A resync did indeed fix it

Comment 4 John Matthews 2011-09-30 18:11:30 UTC
Agree this is a race condition, proposed fix is to catch the DuplicateKeyError and recover from that.

Comment 5 John Matthews 2011-10-06 16:37:40 UTC
We were already catching a DuplicateKeyError, then looking up the package and continuing as long as it was found.

The situation we are seeing in this bz is that:
 1) We tried to create a package and it raised a DuplicateKeyError
 2) We looked up the package and did _not_ find a match

This is confusing, the exception tells us the package already exists, looking it up should have returned a match.

Below is more data from logs about what we are seeing:

2011-08-30 20:12:53,944 1671:140266130777856: pulp.server.api.synchronizers:ERROR: synchronizers:253 E11000 duplicate key error index: pulp_database.packages.$name_-1_epoch_-1_version_-1_release_-1_arch_-1_filename_-1_checksum_-1  dup key: { : "selinux-policy-devel", : "0", : "2.4.6", : "255.el5_4.1", : "noarch", : "selinux-policy-devel-2.4.6-255.el5_4.1.noarch.rpm", : { sha256: "85d4a7922bb47963b7bb6f5f423bc601b299afad" } }
2011-08-30 20:12:53,975 1671:140266130777856: pulp.server.api.synchronizers:ERROR: synchronizers:254 Caught DuplicateKeyError yet we didn't find a matching package in database
2011-08-30 20:12:53,975 1671:140266130777856: pulp.server.api.synchronizers:ERROR: synchronizers:256 Originally tried to create: name=selinux-policy-devel, epoch=0, version=2.4.6, arch=noarch, hashtype=sha256, checksum=85d4a7922bb47963b7bb6f5f423bc601b299afad, file_name=selinux-policy-devel-2.4.6-255.el5_4.1.noarch.rpm

Comment 6 John Matthews 2011-10-07 21:13:52 UTC
Added retry logic to rhui branch with below commit

http://git.fedorahosted.org/git/?p=pulp.git;a=commitdiff;h=5b27362736ead4321e33eb9376552551f824487d

My thinking is that we should not rely on DuplicateKeyError exception to signal that the entry has completed being added to MongoDB.  The thought of using a series of retries, where Pulp waits a random amount of time from 1 to 10 seconds on each retry is being done to work around any timing issues of a write being started so DuplicateKeyError is thrown and it completing.

Comment 7 John Matthews 2011-10-10 17:26:04 UTC
Here is a link to our question on mongodb-users about this issue

http://groups.google.com/group/mongodb-user/browse_thread/thread/39d115a6ff44f719/6b0ccd6b644835a7#6b0ccd6b644835a7

Comment 8 wes hayutin 2011-10-17 20:01:52 UTC
set tracker bug. 746803

Comment 10 Sachin Ghai 2011-10-21 11:28:25 UTC
Verified with following RHUI ISO:

RHEL-6.1-RHUI-2.0.1-20111017.0-Server-x86_64-DVD1.iso

I started a sync of RHUI2.0 repo and sync other repos too. However the reported is issue no longer reproducible now.


Red Hat Repositories
  Red Hat Update Infrastructure 2.0 (RPMs) (6.0-i386)
  Red Hat Update Infrastructure 2.0 (RPMs) (6.1-i386)
  Red Hat Update Infrastructure 2.0 (RPMs) (6Server-i386)
  Red Hat Update Infrastructure 2.0 (RPMs) (6.0-x86_64)
  Red Hat Update Infrastructure 2.0 (RPMs) (6Server-x86_64)
  Red Hat Update Infrastructure 2.0 (RPMs) (6.1-x86_64)
  Red Hat Enterprise Linux Server 6 (RPMs) (6.0-i386)
  Red Hat Enterprise Linux Server 6 (RPMs) (6.0-x86_64)
  Red Hat Enterprise Linux Server 6 (RPMs) (6Server-x86_64)
  Red Hat Enterprise Linux Server 6 (RPMs) (6.1-x86_64)
  Red Hat Enterprise Linux Server 6 (RPMs) (6Server-i386)
  Red Hat Enterprise Linux Server 6 (RPMs) (6.1-i386)
  Red Hat Enterprise Linux Server 5 (RPMs) (5.6-i386)
  Red Hat Enterprise Linux Server 5 (RPMs) (5.7-i386)
  Red Hat Enterprise Linux Server 5 (RPMs) (5.6-x86_64)
  Red Hat Enterprise Linux Server 5 (RPMs) (5.7-x86_64)
  Red Hat Enterprise Linux Server 6 Optional (RPMs) (6.0-i386)
  Red Hat Enterprise Linux Server 6 Optional (RPMs) (6.0-x86_64)
  Red Hat Enterprise Linux Server 6 Optional (RPMs) (6Server-i386)
  Red Hat Enterprise Linux Server 6 Optional (RPMs) (6.1-x86_64)
  Red Hat Enterprise Linux Server 6 Optional (RPMs) (6Server-x86_64)
  Red Hat Enterprise Linux Server 6 Optional (RPMs) (6.1-i386)
  Red Hat Enterprise Linux Server 5 (RPMs) (5Server-i386)
  Red Hat Enterprise Linux Server 5 (RPMs) (5Server-x86_64)

Comment 11 James Slagle 2011-10-31 19:16:41 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Do not document

Comment 15 errata-xmlrpc 2017-03-01 22:05:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:0367