Bug 2247864

Summary: Syncing from https://galaxy.ansible.com/ failed with "get() returned more than one Collection -- it returned 2!"
Product: Red Hat Satellite Reporter: matt jia <mjia>
Component: PulpAssignee: satellite6-bugs <satellite6-bugs>
Status: CLOSED DUPLICATE QA Contact: Satellite QE Team <sat-qe-bz-list>
Severity: medium Docs Contact:
Priority: high    
Version: 6.13.4CC: ahumbe, dalley, dkliban, egolov, gformisa, ggainey, hakon.gislason, hyu, iballou, lmjachky, pmoravec, rchan, redhat, rlavi
Target Milestone: Unspecified   
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2024-01-12 14:56:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description matt jia 2023-11-03 23:18:15 UTC
Description of problem:

Here is the traceback:

error:
    traceback: |2
        File "/usr/lib/python3.9/site-packages/pulpcore/tasking/pulpcore_worker.py", line 452, in _perform_task
          result = func(*args, **kwargs)
        File "/usr/lib/python3.9/site-packages/pulp_ansible/app/tasks/collections.py", line 180, in sync
          repo_version = d_version.create()
        File "/usr/lib/python3.9/site-packages/pulpcore/plugin/stages/declarative_version.py", line 161, in create
          loop.run_until_complete(pipeline)
        File "/usr/lib64/python3.9/asyncio/base_events.py", line 647, in run_until_complete
          return future.result()
        File "/usr/lib/python3.9/site-packages/pulpcore/plugin/stages/api.py", line 225, in create_pipeline
          await asyncio.gather(*futures)
        File "/usr/lib/python3.9/site-packages/pulpcore/plugin/stages/api.py", line 43, in __call__
          await self.run()
        File "/usr/lib/python3.9/site-packages/pulpcore/plugin/stages/content_stages.py", line 198, in run
          await sync_to_async(process_batch)()
        File "/usr/lib/python3.9/site-packages/asgiref/sync.py", line 435, in __call__
          ret = await asyncio.wait_for(future, timeout=None)
        File "/usr/lib64/python3.9/asyncio/tasks.py", line 442, in wait_for
          return await fut
        File "/usr/lib64/python3.9/concurrent/futures/thread.py", line 58, in run
          result = self.fn(*self.args, **self.kwargs)
        File "/usr/lib/python3.9/site-packages/asgiref/sync.py", line 476, in thread_handler
          return func(*args, **kwargs)
        File "/usr/lib/python3.9/site-packages/pulpcore/plugin/stages/content_stages.py", line 106, in process_batch
          self._pre_save(batch)
        File "/usr/lib/python3.9/site-packages/pulp_ansible/app/tasks/collections.py", line 1042, in _pre_save
          collection, created = Collection.objects.get_or_create(
        File "/usr/lib/python3.9/site-packages/django/db/models/manager.py", line 85, in manager_method
          return getattr(self.get_queryset(), name)(*args, **kwargs)
        File "/usr/lib/python3.9/site-packages/django/db/models/query.py", line 581, in get_or_create
          return self.get(**kwargs), False
        File "/usr/lib/python3.9/site-packages/django/db/models/query.py", line 439, in get
          raise self.model.MultipleObjectsReturned(
    description: get() returned more than one Collection -- it returned 2!

Not exactly sure how the duplicates happened. It could be a race condition during the sync? Nevertheless, it'd be good to have a foreman-rake task to clean up the duplicates.


Version-Release number of selected component (if applicable):


How reproducible:

Hard

Steps to Reproduce:

Sync all the collections from galaxy


Actual results:

Failed

Expected results:

Success

Additional info:

Comment 3 Adrian Gerth 2023-11-17 10:57:29 UTC
Could this be related to: https://community.theforeman.org/t/pulp3-ansible-collection-sync-error/35815 ?

Comment 4 Daniel Alley 2023-11-17 15:04:59 UTC
Matt, any chance this is a result of corruption during a (very early, before we fixed this issue) RHEL 7 -> RHEL 8 upgrade? Previously reported as https://bugzilla.redhat.com/show_bug.cgi?id=2161929

Comment 5 Adrian Gerth 2023-11-17 18:14:24 UTC
Daniel, I just had a look at the linked bugzilla and it appears to be the same issue:
```
[root@katello ~]# su -ls /usr/bin/bash -c 'reindexdb -a' postgres
load average: 0.57 1.38 0.69
load average: 0.57 1.38 0.69
/etc/profile: line 88: TMOUT: readonly variable
reindexdb: reindexing database "candlepin"
reindexdb: reindexing database "foreman"
reindexdb: error: reindexing of database "foreman" failed: ERROR:  could not create unique index "index_fact_names_on_name_and_type"
DETAIL:  Key (name, type)=(ssh::rsa::key, PuppetFactName) is duplicated.
```
I just wonder why it fails over now since this has never been an issue with foreman 3.7 and katello 4.9

Comment 6 Daniel Alley 2023-11-17 18:17:57 UTC
I don't know.  Maybe it's perfectly fine until the corrupted entries are re-accessed?

Comment 7 Adrian Gerth 2023-11-17 18:22:03 UTC
I did the in-place upgrade on foreman 3.3 and katello 4.5 (roughly 14 months ago), I'd think the entries in the database would've been accessed earlier as the collections have been synced regularly.
Anyway, is there a proposed way forward?

Comment 8 Daniel Alley 2023-11-17 18:48:38 UTC
I think the duplicate entries just need to be manually cleaned up prior to performing the reindex operation, the same as with https://bugzilla.redhat.com/show_bug.cgi?id=2161929

Comment 9 Adrian Gerth 2023-11-17 19:35:20 UTC
I just checked back on it and sure, the facts can be easily deleted but this is sadly not where it stops. The next issues arise in the 'katello_erratum_packages' which has multiple references in 'katello_module_stream_erratum_packages'. I'm really not sure that's a feasible way forward 'just deleting the duplicates'. Any thoughts?

Comment 10 Adrian Gerth 2023-11-17 19:39:28 UTC
Especially not if all we're talking is issues with the content sync of ansible collections.

Comment 18 Daniel Alley 2024-01-12 14:56:20 UTC

*** This bug has been marked as a duplicate of bug 2161929 ***

Comment 19 Robin Chan 2024-01-15 17:05:52 UTC
The Pulp upstream bug status is at closed. Updating the external tracker on this bug.

Comment 20 Red Hat Bugzilla 2024-05-15 04:25:06 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days