Bug 1991557 - Many Postgres ERRORs (duplicate key) especially on RedHat repo sync
Summary: Many Postgres ERRORs (duplicate key) especially on RedHat repo sync
Keywords:
Status: MODIFIED
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Repositories
Version: 6.9.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium vote
Target Milestone: Unspecified
Assignee: Justin Sherrill
QA Contact: Cole Higgins
URL:
Whiteboard:
Depends On:
Blocks: 1957813
TreeView+ depends on / blocked
 
Reported: 2021-08-09 12:23 UTC by Brad Buckingham
Modified: 2022-04-07 14:25 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Foreman Issue Tracker 33451 0 Normal Closed Many Postgres ERRORs (duplicate key) especially on RedHat repo sync 2022-02-11 20:09:23 UTC

Description Brad Buckingham 2021-08-09 12:23:34 UTC
Description of problem:

As part of the Satellite 6.9 Pulp 3 High-Touch Beta, several duplicate key errors were observed on Red Hat repository syncs.

During this HTB, a Satellite 6.9 instance went through the process of migration to Pulp 3, a switchover to Pulp 3 was performed and content management workflows executed.

Version-Release number of selected component (if applicable):
6.9.1

Errors observed:

The errors on Thursday are between 03:00am-04:00am that was the first time I was running the RedHat Repositories sync after the pulp3 switchover

To give an idea on the tables and error i summarized the count per hour and error

root@hostname:/var/opt/rh/rh-postgresql12/lib/pgsql/data/log# grep 'UTC ERROR:' *.log | cut -c-1000 | sed 's/\(2021-.* [0-9].:\).*UTC/\1/' | sort | uniq -c
      2 postgresql-Mon.log:2021-06-28 08: ERROR:  duplicate key value violates unique constraint "index_katello_rpms_on_pulp_id"
    843 postgresql-Thu.log:2021-07-01 03: ERROR:  duplicate key value violates unique constraint "index_katello_errata_on_pulp_id"
   2111 postgresql-Thu.log:2021-07-01 03: ERROR:  duplicate key value violates unique constraint "index_katello_erratum_cves_on_erratum_id_and_cve_id_and_href"
   3990 postgresql-Thu.log:2021-07-01 03: ERROR:  duplicate key value violates unique constraint "katello_erratum_bz_eid_bid_href"
  18171 postgresql-Thu.log:2021-07-01 03: ERROR:  duplicate key value violates unique constraint "katello_erratum_packages_eid_nvrea_n_f"problem:

Additional info:

https://bugzilla.redhat.com/show_bug.cgi?id=1991527 - contains additional errors reported

See https://bugzilla.redhat.com/show_bug.cgi?id=1991527#c3

Comment 2 Justin Sherrill 2021-09-01 19:57:33 UTC
it looks like these errors popped in the logs during sync, this is actually expected when syncing multiple repos at the same time.  Rails now supports upsert and upsert_all (as of rails 6.0) which should allow us to avoid this and possibly improve performance as well.  As long as the actual sync didn't fail (which it doesn't seem like it did), i don't consider this a regression however.  But it should be a small change we can do in 6.10

Comment 3 Justin Sherrill 2021-09-09 20:30:02 UTC
Created redmine issue https://projects.theforeman.org/issues/33451 from this bug

Comment 4 Bryan Kearney 2021-09-10 00:05:21 UTC
Upstream bug assigned to jsherril@redhat.com

Comment 5 Bryan Kearney 2021-09-10 00:05:23 UTC
Upstream bug assigned to jsherril@redhat.com

Comment 6 Justin Sherrill 2021-09-10 11:38:44 UTC
After some investigation, it turns out some tables are missing uniqueness constraints which are probably too risky to add with migrations at this point in 6.10.  I'm proposing we move to 7.0.   Since this isn't a regression from 6.8/6.9 or earlier I don't see it as a big issue

Comment 7 Peter Vreman 2021-09-14 08:54:07 UTC
Some input from user point of view:

For me the errors did not impact the functionality.

The minor impact:
- Size the postgres log grows
- Many red herrings in the log that might hide real errors and distract during troubleshooting.


Still it would be good to have this documented as being expected in 6.10 and improvements are on roadmap for 7.x

Comment 11 Bryan Kearney 2022-02-17 16:05:02 UTC
Moving this bug to POST for triage into Satellite since the upstream issue https://projects.theforeman.org/issues/33451 has been resolved.

Comment 14 Justin Sherrill 2022-04-06 19:56:07 UTC
I would recommend punting this to 6.12.  The change has already been merged upstream, but it was a very large change and present good bit of risk.  Cherry picking back to 6.11 would not be advisable.  However since its merged against upstream it will automatically appear in 6.12, and have more testing/baking done.


Note You need to log in before you can comment on or make changes to this bug.