Bug 1991527 - Many Postgres ERRORs (duplicate key) especailly on RedHat repo sync
Summary: Many Postgres ERRORs (duplicate key) especailly on RedHat repo sync
Keywords:
Status: NEW
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Documentation
Version: 6.9.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium vote
Target Milestone: Unspecified
Assignee: Nobody
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1957813
TreeView+ depends on / blocked
 
Reported: 2021-08-09 11:04 UTC by Brad Buckingham
Modified: 2022-05-23 08:35 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker SATDOC-625 0 None None None 2022-02-14 20:13:52 UTC

Description Brad Buckingham 2021-08-09 11:04:54 UTC
Description of problem:

As part of the Satellite 6.9 Pulp 3 High-Touch Beta, several duplicate key errors were observed on Red Hat repository syncs.

During this HTB, a Satellite 6.9 instance went through the process of migration to Pulp 3, a switchover to Pulp 3 was performed and content management workflows executed.

Version-Release number of selected component (if applicable):
6.9.1

Errors observed:

root@hostname:/var/opt/rh/rh-postgresql12/lib/pgsql/data/log# grep -c ERROR *.log
postgresql-Fri.log:39
postgresql-Mon.log:7
postgresql-Sat.log:0
postgresql-Sun.log:0
postgresql-Thu.log:35071
postgresql-Tue.log:5955
postgresql-Wed.log:33

The errors on Thursday are between 03:00am-04:00am that was the first time I was running the RedHat Repositories sync after the pulp3 switchover

To give an idea on the tables and error i summarized the count per hour and error

root@hostname:/var/opt/rh/rh-postgresql12/lib/pgsql/data/log# grep 'UTC ERROR:' *.log | cut -c-1000 | sed 's/\(2021-.* [0-9].:\).*UTC/\1/' | sort | uniq -c
     21 postgresql-Fri.log:2021-07-02 03: ERROR:  duplicate key value violates unique constraint "core_artifact_sha256_key"
      5 postgresql-Fri.log:2021-07-02 09: ERROR:  duplicate key value violates unique constraint "core_artifact_sha256_key"
      4 postgresql-Fri.log:2021-07-02 12: ERROR:  duplicate key value violates unique constraint "core_artifact_sha256_key"
      1 postgresql-Fri.log:2021-07-02 13: ERROR:  duplicate key value violates unique constraint "core_artifact_sha256_key"
      1 postgresql-Fri.log:2021-07-02 13: ERROR:  duplicate key value violates unique constraint "dynflow_coordinator_records_pkey"
     10 postgresql-Fri.log:2021-07-02 14: ERROR:  duplicate key value violates unique constraint "core_artifact_sha256_key"
      1 postgresql-Mon.log:2021-06-07 14: ERROR:  duplicate key value violates unique constraint "dynflow_coordinator_records_pkey"
      1 postgresql-Mon.log:2021-06-14 13: ERROR:  duplicate key value violates unique constraint "dynflow_coordinator_records_pkey"
      1 postgresql-Mon.log:2021-06-28 07: ERROR:  duplicate key value violates unique constraint "dynflow_coordinator_records_pkey"
      1 postgresql-Mon.log:2021-06-28 08: ERROR:  duplicate key value violates unique constraint "dynflow_coordinator_records_pkey"
      2 postgresql-Mon.log:2021-06-28 08: ERROR:  duplicate key value violates unique constraint "index_katello_rpms_on_pulp_id"
      1 postgresql-Mon.log:2021-06-28 13: ERROR:  duplicate key value violates unique constraint "dynflow_coordinator_records_pkey"
      1 postgresql-Thu.log:2021-07-01 03: ERROR:  could not extend file "base/16926/640886": No space left on device
      2 postgresql-Thu.log:2021-07-01 03: ERROR:  could not write to file "base/pgsql_tmp/pgsql_tmp12388.4": No space left on device
      2 postgresql-Thu.log:2021-07-01 03: ERROR:  could not write to file "base/pgsql_tmp/pgsql_tmp14002.4": No space left on device
      2 postgresql-Thu.log:2021-07-01 03: ERROR:  could not write to file "base/pgsql_tmp/pgsql_tmp14313.2": No space left on device
    263 postgresql-Thu.log:2021-07-01 03: ERROR:  duplicate key value violates unique constraint "core_artifact_sha256_key"
    843 postgresql-Thu.log:2021-07-01 03: ERROR:  duplicate key value violates unique constraint "index_katello_errata_on_pulp_id"
   2111 postgresql-Thu.log:2021-07-01 03: ERROR:  duplicate key value violates unique constraint "index_katello_erratum_cves_on_erratum_id_and_cve_id_and_href"
   3990 postgresql-Thu.log:2021-07-01 03: ERROR:  duplicate key value violates unique constraint "katello_erratum_bz_eid_bid_href"
  18171 postgresql-Thu.log:2021-07-01 03: ERROR:  duplicate key value violates unique constraint "katello_erratum_packages_eid_nvrea_n_f"
   1932 postgresql-Thu.log:2021-07-01 03: ERROR:  duplicate key value violates unique constraint "rpm_modulemd_name_stream_version_context_arch_f6598e4e_uniq"
      6 postgresql-Thu.log:2021-07-01 03: ERROR:  duplicate key value violates unique constraint "rpm_package_pkgId_key"
   7662 postgresql-Thu.log:2021-07-01 03: ERROR:  duplicate key value violates unique constraint "rpm_updaterecord_digest_key"
     37 postgresql-Thu.log:2021-07-01 05: ERROR:  duplicate key value violates unique constraint "core_artifact_sha256_key"
     11 postgresql-Thu.log:2021-07-01 08: ERROR:  duplicate key value violates unique constraint "core_artifact_sha256_key"
     15 postgresql-Thu.log:2021-07-01 11: ERROR:  duplicate key value violates unique constraint "core_artifact_sha256_key"
      4 postgresql-Thu.log:2021-07-01 12: ERROR:  duplicate key value violates unique constraint "core_artifact_sha256_key"
      4 postgresql-Thu.log:2021-07-01 15: ERROR:  duplicate key value violates unique constraint "core_artifact_sha256_key"
      4 postgresql-Thu.log:2021-07-01 16: ERROR:  duplicate key value violates unique constraint "core_artifact_sha256_key"
     99 postgresql-Tue.log:2021-06-29 10: ERROR:  duplicate key value violates unique constraint "core_artifact_pkey"
    164 postgresql-Tue.log:2021-06-29 11: ERROR:  duplicate key value violates unique constraint "core_artifact_pkey"
    705 postgresql-Tue.log:2021-06-29 11: ERROR:  duplicate key value violates unique constraint "core_artifact_sha256_key"
    185 postgresql-Tue.log:2021-06-29 12: ERROR:  duplicate key value violates unique constraint "core_artifact_pkey"
    913 postgresql-Tue.log:2021-06-29 12: ERROR:  duplicate key value violates unique constraint "core_artifact_sha256_key"
     25 postgresql-Tue.log:2021-06-29 13: ERROR:  duplicate key value violates unique constraint "core_artifact_pkey"
   1422 postgresql-Tue.log:2021-06-29 13: ERROR:  duplicate key value violates unique constraint "core_artifact_sha256_key"
      3 postgresql-Tue.log:2021-06-29 13: ERROR:  duplicate key value violates unique constraint "rpm_distributiontree_header_version_release_n_5404b190_uniq"
    539 postgresql-Tue.log:2021-06-29 13: ERROR:  duplicate key value violates unique constraint "rpm_modulemddefaults_digest_key"
     90 postgresql-Tue.log:2021-06-29 13: ERROR:  duplicate key value violates unique constraint "rpm_packagecategory_digest_key"
     87 postgresql-Tue.log:2021-06-29 13: ERROR:  duplicate key value violates unique constraint "rpm_packageenvironment_digest_key"
    389 postgresql-Tue.log:2021-06-29 13: ERROR:  duplicate key value violates unique constraint "rpm_packagegroup_digest_key"
     20 postgresql-Tue.log:2021-06-29 13: ERROR:  duplicate key value violates unique constraint "rpm_packagelangpacks_digest_key"
    133 postgresql-Tue.log:2021-06-29 13: ERROR:  duplicate key value violates unique constraint "rpm_repometadatafile_data_type_checksum_relat_c9d7364a_uniq"
    584 postgresql-Tue.log:2021-06-29 13: ERROR:  duplicate key value violates unique constraint "rpm_updaterecord_digest_key"
    259 postgresql-Tue.log:2021-06-29 14: ERROR:  duplicate key value violates unique constraint "core_artifact_sha256_key"
    338 postgresql-Tue.log:2021-06-29 15: ERROR:  duplicate key value violates unique constraint "core_artifact_sha256_key"
      1 postgresql-Wed.log:2021-06-30 06: ERROR:  duplicate key value violates unique constraint "dynflow_coordinator_records_pkey"
      1 postgresql-Wed.log:2021-06-30 07: ERROR:  duplicate key value violates unique constraint "dynflow_coordinator_records_pkey"
      1 postgresql-Wed.log:2021-06-30 16: ERROR:  duplicate key value violates unique constraint "core_artifact_sha256_key"
      5 postgresql-Wed.log:2021-06-30 17: ERROR:  duplicate key value violates unique constraint "core_artifact_sha256_key"
      1 postgresql-Wed.log:2021-06-30 17: ERROR:  duplicate key value violates unique constraint "dynflow_coordinator_records_pkey"
      1 postgresql-Wed.log:2021-06-30 17: ERROR:  update or delete on table "core_reservedresource" violates foreign key constraint "core_taskreservedres_resource_id_ee0b7c62_fk_core_rese" on table "core_taskreservedresource"
      4 postgresql-Wed.log:2021-06-30 18: ERROR:  duplicate key value violates unique constraint "core_artifact_sha256_key"
      8 postgresql-Wed.log:2021-06-30 19: ERROR:  duplicate key value violates unique constraint "core_artifact_sha256_key"
     11 postgresql-Wed.log:2021-06-30 20: ERROR:  duplicate key value violates unique constraint "core_artifact_sha256_key"
root@hostname:/var/opt/rh/rh-postgresql12/lib/pgsql/data/log#

Comment 2 Brad Buckingham 2021-08-09 11:12:43 UTC
Bug 1989182 was also created for a duplicate constraint on rpm_variant_variant_id_ .

Comment 3 Peter Vreman 2021-08-09 11:47:42 UTC
The duplicate keys are seen in different database (base on ttable names) and might be good to be split in a pulp3 BZ and katello BZ

E.g. the following is katello code:
~~~
   3990 postgresql-Thu.log:2021-07-01 03: ERROR:  duplicate key value violates unique constraint "katello_erratum_bz_eid_bid_href"
  18171 postgresql-Thu.log:2021-07-01 03: ERROR:  duplicate key value violates unique constraint "katello_erratum_packages_eid_nvrea_n_f"
~~~



Input for a Reproducer, i sync the following RHEL7 repos, simaliar setup repos also for RHEL8.
- RHEL 7.6 E4S
  Server
  Optional
  Sat6.8-Tools
  Sat6.9-Tools
- RHEL 7.7 EUS
  Server
  Optional
  Sat6.8-Tools
  Sat6.9-Tools
- RHEL 7.7 E4S
  Server
  Optional
  Sat6.8-Tools
  Sat6.9-Tools
- RHEL 7.9
  Server
  Optional
  Sat6.8-Tools
  Sat6.9-Tools
- RHEL 7Server
  Server
  Optional
  Sat6.8-Tools
  Sat6.9-Tools
- RHSCL


There are several upstream tickets like https://pulp.plan.io/issues/8967 and https://pulp.plan.io/issues/9029 that are talking about syncing repos with same content.
That kind of matches my setup where 7Server=7.9 have and also 7.7-EUS=7.7-E4S share same content.

And with RHEL8 you even have 3 repos RHEL8.4=RHEL8.4-EUS=RHEL-8.4-E4S with same content

Hope this input helps for engineering to build reproducers

Comment 4 Peter Vreman 2021-08-09 11:54:02 UTC
I as user of the end-to-end Satellite Repo+CV+CVV functionality did not notice any functional restricts from the errors
The ContentView and CompositeContentView creates were working without issues and on the client (1x RHEL7.9, 1x RHEL8.4, 1x 7Server (the Sat6 and Capsule itself)) i did not see any abnormal behaviour that content was missing

The major issue i noticed is the filling of the postgres filesystem huge postgres logs that are unreadable due to excesive (up to 1500000 chars) long lines

Comment 5 Brad Buckingham 2021-08-09 12:25:56 UTC
Thanks Peter!

I've created bug 1991557 to track the katello errors.  I have also downgraded the severity since the error itself is not resulting in failure of the sync/publish.

Comment 6 Tanya Tereshchenko 2021-08-09 18:40:49 UTC
If there are no sync/publish failures, the current BZ is unrelated to the Bug 1989182, https://pulp.plan.io/issues/8967 or https://pulp.plan.io/issues/9029. In all those, task will fail.
I suspect that it's more of implementation approach to solve race conditions. We'll see what we can do.

Comment 7 Tanya Tereshchenko 2021-09-10 17:28:13 UTC
In order to solve potential race conditions during sync/upload/cleanup operations, various db records might be attempted to be created when they are already there.
Also for performance reasons, bulk operations are used and in some cases, can cause `duplicate key...` records in the postgres logs.
For the sake of data consistency, currently there are no plans to adjust anything in this area.

This comment is relevant for the Pulp related records (do not have word `katello` or `dynflow` in them).

I propose to close this BZ.

Comment 8 Peter Vreman 2021-09-14 07:24:10 UTC
If it is on purpose that ERRORs in log files are expected it shall at least be documented and explained why they are harmless.

That i have opened a BZ for it is a sign that at least some users care about clean logs.


Note You need to log in before you can comment on or make changes to this bug.