Bug 2043045

Summary: When pulp synces an errata, it creates duplicate records for all errata, what confuses katello mail notifications
Product: Red Hat Satellite Reporter: momran
Component: PulpAssignee: satellite6-bugs <satellite6-bugs>
Status: CLOSED DUPLICATE QA Contact: Lai <ltran>
Severity: medium Docs Contact:
Priority: high    
Version: 6.10.1CC: ahumbe, dalley, jsenkyri, jsherril, momran, pmoravec
Target Milestone: UnspecifiedKeywords: Triaged
Target Release: Unused   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-06-30 07:48:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
tools repodata for tiny reproducer none

Description momran 2022-01-20 14:28:17 UTC
Description of problem:

Discrepancy between sync status page in satellite 6.10 web UI and e-mail notifications for completed repo syncs launched by daily sync plans. For example:

- [satellite] Sync Summary for Red Hat Enterprise Linux 7 Server - Optional RPMs x86_64 7Server

SYNC SUMMARY
The synchronization of "Red Hat Enterprise Linux 7 Server - Optional RPMs x86_64 7Server" has completed. Below is a summary of new errata. A large number of errata were synced for this repository, so only the first 100 are shown.

New Errata
201  SECURITY
222  BUGFIX
36   ENHANCEMENT


- [satellite] Sync Summary for Red Hat Enterprise Linux 7 Server RPMs x86_64 7Server

SYNC SUMMARY
The synchronization of "Red Hat Enterprise Linux 7 Server RPMs x86_64 7Server" has completed. Below is a summary of new errata. A large number of errata were synced for this repository, so only the first 100 are shown.

New Errata
207  SECURITY
358  BUGFIX
67   ENHANCEMENT


Whereas the sync status page in Satellite's web UI shows:

Red Hat Enterprise Linux 7 Server - Optional RPMs x86_64 7Server	18 minutes ago	10 minutes	Added Rpms: 2, Errata: 1	Syncing Complete.
Red Hat Enterprise Linux 7 Server RPMs x86_64 7Server			20 minutes ago	20 minutes	Added Rpms: 2, Errata: 1	Syncing Complete.

When the repo sync is launched manually, both the sync status page and the e-mail notifications align.


Version-Release number of selected component (if applicable):


How reproducible:
-----------------

Always with the ‘Red Hat Enterprise Linux 7 Server - Optional RPMs x86_64 7Server’ and ‘Red Hat Enterprise Linux 7 Server RPMs x86_64 7Server’ repos, but not with RHEL 8 repos. Successfully reproduced the same behavior in test environment. Customer also reported that he encountered the same problem with EPEL and Oracle Linux 7/8 repos (though this was not reproduced in test environmnet yet).


Steps to Reproduce:
-------------------

1. Enable the ‘Red Hat Enterprise Linux 7 Server - Optional RPMs x86_64 7Server’ and ‘Red Hat Enterprise Linux 7 Server RPMs x86_64 7Server’ repos on Satellite 6.10.1.
2. Synchronize the repos to Satellite.
3. Set up a daily sync plan for these 2 repos.
4. Observe the e-mail notifications for completed repo syncs launched by the daily sync plan. Cross check with the Sync Status page in Satellite's web UI.


Actual results:
---------------

Errata count in sync status page in satellite 6.10 web UI and e-mail notifications for completed repo syncs launched by daily sync plans does not align.


Expected results:
-----------------

Errata count in sync status page in satellite 6.10 web UI and e-mail notifications for completed repo syncs launched by daily sync plans should align.

Comment 1 momran 2022-01-20 14:41:52 UTC
- Performed the following on the ‘Red Hat Enterprise Linux 7 Server - Optional RPMs x86_64 7Server’ and ‘Red Hat Enterprise Linux 7 Server RPMs x86_64 7Server’ repos:
    - Verify Content Checksum
    - Advanced Sync -> Complete Sync

  Outcome:

  - [satellite] Sync Summary for Red Hat Enterprise Linux 7 Server - Optional RPMs x86_64 7Server

SYNC SUMMARY
The synchronization of "Red Hat Enterprise Linux 7 Server - Optional RPMs x86_64 7Server" has completed. Below is a summary of new errata. A large number of errata were synced for this repository, so only the first 100 are shown.

New Errata:
200  SECURITY
222  BUGFIX
36   ENHANCEMENT


  - [satellite] Sync Summary for Red Hat Enterprise Linux 7 Server RPMs x86_64 7Server

The synchronization of "Red Hat Enterprise Linux 7 Server RPMs x86_64 7Server" has completed. Below is a summary of new errata. A large number of errata were synced for this repository, so only the first 100 are shown.

New Errata:
206   SECURITY
358   BUGFIX
67    ENHANCEMENT


Whereas the sync status page in Satellite's web UI shows:

Red Hat Enterprise Linux 7 Server - Optional RPMs x86_64 7Server	22 minutes ago	9 minutes	No content added.	Syncing Complete.
Red Hat Enterprise Linux 7 Server RPMs x86_64 7Server			22 minutes ago	19 minutes	No content added.	Syncing Complete.

So, even with Advanced Sync -> Complete Sync of the 2 repos, the errata count in sync status page in satellite 6.10 web UI and e-mail notifications for the manually initiated completed Advanced Sync -> Complete Syncs does not align.

Comment 2 momran 2022-01-25 16:06:18 UTC
On 24.1.2022:
-------------

- [satellite] Sync Summary for Red Hat Enterprise Linux 7 Server RPMs x86_64 7Server

SYNC SUMMARY
The synchronization of "Red Hat Enterprise Linux 7 Server RPMs x86_64 7Server" has completed. Below is a summary of new errata.

New Errata:
1  SECURITY
0  BUGFIX
0  ENHANCEMENT


- [satellite] Sync Summary for Red Hat Enterprise Linux 7 Server - Optional RPMs x86_64 7Server

SYNC SUMMARY
The synchronization of "Red Hat Enterprise Linux 7 Server - Optional RPMs x86_64 7Server" has completed. Below is a summary of new errata.

New Errata:
1  SECURITY
0  BUGFIX
0  ENHANCEMENT

So, the following actions fixed the issue temporarily:
  -Performed the following on the ‘Red Hat Enterprise Linux 7 Server - Optional RPMs x86_64 7Server’ and ‘Red Hat Enterprise Linux 7 Server RPMs x86_64 7Server’ repos:
    - Verify Content Checksum
    - Advanced Sync -> Complete Sync

But again, on next run of the repo sync plans, the issue re-appeared:

On 25.1.2022:
-------------

- [satellite] Sync Summary for Red Hat Enterprise Linux 7 Server RPMs x86_64 7Server

SYNC SUMMARY
The synchronization of "Red Hat Enterprise Linux 7 Server RPMs x86_64 7Server" has completed. Below is a summary of new errata. A large number of errata were synced for this repository, so only the first 100 are shown.

New Errata:
207  SECURITY
358  BUGFIX
67  ENHANCEMENT


- [satellite] Sync Summary for Red Hat Enterprise Linux 7 Server - Optional RPMs x86_64 7Server

SYNC SUMMARY
The synchronization of "Red Hat Enterprise Linux 7 Server - Optional RPMs x86_64 7Server" has completed. Below is a summary of new errata. A large number of errata were synced for this repository, so only the first 100 are shown.

New Errata:
201  SECURITY
222  BUGFIX
36   ENHANCEMENT

Comment 4 Justin Sherrill 2022-02-21 13:49:52 UTC
Hi, 

It looks like the errata mailer is just sending the list of all errata in the repository, which is what i thought it did, and was surprised that its telling the user that its 'New Errata'.  That is incorrect.

Ideally we change this to actually return the newly synced errata, at a bare minimum, we should at least make the email text correct.

Comment 5 Pavel Moravec 2022-06-29 15:28:54 UTC
(In reply to Justin Sherrill from comment #4)
> Hi, 
> 
> It looks like the errata mailer is just sending the list of all errata in
> the repository, which is what i thought it did, and was surprised that its
> telling the user that its 'New Errata'.  That is incorrect.
> 
> Ideally we change this to actually return the newly synced errata, at a bare
> minimum, we should at least make the email text correct.

Katello is behaving properly here. It queries pulp prior and after the sync and what is new or different(!), it marks as a New Errata. And imho the mail notification feature *should* be sending *just new/modified* errata, not all.

The "bad guy" here is pulp that - when detecting there is an errata to newly sync - it creates records for *all* errata from the repo and update the repository to use the new records. That confuses katello.

See example:

katello_repository_errata table :
  id  | erratum_id | repository_id |     created_at      |     updated_at      |                            erratum_pulp3_href
 2914 |          1 |             2 | 2022-06-29 07:28:08 | 2022-06-29 07:28:08 | /pulp/api/v3/content/rpm/advisories/b84ca7e0-1669-435d-90fe-db95df1fd7df/

changed to:

 2914 |          1 |             2 | 2022-06-29 11:49:13 | 2022-06-29 11:49:13 | /pulp/api/v3/content/rpm/advisories/62eb7f9e-e5d0-4956-ad83-611fd8639b13/

both advisories URLs contain the *same* errata, these are duplicate records. The new errata was *created* during the new sync (that synced/updated just one different errata).


And from pulp side: 
1) having a repo with 1 errata synced
2) update the remote repo to have some more packages and *one* another errata
3) check # errata in pulp:

certs="--cacert /etc/pki/katello/certs/katello-server-ca.crt --cert /etc/pki/katello/certs/pulp-client.crt --key /etc/pki/katello/private/pulp-client.key"
hname=$(hostname -f)
curl -s $certs https://${hname}/pulp/api/v3/content/rpm/advisories/ | json_reformat | head
{
    "count": 4940,
..

4) sync the repo to fetch the new RPMs and some errata (one new here)
5) check # errata in pulp:

curl -s $certs https://${hname}/pulp/api/v3/content/rpm/advisories/ | json_reformat | head
{
    "count": 4942,
..

Just *one* new errata appeared in the repo, but *two* new entries there.

6) repeat it once more by adding *one* another new errata to sync:

curl -s $certs https://${hname}/pulp/api/v3/content/rpm/advisories/ | json_reformat | head
{
    "count": 4945,
..

and *three* errata appears! All three from the repo.



Satellite's reproducer:
1) Put to /var/www/html/pub/ the three directories tools_repodata_[1-3] I will attach. These are repodata of RHEL7 Sat6.9 tools gradually adding one errata at a time.
2) create a product and a repo:
hammer product create --name "tools_test_product" --organization-id 1
hammer repository create --organization-id 1 --product tools_test_product --name tools_repo_1 --content-type yum --url http://localhost/pub/tools_repodata --download-policy on_demand

3) point tools_repodata to the first "version" of the repodata:
i=1; rm -f /var/www/html/pub/tools_repodata; ln -s /var/www/html/pub/tools_repodata_${i} /var/www/html/pub/tools_repodata; chown -R apache:apache /var/www/html/pub/tools*

4) sync the repo:
hammer repository sync --organization-id 1 --product tools_test_product --name tools_repo_1

5) Check the number of advisories in pulp (and for katello, check records in katello_repository_errata table in foreman DB)

6) point tools_repodata to the second "version" of the repodata:
i=1; rm -f /var/www/html/pub/tools_repodata; ln -s /var/www/html/pub/tools_repodata_${i} /var/www/html/pub/tools_repodata; chown -R apache:apache /var/www/html/pub/tools*

5) sync the repo again and check pulp advisories count - increased by *two* (and the repo<->advisories mapping in pulp point to the *new* advisory records - what also katello_repository_errata confirms)

6) repeat the test with third "version" of the repo


Needinfo to David to explain the pulp behaviour.

Comment 6 Pavel Moravec 2022-06-29 15:32:37 UTC
Created attachment 1893438 [details]
tools repodata for tiny reproducer

Three versions of the repodata for a reproducer.

I forgot to mention that when a repo sync finds nothing to sync (no errata or no new content at all? I dont know this), then no new advisory is created (which is expected).

Simply a new content (or advisory in particular)? detected during a repo sync is the mandatory condition of the reproducer / bug.

Comment 7 Pavel Moravec 2022-06-30 06:19:15 UTC
(not 100% sure if this is correct, just indirect evidences for this)

For newly synced repositories on Sat6.10, all the errata are added to the mail notification / all errata are duplicated in pulp advisories.

For content originally migrated from pulp-2, the errata migrated from pulp-2 are *not* included in the mail notification / they are not duplicated among pulp advisories. (this is just indirectly observed).

Comment 8 Pavel Moravec 2022-06-30 07:48:42 UTC
Closing this as a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=2044314 that *should* fix the errata mails (though my attempt to backport to 6.10 failed: I still saw all mails there).

The reason why pulp creates new advisory record is that the same errata has subtle differences in the three repodata updateinfo details - so pulp calculates a different digest for the errata. The old errata is removed during orphans removal, so no "advisories leak" happens.

*** This bug has been marked as a duplicate of bug 2044314 ***