Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Description of problem:
Bulk capsule syncs failed when they were trying to remove the same pulp repos ('Actions::Katello::CapsuleContent::RemoveUnneededRepos').
Please see the below task information:
> 'Actions::Katello::Repository::CapsuleGenerateAndSync'
> there are 7 Actions::Katello::CapsuleContent::Sync sub tasks in it matches the number of capsules in pnt-sysops org
> each one of the capsulecontent::sync have one 'Actions::Katello::CapsuleContent::ConfigureCapsule'
> and a lot of 'Actions::Pulp::Repository::Refresh'
> under these 'Actions::Pulp::Repository::Refresh', some of them have skipped 'Actions::Pulp::Repository::RefreshDistributor'
> under 'Actions::Katello::CapsuleContent::ConfigureCapsule', there is one 'Actions::Katello::CapsuleContent::RemoveUnneededRepos', where lots and lots of 'Actions::Pulp::Repository::Destroy' is in listed
> 'Actions::Pulp::Repository::Destroy' <-- lots of these are skipped or failed
The tasks paused after a whole bunch of repo::destroy failed. It can not be resumed until the user literally pressed the skip link next to every single errorred repo:destroy. The parent task can be resumed only when all of those errorred ones are skipped. After resuming, user got the Repo::create errors.
More information:
http://paste-platops.itos.redhat.com/pzpwqg41u/yyulhy#line-8http://paste-platops.itos.redhat.com/p6cfi7rzn/sqiixo#line-15
While reading'katello/repository/capsule_generate_and_sync.rb' file, I notice that each 'CapsuleGenerateAndSync' task will sync all the repositories between Katello and Capsule Pulp every time. If the repositories are not match, Katello will either create the needed repos or destroy the unneeded repos. This seems
ok when we only have one sync task running for a Capsule but conflict could happen when there are multiple sync tasks for a Capsule (I am don't know how Foreman is handling multiple tasks so I could be wrong).
For example, if the following sync tasks are running at about the same time.
Sync task 1:
1) Get a list of repos from Pulp. Pulp returns Repo1, Repo2 (A)
2) Get a list of repos from Katello. Katello returns Repo1 (B)
3) Unneeded repos = A - B
4) Delete unneeded repos from Pulp by calling Pulp API
5) Pulp async tasks created and queueing (Pulp task 1)
Sync task 2
1) Get a list of repos from Pulp. Pulp returns Repo1, Repo2 (A)
2) Get a list of repos from Katello. Katello returns Repo1 (B)
3) Unneeded repos = A - B.
4) It should get the same "unneeded repos" as Sync task 1 because Pulp task 1 is still queueing
5) Pulp runs Pulp task 1. The unneeded repos are now deleted from Pulp.
6) Call Pulp API to deleted unneeded repos. The API will return 404 because the repos have been deleted
I can still reproduce the 409 conflict error when bulk sync repos to Capsule.
The first sync task ran successfully but the later tasks will get the following errors because the new repos had already been created by the 1st task:
7: Actions::Pulp::Repository::Create (skipped) [ 6.98s / 0.25s ]
9: Actions::Pulp::Repository::Create (skipped) [ 6.10s / 1.27s ]
11: Actions::Pulp::Repository::Create (skipped) [ 5.10s / 0.21s ]
13: Actions::Pulp::Repository::Create (skipped) [ 4.21s / 0.19s ]
15: Actions::Pulp::Repository::Create (skipped) [ 4.21s / 1.19s ]
17: Actions::Pulp::Consumer::SyncCapsule (success) [ 51.74s / 9.72s ]
19: Actions::Katello::CapsuleContent::RemoveOrphans (success) [ 0.15s / 0.15s ]
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHBA-2017:2466
Description of problem: Bulk capsule syncs failed when they were trying to remove the same pulp repos ('Actions::Katello::CapsuleContent::RemoveUnneededRepos'). Please see the below task information: > 'Actions::Katello::Repository::CapsuleGenerateAndSync' > there are 7 Actions::Katello::CapsuleContent::Sync sub tasks in it matches the number of capsules in pnt-sysops org > each one of the capsulecontent::sync have one 'Actions::Katello::CapsuleContent::ConfigureCapsule' > and a lot of 'Actions::Pulp::Repository::Refresh' > under these 'Actions::Pulp::Repository::Refresh', some of them have skipped 'Actions::Pulp::Repository::RefreshDistributor' > under 'Actions::Katello::CapsuleContent::ConfigureCapsule', there is one 'Actions::Katello::CapsuleContent::RemoveUnneededRepos', where lots and lots of 'Actions::Pulp::Repository::Destroy' is in listed > 'Actions::Pulp::Repository::Destroy' <-- lots of these are skipped or failed The tasks paused after a whole bunch of repo::destroy failed. It can not be resumed until the user literally pressed the skip link next to every single errorred repo:destroy. The parent task can be resumed only when all of those errorred ones are skipped. After resuming, user got the Repo::create errors. More information: http://paste-platops.itos.redhat.com/pzpwqg41u/yyulhy#line-8 http://paste-platops.itos.redhat.com/p6cfi7rzn/sqiixo#line-15