Bug 1716405

Summary:	Make "capsule_remove_unneeded_repos" tolerant of single capsule failure
Product:	Red Hat Satellite	Reporter:	Dylan Gross <dgross>
Component:	Capsule - Content	Assignee:	Samir Jha <sajha>
Status:	CLOSED ERRATA	QA Contact:	Lukas Pramuk <lpramuk>
Severity:	medium	Docs Contact:
Priority:	unspecified
Version:	6.5.0	CC:	pcreech, sajha
Target Milestone:	6.6.0	Keywords:	Triaged
Target Release:	Unused
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:	tfm-rubygem-katello-3.12.0.7-1	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2019-10-22 19:51:07 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Dylan Gross 2019-06-03 12:43:21 UTC

1. Proposed title of this feature request

   Make /usr/sbin/foreman-rake katello:capsule_remove_unneeded_repositories
 tolerant of a single capsule failure

3. What is the nature and description of the request?

  When /usr/sbin/foreman-rake katello:capsule_remove_unneeded_repositories
 is run on a Satellite, it appears to exit if it encounters a capsule to which it cannot connect to it's pulp server.  The job then appears to end immediately rather than cleaning up the capsule to which it still could connect.

4. Why does the customer need this? (List the business requirements here)

  The cleanup job runs regularly, but has little apparent effect if the pulp on an unrelated capsule is down.

5. How would the customer like to achieve this? (List the functional requirements here)

   Two suggestions 
   a) return a warning if the pulp on a capsule is down and proceed to the next capsule.
   b) Allow an argument specification so that specific capsules can be targeted.

6. For each functional requirement listed, specify how Red Hat and the customer can test to confirm the requirement is successfully implemented.

   The cleanup script will execute to completion even if some capsule in the enviornment is down.

7. Is there already an existing RFE upstream or in Red Hat Bugzilla?

  No

8. Does the customer have any specific timeline dependencies and which release would they like to target (i.e. RHEL5, RHEL6)?

  No

9. Is the sales team involved in this request and do they have any additional input?

  No

10. List any affected packages or components.

   

11. Would the customer be able to assist in testing this functionality if implemented?

Comment 6 Bryan Kearney 2019-07-08 16:08:21 UTC

Moving this bug to POST for triage into Satellite 6 since the upstream issue https://projects.theforeman.org/issues/27169 has been resolved.

Comment 8 Lukas Pramuk 2019-10-01 17:29:26 UTC

VERIFIED.

@Satellite 6.6.0 Snap23
tfm-rubygem-katello-3.12.0.26-1.el7sat.noarch

by the following manual reproducer:

1) Have a Satellite with 2 external capsules assigned to the Library 

2) Sync some decent repo to both capsules and check for content at capsules:

# pulp-admin rpm repo list | grep -e Name: -e Rpm:
Display Name:        Red Hat Enterprise Linux 7 Server RPMs x86_64 7Server
  Rpm:                    26313
Display Name:        RHEL 7.7
  Rpm:                    5225

3) Remove Library from both capsules

4) Make sure capsule #2 is destined to fail 

# service pulp_workers stop
Redirecting to /bin/systemctl stop pulp_workers.service

5) Run weekly clean up task manually

# foreman-rake katello:delete_orphaned_content
Orphaned content deletion started in background.
Orphaned content deletion started in background.
Orphaned content deletion started in background.

>>> content deletion tasks is now generated per each capsule

6) At capsule #1 check that clean up succeded

# pulp-admin rpm repo list | grep -e Name: -e Rpm:
<empty>

>>> only task for broken capsule is failing while for other capsules cleanup succeeds

Comment 9 Bryan Kearney 2019-10-22 19:51:07 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:3172