Bug 1914215 - must-gather fails to delete the completed state compute-xx-debug pods after successful completion
Summary: must-gather fails to delete the completed state compute-xx-debug pods after s...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Container Storage
Classification: Red Hat Storage
Component: must-gather
Version: 4.7
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: OCS 4.7.0
Assignee: Pulkit Kundra
QA Contact: Neha Berry
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-01-08 11:34 UTC by Neha Berry
Modified: 2021-05-19 09:18 UTC (History)
4 users (show)

Fixed In Version: 4.7.0-721.ci
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-05-19 09:17:47 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift ocs-operator pull 1013 0 None closed Bug 1914215: [release-4.7] Make must-gather more efficient 2021-02-17 14:29:02 UTC
Github openshift ocs-operator pull 1033 0 None closed Must-gather: update uninstall script 2021-02-15 05:45:44 UTC
Github openshift ocs-operator pull 1039 0 None closed Bug 1914215: [release-4.7] Must-gather: update uninstall script 2021-02-15 05:45:44 UTC
Github openshift ocs-operator pull 967 0 None closed Make must-gather more efficient 2021-02-13 05:50:32 UTC
Red Hat Product Errata RHSA-2021:2041 0 None None None 2021-05-19 09:18:12 UTC

Description Neha Berry 2021-01-08 11:34:02 UTC
Description of problem (please be detailed as possible and provide log
snippests):
----------------------------------------------------------
SInce OCS 4.7, it is seen in multiple clusters that the debug pods created as part of must-gather collection are not deleted and stay on the cluster forever.. till a new must-gather is initiated or manually deleted.



NAME                                                              READY   STATUS      RESTARTS   AGE     IP             NODE        NOMINATED NODE   READINESS GATES
compute-0-debug                                                   0/1     Completed   0          17h     10.1.161.41    compute-0   <none>           <none>
compute-1-debug                                                   0/1     Completed   0          17h     10.1.160.196   compute-1   <none>           <none>
compute-2-debug                                                   0/1     Completed   0          17h     10.1.161.44    compute-2   <none>           <none>
compute-3-debug                                                   0/1     Completed   0          17h     10.1.160.209   compute-3   <none>           <none>
compute-4-debug                                                   0/1     Completed   0          17h     10.1.161.31    compute-4   <none>           <none>
compute-5-debug                                                   0/1     Completed   0          17h     10.1.160.210   compute-5   <none>           <none>


Version of all relevant components (if applicable):
=====================================================
OCP: 4.7.0-0.nightly-2021-01-07-181010
OCS: ocs-operator.v4.7.0-229.ci , 4.7.0-228.ci, etc

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
=================================================
No

Is there any workaround available to the best of your knowledge?
========================================================
We can delete it manually, but cleanup should be part of must-gather.

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
================================================================
2

Can this issue reproducible?
===========================
Yes

Can this issue reproduce from the UI?
===========================================
NA

If this is a regression, please provide more details to justify this:
=================================================
These pods used to get removed in 4.6 IIRC

Steps to Reproduce:
=======================
1. Install OCS 4.7
2. Initiate a must-gather log collection for ocs
>>oc adm must-gather --image=quay.io/rhceph-dev/ocs-must-gather:latest-4.7 |tee terminal-must-gather
3. check if debug pods are left behind even after successful MG completion

#oc get pods -o wide -n openshift-storage


Actual results:
=================
the debug pods are left behind in completed state.

Expected results:

====================
All resources created as part of MG should be deleted by MG.

Must gather should delete all resources created by it, even completed pods, once log collection completes.

Additional info:
=====================

Comment 9 errata-xmlrpc 2021-05-19 09:17:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2041


Note You need to log in before you can comment on or make changes to this bug.