Bug 1898808 - Rook-Ceph crash collector pod should not run on non-ocs node
Summary: Rook-Ceph crash collector pod should not run on non-ocs node
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Container Storage
Classification: Red Hat Storage
Component: rook
Version: 4.6
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: OCS 4.7.0
Assignee: Sébastien Han
QA Contact: Aviad Polak
URL:
Whiteboard:
: 1965749 (view as bug list)
Depends On:
Blocks: 1938134
TreeView+ depends on / blocked
 
Reported: 2020-11-18 06:08 UTC by Mudit Agarwal
Modified: 2024-10-01 17:05 UTC (History)
13 users (show)

Fixed In Version: 4.7.0-722.ci
Doc Type: Known Issue
Doc Text:
.Crash-collector pod gets removed from the node Previously, when Ceph pods moved to a different node, the crash-collector pod would continue to run on the previous node. Now, the crash-collector pod gets removed from the node if no Ceph pod is present.
Clone Of:
Environment:
Last Closed: 2021-05-19 09:16:24 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift rook pull 166 0 None closed Bug 1898808: ceph: remove crash collector if ceph pod moved 2021-02-16 09:54:16 UTC
Github red-hat-storage ocs-ci pull 4518 0 None None None 2021-08-16 06:14:56 UTC
Github rook rook pull 7160 0 None closed ceph: remove crash collector if ceph pod moved 2021-02-16 09:54:16 UTC
Red Hat Product Errata RHSA-2021:2041 0 None None None 2021-05-19 09:17:01 UTC

Description Mudit Agarwal 2020-11-18 06:08:48 UTC
Description of problem (please be detailed as possible and provide log
snippests):

The Rook-Ceph crashcollector can run on an unlabeled node, it shouldn't.

Version of all relevant components (if applicable):


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?


Is there any workaround available to the best of your knowledge?


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1

Can this issue reproducible?
yes

Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1.
2.
3.


Actual results:


Expected results:


Additional info:

Comment 3 Shrivaibavi Raghaventhiran 2020-11-18 06:13:17 UTC
Logs: http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/bz-1898501/

Comment 4 Sébastien Han 2020-11-26 13:47:42 UTC
Mudit, the crash collector will run anywhere Ceph daemons run, is this really an issue with the crash collector here?

Comment 6 Travis Nielsen 2020-12-01 23:16:59 UTC
@Seb In the original BZ (https://bugzilla.redhat.com/show_bug.cgi?id=1883828), the crash collector pod was found to be running on a non-OCS node. I suspect there was a ceph pod running on that node at some point, though at the time of analysis the crash collector was the only one on that node. Does Rook remove the crash collector from a node if all the ceph daemons are removed as well? The bug seems to be that the crash collector remained even after ceph pods were no longer running there.

Comment 7 Sébastien Han 2020-12-04 15:17:01 UTC
@Travis indeed, this is not handled, I've managed to repro, working on a fix.

Comment 9 Mudit Agarwal 2021-02-02 14:14:44 UTC
Seb, is this already fixed?

Comment 10 Sébastien Han 2021-02-03 13:34:57 UTC
Mudit, no it's not I'm working on it.

Comment 15 Mudit Agarwal 2021-04-21 10:02:25 UTC
Doc text is added.

Comment 17 Sébastien Han 2021-04-21 13:28:25 UTC
Doc text lgtm.

Comment 20 errata-xmlrpc 2021-05-19 09:16:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2041

Comment 21 Ashish Singh 2021-06-02 16:44:53 UTC
*** Bug 1965749 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.