Bug 1999952 - Automate the creation of cephobjectstoreuser for obc metrics collector
Summary: Automate the creation of cephobjectstoreuser for obc metrics collector
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: ocs-operator
Version: 4.9
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ODF 4.10.0
Assignee: Jiffin
QA Contact: akarsha
URL:
Whiteboard:
Depends On:
Blocks: 2011326
TreeView+ depends on / blocked
 
Reported: 2021-09-01 06:25 UTC by Jiffin
Modified: 2023-08-09 17:00 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
.Automated the creation of cephobjectstoreuser for object bucket claim metrics collector With this update, the cephobjectstoreuser known as `prometheus-user` to collect data from the RGW server is automatically created.
Clone Of:
Environment:
Last Closed: 2022-04-13 18:49:40 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github red-hat-storage ocs-operator pull 1336 0 None open Automate prometheususer creation for ob metrics collector 2021-09-14 07:28:44 UTC
Red Hat Product Errata RHSA-2022:1372 0 None None None 2022-04-13 18:50:35 UTC

Description Jiffin 2021-09-01 06:25:01 UTC
Description of problem
======================

Currently, for the obc-metrics-collector is prerequisite to have cephobjectstoreuser with name "prometheus-user" with certain permissions. It is better to automate that workflow than doing it manually

Version of all relevant components
===================================
4.9

Does this issue impact your ability to continue to work with the product
========================================================================


Is there any workaround available to the best of your knowledge?
================================================================


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
========================================


Can this issue reproducible?
============================

Yes

Can this issue reproduce from the UI?
=====================================

Yes

If this is a regression
=======================

Additional info
===============

For adding permissions to user the following PR https://github.com/rook/rook/pull/8211 in Rook needs to merge 
Post that small fix OCS-Op to create that user. Fix is not intrusive for any existing workflow.

Comment 2 Jiffin 2021-09-14 07:28:45 UTC
The dependent Rook PR got merged in v1.7.3 and OCS-Op PR posted https://github.com/red-hat-storage/ocs-operator/pull/1336

Comment 11 Mudit Agarwal 2021-10-19 13:07:46 UTC
This is not ready for 4.9, moving it to 4.10
For 4.9, we have opened a doc BZ (#2015382) to document the procedure.

Comment 14 Martin Bukatovic 2021-10-26 08:37:30 UTC
(In reply to Mudit Agarwal from comment #11)
> This is not ready for 4.9, moving it to 4.10
> For 4.9, we have opened a doc BZ (#2015382) to document the procedure.

Per comment 3, this is not acceptable as long as RHSTOR-1879 is not pushed out as well.

Moreover we are going to receive this via upstream rebase anyway, so I would like to see how would we handle that exactly no matter which decision will be taken in the end.

Please consult next steps with QE owner of RHSTOR-1879, until that happens, I'm moving it back. One sided override like that is simply not acceptable. Please not do it again.

Comment 18 Martin Bukatovic 2021-10-26 08:39:10 UTC
Status of this BZ is disputed, and decision needs to be based on both QE and DEV owners of this bug and RHSTOR-1879.

Comment 21 Mudit Agarwal 2021-10-26 10:45:35 UTC
(In reply to Martin Bukatovic from comment #14)
> (In reply to Mudit Agarwal from comment #11)
> > This is not ready for 4.9, moving it to 4.10
> > For 4.9, we have opened a doc BZ (#2015382) to document the procedure.
> 
> Per comment 3, this is not acceptable as long as RHSTOR-1879 is not pushed
> out as well.
> Moreover we are going to receive this via upstream rebase anyway, so I would
> like to see how would we handle that exactly no matter which decision will
> be taken in the end.

We are not going to receive the complete fix via upstream, only rook fix is in upstream.
If we were able to fix this within the dev freeze time, then it would not have moved.

> Please consult next steps with QE owner of RHSTOR-1879, until that happens,
> I'm moving it back. One sided override like that is simply not acceptable.
> Please not do it again.

This is a dev preview feature, which means regression only. 
Moreover, this is not a blocker and thus doesn't qualify after we enter dev freeze.
If you want to see this in 4.9, please mark it a blocker with proper justification saying why we should not release 4.9 without this fix.

Comment 26 Martin Bukatovic 2021-10-27 15:42:29 UTC
(In reply to Mudit Agarwal from comment #21)
> We are not going to receive the complete fix via upstream, only rook fix is
> in upstream.
> If we were able to fix this within the dev freeze time, then it would not
> have moved.

I believe that the solution you proposed makes sense from technical perspective,
but I would still like to have it consulted and aligned with people assigned
to RHSTOR-1879, including QE, since during bug triage meeting (when this bug
was acked) we agreed that these tasks are closely related and work on it will
be coordinated.

> > Please consult next steps with QE owner of RHSTOR-1879, until that happens,
> > I'm moving it back. One sided override like that is simply not acceptable.
> > Please not do it again.
> 
> This is a dev preview feature, which means regression only. 
> Moreover, this is not a blocker and thus doesn't qualify after we enter dev
> freeze.

Dev freeze feature level doesn't afaik imply regression testing only. If that has
been changed, could you provide a reference to program approved definition?

> If you want to see this in 4.9, please mark it a blocker with proper
> justification saying why we should not release 4.9 without this fix.

I'm not against pushing it out. Actually I would have not provided qa ack if
I haven't been told it is related to a new feature. But based on what we agreed
on before, the expected course of action here would be:

- consider impact of dropping this BZ on RHSTOR-1879
- sync with dev and qe owners of RHSTOR-1879, and note in this bug that it
  has happened

Maybe there is some existing agreement I'm not aware about, but if that is
the case, let's reference it here.

Looking into known state of this BZ and RHSTOR-1879, I would assume that
neither should be part of the 4.9 release.

Comment 29 Mudit Agarwal 2021-10-27 16:04:22 UTC
(In reply to Martin Bukatovic from comment #26)
> (In reply to Mudit Agarwal from comment #21)
> > We are not going to receive the complete fix via upstream, only rook fix is
> > in upstream.
> > If we were able to fix this within the dev freeze time, then it would not
> > have moved.
> 
> I believe that the solution you proposed makes sense from technical
> perspective,
> but I would still like to have it consulted and aligned with people assigned
> to RHSTOR-1879, including QE, since during bug triage meeting (when this bug
> was acked) we agreed that these tasks are closely related and work on it will
> be coordinated.

I am not providing any solution, I am just saying that the work is incomplete and we don't have time to finish that in the current release.
Hence we want to move this out, given that it is not a blocker for the release.
Regarding the acks during the triage meeting, when we acked it we were not in a blocker only phase but we are now.

> > > Please consult next steps with QE owner of RHSTOR-1879, until that happens,
> > > I'm moving it back. One sided override like that is simply not acceptable.
> > > Please not do it again.
> > 
> > This is a dev preview feature, which means regression only. 
> > Moreover, this is not a blocker and thus doesn't qualify after we enter dev
> > freeze.
> 
> Dev freeze feature level doesn't afaik imply regression testing only. If
> that has
> been changed, could you provide a reference to program approved definition?
Yeah, sorry my bad, it is not regression only but functionality that MAY NOT be fully tested.

> > If you want to see this in 4.9, please mark it a blocker with proper
> > justification saying why we should not release 4.9 without this fix.
> 
> I'm not against pushing it out. Actually I would have not provided qa ack if
> I haven't been told it is related to a new feature. But based on what we
> agreed
> on before, the expected course of action here would be:
> 
> - consider impact of dropping this BZ on RHSTOR-1879
> - sync with dev and qe owners of RHSTOR-1879, and note in this bug that it
>   has happened
Impact is mentioned in the above comments, user will need to perform some manual steps which would be well documented via the doc bz.

Comment 45 Mudit Agarwal 2021-11-08 13:40:06 UTC
After having an offline discussion with Eran and Elad, moving this out of 4.9
Have added it as a known issue, will try to fix it in 4.9.z

Comment 50 Mudit Agarwal 2022-02-01 13:22:37 UTC
Please test with any of the latest 4.10 builds.

Comment 56 akarsha 2022-03-17 06:27:39 UTC
Version:
OCP: 4.10.0-0.nightly-2022-03-16-000645
ODF: 4.10.0-194
CEPH: 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)

prometheus-user got created in openshift-storage and listed as shown in the sample output. Observed the obc related metrics are exported as shown in the attached screenshot [1] and [2] in the comment54, comment 55.
Based on the observation moving bug to verified state.

Sample output:

$ oc get cephobjectstoreuser -n openshift-storage
NAME                                     AGE
noobaa-ceph-objectstore-user             23h
ocs-storagecluster-cephobjectstoreuser   23h
prometheus-user                          23h

Comment 58 Jiffin 2022-04-12 05:23:12 UTC
The doc text looks good to me

Comment 60 errata-xmlrpc 2022-04-13 18:49:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.10.0 enhancement, security & bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1372

Comment 61 Mudit Agarwal 2022-04-15 07:24:37 UTC
Doc text was added, thanks Bipin.


Note You need to log in before you can comment on or make changes to this bug.