Bug 1892642

Summary: oauth-server password metrics do not appear in UI after initial OCP installation
Product: OpenShift Container Platform Reporter: Paul Needle <pneedle>
Component: oauth-apiserverAssignee: Standa Laznicka <slaznick>
Status: CLOSED ERRATA QA Contact: pmali
Severity: low Docs Contact:
Priority: low    
Version: 4.6CC: alegrand, anpicker, aos-bugs, erooth, kakkoyun, lcosic, mfojtik, pkrupa, spasquie, surbania, xxia
Target Milestone: ---Keywords: UserExperience
Target Release: 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: Some metrics were not properly initialized in code. Consequence: These metrics would not appear in searches in the Prometheus UI. Fix: Initialize the metrics that appeared missing. Result: All oauth-server metrics should appear in Prometheus UI metrics searches even though they did not see any updates yet.
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-07-27 22:33:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Auto-complete without the metrics. This screenshot was taken before UWM setup
none
Auto-complete with the metrics. This screenshot was taken after UWM setup none

Description Paul Needle 2020-10-29 11:29:49 UTC
Description of problem:

The `openshift_auth_basic_password_count` and `openshift_auth_basic_password_count_result` metrics were reintroduced into OCP 4 through https://bugzilla.redhat.com/show_bug.cgi?id=1822122. They appear in the CLI after OCP installation. However, they do not appear in the PromQL prompt in Monitoring -> Metrics until after UWM setup.

This is also the case in the Prometheus UI.

Version-Release number of selected component (if applicable):

OCP 4.4 to 4.6.

How reproducible:

Every time.

Steps to Reproduce:

1. After OCP installation, run the following to list the two metrics:

----
$ token=`oc sa get-token prometheus-k8s -n openshift-monitoring`
----

----
$ oc get route oauth-openshift -n openshift-authentication
----

----
$ curl -k -H "Authorization: Bearer $token"  https://<oauth_route_output_in_previous_command>/metrics > result
----

----
$ cat result|grep -i openshift_auth_basic
----

In my test example, the output was as follows. The `openshift_auth_basic_password_count` and `openshift_auth_basic_password_count_result` metrics are included:

----
# HELP openshift_auth_basic_password_count [ALPHA] Counts basic password authentication attempts
# TYPE openshift_auth_basic_password_count counter
openshift_auth_basic_password_count 4
# HELP openshift_auth_basic_password_count_result [ALPHA] Counts basic password authentication attempts by result
# TYPE openshift_auth_basic_password_count_result counter
openshift_auth_basic_password_count_result{result="failure"} 2
openshift_auth_basic_password_count_result{result="success"} 2
----

2a. Go to 'Monitoring -> Metrics' in the OCP web console and type `openshift_auth`. The auto-complete feature provides suggestions, but the `openshift_auth_basic_password_count` and `openshift_auth_basic_password_count_result` metrics are not included in that list.

2b. Type in the full metric names `openshift_auth_basic_password_count` and `openshift_auth_basic_password_count_result` and no results are provided. 

3. Go to the third party Prometheus UI and the same is true.

4. Set up monitoring for user-defined projects. I did this by running the https://github.com/openshift/cluster-monitoring-operator/blob/master/hack/uwm_setup.sh script.

5. Type in the full metric names `openshift_auth_basic_password_count` and `openshift_auth_basic_password_count_result` in the Prometheus UI and results are now provided. The same is then true in the OCP web console.

6. After this, the auto-complete includes the two metrics `openshift_auth_basic_password_count` and `openshift_auth_basic_password_count_result` in the suggested metrics list if you type `openshift_auth`.

Actual results:

The `openshift_auth_basic_password_count` and `openshift_auth_basic_password_count_result` metrics do not appear in the metrics UI after an initial OCP installation.

Expected results:

For those two metrics to appear in the UI as they do in the CLI, after an initial OCP installation.

Additional info:

I have attached two screenshots:

- ocp_4_6_1_openshift_auth_promql_autocomplete.png. This shows the auto-complete after initial installation without the two metrics.

- ocp_4_6_1_openshift_auth_promql_autocomplete_after_uwm_setup.png. This shows the autocomplete after running the UWM setup script. The suggested metrics list includes the two metrics.

Comment 1 Paul Needle 2020-10-29 11:31:07 UTC
This relates to https://github.com/openshift/openshift-docs/issues/21085.

Please can you advise as to whether this is by design. I will update the documentation accordingly, once the cause has been identified.

Thanks,
Paul.

Comment 2 Paul Needle 2020-10-29 11:41:31 UTC
Created attachment 1725048 [details]
Auto-complete without the metrics. This screenshot was taken before UWM setup

Comment 3 Paul Needle 2020-10-29 11:41:57 UTC
Created attachment 1725049 [details]
Auto-complete with the metrics. This screenshot was taken after UWM setup

Comment 4 Paul Needle 2020-10-29 13:13:06 UTC
From a conversation with our Monitoring Engineering team, I understand that the metrics aren't scraped by prometheus-k8s because the namespace doesn't have the openshift.io/cluster-monitoring="true" label.

Moving the BZ component to oauth-apiserver.

Paul.

Comment 5 Standa Laznicka 2020-10-29 13:37:19 UTC
```
oc get ns openshift-authentication -o json | jq '.metadata.labels'
{
  "olm.operatorgroup.uid/2c885697-4676-49a0-8570-908efe54c41d": "",
  "openshift.io/cluster-monitoring": "true"
}
```

Comment 6 Simon Pasquier 2020-10-29 14:57:22 UTC
@Paul can you run the same command on your cluster?

oc get ns openshift-authentication -o json | jq '.metadata.labels'

Comment 7 Paul Needle 2020-10-29 16:27:29 UTC
@Simon I ran this on a newly deployed cluster which exhibits the issue. The output is as follows:

----
$ oc get ns openshift-authentication -o json | jq '.metadata.labels'
{
  "olm.operatorgroup.uid/65d0d381-1625-483f-b83e-cab4e0002507": "",
  "openshift.io/cluster-monitoring": "true"
}
----

Comment 8 Simon Pasquier 2020-10-29 17:07:02 UTC
@Paul a must-gather would be handy :)

Comment 9 Paul Needle 2020-10-30 11:11:05 UTC
@Simon I have now sent you a link to the must-gather output, which was too large to attach to this BZ.

The archive is from a fresh OCP 4.6.1 installation.

Comment 10 Paul Needle 2020-11-03 10:04:23 UTC
From a review by our Monitoring team, I understand that these metrics likely only appear after a user login attempt because the metrics counter is not initialised at start up.

Can someone please review whether it is possible to have the metrics counter initialise at start up?

Thanks,
Paul.

Comment 19 errata-xmlrpc 2021-07-27 22:33:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438