Bug 2013528 - mapi_current_pending_csr is always set to 1 on OpenShift Container Platform 4.8
Summary: mapi_current_pending_csr is always set to 1 on OpenShift Container Platform 4.8
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cloud Compute
Version: 4.8
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
: 4.10.0
Assignee: Joel Speed
QA Contact: sunzhaohua
URL:
Whiteboard:
Depends On:
Blocks: 2019754 2047702
TreeView+ depends on / blocked
 
Reported: 2021-10-13 06:09 UTC by Simon Reber
Modified: 2022-04-11 08:33 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: When the last pending CSR was reconciled, no further reconcilation happened to update the value of the metric Consequence: The metric would often report 1 when in fact there were no pending CSRs Fix: Ensure the metric is updated at the end of each reconcile loop Result: The metric should now be up to date at all times
Clone Of:
: 2019754 2047702 (view as bug list)
Environment:
Last Closed: 2022-03-10 16:19:33 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Metrics from mapi_current_pending_csr showing that it's set to one shortly after Cluster installation (64.07 KB, image/png)
2021-10-13 06:12 UTC, Simon Reber
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-machine-approver pull 135 0 None Closed VM provisioning still failing due to lack of provider refresh 2022-05-11 09:06:37 UTC
Red Hat Knowledge Base (Solution) 6411541 0 None None None 2021-10-13 06:29:42 UTC
Red Hat Product Errata RHSA-2022:0056 0 None None None 2022-03-10 16:19:54 UTC

Description Simon Reber 2021-10-13 06:09:49 UTC
Description of problem:

After installing or updating to OpenShift Container Platform 4.8 it was found that  mapi_current_pending_csr metric from openshift-cluster-machine-approver is always reporting one, even though no pending CSR is reported:

> $ oc get clusterversion
> NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
> version   4.8.12    True        False         43h     Cluster version is 4.8.12

> $ oc get csr
> No resources found

> $ oc exec -c machine-approver-controller machine-approver-5fcfd56b9d-pc4g8 -- curl -H "Authorization: Bearer XXXXXX" -k https://localhost:9192/metrics | grep mapi_current_pending_csr
>   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
>                                  Dload  Upload   Total   Spent    Left  Speed
>   0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0# HELP mapi_current_pending_csr Count of pending CSRs at the cluster level
> # TYPE mapi_current_pending_csr gauge
> mapi_current_pending_csr 1

This is reported by a freshly installed OpenShift Container Platform 4.8 - Cluster on AWS using IPI. But the same can be seen with OpenShift Container Platform - Cluster running on OpenStack and using UPI installation mode.

Version-Release number of selected component (if applicable):

 - OpenShift Container Platform 4.8.12

How reproducible:

 - So far always


Steps to Reproduce:
1. Install OpenShift Container Platform 4.8, wait some time and check metrics for `mapi_current_pending_csr`
2. Also make sure that there is no pending CSR in the Cluster

Actual results:

mapi_current_pending_csr is reporting 1 even though no pending CSR is reported

Expected results:

mapi_current_pending_csr should report 0 if there is no pending CSR

Additional info:

Comment 1 Simon Reber 2021-10-13 06:12:37 UTC
Created attachment 1832479 [details]
Metrics from mapi_current_pending_csr showing that it's set to one shortly after Cluster installation

Hi all,

The Screenshot attached is showing the history of the mapi_current_pending_csr metric which is set to one shortly after OpenShift Container Platform 4.8 is installed. As shown there are no pending CSR and after installation there was no Node added, removed or changed. So the cluster did not have pending CSR for a long time.

Comment 5 sunzhaohua 2021-10-27 06:52:54 UTC
waiting for the new nightly build to test

Comment 7 sunzhaohua 2021-10-29 03:23:15 UTC
verified
clusterversion: 4.10.0-0.nightly-2021-10-28-150422

$ oc get csr | grep Pending 
$

$ oc exec -c machine-approver-controller  machine-approver-86bc4fc875-whdtc  -- curl -k -H "Authorization: Bearer `oc sa get-token prometheus-k8s -n openshift-monitoring`"   -H "Content-type: application/json" https://10.0.139.161:9192/metrics | grep "mapi_current_pending_csr"
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0# HELP mapi_current_pending_csr Count of pending CSRs at the cluster level
# TYPE mapi_current_pending_csr gauge
mapi_current_pending_csr 0

Comment 10 errata-xmlrpc 2022-03-10 16:19:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056


Note You need to log in before you can comment on or make changes to this bug.