Bug 1798214 - Send apiserver request-in-flight metrics to telemeter
Summary: Send apiserver request-in-flight metrics to telemeter
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: kube-apiserver
Version: 4.4
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 4.4.0
Assignee: Abu Kashem
QA Contact: Ke Wang
URL:
Whiteboard:
: 1799057 (view as bug list)
Depends On:
Blocks: 1798215 1799057
TreeView+ depends on / blocked
 
Reported: 2020-02-04 20:21 UTC by Abu Kashem
Modified: 2020-05-04 11:33 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1798215 1799057 (view as bug list)
Environment:
Last Closed: 2020-05-04 11:33:13 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Github openshift cluster-kube-apiserver-operator pull 737 None closed Bug 1798214: Add a PrometheusRule to aggregate in-flight requests 2020-11-12 01:17:58 UTC
Red Hat Product Errata RHBA-2020:0581 None None None 2020-05-04 11:33:48 UTC

Description Abu Kashem 2020-02-04 20:21:00 UTC
Send apiserver request-in-flight metrics to telemeter

We want to have an idea of how loaded
our api server(s) are. Use the metric apiserver_current_inflight_requests to look at the peak of the number of requests in flight over time.

Comment 2 Ke Wang 2020-02-19 09:59:57 UTC
Since the telemeter server side has not been synced yet, see https://bugzilla.redhat.com/show_bug.cgi?id=1799057#c5, holding verification.

Comment 3 Ke Wang 2020-02-21 08:31:11 UTC
Verified with the following OCP env,
$ oc version
Client Version: v4.4.0
Server Version: 4.4.0-0.nightly-2020-02-18-132334
Kubernetes Version: v1.17.1

Verification steps,

1. Check if the code changes of PR https://github.com/openshift/cluster-kube-apiserver-operator/pull/737 in,
$ oc get ServiceMonitor -n openshift-kube-apiserver -o yaml
apiVersion: v1                                                                                                                                
items:    
- apiVersion: monitoring.coreos.com/v1    
  kind: ServiceMonitor   
  ...
  relabelings:          
  - action: replace 
    replacement: kube-apiserver                      
    targetLabel: apiserver
  ...

$ oc get PrometheusRule -n openshift-kube-apiserver -o yaml
...
   - name: apiserver-requests-in-flight
      rules:
      - expr: |
          max_over_time(sum(apiserver_current_inflight_requests{apiserver=~"openshift-apiserver|kube-apiserver"}) by (apiserver,requestKind)[2m:])
        record: cluster:apiserver_current_inflight_requests:sum:max_over_time:2m
...

$ oc -n openshift-monitoring get cm telemetry-config -oyaml | grep "cluster:apiserver_current_inflight_requests:sum:max_over_time:2m" 
    # cluster:apiserver_current_inflight_requests:sum:max_over_time:2m gives maximum number of requests in flight
    - '{__name__="cluster:apiserver_current_inflight_requests:sum:max_over_time:2m"}'

The code changes are checked as expected.

2. Check if the feature work fine with Metrics.

Open the OCP cluster web console, on the left panel, navigate to Monitoring-> Metrics,  enter the keyword ‘cluster:apiserver_current_inflight_requests:sum:max_over_time:2m’ in query textarea of displayed page , click on ‘Run  Queries’,
four items of openshift-apiserver and kube-apiserver are displayed, at column Value, we can see the requests number in 2 minutes.

We will see the feature work as expected.

Comment 4 Ke Wang 2020-03-05 04:28:47 UTC
Pasted the ‘cluster:apiserver_current_inflight_requests:sum:max_over_time:2m’ result in Prometheus,

Element 	                                                                                                             Value                                                                                                                                                                                      
cluster:apiserver_current_inflight_requests:sum:max_over_time:2m{apiserver="kube-apiserver",requestKind="mutating"}	      4
cluster:apiserver_current_inflight_requests:sum:max_over_time:2m{apiserver="kube-apiserver",requestKind="readOnly"}	      6
cluster:apiserver_current_inflight_requests:sum:max_over_time:2m{apiserver="openshift-apiserver",requestKind="mutating"}      1
cluster:apiserver_current_inflight_requests:sum:max_over_time:2m{apiserver="openshift-apiserver",requestKind="readOnly"}      3

Comment 5 Ke Wang 2020-03-05 08:58:03 UTC
*** Bug 1799057 has been marked as a duplicate of this bug. ***

Comment 7 errata-xmlrpc 2020-05-04 11:33:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0581


Note You need to log in before you can comment on or make changes to this bug.