Bug 1776133 - template-service-broker-operator doesn't notify users and admins via alerts in prometheus
Summary: template-service-broker-operator doesn't notify users and admins via alerts i...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Service Broker
Version: 4.3.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.3.z
Assignee: Jesus M. Rodriguez
QA Contact: Cuiping HUO
URL:
Whiteboard:
Depends On: 1782061
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-11-25 08:14 UTC by Cuiping HUO
Modified: 2020-06-17 20:28 UTC (History)
5 users (show)

Fixed In Version: jfan@redhat.com, jiazha@redhat.com, chezhang@redhat.com
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1782061 (view as bug list)
Environment:
Last Closed: 2020-06-17 20:27:11 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift template-service-broker-operator pull 102 0 None closed Bug 1776133: Allow the PrometheusAlert to fire. 2020-08-14 08:53:31 UTC
Github openshift template-service-broker-operator pull 64 0 None closed Bug 1776133: Allow TSBO to access endpoints 2020-08-14 08:53:31 UTC
Red Hat Product Errata RHBA-2020:2436 0 None None None 2020-06-17 20:28:20 UTC

Description Cuiping HUO 2019-11-25 08:14:14 UTC
Description of problem:
template-service-broker-operator doesn't notify users and admins via alerts in prometheus

Version-Release number of selected component (if applicable):
4.3.0-0.nightly-2019-11-24-183610
tsb operator commit.id:27d3ed355ca2ef9d6cf59df940ff7ecac27e6591

How reproducible:
Always

Steps to Reproduce:
1. Install tsb operator 
2. check prometheus rule for the template service broker operator
3. check prometheus targets for the template service broker operator

Actual results:
2. no rule for the template service broker operator
3. no targets for the template service broker operator

Expected results:
2. prometheus rule with alert name:TemplateServiceBrokerEnabled can be find found
3. prometheus targets for template-service-broker-operator withe an Endpoint can be found

Additional info:
$ oc get csv -n openshift-template-service-broker
NAME                                                        DISPLAY                                      VERSION              REPLACES   PHASE
openshifttemplateservicebrokeroperator.4.3.0-201911220712   OpenShift Template Service Broker Operator   4.3.0-201911220712              Succeeded
$ oc image info registry-proxy.engineering.redhat.com/rh-osbs/openshift-ose-template-service-broker-operator:v4.3.0-201911220712 --filter-by-os linux/amd64| grep commit.id
               io.openshift.build.commit.id=27d3ed355ca2ef9d6cf59df940ff7ecac27e6591

Comment 2 Cuiping HUO 2019-12-12 10:31:31 UTC
While install tsb according 
https://docs.openshift.com/container-platform/4.2/applications/service_brokers/installing-template-service-broker.html

there is no requirement to have the monitoring labels on the namespace. And I added the label by manually.
$ oc get ns openshift-template-service-broker --show-labels
NAME                                STATUS   AGE   LABELS
openshift-template-service-broker   Active   5h    openshift.io/cluster-monitoring=true

then the prometheus rule with alert name  TemplateServiceBrokerEnabled can be found and targets for the template service broker operator also shows the Endpoint 
alert: TemplateServiceBrokerEnabled
expr: templateservicebroker_info{namespace="openshift-template-service-broker",templateservicebroker="template-service-broker"}
  > 0
labels:
  severity: warning
annotations:
  summary: Indicates whether the Template Service Broker is enabled

$ oc get ep -n openshift-template-service-broker
NAME                                                 ENDPOINTS          AGE
apiserver                                            10.130.2.10:8443   100m
openshift-template-service-broker-operator-metrics   <none>             100m

but metrics had problem and TemplateServiceBrokerEnabled was not firing.
$ token=`oc -n openshift-monitoring sa get-token prometheus-k8s`
$ oc -n openshift-monitoring exec -c prometheus prometheus-k8s-1  -- curl -k -H "Authorization: Bearer $token" 'https://10.130.2.10:8443/metrics'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0# HELP apiserver_audit_event_total Counter of audit events generated and sent to the audit backend.
# TYPE apiserver_audit_event_total counter
apiserver_audit_event_total 0
# HELP apiserver_audit_requests_rejected_total Counter of apiserver requests rejected due to an error in audit logging backend.
# TYPE apiserver_audit_requests_rejected_total counter
apiserver_audit_requests_rejected_total 0
# HELP apiserver_client_certificate_expiration_seconds Distribution of the remaining lifetime on the certificate used to authenticate a request.
# TYPE apiserver_client_certificate_expiration_seconds histogram
apiserver_client_certificate_expiration_seconds_bucket{le="0"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="1800"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="3600"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="7200"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="21600"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="43200"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="86400"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="172800"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="345600"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="604800"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="2.592e+06"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="7.776e+06"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="1.5552e+07"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="3.1104e+07"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="+Inf"} 0
apiserver_client_certificate_expiration_seconds_sum 0
apiserver_client_certificate_expiration_seconds_count 0
# HELP apiserver_current_inflight_requests Maximal number of currently used inflight request limit of this apiserver per request kind in last second.
# TYPE apiserver_current_inflight_requests gauge
apiserver_current_inflight_requests{requestKind="mutating"} 0
apiserver_current_inflight_requests{requestKind="readOnly"} 0
# HELP apiserver_storage_data_key_generation_duration_seconds Latencies in seconds of data encryption key(DEK) generation operations.
# TYPE apiserver_storage_data_key_generation_duration_seconds histogram
apiserver_storage_data_key_generation_duration_seconds_bucket{le="5e-06"} 0
apiserver_storage_data_key_generation_duration_seconds_bucket{le="1e-05"} 0
apiserver_storage_data_key_generation_duration_seconds_bucket{le="2e-05"} 0
apiserver_storage_data_key_generation_duration_seconds_bucket{le="4e-05"} 0
apiserver_storage_data_key_generation_duration_seconds_bucket{le="8e-05"} 0
apiserver_storage_data_key_generation_duration_seconds_bucket{le="0.00016"} 0
apiserver_storage_data_key_generation_duration_seconds_bucket{le="0.00032"} 0
apiserver_storage_data_key_generation_duration_seconds_bucket{le="0.00064"} 0
apiserver_storage_data_key_generation_duration_seconds_bucket{le="0.00128"} 0
apiserver_storage_data_key_generation_duration_seconds_bucket{le="0.00256"} 0
apiserver_storage_data_key_generation_duration_seconds_bucket{le="0.00512"} 0
apiserver_storage_data_key_generation_duration_seconds_bucket{le="0.01024"} 0
apiserver_storage_data_key_generation_duration_seconds_bucket{le="0.02048"} 0
apiserver_storage_data_key_generation_duration_seconds_bucket{le="0.04096"} 0
apiserver_storage_data_key_generation_duration_seconds_bucket{le="+Inf"} 0
apiserver_storage_data_key_generation_duration_seconds_sum 0
apiserver_storage_data_key_generation_duration_seconds_count 0
# HELP apiserver_storage_data_key_generation_failures_total Total number of failed data encryption key(DEK) generation operations.
# TYPE apiserver_storage_data_key_generation_failures_total counter
apiserver_storage_data_key_generation_failures_total 0
# HELP apiserver_storage_data_key_generation_latencies_microseconds (Deprecated) Latencies in microseconds of data encryption key(DEK) generation operations.
# TYPE apiserver_storage_data_key_generation_latencies_microseconds histogram
apiserver_storage_data_key_generation_latencies_microseconds_bucket{le="5"} 0
apiserver_storage_data_key_generation_latencies_microseconds_bucket{le="10"} 0
apiserver_storage_data_key_generation_latencies_microseconds_bucket{le="20"} 0
apiserver_storage_data_key_generation_latencies_microseconds_bucket{le="40"} 0
apiserver_storage_data_key_generation_latencies_microseconds_bucket{le="80"} 0
apiserver_storage_data_key_generation_latencies_microseconds_bucket{le="160"} 0
apiserver_storage_data_key_generation_latencies_microseconds_bucket{le="320"} 0
apiserver_storage_data_key_generation_latencies_microseconds_bucket{le="640"} 0
apiserver_storage_data_key_generation_latencies_microseconds_bucket{le="1280"} 0
apiserver_storage_data_key_generation_latencies_microseconds_bucket{le="2560"} 0
apiserver_storage_data_key_generation_latencies_microseconds_bucket{le="5120"} 0
apiserver_storage_data_key_generation_latencies_microseconds_bucket{le="10240"} 0
apiserver_storage_data_key_generation_latencies_microseconds_bucket{le="20480"} 0
apiserver_storage_data_key_generation_latencies_microseconds_bucket{le="40960"} 0
apiserver_storage_data_key_generation_latencies_microseconds_bucket{le="+Inf"} 0
apiserver_storage_data_key_generation_latencies_microseconds_sum 0
apiserver_storage_data_key_generation_latencies_microseconds_count 0
# HELP apiserver_storage_envelope_transformation_cache_misses_total Total number of cache misses while accessing key decryption key(KEK).
# TYPE apiserver_storage_envelope_transformation_cache_misses_total counter
apiserver_storage_envelope_transformation_cache_misses_total 0
# HELP authenticated_user_requests Counter of authenticated requests broken out by username.
# TYPE authenticated_user_requests counter
authenticated_user_requests{username="other"} 392
# HELP etcd_helper_cache_entry_count (Deprecated) Counter of etcd helper cache entries. This can be different from etcd_helper_cache_miss_count because two concurrent threads can miss the cache and generate the same entry twice.
# TYPE etcd_helper_cache_entry_count counter
etcd_helper_cache_entry_count 0
# HELP etcd_helper_cache_entry_total Counter of etcd helper cache entries. This can be different from etcd_helper_cache_miss_count because two concurrent threads can miss the cache and generate the same entry twice.
# TYPE etcd_helper_cache_entry_total counter
etcd_helper_cache_entry_total 0
# HELP etcd_helper_cache_hit_count (Deprecated) Counter of etcd helper cache hits.
# TYPE etcd_helper_cache_hit_count counter
etcd_helper_cache_hit_count 0
# HELP etcd_helper_cache_hit_total Counter of etcd helper cache hits.
# TYPE etcd_helper_cache_hit_total counter
etcd_helper_cache_hit_total 0
# HELP etcd_helper_cache_miss_count (Deprecated) Counter of etcd helper cache miss.
# TYPE etcd_helper_cache_miss_count counter
etcd_helper_cache_miss_count 0
# HELP etcd_helper_cache_miss_total Counter of etcd helper cache miss.
# TYPE etcd_helper_cache_miss_total counter
etcd_helper_cache_miss_total 0
# HELP etcd_request_cache_add_duration_seconds Latency in seconds of adding an object to etcd cache
# TYPE etcd_request_cache_add_duration_seconds histogram
etcd_request_cache_add_duration_seconds_bucket{le="0.005"} 0
etcd_request_cache_add_duration_seconds_bucket{le="0.01"} 0
etcd_request_cache_add_duration_seconds_bucket{le="0.025"} 0
etcd_request_cache_add_duration_seconds_bucket{le="0.05"} 0
etcd_request_cache_add_duration_seconds_bucket{le="0.1"} 0
etcd_request_cache_add_duration_seconds_bucket{le="0.25"} 0
etcd_request_cache_add_duration_seconds_bucket{le="0.5"} 0
etcd_request_cache_add_duration_seconds_bucket{le="1"} 0
etcd_request_cache_add_duration_seconds_bucket{le="2.5"} 0
etcd_request_cache_add_duration_seconds_bucket{le="5"} 0
etcd_request_cache_add_duration_seconds_bucket{le="10"} 0
etcd_request_cache_add_duration_seconds_bucket{le="+Inf"} 0
etcd_request_cache_add_duration_seconds_sum 0
etcd_request_cache_add_duration_seconds_count 0
# HELP etcd_request_cache_add_latencies_summary (Deprecated) Latency in microseconds of adding an object to etcd cache
100 17287    0 17287    0     0   _summary summary
etcd_request_cache_add_latencies_summary{quantile="0.5"} NaN
etcd_request_cache_add_latencies_summary{quantile="0.9"} NaN
etcd_request_cache_add_latencies_summary{quantile="0.99"} NaN
etcd_request_cache_add_latencies_summary_sum 0
etcd_request_cache_add_latencies_summary_count 0
# HELP etcd_request_cache_get_duration_seconds Latency in seconds of getting an object from etcd cache
# TYPE etcd_request_cache_get_duration_seconds histogram
etcd_request_cache_get_duration_seconds_bucket{le="0.005"} 0
etcd_request_cache_get_duration_seconds_bucket{le="0.01"} 0
etcd_request_cache_get_duration_seconds_bucket{le="0.025"} 0
etcd_request_cache_get_duration_seconds_bucket{le="0.05"} 0
etcd_request_cache_get_duration_seconds_bucket{le="0.1"} 0
etcd_request_cache_get_duration_seconds_bucket{le="0.25"} 0
etcd_request_cache_get_duration_seconds_bucket{le="0.5"} 0
etcd_request_cache_get_duration_seconds_bucket{le="1"} 0
etcd_request_cache_get_duration_seconds_bucket{le="2.5"} 0
etcd_request_cache_get_duration_seconds_bucket{le="5"} 0
etcd_request_cache_get_duration_seconds_bucket{le="10"} 0
etcd_request_cache_get_duration_seconds_bucket{le="+Inf"} 0
etcd_request_cache_get_duration_seconds_sum 0
etcd_request_cache_get_duration_seconds_count 0
# HELP etcd_request_cache_get_latencies_summary (Deprecated) Latency in microseconds of getting an object from etcd cache
# TYPE etcd_request_cache_get_latencies_summary summary
etcd_request_cache_get_latencies_summary{quantile="0.5"} NaN
etcd_request_cache_get_latencies_summary{quantile="0.9"} NaN
etcd_request_cache_get_latencies_summary{quantile="0.99"} NaN
etcd_request_cache_get_latencies_summary_sum 0
etcd_request_cache_get_latencies_summary_count 0
# HELP go_gc_duration_seconds A summary of the GC invocation durations.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 1.343e-05
go_gc_duration_seconds{quantile="0.25"} 2.1515e-05
go_gc_duration_seconds{quantile="0.5"} 3.243e-05
go_gc_duration_seconds{quantile="0.75"} 5.3755e-05
go_gc_duration_seconds{quantile="1"} 0.000303093
go_gc_duration_seconds_sum 0.003170415
go_gc_duration_seconds_count 56
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 31
# HELP go_info Information about the Go environment.
# TYPE go_info gauge
go_info{version="go1.12.12"} 1
# HELP go_memstats_alloc_bytes Number of bytes allocated and still in use.
# TYPE go_memstats_alloc_bytes gauge
go_memstats_alloc_bytes 8.297024e+06
# HELP go_memstats_alloc_bytes_total Total number of bytes allocated, even if freed.
# TYPE go_memstats_alloc_bytes_total counter
go_memstats_alloc_bytes_total 2.9249872e+08
# HELP go_memstats_buck_hash_sys_bytes Number of bytes used by the profiling bucket hash table.
# TYPE go_memstats_buck_hash_sys_bytes gauge
go_memstats_buck_hash_sys_bytes 1.525793e+06
# HELP go_memstats_frees_total Total number of frees.
# TYPE go_memstats_frees_total counter
go_memstats_frees_total 1.944657e+06
# HELP go_memstats_gc_cpu_fraction The fraction of this program's available CPU time used by the GC since the program started.
# TYPE go_memstats_gc_cpu_fraction gauge
go_memstats_gc_cpu_fraction 1.524085766737999e-05
# HELP go_memstats_gc_sys_bytes Number of bytes used for garbage collection system metadata.
# TYPE go_memstats_gc_sys_bytes gauge
go_memstats_gc_sys_bytes 2.414592e+06
# HELP go_memstats_heap_alloc_bytes Number of heap bytes allocated and still in use.
# TYPE go_memstats_heap_alloc_bytes gauge
go_memstats_heap_alloc_bytes 8.297024e+06
# HELP go_memstats_heap_idle_bytes Number of heap bytes waiting to be used.
# TYPE go_memstats_heap_idle_bytes gauge
go_memstats_heap_idle_bytes 5.2240384e+07
# HELP go_memstats_heap_inuse_bytes Number of heap bytes that are in use.
# TYPE go_memstats_heap_inuse_bytes gauge
go_memstats_heap_inuse_bytes 1.4082048e+07
# HELP go_memstats_heap_objects Number of allocated objects.
# TYPE go_memstats_heap_objects gauge
go_memstats_heap_objects 55958
# HELP go_memstats_heap_released_bytes Number of heap bytes released to OS.
# TYPE go_memsta130k      0 --:--:ts_heap_released_bytes gauge
go_memstats_heap_released_bytes 3.6470784e+07
# HELP go_memstats_heap_sys_bytes Number of heap bytes obtained from system.
# TYPE go_memstats_heap_sys_bytes gauge
go_memstats_heap_sys_bytes 6.6322432e+07
# HELP go_memstats_last_gc_time_seconds Number of seconds since 1970 of last garbage collection.
# TYPE go_memstats_last_gc_time_seconds gauge
go_memstats_last_gc_time_seconds 1.5761464648140407e+09
# HELP go_memstats_lookups_total Total number of pointer lookups.
# TYPE go_memstats_lookups_total counter
go_memstats_lookups_total 0
# HELP go_memstats_mallocs_total Total number of mallocs.
# TYPE go_memstats_mallocs_total counter
go_memstats_mallocs_total 2.000615e+06
# HELP go_memstats_mcache_inuse_bytes Number of bytes in use by mcache structures.
# TYPE go_memstats_mcache_inuse_bytes gauge
go_memstats_mcache_inuse_bytes 6944
# HELP go_memstats_mcache_sys_bytes Number of bytes used for mcache structures obtained from system.
# TYPE go_memstats_mcache_sys_bytes gauge
go_memstats_mcache_sys_bytes 16384
# HELP go_memstats_mspan_inuse_bytes Number of bytes in use by mspan structures.
# TYPE go_memstats_mspan_inuse_bytes gauge
go_memstats_mspan_inuse_bytes 197136
# HELP go_memstats_mspan_sys_bytes Number of bytes used for mspan structures obtained from system.
# TYPE go_memstats_mspan_sys_bytes gauge
go_memstats_mspan_sys_bytes 327680
# HELP go_memstats_next_gc_bytes Number of heap bytes when next garbage collection will take place.
# TYPE go_memstats_next_gc_bytes gauge
go_memstats_next_gc_bytes 1.5886224e+07
# HELP go_memstats_other_sys_bytes Number of bytes used for other system allocations.
# TYPE go_memstats_other_sys_bytes gauge
go_memstats_other_sys_bytes 893143
# HELP go_memstats_stack_inuse_bytes Number of bytes in use by the stack allocator.
# TYPE go_memstats_stack_inuse_bytes gauge
go_memstats_stack_inuse_bytes 786432
# HELP go_memstats_stack_sys_bytes Number of bytes obtained from system for stack allocator.
# TYPE go_memstats_stack_sys_bytes gauge
go_memstats_stack_sys_bytes 786432
# HELP go_memstats_sys_bytes Number of bytes obtained from system.
# TYPE go_memstats_sys_bytes gauge
go_memstats_sys_bytes 7.2286456e+07
# HELP go_threads Number of OS threads created.
# TYPE go_threads gauge
go_threads 14
# HELP http_request_duration_microseconds The HTTP request latencies in microseconds.
# TYPE http_request_duration_microseconds summary
http_request_duration_microseconds{handler="prometheus",quantile="0.5"} NaN
http_request_duration_microseconds{handler="prometheus",quantile="0.9"} NaN
http_request_duration_microseconds{handler="prometheus",quantile="0.99"} NaN
http_request_duration_microseconds_sum{handler="prometheus"} 0
http_request_duration_microseconds_count{handler="prometheus"} 0
# HELP http_request_size_bytes The HTTP request sizes in bytes.
# TYPE http_request_size_bytes summary
http_request_size_bytes{handler="prometheus",quantile="0.5"} NaN
http_request_size_bytes{handler="prometheus",quantile="0.9"} NaN
http_request_size_bytes{handler="prometheus",quantile="0.99"} NaN
http_request_size_bytes_sum{handler="prometheus"} 0
http_request_size_bytes_count{handler="prometheus"} 0
# HELP http_response_size_bytes The HTTP response sizes in bytes.
# TYPE http_response_size_bytes summary
http_response_size_bytes{handler="prometheus",quantile="0.5"} NaN
http_response_size_bytes{handler="prometheus",quantile="0.9"} NaN
http_response_size_bytes{handler="prometheus",quantile="0.99"} NaN
http_response_size_bytes_sum{handler="prometheus"} 0
http_response_size_bytes_count{handler="prometheus"} 0
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 5.45
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1.048576e+06
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 8
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resi-- --:--:-- --:--:dent_memory_bytes 5.4632448e+07
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.57614266293e+09
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 6.71289344e+08
# HELP process_virtual_memory_max_bytes Maximum amount of virtual memory available in bytes.
# TYPE process_virtual_memory_max_bytes gauge
process_virtual_memory_max_bytes -1
# HELP template_service_broker_build_info A metric with a constant '1' value labeled by major, minor, git commit & git version from which Template Service Broker was built.
# TYPE template_service_broker_build_info gauge
template_service_broker_build_info{gitCommit="58203d1a763764c7eec567473304e831946ab8c3",gitVersion="v0.0.0-alpha.0-5-g58203d1a",major="",minor=""} 1
--  129k

Comment 4 Cuiping HUO 2019-12-13 08:32:43 UTC
Verification failed in 3 points.
1.Alert: TemplateServiceBrokerEnabled is not firing.
2.prometheus rule with alert name  TemplateServiceBrokerEnabled should have a message field. 
3.The metric name is template_service_broker_build_info, no template_service_broker_enabled.

cluster version:4.3.0-0.nightly-2019-12-12-004325
tsb commit.id: 0227d00b0e5aaa2c48cc1f3756cb09b04bc83c1f

$ oc get ep -n openshift-template-service-broker
NAME                                                 ENDPOINTS                           AGE
apiserver                                            10.128.2.23:8443                    108s
openshift-template-service-broker-operator-metrics   10.131.0.16:8383,10.131.0.16:8686   10m

$  token=`oc -n openshift-monitoring sa get-token prometheus-k8s`
$ oc -n openshift-monitoring exec -c prometheus prometheus-k8s-1  -- curl -k -H "Authorization: Bearer $token" 'https://10.128.2.23:8443/metrics' | grep template_service_broker
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 17462    0 17462    0     0   136k      0 --:--:-- --:--:-- --:--:--  137k
# HELP template_service_broker_build_info A metric with a constant '1' value labeled by major, minor, git commit & git version from which Template Service Broker was built.
# TYPE template_service_broker_build_info gauge
template_service_broker_build_info{gitCommit="58203d1a763764c7eec567473304e831946ab8c3",gitVersion="v0.0.0-alpha.0-5-g58203d1a",major="",minor=""} 1


$ oc get csv -n openshift-template-service-broker
NAME                                                        DISPLAY                                      VERSION              REPLACES   PHASE
openshifttemplateservicebrokeroperator.4.3.0-201912122317   OpenShift Template Service Broker Operator   4.3.0-201912122317              Succeeded

Comment 5 Cuiping HUO 2019-12-19 06:57:58 UTC
templateservicebroker_info metrics works, but 
1.Alert: TemplateServiceBrokerEnabled is not firing.
2.prometheus rule with alert name  TemplateServiceBrokerEnabled should have a message field.

$ oc get ep openshift-template-service-broker
NAME                                                 ENDPOINTS                           AGE
apiserver                                            10.128.2.27:8443                    4m51s
openshift-template-service-broker-operator-metrics   10.128.2.25:8383,10.128.2.25:8686   5m6s

$ oc -n openshift-monitoring exec prometheus-k8s-1 -c prometheus -- curl 'http://10.128.2.25:8686/metrics' | grep templateservicebroker_info
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   259  100   259    0     0   158k      0 --:--:-- --:--:-- --:--:--  252k
# HELP templateservicebroker_info Information about the TemplateServiceBroker custom resource.
# TYPE templateservicebroker_info gauge
templateservicebroker_info{namespace="openshift-template-service-broker",templateservicebroker="template-service-broker"} 1
[chuo@dhcp-140-51 .kube]$ oc get csv -n openshift-template-service-broker
NAME                                                        DISPLAY                                      VERSION              REPLACES   PHASE
openshifttemplateservicebrokeroperator.4.3.0-201912171717   OpenShift Template Service Broker Operator   4.3.0-201912171717              Succeeded

Comment 7 Jesus M. Rodriguez 2020-05-15 03:32:52 UTC
Bug has been reviewed this sprint.

Comment 11 Cuiping HUO 2020-06-01 08:41:50 UTC
Verified.

template-service-broker-operator version: 4.3.24-202005300952
templateservicebroker_info metrics works and TemplateServiceBrokerEnabled is firing.
and as https://bugzilla.redhat.com/show_bug.cgi?id=1841099 shows WONTFIX, "prometheus rule with alert name  TemplateServiceBrokerEnabled should have a message field" is not a issue now.

$ oc get ep -n openshift-template-service-broker
NAME                                                 ENDPOINTS                           AGE
apiserver                                            10.129.2.12:8443                    85s
openshift-template-service-broker-operator-metrics   10.131.0.18:8383,10.131.0.18:8686   3m24s

$ token=`oc -n openshift-monitoring sa get-token prometheus-k8s`

$ oc -n openshift-monitoring exec -c prometheus prometheus-k8s-1  -- curl -k -H "Authorization: Bearer $token" 'https://10.129.2.12:8443/metrics' | grep template_service_broker
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 17244    0 17244    0     0   278k      0 --:--:-- --:--:-- --:--:--  280k
# HELP template_service_broker_build_info A metric with a constant '1' value labeled by major, minor, git commit & git version from which Template Service Broker was built.
# TYPE template_service_broker_build_info gauge
template_service_broker_build_info{gitCommit="58203d1a763764c7eec567473304e831946ab8c3",gitVersion="58203d1a",major="",minor=""} 1

Comment 15 errata-xmlrpc 2020-06-17 20:27:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2436


Note You need to log in before you can comment on or make changes to this bug.