Created attachment 1681367 [details] no alert/rule on thanos-ruler UI Description of problem: enabled techPreviewUserWorkload, and create PrometheusRule under user namespace, there is not alert/rule on thanos-ruler UI. Steps: # oc -n openshift-user-workload-monitoring get pod NAME READY STATUS RESTARTS AGE prometheus-operator-765866997c-6fn65 2/2 Running 0 52m prometheus-user-workload-0 5/5 Running 1 52m prometheus-user-workload-1 5/5 Running 1 52m thanos-ruler-user-workload-0 3/3 Running 0 51m thanos-ruler-user-workload-1 3/3 Running 0 51m # oc new-project test3 # oc create -f - << EOF apiVersion: monitoring.coreos.com/v1 kind: PrometheusRule metadata: name: test3.rules spec: groups: - name: alerting rules rules: - alert: Watchdog expr: vector(1) labels: severity: none message: This is an alert meant to ensure that the entire alerting pipeline is functional. EOF could find the rule in rules-configmap-reloader container # oc -n openshift-user-workload-monitoring exec -c rules-configmap-reloader thanos-ruler-user-workload-0 -- cat /etc/thanos/rules/thanos-ruler-user-workload-rulefiles-0/test3-test3.rules.yaml groups: - name: alerting rules rules: - alert: Watchdog expr: vector(1) labels: namespace: test3 severity: none no alerts/rules from API query # token=`oc sa get-token prometheus-k8s -n openshift-monitoring` # oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://thanos-ruler.openshift-user-workload-monitoring.svc:9091/api/v1/alerts' | jq { "status": "success", "data": { "alerts": null } } # oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://thanos-ruler.openshift-user-workload-monitoring.svc:9091/api/v1/rules' | jq "status": "success", "data": { "groups": null } } Version-Release number of selected component (if applicable): 4.5.0-0.nightly-2020-04-23-202137 How reproducible: Always Steps to Reproduce: 1. See the description 2. 3. Actual results: no alert/rule on thanos-ruler UI Expected results: alert/rule on thanos-ruler UI Additional info:
the same with thanos-ruler sa # token=`oc sa get-token thanos-ruler -n openshift-user-workload-monitoring` # oc -n openshift-user-workload-monitoring exec -c rules-configmap-reloader thanos-ruler-user-workload-0 -- curl -k -H "Authorization: Bearer $token" 'https://thanos { "status": "success", "data": { "alerts": null } } # oc -n openshift-user-workload-monitoring exec -c rules-configmap-reloader thanos-ruler-user-workload-0 -- curl -k -H "Authorization: Bearer $token" 'https://thanos-ruler.openshift-user-workload-monitoring.svc:9091/api/v1/rules' | jq { "status": "success", "data": { "groups": null } }
(In reply to Junqi Zhao from comment #1) > the same with thanos-ruler sa > # oc -n openshift-user-workload-monitoring exec -c rules-configmap-reloader > thanos-ruler-user-workload-0 -- curl -k -H "Authorization: Bearer $token" > 'https://thanos > { > "status": "success", > "data": { > "alerts": null > } should be # oc -n openshift-user-workload-monitoring exec -c rules-configmap-reloader thanos-ruler-user-workload-0 -- curl -k -H "Authorization: Bearer $token" 'https://thanos-ruler.openshift-user-workload-monitoring.svc:9091/api/v1/alerts' | jq { "status": "success", "data": { "alerts": null } }
not sure if the issue is related to Bug 1827489, after Bug 1827489 is fixed,the there is not such issue with 4.5.0-0.nightly-2020-04-25-170442. close it # oc -n openshift-user-workload-monitoring logs thanos-ruler-user-workload-0 -c rules-configmap-reloader 2020/04/26 01:41:17 Watching directory: "/etc/thanos/rules/thanos-ruler-user-workload-rulefiles-0" # token=`oc sa get-token prometheus-k8s -n openshift-monitoring` # oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://thanos-ruler.openshift-user-workload-monitoring.svc:9091/api/v1/alerts' | jq % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 244 100 244 0 0 5741 0 --:--:-- --:--:-- --:--:-- 5809 { "status": "success", "data": { "alerts": [ { "labels": { "alertname": "Watchdog", "namespace": "load", "severity": "none" }, "annotations": {}, "state": "firing", "activeAt": "2020-04-26T01:53:41.308293746Z", "value": "1e+00", "partial_response_strategy": "ABORT" } ] } }
tested with 4.5.0-0.nightly-2020-05-05-205255, there are alerts/rule on thanos-ruler UI # oc -n openshift-user-workload-monitoring logs thanos-ruler-user-workload-0 -c rules-configmap-reloader 2020/05/06 08:23:14 Watching directory: "/etc/thanos/rules/thanos-ruler-user-workload-rulefiles-0" 2020/05/06 09:06:27 config map updated # token=`oc sa get-token thanos-ruler -n openshift-user-workload-monitoring` # oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://thanos-ruler.openshift-user-workload-monitoring.svc:9091/api/v1/alerts' | jq { "status": "success", "data": { "alerts": [ { "labels": { "alertname": "Watchdog", "namespace": "test3", "severity": "none" }, "annotations": {}, "state": "firing", "activeAt": "2020-05-06T09:06:42.078951644Z", "value": "1e+00", "partial_response_strategy": "ABORT" } ] } }
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409