Bug 1827530 - no alert/rule on thanos-ruler UI
Summary: no alert/rule on thanos-ruler UI
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Monitoring
Version: 4.5
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.5.0
Assignee: Simon Pasquier
QA Contact: Junqi Zhao
URL:
Whiteboard:
Depends On:
Blocks: 1819765 1828702
TreeView+ depends on / blocked
 
Reported: 2020-04-24 06:13 UTC by Junqi Zhao
Modified: 2020-07-13 17:31 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-07-13 17:30:56 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
no alert/rule on thanos-ruler UI (20.32 KB, image/png)
2020-04-24 06:13 UTC, Junqi Zhao
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift thanos pull 26 0 None closed Bug 1827530: Bump to v0.12.2 2021-02-18 05:13:37 UTC
Github thanos-io thanos issues 2514 0 None closed rule: `/-/reload` is inaccessible on v0.12.1 when no prefix has been specified 2021-02-18 05:13:38 UTC
Github thanos-io thanos issues 2515 0 None closed cmd: rule: do not wrap reload endpoint with '/' 2021-02-18 05:13:37 UTC
Github thanos-io thanos pull 2533 0 None closed cmd: rule: do not wrap reload endpoint with prefix twice 2021-02-18 05:13:37 UTC
Red Hat Product Errata RHBA-2020:2409 0 None None None 2020-07-13 17:31:19 UTC

Description Junqi Zhao 2020-04-24 06:13:51 UTC
Created attachment 1681367 [details]
no alert/rule on thanos-ruler UI

Description of problem:
enabled techPreviewUserWorkload, and create PrometheusRule under user namespace, there is not alert/rule on thanos-ruler UI.
Steps:
# oc -n openshift-user-workload-monitoring get pod
NAME                                   READY   STATUS    RESTARTS   AGE
prometheus-operator-765866997c-6fn65   2/2     Running   0          52m
prometheus-user-workload-0             5/5     Running   1          52m
prometheus-user-workload-1             5/5     Running   1          52m
thanos-ruler-user-workload-0           3/3     Running   0          51m
thanos-ruler-user-workload-1           3/3     Running   0          51m


# oc new-project test3
# oc create -f - << EOF
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: test3.rules
spec:
  groups:
  - name: alerting rules
    rules:
    - alert: Watchdog
      expr: vector(1)
      labels:
        severity: none
      message:
        This is an alert meant to ensure that the entire alerting pipeline is functional.
EOF

could find the rule in rules-configmap-reloader container
# oc -n openshift-user-workload-monitoring exec -c rules-configmap-reloader thanos-ruler-user-workload-0 -- cat /etc/thanos/rules/thanos-ruler-user-workload-rulefiles-0/test3-test3.rules.yaml 
groups:
- name: alerting rules
  rules:
  - alert: Watchdog
    expr: vector(1)
    labels:
      namespace: test3
      severity: none

no alerts/rules from API query
# token=`oc sa get-token prometheus-k8s -n openshift-monitoring`
# oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://thanos-ruler.openshift-user-workload-monitoring.svc:9091/api/v1/alerts' | jq
{
  "status": "success",
  "data": {
    "alerts": null
  }
}
# oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://thanos-ruler.openshift-user-workload-monitoring.svc:9091/api/v1/rules' | jq
  "status": "success",
  "data": {
    "groups": null
  }
}



Version-Release number of selected component (if applicable):
4.5.0-0.nightly-2020-04-23-202137

How reproducible:
Always

Steps to Reproduce:
1. See the description
2.
3.

Actual results:
no alert/rule on thanos-ruler UI

Expected results:
alert/rule on thanos-ruler UI

Additional info:

Comment 1 Junqi Zhao 2020-04-24 06:21:10 UTC
the same with thanos-ruler sa
# token=`oc sa get-token thanos-ruler -n openshift-user-workload-monitoring`
# oc -n openshift-user-workload-monitoring exec -c rules-configmap-reloader thanos-ruler-user-workload-0 -- curl -k -H "Authorization: Bearer $token" 'https://thanos
{
  "status": "success",
  "data": {
    "alerts": null
  }
}
# oc -n openshift-user-workload-monitoring exec -c rules-configmap-reloader thanos-ruler-user-workload-0 -- curl -k -H "Authorization: Bearer $token" 'https://thanos-ruler.openshift-user-workload-monitoring.svc:9091/api/v1/rules' | jq
{
  "status": "success",
  "data": {
    "groups": null
  }
}

Comment 2 Junqi Zhao 2020-04-24 06:25:48 UTC
(In reply to Junqi Zhao from comment #1)
> the same with thanos-ruler sa
> # oc -n openshift-user-workload-monitoring exec -c rules-configmap-reloader
> thanos-ruler-user-workload-0 -- curl -k -H "Authorization: Bearer $token"
> 'https://thanos
> {
>   "status": "success",
>   "data": {
>     "alerts": null
>   }
should be

# oc -n openshift-user-workload-monitoring exec -c rules-configmap-reloader thanos-ruler-user-workload-0 -- curl -k -H "Authorization: Bearer $token" 'https://thanos-ruler.openshift-user-workload-monitoring.svc:9091/api/v1/alerts' | jq
{
  "status": "success",
  "data": {
    "alerts": null
  }
}

Comment 7 Junqi Zhao 2020-04-26 02:02:27 UTC
not sure if the issue is related to Bug 1827489, after Bug 1827489 is fixed,the  there is not such issue with 4.5.0-0.nightly-2020-04-25-170442. close it
# oc -n openshift-user-workload-monitoring logs thanos-ruler-user-workload-0 -c rules-configmap-reloader
2020/04/26 01:41:17 Watching directory: "/etc/thanos/rules/thanos-ruler-user-workload-rulefiles-0"

# token=`oc sa get-token prometheus-k8s -n openshift-monitoring`
# oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://thanos-ruler.openshift-user-workload-monitoring.svc:9091/api/v1/alerts' | jq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   244  100   244    0     0   5741      0 --:--:-- --:--:-- --:--:--  5809
{
  "status": "success",
  "data": {
    "alerts": [
      {
        "labels": {
          "alertname": "Watchdog",
          "namespace": "load",
          "severity": "none"
        },
        "annotations": {},
        "state": "firing",
        "activeAt": "2020-04-26T01:53:41.308293746Z",
        "value": "1e+00",
        "partial_response_strategy": "ABORT"
      }
    ]
  }
}

Comment 14 Junqi Zhao 2020-05-06 09:16:00 UTC
tested with 4.5.0-0.nightly-2020-05-05-205255, there are alerts/rule on thanos-ruler UI
# oc -n openshift-user-workload-monitoring logs thanos-ruler-user-workload-0 -c rules-configmap-reloader
2020/05/06 08:23:14 Watching directory: "/etc/thanos/rules/thanos-ruler-user-workload-rulefiles-0"
2020/05/06 09:06:27 config map updated

# token=`oc sa get-token thanos-ruler -n openshift-user-workload-monitoring`
# oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://thanos-ruler.openshift-user-workload-monitoring.svc:9091/api/v1/alerts' | jq
{
  "status": "success",
  "data": {
    "alerts": [
      {
        "labels": {
          "alertname": "Watchdog",
          "namespace": "test3",
          "severity": "none"
        },
        "annotations": {},
        "state": "firing",
        "activeAt": "2020-05-06T09:06:42.078951644Z",
        "value": "1e+00",
        "partial_response_strategy": "ABORT"
      }
    ]
  }
}

Comment 15 errata-xmlrpc 2020-07-13 17:30:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409


Note You need to log in before you can comment on or make changes to this bug.