Bug 1828702 - End-to-end tests didn't detect failures to reload Thanos Ruler
Summary: End-to-end tests didn't detect failures to reload Thanos Ruler
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Monitoring
Version: 4.5
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: 4.5.0
Assignee: Simon Pasquier
QA Contact: Junqi Zhao
URL:
Whiteboard:
Depends On: 1827530
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-04-28 07:47 UTC by Simon Pasquier
Modified: 2022-06-20 07:36 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-07-13 17:32:07 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-monitoring-operator pull 773 0 None closed Bug 1828702: wait for trusted CA bundle to be created 2020-09-21 07:39:19 UTC
Red Hat Product Errata RHBA-2020:2409 0 None None None 2020-07-13 17:32:18 UTC

Description Simon Pasquier 2020-04-28 07:47:34 UTC
Description of problem:
As noted in bug 1827530, there's currently a bug with the Thanos Ruler /-/reload endpoint making it impossible to reload its rules. This issue should have been caught by the CI.

Version-Release number of selected component (if applicable):
4.5

How reproducible:
Always

Steps to Reproduce:
N/A

Actual results:
E2E tests pass.

Expected results:
The e2e TestUserWorkloadMonitoring test should fail because it provisions a PrometheusRule resource after Thanos Ruler is up and running.

Additional info:
The test doesn't fail because the Thanos Ruler statefulset gets redeployed soon after the PrometheusRule resource is provisioned.
Once you enable workload monitoring, CMO will generate a first version of the Thanos Ruler spec but without the trusted CA bundle ConfigMap (because it isn't yet ready).
Once the trusted bundle CM is populated then CMO catches up and adds it to the Thanos Ruler custom resource which in turn modifies the hash of the Thanos Ruler statefulset (computed by the prometheus operator).
Eventually a new revision of the statefulset is rolled out with the expected rules.

Comment 4 Junqi Zhao 2020-05-18 06:25:13 UTC
tested with 4.5.0-0.nightly-2020-05-17-201019, created rule files to trigger the reload, no error was found

# oc -n openshift-user-workload-monitoring logs thanos-ruler-user-workload-1 -c rules-configmap-reloader
2020/05/18 05:46:53 Watching directory: "/etc/thanos/rules/thanos-ruler-user-workload-rulefiles-0"
2020/05/18 05:57:39 config map updated
2020/05/18 05:57:39 successfully triggered reload
2020/05/18 06:00:03 config map updated
2020/05/18 06:00:03 successfully triggered reload
2020/05/18 06:04:46 config map updated
2020/05/18 06:04:46 successfully triggered reload

Comment 5 errata-xmlrpc 2020-07-13 17:32:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409

Comment 6 ekumenskukv 2022-04-20 08:45:46 UTC Comment hidden (spam)

Note You need to log in before you can comment on or make changes to this bug.