1955490 – Thanos ruler Statefulsets should have 2 replicas and hard affinity set

Bug 1955490 - Thanos ruler Statefulsets should have 2 replicas and hard affinity set

Summary: Thanos ruler Statefulsets should have 2 replicas and hard affinity set

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Monitoring
Sub Component:
Version:	4.6
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	4.10.0
Assignee:	Simon Pasquier
QA Contact:	Junqi Zhao
Docs Contact:	Brian Burt
URL:
Whiteboard:
Duplicates (3):	1950035 1997948 2016753 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2021-04-30 08:52 UTC by Simon Pasquier
Modified:	2022-05-20 14:09 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:	Previously, the Thanos Ruler service would become unavailable when the node that contains the two Thanos Ruler pods experienced an outage. This situation occurred because the Thanos Ruler pods had only soft anti-affinity rules regarding node placement. Consequently, user-defined rules would not be evaluated until the node came back online. With this release, the Cluster Monitoring Operator (CMO) now configures hard anti-affinity rules to ensure that the two Thanos Ruler pods are scheduled on different nodes. As a result, a single-node outage no longer creates a gap in user-defined rule evaluation.
Clone Of:
Environment:
Last Closed:	2022-03-10 16:03:07 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Github	openshift cluster-monitoring-operator pull 1191	None	closed	Bug 1955490: Thanos ruler Statefulsets should have 2 replicas and hard affinity set	2021-10-25 07:59:50 UTC
Github	openshift cluster-monitoring-operator pull 1341	None	open	Bug 1933847: enable hard affinity + PodDisruptionBudget for Prometheus and Thanos Ruler pods	2021-10-25 08:04:47 UTC
Red Hat Knowledge Base (Solution)	6959436	None	None	None	2022-05-20 14:09:37 UTC
Red Hat Product Errata	RHSA-2022:0056	None	None	None	2022-03-10 16:03:35 UTC

Description Simon Pasquier 2021-04-30 08:52:21 UTC

Description of problem:

As mentioned in the conventions doc https://github.com/openshift/enhancements/blob/master/CONVENTIONS.md#high-availability, Thanos ruler should have replica count of 2 with hard affinities set till we bring descheduler into our product.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Follow-up of bug 1949262.

Comment 1 Simon Pasquier 2021-04-30 12:05:52 UTC

*** Bug 1950035 has been marked as a duplicate of this bug. ***

Comment 3 Haoyu Sun 2021-06-04 09:49:23 UTC

The pull request is on hold because of an issue about hard affinity and persistent volumes, as detailed in bug: https://bugzilla.redhat.com/show_bug.cgi?id=1967614

Comment 4 Damien Grisonnet 2021-06-10 12:16:23 UTC

PR has been closed for the reasons mentioned above. Moving the BZ back to assigned.

Comment 5 Simon Pasquier 2021-08-26 07:52:52 UTC

*** Bug 1997948 has been marked as a duplicate of this bug. ***

Comment 12 Simon Pasquier 2021-10-25 07:55:56 UTC

*** Bug 2016753 has been marked as a duplicate of this bug. ***

Comment 14 Simon Pasquier 2021-11-23 07:55:30 UTC

https://github.com/openshift/cluster-monitoring-operator/pull/1341 has been merged

Comment 15 Junqi Zhao 2021-11-29 06:34:16 UTC

checked with 4.10.0-0.nightly-2021-11-28-164900, Thanos ruler Statefulset now has 2 replicas and hard affinity set
# oc -n openshift-user-workload-monitoring get pod -o wide | grep thanos-ruler
thanos-ruler-user-workload-0          3/3     Running   0          8m55s   10.129.2.65    ip-10-0-194-46.us-east-2.compute.internal    <none>           <none>
thanos-ruler-user-workload-1          3/3     Running   0          8m55s   10.128.2.123   ip-10-0-191-20.us-east-2.compute.internal    <none>           <none>

# oc -n openshift-user-workload-monitoring get sts thanos-ruler-user-workload -oyaml
...
spec:
  podManagementPolicy: Parallel
  replicas: 2
  revisionHistoryLimit: 10
...
  template:
    metadata:
      annotations:
        kubectl.kubernetes.io/default-container: thanos-ruler
        target.workload.openshift.io/management: '{"effect": "PreferredDuringScheduling"}'
      creationTimestamp: null
      labels:
        app.kubernetes.io/instance: user-workload
        app.kubernetes.io/managed-by: prometheus-operator
        app.kubernetes.io/name: thanos-ruler
        thanos-ruler: user-workload
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                app.kubernetes.io/name: thanos-ruler
                thanos-ruler: user-workload
            namespaces:
            - openshift-user-workload-monitoring
            topologyKey: kubernetes.io/hostname

# oc -n openshift-user-workload-monitoring get pdb thanos-ruler-user-workload -oyaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  creationTimestamp: "2021-11-29T06:21:45Z"
  generation: 1
  labels:
    thanosRulerName: user-workload
  name: thanos-ruler-user-workload
  namespace: openshift-user-workload-monitoring
  resourceVersion: "149008"
  uid: 76c8db6f-f489-4493-8b43-84239abb9ff4
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app.kubernetes.io/name: thanos-ruler
      thanos-ruler: user-workload
status:
  conditions:
  - lastTransitionTime: "2021-11-29T06:21:48Z"
    message: ""
    observedGeneration: 1
    reason: SufficientPods
    status: "True"
    type: DisruptionAllowed
  currentHealthy: 2
  desiredHealthy: 1
  disruptionsAllowed: 1
  expectedPods: 2
  observedGeneration: 1

Comment 16 Kai-Uwe Rommel 2022-01-19 14:39:31 UTC

Any news when this will be fixed? It's still in latest 4.9 releases.
Problem is annoying because it generates unnecessary alerts.

Comment 17 Kai-Uwe Rommel 2022-01-19 14:41:07 UTC

I mean should be easy "backport" from 4.10 nightlies into current 4.9 stable releases?

Comment 18 Simon Pasquier 2022-01-19 16:36:22 UTC

We have no plan to backport the fix because it wouldn't be easy. We considered that switching from soft anti-affinity to hard anti-affinity is too risky to happen in a z stream release.

Comment 21 errata-xmlrpc 2022-03-10 16:03:07 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056

Note You need to log in before you can comment on or make changes to this bug.