Bug 1955589

Summary: thanos-querier should have a PodDisruptionBudget in HA topology
Product: OpenShift Container Platform Reporter: Damien Grisonnet <dgrisonn>
Component: MonitoringAssignee: Haoyu Sun <hasun>
Status: CLOSED ERRATA QA Contact: hongyan li <hongyli>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.8CC: alegrand, anpicker, erooth, juzhao, kakkoyun, lcosic, pkrupa
Target Milestone: ---Keywords: EasyFix
Target Release: 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-07-27 23:05:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Damien Grisonnet 2021-04-30 13:15:11 UTC
Description of problem:

In high-availability topology, thanos-querier need to have a PodDisruptionBudget to align with OpenShift `Upgrades and Reconfiguration` conventions. https://github.com/openshift/enhancements/blob/master/CONVENTIONS.md#upgrade-and-reconfiguration

Comment 3 Junqi Zhao 2021-06-11 07:40:35 UTC
tested with 4.8.0-0.nightly-2021-06-10-224448, thanos-querier pdb is added
# oc -n openshift-monitoring get pdb thanos-querier-pdb -oyaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  creationTimestamp: "2021-06-11T06:56:34Z"
  generation: 1
  labels:
    app.kubernetes.io/component: query-layer
    app.kubernetes.io/instance: thanos-querier
    app.kubernetes.io/name: thanos-query
    app.kubernetes.io/version: 0.20.2
  name: thanos-querier-pdb
  namespace: openshift-monitoring
  resourceVersion: "23052"
  uid: b4157d44-c690-40f5-be61-11cafbcdf4d3
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app.kubernetes.io/component: query-layer
      app.kubernetes.io/instance: thanos-querier
      app.kubernetes.io/name: thanos-query
status:
  conditions:
  - lastTransitionTime: "2021-06-11T06:56:34Z"
    message: ""
    observedGeneration: 1
    reason: SufficientPods
    status: "True"
    type: DisruptionAllowed
  currentHealthy: 2
  desiredHealthy: 1
  disruptionsAllowed: 1
  expectedPods: 2
  observedGeneration: 1

# oc -n openshift-monitoring get pod -o wide | grep thanos-querier
thanos-querier-8898b65b5-5cdrn                 5/5     Running   0          35m   10.131.0.14    ip-10-0-164-200.us-east-2.compute.internal   <none>           <none>
thanos-querier-8898b65b5-gnssl                 5/5     Running   0          35m   10.128.2.9     ip-10-0-196-179.us-east-2.compute.internal   <none>           <none>

Comment 4 hongyan li 2021-06-11 08:09:37 UTC
Test with payload 4.8.0-0.nightly-2021-06-10-210437
$ oc -n openshift-monitoring get PodDisruptionBudget thanos-querier-pdb -oyaml
----------------
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  generation: 1
  labels:
    app.kubernetes.io/component: query-layer
    app.kubernetes.io/instance: thanos-querier
    app.kubernetes.io/name: thanos-query
    app.kubernetes.io/version: 0.20.2
  name: thanos-querier-pdb
  namespace: openshift-monitoring
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app.kubernetes.io/component: query-layer
      app.kubernetes.io/instance: thanos-querier
      app.kubernetes.io/name: thanos-query
----------------------
$ oc -n openshift-monitoring get Pod |grep thanos-querier
thanos-querier-7979c4b9b5-9lnf7                5/5     Running   0          61m
thanos-querier-7979c4b9b5-vstlh                5/5     Running   0          61m

Comment 7 errata-xmlrpc 2021-07-27 23:05:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438