Description of problem: In high-availability topology, thanos-querier need to have a PodDisruptionBudget to align with OpenShift `Upgrades and Reconfiguration` conventions. https://github.com/openshift/enhancements/blob/master/CONVENTIONS.md#upgrade-and-reconfiguration
tested with 4.8.0-0.nightly-2021-06-10-224448, thanos-querier pdb is added # oc -n openshift-monitoring get pdb thanos-querier-pdb -oyaml apiVersion: policy/v1 kind: PodDisruptionBudget metadata: creationTimestamp: "2021-06-11T06:56:34Z" generation: 1 labels: app.kubernetes.io/component: query-layer app.kubernetes.io/instance: thanos-querier app.kubernetes.io/name: thanos-query app.kubernetes.io/version: 0.20.2 name: thanos-querier-pdb namespace: openshift-monitoring resourceVersion: "23052" uid: b4157d44-c690-40f5-be61-11cafbcdf4d3 spec: minAvailable: 1 selector: matchLabels: app.kubernetes.io/component: query-layer app.kubernetes.io/instance: thanos-querier app.kubernetes.io/name: thanos-query status: conditions: - lastTransitionTime: "2021-06-11T06:56:34Z" message: "" observedGeneration: 1 reason: SufficientPods status: "True" type: DisruptionAllowed currentHealthy: 2 desiredHealthy: 1 disruptionsAllowed: 1 expectedPods: 2 observedGeneration: 1 # oc -n openshift-monitoring get pod -o wide | grep thanos-querier thanos-querier-8898b65b5-5cdrn 5/5 Running 0 35m 10.131.0.14 ip-10-0-164-200.us-east-2.compute.internal <none> <none> thanos-querier-8898b65b5-gnssl 5/5 Running 0 35m 10.128.2.9 ip-10-0-196-179.us-east-2.compute.internal <none> <none>
Test with payload 4.8.0-0.nightly-2021-06-10-210437 $ oc -n openshift-monitoring get PodDisruptionBudget thanos-querier-pdb -oyaml ---------------- apiVersion: policy/v1 kind: PodDisruptionBudget metadata: generation: 1 labels: app.kubernetes.io/component: query-layer app.kubernetes.io/instance: thanos-querier app.kubernetes.io/name: thanos-query app.kubernetes.io/version: 0.20.2 name: thanos-querier-pdb namespace: openshift-monitoring spec: minAvailable: 1 selector: matchLabels: app.kubernetes.io/component: query-layer app.kubernetes.io/instance: thanos-querier app.kubernetes.io/name: thanos-query ---------------------- $ oc -n openshift-monitoring get Pod |grep thanos-querier thanos-querier-7979c4b9b5-9lnf7 5/5 Running 0 61m thanos-querier-7979c4b9b5-vstlh 5/5 Running 0 61m
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438