Bug 1955489

Summary:	Alertmanager Statefulsets should have 2 replicas and hard affinity set
Product:	OpenShift Container Platform	Reporter:	Simon Pasquier <spasquie>
Component:	Monitoring	Assignee:	Simon Pasquier <spasquie>
Status:	CLOSED ERRATA	QA Contact:	Junqi Zhao <juzhao>
Severity:	high	Docs Contact:	Brian Burt <bburt>
Priority:	low
Version:	4.8	CC:	anpicker, bburt, erooth, hongyli, jeder, rgudimet, wking
Target Milestone:	---	Keywords:	ServiceDeliveryImpact
Target Release:	4.10.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:	Prviously, during {product-title} upgrades, the Alertmanager service might become unavailable because either the three Alertmanager pods were located on the same node or the nodes running the Alertmanager pods happened to reboot at the same time. This situation was possible because the Alertmanager pods had soft anti-affinity rules regarding node placement and no pod disruption budget. This release enables hard anti-affinity rules and pod disruption budgets to ensure no downtime during patch upgrades for the Alertmanager and other monitoring components. Consequence: alert notifications wouldn't be dispatched during some time. Fix: the cluster monitoring operator configures hard anti-affinity rules to ensure that the Alertmanager pods are scheduled on different nodes. It also provision a pod disruption budget to ensure that at least 1 Alertmanager pod is always running. Result: during upgrades, the nodes should reboot in sequence to ensure that at least 1 Alertmanager pod is always running.	Story Points:	---
Clone Of:		Environment:
Last Closed:	2022-03-10 16:03:07 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Simon Pasquier 2021-04-30 08:49:47 UTC

Description of problem:

As mentioned in the conventions doc https://github.com/openshift/enhancements/blob/master/CONVENTIONS.md#high-availability, alertmanager should have replica count of 2 with hard affinities set till we bring descheduler into our product.

Version-Release number of selected component (if applicable):
4.8

How reproducible:
Always

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Follow-up of bug 1949262. We need to be able to set minReadySeconds on statefulsets before the replica count can be decreased to 2 (see https://github.com/kubernetes/kubernetes/pull/100842).

Comment 11 Junqi Zhao 2021-11-25 03:40:22 UTC

tested with PR, expected alertmanager pods changed to 2 and pods can not be started
# oc -n openshift-monitoring get pdb alertmanager-main
NAME                MIN AVAILABLE   MAX UNAVAILABLE   ALLOWED DISRUPTIONS   AGE
alertmanager-main   N/A             1                 0                     41m

# oc -n openshift-monitoring get pdb alertmanager-main -oyaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  creationTimestamp: "2021-11-25T02:52:56Z"
  generation: 1
  labels:
    app.kubernetes.io/component: alert-router
    app.kubernetes.io/name: alertmanager
    app.kubernetes.io/part-of: openshift-monitoring
    app.kubernetes.io/version: 0.22.2
  name: alertmanager-main
  namespace: openshift-monitoring
  resourceVersion: "95240"
  uid: 94eea939-798d-48a0-8f24-aa89aaa525c2
spec:
  maxUnavailable: 1
  selector:
    matchLabels:
      alertmanager: main
      app.kubernetes.io/component: alert-router
      app.kubernetes.io/name: alertmanager
      app.kubernetes.io/part-of: openshift-monitoring
status:
  conditions:
  - lastTransitionTime: "2021-11-25T03:34:55Z"
    message: ""
    observedGeneration: 1
    reason: InsufficientPods
    status: "False"
    type: DisruptionAllowed
  currentHealthy: 0
  desiredHealthy: 1
  disruptionsAllowed: 0
  expectedPods: 2
  observedGeneration: 1

# while true; do date; oc -n openshift-monitoring get pod | grep alertmanager; sleep 10s; echo -e "\n"; done
Wed Nov 24 22:29:54 EST 2021
alertmanager-main-0                            0/6     Terminating         0          3s
alertmanager-main-1                            0/6     ContainerCreating   0          3s


Wed Nov 24 22:30:10 EST 2021
alertmanager-main-0                            0/6     Terminating   0          1s
alertmanager-main-1                            0/6     Terminating   0          1s


Wed Nov 24 22:30:25 EST 2021
alertmanager-main-0                            0/6     Terminating   0          1s
alertmanager-main-1                            0/6     Terminating   0          1s


Wed Nov 24 22:30:41 EST 2021
alertmanager-main-1                            6/6     Terminating   0          6s


Wed Nov 24 22:30:56 EST 2021
alertmanager-main-1                            6/6     Terminating   0          6s


Wed Nov 24 22:31:11 EST 2021
alertmanager-main-0                            6/6     Terminating   0          4s
alertmanager-main-1                            6/6     Terminating   0          4s


Wed Nov 24 22:31:27 EST 2021
alertmanager-main-0                            0/6     Terminating   0          4s
alertmanager-main-1                            0/6     Terminating   0          4s


Wed Nov 24 22:31:42 EST 2021
alertmanager-main-0                            0/6     Terminating   0          3s
alertmanager-main-1                            0/6     Terminating   0          3s


Wed Nov 24 22:31:57 EST 2021
alertmanager-main-0                            0/6     ContainerCreating   0          0s
alertmanager-main-1                            0/6     Pending             0          0s


Wed Nov 24 22:32:13 EST 2021
alertmanager-main-0                            6/6     Terminating   0          6s

# oc -n openshift-monitoring get event | grep alertmanager-main
...
13s         Warning   FailedCreate               statefulset/alertmanager-main                            create Pod alertmanager-main-0 in StatefulSet alertmanager-main failed error: The POST operation against Pod could not be completed at this time, please try again.
13s         Warning   FailedCreate               statefulset/alertmanager-main                            create Pod alertmanager-main-0 in StatefulSet alertmanager-main failed error: The POST operation against Pod could not be completed at this time, please try again.
13s         Normal    SuccessfulCreate           statefulset/alertmanager-main                            create Pod alertmanager-main-0 in StatefulSet alertmanager-main successful
13s         Warning   FailedCreate               statefulset/alertmanager-main                            create Pod alertmanager-main-1 in StatefulSet alertmanager-main failed error: The POST operation against Pod could not be completed at this time, please try again.

Comment 14 Junqi Zhao 2021-11-25 03:52:09 UTC

# oc -n openshift-monitoring logs prometheus-operator-84c85586d6-bpf2r
...
level=info ts=2021-11-25T02:53:02.3094545Z caller=operator.go:814 component=alertmanageroperator key=openshift-monitoring/main msg="recreating AlertManager StatefulSet because the update operation wasn't possible" reason="Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', 'updateStrategy' and 'minReadySeconds' are forbidden"
level=info ts=2021-11-25T02:53:02.316758637Z caller=operator.go:741 component=alertmanageroperator key=openshift-monitoring/main msg="sync alertmanager"
level=info ts=2021-11-25T02:53:02.426700671Z caller=operator.go:814 component=alertmanageroperator key=openshift-monitoring/main msg="recreating AlertManager StatefulSet because the update operation wasn't possible" reason="Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', 'updateStrategy' and 'minReadySeconds' are forbidden"
level=info ts=2021-11-25T02:53:02.432181021Z caller=operator.go:741 component=alertmanageroperator key=openshift-monitoring/main msg="sync alertmanager"
level=info ts=2021-11-25T02:53:02.50330463Z caller=operator.go:814 component=alertmanageroperator key=openshift-monitoring/main msg="recreating AlertManager StatefulSet because the update operation wasn't possible" reason="Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', 'updateStrategy' and 'minReadySeconds' are forbidden"
level=info ts=2021-11-25T02:53:02.509158493Z caller=operator.go:741 component=alertmanageroperator key=openshift-monitoring/main msg="sync alertmanager"
level=info ts=2021-11-25T02:53:02.553180316Z caller=operator.go:814 component=alertmanageroperator key=openshift-monitoring/main msg="recreating AlertManager StatefulSet because the update operation wasn't possible" reason="Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', 'updateStrategy' and 'minReadySeconds' are forbidden"
...



should remove 
  finalizers:
  - foregroundDeletion
from alertmanager-main statefulset
# oc -n openshift-monitoring get sts alertmanager-main -oyaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  annotations:
    prometheus-operator-input-hash: "14523878381744334873"
  creationTimestamp: "2021-11-25T03:49:58Z"
  deletionGracePeriodSeconds: 0
  deletionTimestamp: "2021-11-25T03:49:58Z"
  finalizers:
  - foregroundDeletion

Comment 16 Simon Pasquier 2021-11-26 10:00:13 UTC

Pull request submitted

Comment 19 Junqi Zhao 2021-12-13 10:15:04 UTC

checked with 4.10.0-0.nightly-2021-12-12-184227, the fix is in it. Alertmanager Statefulsets have 2 replicas and hard affinity set
# oc -n openshift-monitoring get pod | grep alertmanager-main
alertmanager-main-0                            6/6     Running   0          4h13m
alertmanager-main-1                            6/6     Running   0          4h12m

# oc -n openshift-monitoring get sts alertmanager-main -oyaml
...
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                app.kubernetes.io/component: alert-router
                app.kubernetes.io/instance: main
                app.kubernetes.io/name: alertmanager
                app.kubernetes.io/part-of: openshift-monitoring
            namespaces:
            - openshift-monitoring
            topologyKey: kubernetes.io/hostname
# oc -n openshift-monitoring get pdb alertmanager-main 
NAME                MIN AVAILABLE   MAX UNAVAILABLE   ALLOWED DISRUPTIONS   AGE
alertmanager-main   N/A             1                 1                     10h

# oc -n openshift-monitoring get pdb alertmanager-main  -oyaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  creationTimestamp: "2021-12-12T23:30:56Z"
  generation: 1
  labels:
    app.kubernetes.io/component: alert-router
    app.kubernetes.io/instance: main
    app.kubernetes.io/name: alertmanager
    app.kubernetes.io/part-of: openshift-monitoring
    app.kubernetes.io/version: 0.22.2
  name: alertmanager-main
  namespace: openshift-monitoring
  resourceVersion: "149472"
  uid: 74e9b3dd-a3c8-45fb-8b5a-6b627a0a3acd
spec:
  maxUnavailable: 1
  selector:
    matchLabels:
      app.kubernetes.io/component: alert-router
      app.kubernetes.io/instance: main
      app.kubernetes.io/name: alertmanager
      app.kubernetes.io/part-of: openshift-monitoring
status:
  conditions:
  - lastTransitionTime: "2021-12-13T06:01:03Z"
    message: ""
    observedGeneration: 1
    reason: SufficientPods
    status: "True"
    type: DisruptionAllowed
  currentHealthy: 2
  desiredHealthy: 1
  disruptionsAllowed: 1
  expectedPods: 2

Comment 22 errata-xmlrpc 2022-03-10 16:03:07 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056