Description of problem:
The PodDisruptionBudgetAtLimit alert looks at: kube_poddisruptionbudget_status_expected_pods == kube_poddisruptionbudget_status_desired_healthy
With maxUnavailable (and MinAvailable using a percentage), expectedPods is equal to the number of replicas:
With a MinAvailable int, expectedPods is equal to the actual current number of pods:
If you have, for example, a DC with 3 replicas and maxUnavailable = 2, desired healthy will be 1. The pdb is at its limit, but expected (3)will never equal healthy (1) so it will never fire.
In the first 2 cases, expectedPods can never == desiredHealthy. You would never get a PodDisruptionBudgetAtLimit alert for etcd-quorum-guard
The same allies with the critical alert PodDisruptionBudgetLimit : kube_poddisruptionbudget_status_expected_pods < kube_poddisruptionbudget_status_desired_healthy
Expected will never be less then expected maxUnavailable (and MinAvailable using a percentage)
Is the alert wrong or should all areas of the PDB code set expected to actual running pods?
To fix the alerts, we should compare current healthy: kube_poddisruptionbudget_status_current_healthy to desired healthy
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. cordon a master
2. delete an etcd quaorum guard pod
3. no alerts
This caused a regression in upgrade jobs - it assumes that all master nodes must upgrade within 15 mins.
Instead this alert should use a most sophisticated metric:
count_over_time((kube_poddisruptionbudget_status_current_healthy < kube_poddisruptionbudget_status_desired_healthy)[15m:10s]) > 0
To ensure that PDB was not violated for more than 10 seconds within 15 mins window
A better idea - check for `cluster_version` metric, if `type` is `updating` then the alert should not be fired
Not firing the alert during upgrades would be an issue as well. That is how we found the issue with the alert.
Customer had some bad PDBs that cause the MCP rollout to hang for hours on the 4.6.25 upgrade before someone noticed. Then we realized the alerts were broken.
Can see the alert now with the latest payload:
[root@localhost ~]# oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.8.0-0.nightly-2021-06-11-024306 True False 9m30s Cluster version is 4.8.0-0.nightly-2021-06-11-024306
1) cordon one of the node:
[root@localhost ~]# oc adm cordon yinzhou-bug-pkv6w-master-0.c.openshift-qe.internal
[root@localhost ~]# oc get node
NAME STATUS ROLES AGE VERSION
yinzhou-bug-pkv6w-master-0.c.openshift-qe.internal Ready,SchedulingDisabled master 50m v1.21.0-rc.0+a5ec692
2) Delete one of the etcd pod:
[root@localhost ~]# oc delete po etcd-quorum-guard-b8668f655-28c4x -n openshift-etcd
pod "etcd-quorum-guard-b8668f655-28c4x" deleted
[root@localhost ~]# oc get po
NAME READY STATUS RESTARTS AGE
etcd-quorum-guard-b8668f655-5z524 1/1 Running 0 49m
etcd-quorum-guard-b8668f655-ck6ps 0/1 Pending 0 14s
3) wait for some time , check the alert :
[root@localhost ~]# token=`oc sa get-token prometheus-k8s -n openshift-monitoring`
[root@localhost ~]# oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://prometheus-k8s.openshift-monitoring.svc:9091/api/v1/alerts' | jq
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 4278 0 4278 0 0 97227 0 --:--:-- --:--:-- --:--:-- 97227
"description": "Pod openshift-etcd/etcd-quorum-guard-b8668f655-ck6ps has been in a non-ready state for longer than 15 minutes.",
"summary": "Pod has been in a non-ready state for more than 15 minutes."
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days