This will be particularly useful for PDBs. Today these just get the namespace label, but if arbitrary labels could be picked up from the PDB, this will make it easier to route-based alerting in alertmanager.
Hi David, I've created a PR (linked) that exposes the PDB metrics via the allow-list. However if you take a look upstream https://github.com/kubernetes/kube-state-metrics/blob/master/docs/poddisruptionbudget-metrics.md you can see that this particular resource has no additional labels available to bolt on. We could treat this as the first step of a RFE and I can propose upstream that we add "kube_poddisruptionbudget_labels" and "kube_poddisruptionbudget_annotations" metrics which is what I believe is the first step in solving your requirements. These are available already on many other resources already such as namespace. https://github.com/kubernetes/kube-state-metrics/blob/master/docs/namespace-metrics.md
OK, so what you are saying is that https://github.com/openshift/cluster-monitoring-operator/pull/1439/files#diff-b61f7d6e3529525eef15693c9529b4e065ac3e9d1af6308573e42e825fc1218bR37 won't expose the labels on the PDB, because KSM does not expose them: https://github.com/kubernetes/kube-state-metrics/blob/master/docs/poddisruptionbudget-metrics.md ? Having kube_poddisruptionbudget_labels/annotations makes sense, as one can then provide labels and hence routing in alert mgr based on these, which is very useful and how we do routing. Another option would be to "join" this metric with the namespace labels, so that one could simply label the namespace to obtain the routing - but that's not how the other alerts are designed in OCP, so I guess we don't want to go down that route?
Hi David, yes you are correct, there will be no additional series or labels other than https://github.com/kubernetes/kube-state-metrics/blob/v2.2.3/docs/poddisruptionbudget-metrics.md exposed via KSM. No we have already merged https://github.com/kubernetes/kube-state-metrics/pull/1623 to move this RFE forward and expose those additional series. I'll merge https://github.com/openshift/cluster-monitoring-operator/pull/1439 also. As per the comment around the join, that would indeed work, but the majority of the alerts are being pulled from upstream and I don't think that is the road we want to go down to fill individual use cases. Hopefully that is understandable. I think the above changes in combination with https://issues.redhat.com/browse/OBSDA-2 will allow you to tweak the alerts according to specific needs. Let me know if that satisfies this RFE and we can close it. Thanks
I think this is as good as it can get at this stage, thanks! This can be closed.
As mentioned, we need to wait for a release of KSM to be cut that includes https://github.com/kubernetes/kube-state-metrics/pull/1623 and pull it to our downstream fork before verifying this change.
Reassigning to @filip since the final piece of this ticket requires cutting a new release of KSM which is scheduled for mid December. That in conjunction with the ability to override the default alerts (https://github.com/openshift/enhancements/pull/958) and https://github.com/openshift/cluster-monitoring-operator/pull/1439 should provide customer with the ability to achieve what they want and close the RFE.
tested with 4.10.0-0.nightly-2021-12-18-034942, kube_poddisruptionbudget_annotations and kube_poddisruptionbudget_labels is added, but we only could see pdb labels for kube_poddisruptionbudget_labels, can't see the pdb annotations from kube_poddisruptionbudget_annotations # oc -n openshift-monitoring get deploy kube-state-metrics -oyaml | grep metric-labels-allowlist - --metric-labels-allowlist=pods=[*],nodes=[*],namespaces=[*],persistentvolumes=[*],persistentvolumeclaims=[*],poddisruptionbudgets=[*],poddisruptionbudget=[*] # token=`oc sa get-token prometheus-k8s -n openshift-monitoring` # oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://prometheus-k8s.openshift-monitoring.svc:9091/api/v1/label/__name__/values' | jq | grep poddisruptionbudget "kube_poddisruptionbudget_annotations", "kube_poddisruptionbudget_labels", "kube_poddisruptionbudget_status_current_healthy", "kube_poddisruptionbudget_status_desired_healthy", "kube_poddisruptionbudget_status_expected_pods", "kube_poddisruptionbudget_status_observed_generation", "kube_poddisruptionbudget_status_pod_disruptions_allowed", pdb file ********************** apiVersion: policy/v1 kind: PodDisruptionBudget metadata: name: zk-cm annotations: imageregistry: "https://hub.docker.com/" contactor: help labels: app.kubernetes.io/component: zookeeper app.kubernetes.io/instance: main spec: minAvailable: 2 selector: matchLabels: app: zookeeper ********************** # oc -n default get pdb zk-pdb -oyaml apiVersion: policy/v1 kind: PodDisruptionBudget metadata: annotations: contactor: help imageregistry: https://hub.docker.com/ creationTimestamp: "2021-12-20T10:50:51Z" generation: 1 labels: app.kubernetes.io/component: zookeeper app.kubernetes.io/instance: main name: zk-pdb namespace: default resourceVersion: "211532" uid: ef4b4060-314c-46de-85fb-592b098c8c93 spec: minAvailable: 2 selector: matchLabels: app: zookeeper status: conditions: - lastTransitionTime: "2021-12-20T10:50:51Z" message: "" observedGeneration: 1 reason: InsufficientPods status: "False" type: DisruptionAllowed currentHealthy: 0 desiredHealthy: 2 disruptionsAllowed: 0 expectedPods: 0 observedGeneration: 1 ********************** could see pdb labels for kube_poddisruptionbudget_labels kube_poddisruptionbudget_labels{container="kube-rbac-proxy-main", endpoint="https-main", job="kube-state-metrics", label_app_kubernetes_io_component="zookeeper", label_app_kubernetes_io_instance="main", namespace="default", poddisruptionbudget="zk-pdb", service="kube-state-metrics"} 1 can't find the pdb annotations from kube_poddisruptionbudget_annotations kube_poddisruptionbudget_annotations{container="kube-rbac-proxy-main", endpoint="https-main", job="kube-state-metrics", namespace="default", poddisruptionbudget="zk-pdb", service="kube-state-metrics"} 1 also find, we can not get annotations from kube_*_annotations, example: kube_daemonset_annotations, kube_deployment_annotations # oc -n openshift-monitoring get ds node-exporter -o jsonpath="{.metadata.annotations}" {"deprecated.daemonset.template.generation":"1"} result from prometheus kube_daemonset_annotations{container="kube-rbac-proxy-main", daemonset="node-exporter", endpoint="https-main", job="kube-state-metrics", namespace="openshift-monitoring", service="kube-state-metrics"} 1 # oc -n openshift-monitoring get deploy cluster-monitoring-operator -o jsonpath="{.metadata.annotations}" {"deployment.kubernetes.io/revision":"1","include.release.openshift.io/self-managed-high-availability":"true","include.release.openshift.io/single-node-developer":"true"} result from prometheus kube_deployment_annotations{container="kube-rbac-proxy-main", deployment="cluster-monitoring-operator", endpoint="https-main", job="kube-state-metrics", namespace="openshift-monitoring", prometheus="openshift-monitoring/k8s", service="kube-state-metrics"} 1
@Junqi it is expected that kube_poddisruptionbudget_annotations isn't present, we choose to expose kube_poddisruptionbudget_labels only which should be enough to filter .
based on Comment 14 and 15, set to VERIFIED
(In reply to Junqi Zhao from comment #16) > based on Comment 14 and 15, set to VERIFIED change to based on Comment 13 and 15, set to VERIFIED
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056