Hide Forgot
increasing severity to match CU ticket on parent bz1854009
MCO PR https://github.com/openshift/machine-config-operator/pull/2023
Verified on 4.5.0-0.nightly-2020-09-12-063044. Stopped kubelet service on one node and see alerts firing and they were cleared once kubelet was started again. $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.5.0-0.nightly-2020-09-12-063044 True False 144m Cluster version is 4.5.0-0.nightly-2020-09-12-063044 $ oc get nodes NAME STATUS ROLES AGE VERSION ip-10-0-130-237.us-east-2.compute.internal Ready worker 154m v1.18.3+b0068a8 ip-10-0-152-249.us-east-2.compute.internal Ready master 165m v1.18.3+b0068a8 ip-10-0-167-191.us-east-2.compute.internal Ready master 166m v1.18.3+b0068a8 ip-10-0-190-64.us-east-2.compute.internal Ready worker 154m v1.18.3+b0068a8 ip-10-0-200-110.us-east-2.compute.internal Ready master 166m v1.18.3+b0068a8 ip-10-0-220-35.us-east-2.compute.internal Ready worker 154m v1.18.3+b0068a8 $ token=`oc sa get-token prometheus-k8s -n openshift-monitoring` $ oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -g -H "Authorization: Bearer $token" 'https://prometheus-k8s.openshift-monitoring.svc:9091/api/v1/query?query=mcd_kubelet_state>2' | jq % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 334 100 334 0 0 4517 0 --:--:-- --:--:-- --:--:-- 4575 { "status": "success", "data": { "resultType": "vector", "result": [ { "metric": { "__name__": "mcd_kubelet_state", "endpoint": "metrics", "instance": "10.0.130.237:9001", "job": "machine-config-daemon", "namespace": "openshift-machine-config-operator", "pod": "machine-config-daemon-xqddl", "service": "machine-config-daemon" }, "value": [ 1600077068.81, "3" ] } ] } } $ oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -g -H "Authorization: Bearer $token" 'https://prometheus-k8s.openshift-monitoring.svc:9091/api/v1/query?query=mcd_kubelet_state>2' | jq % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 63 100 63 0 0 768 0 --:--:-- --:--:-- --:--:-- 777 { "status": "success", "data": { "resultType": "vector", "result": [] } }
Created attachment 1714740 [details] metrics-fix-confirmation
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.5.11 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:3719