Description of problem: Our customer is observing that when a pod that is monitored by the TargetDown alert is deleted and recreated, the TargetDown alert triggers but does not clear in a reasonable amount of time. They have observed this behavior with multiple targets (i.e. ElasticSearch, catalog-operator). Version-Release number of selected component (if applicable): OpenShift 4.6.26 How reproducible: The customer is able to readily reproduce this issue in their environment, but I was unable to reproduce it separately Steps to Reproduce: 1. Delete the catalog-operator pod from openshift-operator-lifecycle-manager namespace, and the catalog-operator pod should recreate automatically 2. Run this query in Prometheus to observe the target: up{service="catalog-operator-metrics"} 3. Optionally, observe the TargetDown alert is generated in the OpenShift web console dashboard 4. Observe that the alert does not clear in a reasonable amount of time Actual results: The customer observed the TargetDown alert triggered for several days after the ElasticSearch pods were deleted, recreated, and in Ready state within minutes Expected results: The TargetDown alert should clear quickly after a deleted pod is recreated Additional info:
*** Bug 1943860 has been marked as a duplicate of this bug. ***