https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_ovn-kubernetes/875/pull-ci-openshift-ovn-kubernetes-master-e2e-aws-ovn-local-gateway/1471574928024145920 [sig-arch] events should not repeat pathologically 1 events happened too frequently event happened 21 times, something is wrong: ns/loki pod/loki-promtail-9fmk2 node/ip-10-0-144-145.us-west-2.compute.internal - reason/Unhealthy Readiness probe failed: Get "http://10.131.0.17:3101/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers) Looking at the logs for loki-promtail-9fmk2 we see many instances of ratelimiting continuing through the test, like: level=warn ts=2021-12-16T21:07:49.256075215Z caller=client.go:349 component=client host=logs-prod3.grafana.net msg="error sending batch, will retry" status=429 error="server returned HTTP status 429 Too Many Requests (429): {\"code\":429,\"status\":\"error\",\"message\":\"Ingestion rate limit exceeded for user 18459 (limit: 6291456 bytes/sec) while attempting to ingest '598' lines totaling '102332' bytes, reduce log volume or contact your Loki administrator to see if the limit can be increased\"}" Dec 16 21:07:50.000 W ns/loki pod/loki-promtail-9fmk2 node/ip-10-0-144-145.us-west-2.compute.internal reason/Unhealthy Readiness probe failed: Get "http://10.131.0.17:3101/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers) Is it possible the health checks are timing out because the promtail pod is too busy? Dec 16 21:07:50.000 W ns/loki pod/loki-promtail-9fmk2 node/ip-10-0-144-145.us-west-2.compute.internal reason/Unhealthy Readiness probe failed: Get "http://10.131.0.17:3101/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers) Dec 16 21:12:40.000 W ns/loki pod/loki-promtail-9fmk2 node/ip-10-0-144-145.us-west-2.compute.internal reason/Unhealthy Readiness probe failed: Get "http://10.131.0.17:3101/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers) (2 times) Dec 16 21:12:50.000 W ns/loki pod/loki-promtail-9fmk2 node/ip-10-0-144-145.us-west-2.compute.internal reason/Unhealthy Readiness probe failed: Get "http://10.131.0.17:3101/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers) (3 times) Dec 16 21:13:00.000 W ns/loki pod/loki-promtail-9fmk2 node/ip-10-0-144-145.us-west-2.compute.internal reason/Unhealthy Readiness probe failed: Get "http://10.131.0.17:3101/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers) (4 times) Dec 16 21:13:10.000 W ns/loki pod/loki-promtail-9fmk2 node/ip-10-0-144-145.us-west-2.compute.internal reason/Unhealthy Readiness probe failed: Get "http://10.131.0.17:3101/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers) (5 times) Dec 16 21:13:20.000 W ns/loki pod/loki-promtail-9fmk2 node/ip-10-0-144-145.us-west-2.compute.internal reason/Unhealthy Readiness probe failed: Get "http://10.131.0.17:3101/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers) (6 times) Dec 16 21:13:21.000 W ns/loki pod/loki-promtail-9fmk2 node/ip-10-0-144-145.us-west-2.compute.internal reason/Unhealthy Readiness probe failed: Get "http://10.131.0.17:3101/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers) (7 times) Dec 16 21:13:30.000 W ns/loki pod/loki-promtail-9fmk2 node/ip-10-0-144-145.us-west-2.compute.internal reason/Unhealthy Readiness probe failed: Get "http://10.131.0.17:3101/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers) (8 times) Dec 16 21:13:40.000 W ns/loki pod/loki-promtail-9fmk2 node/ip-10-0-144-145.us-west-2.compute.internal reason/Unhealthy Readiness probe failed: Get "http://10.131.0.17:3101/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers) (9 times) Dec 16 21:14:40.000 W ns/loki pod/loki-promtail-9fmk2 node/ip-10-0-144-145.us-west-2.compute.internal reason/Unhealthy Readiness probe failed: Get "http://10.131.0.17:3101/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers) (10 times) Dec 16 21:16:50.000 W ns/loki pod/loki-promtail-9fmk2 node/ip-10-0-144-145.us-west-2.compute.internal reason/Unhealthy Readiness probe failed: Get "http://10.131.0.17:3101/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers) (11 times) Dec 16 21:19:00.000 W ns/loki pod/loki-promtail-9fmk2 node/ip-10-0-144-145.us-west-2.compute.internal reason/Unhealthy Readiness probe failed: Get "http://10.131.0.17:3101/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers) (12 times) Dec 16 21:19:10.000 W ns/loki pod/loki-promtail-9fmk2 node/ip-10-0-144-145.us-west-2.compute.internal reason/Unhealthy Readiness probe failed: Get "http://10.131.0.17:3101/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers) (13 times) Dec 16 21:19:20.000 W ns/loki pod/loki-promtail-9fmk2 node/ip-10-0-144-145.us-west-2.compute.internal reason/Unhealthy Readiness probe failed: Get "http://10.131.0.17:3101/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers) (14 times) Dec 16 21:19:30.000 W ns/loki pod/loki-promtail-9fmk2 node/ip-10-0-144-145.us-west-2.compute.internal reason/Unhealthy Readiness probe failed: Get "http://10.131.0.17:3101/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers) (15 times) Dec 16 21:19:40.000 W ns/loki pod/loki-promtail-9fmk2 node/ip-10-0-144-145.us-west-2.compute.internal reason/Unhealthy Readiness probe failed: Get "http://10.131.0.17:3101/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers) (16 times) Dec 16 21:19:41.000 W ns/loki pod/loki-promtail-9fmk2 node/ip-10-0-144-145.us-west-2.compute.internal reason/Unhealthy Readiness probe failed: Get "http://10.131.0.17:3101/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers) (17 times) Dec 16 21:19:50.000 W ns/loki pod/loki-promtail-9fmk2 node/ip-10-0-144-145.us-west-2.compute.internal reason/Unhealthy Readiness probe failed: Get "http://10.131.0.17:3101/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers) (18 times) Dec 16 21:20:00.000 W ns/loki pod/loki-promtail-9fmk2 node/ip-10-0-144-145.us-west-2.compute.internal reason/Unhealthy Readiness probe failed: Get "http://10.131.0.17:3101/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers) (19 times) Dec 16 21:20:10.000 W ns/loki pod/loki-promtail-9fmk2 node/ip-10-0-144-145.us-west-2.compute.internal reason/Unhealthy Readiness probe failed: Get "http://10.131.0.17:3101/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers) (20 times) Dec 16 21:20:20.000 W ns/loki pod/loki-promtail-9fmk2 node/ip-10-0-144-145.us-west-2.compute.internal reason/Unhealthy Readiness probe failed: Get "http://10.131.0.17:3101/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers) (21 times) There are 167 instances of: level=warn ts=2021-12-16T20:57:14.991170953Z caller=client.go:349 component=client host=logs-prod3.grafana.net msg="error sending batch, will retry" status=429 error="server returned HTTP status 429 Too Many Requests (429): {\"code\":429,\"status\":\"error\",\"message\":\"Ingestion rate limit exceeded for user 18459 (limit: 6291456 bytes/sec) while attempting to ingest '722' lines totaling '102341' bytes, reduce log volume or contact your Loki administrator to see if the limit can be increased\"}" https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_ovn-kubernetes/875/pull-ci-openshift-ovn-kubernetes-master-e2e-aws-ovn-local-gateway/1471574928024145920/artifacts/e2e-aws-ovn-local-gateway/gather-extra/artifacts/pods/loki_loki-promtail-9fmk2_promtail.log