Created attachment 1714175 [details] Multi platform trends Description of problem: Analyzing a week's worth of CI stats for the e2e-* periodic jobs across AWS/GCP/Azure reveals that the connectivity checker is causing excessive events and possibly etcd db growth on Azure. See attached screenshots. Here's a representative example: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-azure-4.6/1303104261227286528) $ jq -r '.items[] | .metadata.namespace + "/" + .reason' < events.json | sort | uniq -c | sort -bnr | head -5 2372 openshift-kube-apiserver/ConnectivityRestored 2140 openshift-apiserver/ConnectivityRestored 128 openshift-authentication-operator/OperatorStatusChanged 81 openshift-apiserver/ConnectivityOutageDetected 78 openshift-monitoring/Pulled For comparison, from a similar GCP job: $ jq -r '.items[] | .metadata.namespace + "/" + .reason' < events.json | sort | uniq -c | sort -bnr | head -5 108 openshift-kube-controller-manager-operator/OperatorStatusChanged 106 openshift-authentication-operator/OperatorStatusChanged 89 openshift-kube-apiserver-operator/OperatorStatusChanged 87 openshift-etcd-operator/OperatorStatusChanged 81 openshift-kube-apiserver/Pulled Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Created attachment 1714176 [details] Azure example
Created attachment 1714177 [details] DB size trends
Researched for a while to verify. Didn't finish yet. BTW the PR is KAS-O PR, so selecting the correct component. BTW should there be an OAS-O PR too?
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196