Description of problem: Discovered during 3.9 upgrade in ca-centra-1 cluster. Seems like the gprc connections are not freed which will cause the api server to be stuck and eventually have to be killed and restarted. David made a monitoring tool that will detect this situation and automatically panic the api server which force it to restart. We need to figure out what is the root cause of this. It sounds like a race condition in gprc library.
Created attachment 1397069 [details] /deb/pprof/goroutine?debug=1
Created attachment 1397085 [details] /metrics snapshot
linked pr has merged.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0489