Description of problem: kube-apiserver of UPI installed cluster produces lot of "http: TLS handshake error from <pod IP and port>: EOF", there is no this prolbem on IPI installed cluster. Here are snipped logs, I0630 07:40:57.787360 1 log.go:172] http: TLS handshake error from 10.0.13.169:51432: EOF I0630 07:40:58.061804 1 log.go:172] http: TLS handshake error from 10.0.62.240:40917: EOF I0630 07:40:59.080841 1 log.go:172] http: TLS handshake error from 10.0.13.169:58219: EOF I0630 07:41:00.275757 1 log.go:172] http: TLS handshake error from 10.0.13.169:3724: EOF I0630 07:41:00.869014 1 log.go:172] http: TLS handshake error from 10.0.13.169:22398: EOF I0630 07:41:01.124229 1 log.go:172] http: TLS handshake error from 10.0.62.240:34347: EOF I0630 07:41:01.235998 1 log.go:172] http: TLS handshake error from 10.0.13.169:12987: EOF ... Version-Release number of selected component (if applicable): 4.5.0-0.nightly-2020-06-26-215024 How reproducible: always Actual results: $ oc logs -n openshift-kube-apiserver kube-apiserver-ip-10-0-52-119.ap-northeast-1.compute.internal > kas.log $ grep -c 'TLS handshake error from 10.0.' kas.log 99789 Differnet kube-apiserver pods could show different two IPs EOF, see below, $ grep 'TLS handshake error from 10.0.' kas.log | awk -F: '{print $5}' | sort |uniq TLS handshake error from 10.0.13.169 TLS handshake error from 10.0.62.240 $ oc logs -n openshift-kube-apiserver kube-apiserver-ip-10-0-63-85.ap-northeast-1.compute.internal > kas1.log $ grep 'TLS handshake error from 10.0.' kas1.log | awk -F: '{print $5}' | sort |uniq TLS handshake error from 10.0.13.169 TLS handshake error from 10.0.62.240 $ oc logs -n openshift-kube-apiserver kube-apiserver-ip-10-0-71-186.ap-northeast-1.compute.internal > kas2.log $ grep 'TLS handshake error from 10.0.' kas2.log | awk -F: '{print $5}' | sort |uniq TLS handshake error from 10.0.23.192 TLS handshake error from 10.0.66.64 Did a further checking, unable to find them from running pods. $ oc get po -A -o wide | grep -E '10.0.13.169|10.0.62.240' Expected Results: kube-apiserver should not generate a large number of similar TLS handshake errors continuously. Additional info:
This might be tightly related to wrong UPI setup, compare https://github.com/openshift/openshift-docs/pull/21336#issuecomment-651635804. We don't own the load balancer setup in UPI and EOFs are very probably caused by that.
The load balancer is wrongly configured. With tcp probe we get this output. Use proper HTTPS probes on /readyz and it goes away. The docs for IPI were wrong for quite some time. I believe they finally got fixed by describing how to set up probes.
(In reply to Stefan Schimanski from comment #4) > The load balancer is wrongly configured. With tcp probe we get this output. > Use proper HTTPS probes on /readyz and it goes away. The docs for IPI were > wrong for quite some time. I believe they finally got fixed by describing > how to set up probes. Moving this to docs team so that they can include information on the kube-apiserver health checks setup for UPI. the installer doc added by apiserver team is here https://github.com/openshift/installer/blob/master/docs/dev/kube-apiserver-health-check.md
looking at https://github.com/openshift/openshift-docs/blob/main/modules/installation-load-balancing-user-infra.adoc it seems to satisfy the requirements to match the contents from https://github.com/openshift/installer/blob/master/docs/dev/kube-apiserver-health-check.md into a openshift docs page. @sttts do you agree ?
Yes, they look good.
Closing this bug since it is not valid anymore.