Description of problem: The registry pods in online-int (paid integration env) are reporting semi-frequent liveness and readiness health probe failures with the message "http2: no cached connection was available" The pods are not restarting, so the problem may be harmless. https://github.com/golang/go/issues/16582 Registry pod 1 Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 12d 1h 158 {kubelet ip-172-31-55-13.ec2.internal} spec.containers{registry} Warning Unhealthy Liveness probe failed: Get https://10.1.5.120:5000/healthz: http2: no cached connection was available 12d 50m 149 {kubelet ip-172-31-55-13.ec2.internal} spec.containers{registry} Warning Unhealthy Readiness probe failed: Get https://10.1.5.120:5000/healthz: http2: no cached connection was available and Registry pod 2 Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 6d 55m 35 {kubelet ip-172-31-55-12.ec2.internal} spec.containers{registry} Warning Unhealthy Liveness probe failed: Get https://10.1.8.182:5000/healthz: http2: no cached connection was available Version-Release number of selected component (if applicable): 3.5.5.10 How reproducible: Seems to be happening at least once an hour. Will monitor.
Possibly https://github.com/golang/go/issues/16582?
This should be fixed in the move to go 1.8 per https://github.com/golang/go/commit/7a622740655bb5fcbd160eb96887032314842e6e As a result, this should resolve when move to kube 1.7.
I'm seeing this frequently in prod, on starter-us-east-1 too. It happens each time I deploy the router or registry pods.
*** Bug 1466035 has been marked as a duplicate of this bug. ***
Could you help verify the bug? thanks
There is no online environment available with 3.7 on it yet. Moving to POST since it is fixed upstream.
This does not appear to be fixed in Go 1.8. Furthermore, the linked commit is also included in Go 1.7. $ oc version oc v3.7.0-0.127.0 kubernetes v1.7.0+80709908fd features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://master.lab.variantweb.net:8443 openshift v3.7.0-0.127.0 kubernetes v1.7.0+80709908fd $ oc get events LASTSEEN FIRSTSEEN COUNT NAME KIND SUBOBJECT TYPE REASON SOURCE MESSAGE 1h 17h 31 docker-registry-1-rn2c6 Pod spec.containers{registry} Warning Unhealthy kubelet, infra.lab.variantweb.net Liveness probe failed: Get https://10.128.0.4:5000/healthz: http2: no cached connection was available 26m 17h 48 docker-registry-1-rn2c6 Pod spec.containers{registry} Warning Unhealthy kubelet, infra.lab.variantweb.net Readiness probe failed: Get https://10.128.0.4:5000/healthz: http2: no cached connection was available
Upstream kube issue: https://github.com/kubernetes/kubernetes/issues/49740
Kube PR: https://github.com/kubernetes/kubernetes/pull/53318 Origin PR: https://github.com/openshift/origin/pull/16633
Verfied on 3.7.0-0.188.0. During registry stress testing with 250 and 500 concurrent builds, the message is no longer seen.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:3188