Description of problem: I am doing etcd performance analysis for thousands builds. After creating 40K cakephp quickstart builds "oc get images" is timing out. Trying to get images project by project also does not work. root@ip-172-31-6-118: ~ # oc get images --all-namespaces -n proj0 Unable to connect to the server: stream error: stream ID 1; INTERNAL_ERROR root@ip-172-31-6-118: ~ # oc get images -n proj0 Unable to connect to the server: stream error: stream ID 1; INTERNAL_ERROR root@ip-172-31-6-118: ~ # oc project proj0 Now using project "proj0" on server "https://ip-172-31-6-118.us-west-2.compute.internal:8443". root@ip-172-31-6-118: ~ # oc get images Unable to connect to the server: stream error: stream ID 1; INTERNAL_ERROR Env details 1 master m4.xlarge 1 etcd m4.2xlarge 1 infra m4.2xlarge 4 nodes m4.xlarge Version-Release number of selected component (if applicable): openshift v3.5.0.55 kubernetes v1.5.2+43a9be4 etcd 3.1.0 How reproducible: Steps to Reproduce: 1. create 200 projects and cakephp builds 2. start 50 concurrent builds at a time 3. reach 40K builds and try oc get images Actual results: See the error Expected results: Should list images Additional info:
Can you please provide the output for the same 'get' commands running with the '--loglevel=9' flag?
oc get images is going to be more the realm of the platform management team. Adding michal.
root@ip-172-31-6-118: ~ # oc get images --all-namespaces --loglevel=9 I0324 09:04:37.299190 112392 loader.go:354] Config loaded from file /root/.kube/config I0324 09:04:37.301151 112392 cached_discovery.go:112] returning cached discovery info from /root/.kube/ip_172_31_6_118.us_west_2.compute.internal_8443/servergroups.json I0324 09:04:37.301269 112392 cached_discovery.go:70] returning cached discovery info from /root/.kube/ip_172_31_6_118.us_west_2.compute.internal_8443/apps/v1beta1/serverresources.json I0324 09:04:37.301339 112392 cached_discovery.go:70] returning cached discovery info from /root/.kube/ip_172_31_6_118.us_west_2.compute.internal_8443/authentication.k8s.io/v1beta1/serverresources.json I0324 09:04:37.301405 112392 cached_discovery.go:70] returning cached discovery info from /root/.kube/ip_172_31_6_118.us_west_2.compute.internal_8443/autoscaling/v1/serverresources.json I0324 09:04:37.301466 112392 cached_discovery.go:70] returning cached discovery info from /root/.kube/ip_172_31_6_118.us_west_2.compute.internal_8443/batch/v1/serverresources.json I0324 09:04:37.301544 112392 cached_discovery.go:70] returning cached discovery info from /root/.kube/ip_172_31_6_118.us_west_2.compute.internal_8443/batch/v2alpha1/serverresources.json I0324 09:04:37.301610 112392 cached_discovery.go:70] returning cached discovery info from /root/.kube/ip_172_31_6_118.us_west_2.compute.internal_8443/certificates.k8s.io/v1alpha1/serverresources.json I0324 09:04:37.301743 112392 cached_discovery.go:70] returning cached discovery info from /root/.kube/ip_172_31_6_118.us_west_2.compute.internal_8443/extensions/v1beta1/serverresources.json I0324 09:04:37.301797 112392 cached_discovery.go:70] returning cached discovery info from /root/.kube/ip_172_31_6_118.us_west_2.compute.internal_8443/policy/v1beta1/serverresources.json I0324 09:04:37.301855 112392 cached_discovery.go:70] returning cached discovery info from /root/.kube/ip_172_31_6_118.us_west_2.compute.internal_8443/storage.k8s.io/v1beta1/serverresources.json I0324 09:04:37.302286 112392 cached_discovery.go:70] returning cached discovery info from /root/.kube/ip_172_31_6_118.us_west_2.compute.internal_8443/v1/serverresources.json I0324 09:04:37.302722 112392 cached_discovery.go:112] returning cached discovery info from /root/.kube/ip_172_31_6_118.us_west_2.compute.internal_8443/servergroups.json I0324 09:04:37.302787 112392 cached_discovery.go:70] returning cached discovery info from /root/.kube/ip_172_31_6_118.us_west_2.compute.internal_8443/apps/v1beta1/serverresources.json I0324 09:04:37.302839 112392 cached_discovery.go:70] returning cached discovery info from /root/.kube/ip_172_31_6_118.us_west_2.compute.internal_8443/authentication.k8s.io/v1beta1/serverresources.json I0324 09:04:37.304855 112392 cached_discovery.go:70] returning cached discovery info from /root/.kube/ip_172_31_6_118.us_west_2.compute.internal_8443/autoscaling/v1/serverresources.json I0324 09:04:37.304923 112392 cached_discovery.go:70] returning cached discovery info from /root/.kube/ip_172_31_6_118.us_west_2.compute.internal_8443/batch/v1/serverresources.json I0324 09:04:37.305013 112392 cached_discovery.go:70] returning cached discovery info from /root/.kube/ip_172_31_6_118.us_west_2.compute.internal_8443/batch/v2alpha1/serverresources.json I0324 09:04:37.305093 112392 cached_discovery.go:70] returning cached discovery info from /root/.kube/ip_172_31_6_118.us_west_2.compute.internal_8443/certificates.k8s.io/v1alpha1/serverresources.json I0324 09:04:37.305232 112392 cached_discovery.go:70] returning cached discovery info from /root/.kube/ip_172_31_6_118.us_west_2.compute.internal_8443/extensions/v1beta1/serverresources.json I0324 09:04:37.305310 112392 cached_discovery.go:70] returning cached discovery info from /root/.kube/ip_172_31_6_118.us_west_2.compute.internal_8443/policy/v1beta1/serverresources.json I0324 09:04:37.305365 112392 cached_discovery.go:70] returning cached discovery info from /root/.kube/ip_172_31_6_118.us_west_2.compute.internal_8443/storage.k8s.io/v1beta1/serverresources.json I0324 09:04:37.305828 112392 cached_discovery.go:70] returning cached discovery info from /root/.kube/ip_172_31_6_118.us_west_2.compute.internal_8443/v1/serverresources.json I0324 09:04:37.308522 112392 cached_discovery.go:112] returning cached discovery info from /root/.kube/ip_172_31_6_118.us_west_2.compute.internal_8443/servergroups.json I0324 09:04:37.308714 112392 round_trippers.go:299] curl -k -v -XGET -H "Accept: application/json" -H "User-Agent: oc/v1.5.2+43a9be4 (linux/amd64) kubernetes/43a9be4" https://ip-172-31-6-118.us-west-2.compute.internal:8443/oapi/v1/images I0324 09:05:37.342117 112392 round_trippers.go:318] GET https://ip-172-31-6-118.us-west-2.compute.internal:8443/oapi/v1/images in 60033 milliseconds I0324 09:05:37.342161 112392 round_trippers.go:324] Response Headers: I0324 09:05:37.342260 112392 helpers.go:221] Connection error: Get https://ip-172-31-6-118.us-west-2.compute.internal:8443/oapi/v1/images: stream error: stream ID 1; INTERNAL_ERROR F0324 09:05:37.342283 112392 helpers.go:116] Unable to connect to the server: stream error: stream ID 1; INTERNAL_ERROR
Vikas, in the process of doing this etcd performance analysis, did you try to reach out directly to the api with curl (instead of through oc)? Something like what's suggested in the last few lines of logs when running in --loglevel=9, like curl -k -v -XGET -H "Accept: application/json" -H "User-Agent: oc/v1.5.2+43a9be4 (linux/amd64) kubernetes/43a9be4" https://ip-172-31-6-118.us-west-2.compute.internal:8443/oapi/v1/images I'm trying to figure if the error happens in the client side or already on server when through the API.
Michal, any idea based on the logs messages?
Fabiano, I tried using directly etcdctl2 on the etcd node at that time, and it was not working. I had to provide 30s for --total-timeout parameter.
OK, the --request-timeout option for 'oc' client landed in 3.7 (and higher versions). Moving this ON_QA to try to get large number of images using this option (it should not hit the request timeout).
Verified in following release, created 100K images and 40K more objects, oc get images and oc get all everything works. All work without any additional parameter. openshift v3.7.27 kubernetes v1.7.6+a08f5eeb62 etcd 3.2.8
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0636