Description of problem: It takes more than 2 minutes a resource can be counted by cluster quota in OCP environment(can't reproduce in Origin). The performance can be optimized? Version-Release number of selected component (if applicable): openshift v3.3.0.14 kubernetes v1.3.0+57fb9ac etcd 2.3.0+git How reproducible: Always Steps to Reproduce: 1. Create 1 project # oc new-project project-a 2. Label projects # oc label namespace project-a user=dev --config=./admin.kubeconfig 3. Create a clusterquota with label selector "user=dev" # oc create clusterresourcequota crq --project-label-selector=user=dev --hard=pods=2 --config=./admin.kubeconfig 4. Create a pod and check clusterquota # oc run testpod-1 --image=aosqe/hello-openshift --generator=run-pod/v1 # oc describe clusterresourcequota crq --config=./admin.kubeconfig Actual results: 4. It will take more than 2 minutes to count a running pod or other resources. [root@dhcp-141-95 qwang]# oc describe clusterresourcequota crq --config=./admin.kubeconfig Name: crq Namespace: <none> Created: 18 minutes ago Labels: <none> Annotations: <none> Label Selector: user=dev AnnotationSelector: map[] Resource Used Hard -------- ---- ---- pods 1 2 Expected results: 4. Resources change should be reflected to cluster quota usage ASAP. Additional info:
How big is the cluster and how many clusterresourcequotas are there? Also, can you provide a master log at loglevel=4?
This problem can't be reproduced in non-HA environment but exist in HA (2master+2infra_node+2node+3etcd). Attached master-config.yaml. BTW, I wasn't able to capture any useful messages on a public HA environment, I'm going to setup a private env to get more info if need.
Created attachment 1188619 [details] master config
> I'm going to setup a private env to get more info if need I am going to need see the controller logs (loglevel=4 please) to really have a reasonable starting point for investigation.
Refer to: https://bugzilla.redhat.com/show_bug.cgi?id=1364403#c7 and attachments.
I've created https://github.com/openshift/origin/pull/10307 to gather metrics for the clusterquota controllers. After its taken a while, please collect curl -k https://controller-host-X:8444/metrics curl -k https://each-api-server:8443/metrics You may have to run `oadm policy add-cluster-role-to-user cluster-admin system:anonymous` or attach cluster-admin certs to the curl requests to get at those endpoints. I'm working on getting a dev-ami
https://github.com/openshift/origin/pull/10307 has merged, but the devami job keeps failing on yum problems. Once you have a build that contains it, please gather the metrics mentioned in comment-6.
Attached each apiserver and controller metrics. Are these what you want? Hope these help
Created attachment 1190012 [details] controller_1_metrics
Created attachment 1190013 [details] controller_2_metrics
Created attachment 1190014 [details] apiserver_1_metrics
Created attachment 1190015 [details] apiserver_2_metrics
It's hitting ratelimiting. I'm considering my options.
Config problem. Opened https://github.com/openshift/openshift-ansible/pull/2287 To get immediate relief, update the master-config.yaml to update "ops:" to "qps:".
Installer fix merged.
*** Bug 1366740 has been marked as a duplicate of this bug. ***
Tested in HA environment (2master+2node+3etcd+1lbnfs) Package version: openshift-ansible-3.3.10-1.git.0.7060379.el7.noarch.rpm openshift-ansible-docs-3.3.10-1.git.0.7060379.el7.noarch.rpm openshift-ansible-filter-plugins-3.3.10-1.git.0.7060379.el7.noarch.rpm openshift-ansible-lookup-plugins-3.3.10-1.git.0.7060379.el7.noarch.rpm openshift-ansible-playbooks-3.3.10-1.git.0.7060379.el7.noarch.rpm openshift-ansible-roles-3.3.10-1.git.0.7060379.el7.noarch.rpm atomic-openshift-3.3.0.19-1.git.0.93380aa.el7.x86_64 atomic-openshift-clients-3.3.0.19-1.git.0.93380aa.el7.x86_64 atomic-openshift-master-3.3.0.19-1.git.0.93380aa.el7.x86_64 tuned-profiles-atomic-openshift-node-3.3.0.19-1.git.0.93380aa.el7.x86_64 atomic-openshift-node-3.3.0.19-1.git.0.93380aa.el7.x86_64 atomic-openshift-sdn-ovs-3.3.0.19-1.git.0.93380aa.el7.x86_64 atomic-openshift-tests-3.3.0.19-1.git.0.93380aa.el7.x86_64 PR https://github.com/openshift/openshift-ansible/pull/2287 is already contained in openshift-ansible-3.3.10-1. However, this problem persists.
Created attachment 1190833 [details] 08-15-master-config.yaml masterClients: externalKubernetesClientConnectionOverrides: acceptContentTypes: application/vnd.kubernetes.protobuf,application/json contentType: application/vnd.kubernetes.protobuf burst: 400 qps: 200 externalKubernetesKubeConfig: "" openshiftLoopbackClientConnectionOverrides: acceptContentTypes: application/vnd.kubernetes.protobuf,application/json contentType: application/vnd.kubernetes.protobuf burst: 600 qps: 300
Created attachment 1190834 [details] 08-16-node-config.yaml masterClientConnectionOverrides: acceptContentTypes: application/vnd.kubernetes.protobuf,application/json contentType: application/vnd.kubernetes.protobuf burst: 200 qps: 100
Created attachment 1190836 [details] 08-15-api-metrics
Created attachment 1190837 [details] 08-15-controller-metrics
I'm very sorry, please ignore Comment 19~23. I configured master-config.yaml manually with "ClusterResourceQuota" enabled and didn't have this problem. Thanks. Package version: openshift-ansible-3.3.10-1.git.0.7060379.el7.noarch.rpm openshift v3.3.0.19 kubernetes v1.3.0+507d3a7 etcd 2.3.0+git [root@dhcp-141-95 qwang]# oc describe clusterresourcequota crq; date Name: crq Namespace: <none> Created: About an hour ago Labels: <none> Annotations: <none> Label Selector: user=dev AnnotationSelector: map[] Resource Used Hard -------- ---- ---- pods 0 2 secrets 9 10 services 0 2 Mon Aug 15 18:08:17 CST 2016 [root@dhcp-141-95 qwang]# oc create -f multi-portsvc.json; date service "multi-portsvc-2" created Mon Aug 15 18:08:30 CST 2016 [root@dhcp-141-95 qwang]# oc describe clusterresourcequota crq; date Name: crq Namespace: <none> Created: About an hour ago Labels: <none> Annotations: <none> Label Selector: user=dev AnnotationSelector: map[] Resource Used Hard -------- ---- ---- pods 0 2 secrets 9 10 services 1 2 Mon Aug 15 18:08:35 CST 2016
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:1933