Description of problem: Please re-assign if REST API isn't correct. For horizontal scale testing we have a cluster loader script which creates a large number of projects, populates them with secretes, build configs, DCs, RCs etc and runs deployments in each project. The script has been used successfully since 3.2. A recent change (since 3.3.0.11) has changed the behavior or project creation. When projects are created consecutively (not simultaneously), the serviceaccounts are not created immediately after the 24th project is created. There is a long delay (minutes) before the serviceaccounts are created. This causes deployments to fail when trying to populate the projects. This is a blocker for horizontal scalability testing in an environment with 300 nodes and 1000 projects. In the past, this issue was not encountered - all projects could deploy DCs immediately. This only seems to occur in HA environments (multi-master) Version-Release number of selected component (if applicable): 3.3.0.18 How reproducible: Always Steps to Reproduce: 0. Install an HA cluster. Mine has 3 master/etcd, 1 master load balancer, 2 registry/router and 5 nodes 1. for i in {1..50}; do oc new-project project$i; done 2. for i in {1..50}; do echo project$i; oc get sa -n project$i --no-headers| wc -l; done Actual results: After the 23rd project (or so), the projects will not have an serviceaccounts. Wait for a while (minutes) and then run the oc get sa again and more projects will have the serviceaccounts. Attempts to run a deployment results in events similar to this popping for the namespace: DeploymentConfig Warning FailedRetry {deployments-controller } deploymentconfig0-1: About to stop retrying deploymentconfig0-1: couldn't create deployer pod for cncf13/deploymentconfig0-1: pods "deploymentconfig0-1-deploy" is forbidden: service account cncf13/deployer was not found, retry after the service account is created Expected results: serviceaccounts created immediately and deployments of DCs operational immediately after project creation.
sleeping 1 second between project creations does not help, still see projects with no SAs. Sleeping 10 seconds does seem to help. Have not bisected.
Likely a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1364431 Can you include the contents of the master-config.yaml?
Created attachment 1190534 [details] master-config.yaml Config attached, let me know if there is a tune-able.
Yeah, dupe of https://bugzilla.redhat.com/show_bug.cgi?id=1364431 The client config overrides had a type in them, so instead of setting "qps" to 200 and 300, it set "ops" to 200/300, the server ignored the unknown field, and defaulted to 5. As a workaround until you have an install containing https://github.com/openshift/openshift-ansible/pull/2287, you can work around it by editing the config. Change: "ops: 200" to "qps: 200" "ops: 300" to "qps: 300" Also check the node config, the same typo exists there. *** This bug has been marked as a duplicate of bug 1364431 ***
Changing ops to qps works around it.