Bug 1259531 - quota synchronization timer not documented
quota synchronization timer not documented
Status: CLOSED CURRENTRELEASE
Product: OpenShift Container Platform
Classification: Red Hat
Component: Documentation (Show other bugs)
3.0.0
Unspecified Unspecified
medium Severity high
: ---
: ---
Assigned To: brice
Vikram Goyal
Vikram Goyal
: Reopened
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-09-02 19:53 EDT by Erik M Jacobs
Modified: 2015-09-23 19:10 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-09-23 19:10:49 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
decarr: needinfo-


Attachments (Terms of Use)

  None (edit)
Description Erik M Jacobs 2015-09-02 19:53:21 EDT
openshift-master-3.0.1.0-1.git.525.eddc479.el7ose.x86_64

Using a quota of max 3 pods and a JSON file that creates 4 pods, do the following:

echo "First create"; oc create -f ~/training/content/hello-quota.json; sleep 10; echo "Delete everything"; oc delete pods --all -n demo; echo "Second create"; oc create -f ~/training/content/hello-quota.json

You'll see:

First create
pods/hello-openshift-1
pods/hello-openshift-2
pods/hello-openshift-3
Error from server: Pod "hello-openshift-4" is forbidden: Limited to 3 pods
Delete everything
pods/hello-openshift-1
pods/hello-openshift-2
pods/hello-openshift-3
Second create
Error from server: Pod "hello-openshift-1" is forbidden: Limited to 3 pods
Error from server: Pod "hello-openshift-2" is forbidden: Limited to 3 pods
Error from server: Pod "hello-openshift-3" is forbidden: Limited to 3 pods
Error from server: Pod "hello-openshift-4" is forbidden: Limited to 3 pods

So, even though the pods are gone, the quota has not been simultaneously recalculated. This could cause a problem in automation scenarios.

I used the following to determine about how long it takes for the quota to get updated:

oc create -f ~/training/content/hello-quota.json; sleep 10; oc delete pods --all -n demo; time watch oc describe quota test-quota                                                         
pods/hello-openshift-1
pods/hello-openshift-2
pods/hello-openshift-3
Error from server: Pod "hello-openshift-4" is forbidden: Limited to 3 pods
pods/hello-openshift-1
pods/hello-openshift-2
pods/hello-openshift-3

real    0m8.012s
user    0m1.180s
sys     0m0.037s

8 seconds is a *long* time.
Comment 2 Paul Weil 2015-09-03 13:41:13 EDT
Quota is synchronized asynchronously in Kube by a controller.  You can control the default synchronization period (10s) by changing the master config's controllerArguments.  For example:

kubernetesMasterConfig:
  apiLevels:
  - v1beta3
  - v1
  apiServerArguments: null
  controllerArguments:
    resource-quota-sync-period:
      - "5s"

Derek, sending this your way for confirmation that this is the correct.
Comment 3 Erik M Jacobs 2015-09-03 13:49:57 EDT
Should this be a docs bug?
Comment 4 Derek Carr 2015-09-03 14:22:44 EDT
Paul, your summary is correct.
Erik, the doc makes not of the asynchronous deletion behavior.

Doc here:
https://docs.openshift.org/latest/dev_guide/quota.html

Quota enforcement:
Once a quota is created and usage statistics are up-to-date, the project accepts the creation of new content. When you create resources, your quota usage is incremented immediately upon the request to create or modify the resource. When you delete a resource, your quota use is decremented during the next full recalculation of quota statistics for the project. As a result, it may take a moment for your quota usage statistics to be reduced to their current observed system value when you delete resources.

As for improving the latency, there is a card to look at shortening the interval.  In practice, things like replication controllers just retry so that style of automation works well.  In latest upstream, graceful deletion of pods went in so the pod actually hangs around 30s terminating after you attempt to delete it, so quota is not the largest interval period.

Reason we cannot capture deletes synchronously is because etcd lacks multi-object transaction support.
Comment 5 Erik M Jacobs 2015-09-03 14:32:40 EDT
Missing config doc, though..?
Comment 6 brice 2015-09-15 02:23:05 EDT
Erik, Derek,

I've submitted a PR for this:

https://github.com/openshift/openshift-docs/pull/963

I've put in a new section on changing the setting to change the sync time, and linked to it from a previous paragraph so the reader can get some more context. 

Can I get an ack this is following the right track, or so I know I'm on the right track? I don't think I have any questions.
Comment 7 Derek Carr 2015-09-16 14:24:35 EDT
The updated text looks good if you note that increasing the frequency of quota calculation will increase the load on the openshift-master and should be done in a balanced way to maintain system performance with end-user goals.

As noted on the doc, there is a plan to make observations of deletes for cpu,memory,pods to happen more rapidly by adding a watch in quota controller for pod related resources.  When that happens, we will most likely increase the value from 30s to something much larger.
Comment 8 brice 2015-09-21 00:45:51 EDT
Suggestion fixed, and PR has merged.

Putting this to closed.

Note You need to log in before you can comment on or make changes to this bug.