Bug 988202 - RFE: REST API rate limiting
Summary: RFE: REST API rate limiting
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Zanata
Classification: Retired
Component: Component-API
Version: unspecified
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 3.4
Assignee: Patrick Huang
QA Contact: Damian Jansen
URL:
Whiteboard:
Depends On: 986741
Blocks: 1088122
TreeView+ depends on / blocked
 
Reported: 2013-07-25 05:49 UTC by Matthew Casperson
Modified: 2014-08-04 22:27 UTC (History)
7 users (show)

Fixed In Version: 3.4.0-SNAPSHOT (git-server-3.3.1-197-g24ab6b8)
Doc Type: Bug Fix
Doc Text:
Story Points: 5
Clone Of:
Environment:
Last Closed: 2014-07-17 06:39:35 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1018633 0 urgent CLOSED Connection leak in production system 2021-02-22 00:41:40 UTC

Internal Links: 1018633

Description Matthew Casperson 2013-07-25 05:49:20 UTC
We're currently using the REST API, and have settled on a limited call rate that seems to work. However, going forward it would be useful to have some documented rate limits that are supported by Zanata and EngOps that we could work against.

Comment 1 Carlos Munoz 2014-02-13 23:43:20 UTC
Assigning to Damian for triage.

Comment 2 Damian Jansen 2014-02-14 00:29:12 UTC
This needs to be a global thing - if at all possible we'd want to see where client and web GUI are used most and by whom, in order to schedule improvement the high use areas and for debugging purposes.

Comment 3 Sean Flanigan 2014-02-19 04:44:00 UTC
I can't find the previous discussion about this, but I did find the leaky bucket implementation: https://github.com/bbeck/token-bucket

<dependency>
    <groupId>org.isomorphism</groupId>
    <artifactId>token-bucket</artifactId>
    <version>1.1</version>
</dependency>

Comment 4 Carlos Munoz 2014-02-19 06:26:00 UTC
Secondary contact: sflaniga

Comment 5 Sean Flanigan 2014-02-25 04:13:48 UTC
The list of "practical considerations" in the article Patrick found is worth reading:

http://amistrongeryet.blogspot.com.au/2011/01/rate-limiting-in-150-lines-or.html


In particular, we should ensure we implement:

- multiple buckets (ie one per API key, as we already planned)

- sparse buckets (or periodic cleanup of full buckets, perhaps hourly)

- dynamic configuration (eg notify existing buckets when the limits are changed)

- Monitoring and introspection (or at least some *terse* logging when limits are enforced, or a simple REST service for admin which lets us query the bucket sizes)

Also, I think we should also be able to change the bucket size via an admin REST API, in case the web UI is slow to load.  (I think we discussed that point somewhere.)

Comment 6 Sean Flanigan 2014-03-04 08:17:50 UTC
After a chat with Lee today, we think the performance problems associated with PressGang syncs in the past may have been due to triggering a connection leak in Zanata when under load.  The fact that sync was running for about a month before Zanata became unresponsive is strongly reminiscent of https://bugzilla.redhat.com/show_bug.cgi?id=1018633

Now we were looking at some sort of extra logic to use with the leaky bucket, perhaps based on charging API users a number of tokens calculated from the cost of the *previous* API call (the "post-paid" model, like a phone bill), or perhaps based on charging API users a large number of tokens at the beginning of each request, but then rebating a proportion of them after processing, based on the actual processing costs (the "rental deposit" model).  (The capacity of the bucket divided by the upfront token cost would determine the maximum number of simultaneous API calls.)

However, given what we now know (or think we do), perhaps the following solution would be a little more straightforward, whilst better addressing our expected problem areas:

 * First semaphore (eg 6 permits).  Using Semaphone.tryAcquire, if a REST request doesn't obtain a permit immediately, we immediately return a 503.  This semaphone prevents any user from tying up more than 5 REST threads.  Few legitimate users will use more than 6 simultaneous requests, so 503s should be rare.  This may help (a very little) with DoS attacks, at least accidental ones.  (For any chance at real DoS protection, you would need to prevent requests from even reaching the app server, but that's out of scope here.)

 * Second semaphone (eg 3 permits).  Using Semaphone.acquire, if a request doesn't obtain a permit immediately, it will block.  This semaphone prevents any user from actively using more than 3 REST threads for processing or database I/O.  (Note: if someone does submit 6 simultaneous requests, 3 of them will block, leading to 3 idle threads in addition to the 3 active threads.)

 * Token bucket (eg capacity: 100 tokens, refill: 100/sec).  Each request simply consumes one token.  This bucket will be given a generous capacity by default, but it could be drastically reduced if we ever need to mitigate another concurrency problem like "PressGang sync makes Zanata unresponsive".  By reducing the refill rate to 1 or 2 tokens per second, we could force an API user to run as slowly as PressGang sync currently does.  (According to Lee, PressGang's API calls take 200-300ms each, plus a 300ms pause between requests.)

(Obviously, all the suggested permit and token capacities are subject to tuning.)

One scenario not addressed by this solution is the problem of expensive API calls, like "export the whole database with TMX" which might take 30 minutes to execute. However, we don't have any evidence that users like to export the whole database twice per hour, or that this can cause Zanata to become unresponsive.  (If Zanata can survive 30 minutes of database activity, would pausing for 30 seconds or 30 minutes afterwards really help?)  Trying to limit TMX exports is more likely to cause problems for users than it is to prevent them.

So it's probably not worth implementing (and testing!) the "post-paid" or "rental deposit" solutions at this stage, but a couple of semaphores might save our bacon at some point.

Comment 7 Patrick Huang 2014-03-17 21:58:34 UTC
script to simulate skynet load (GET all translations for all locales)
https://github.com/zanata/zanata-scripts/blob/master/getTranslationLoadTest.groovy

Comment 8 Patrick Huang 2014-04-03 01:24:49 UTC
https://github.com/zanata/zanata-server/pull/390

Comment 9 Patrick Huang 2014-04-03 01:25:42 UTC
at the end we decided to only limit concurrent requests (two semaphores) not limit the rate.

Comment 10 Ding-Yi Chen 2014-04-28 01:23:40 UTC
A few questions:

1. What's the default limit?

2. Where can I configure the default limit, is it in a configuration file, or it is hard coded?

Comment 11 Sean Flanigan 2014-04-28 01:36:43 UTC
1. Six concurrent requests per API key, but Zanata will only work on two of them at a time (the others will be queued).  

2. The default limits come from ApplicationConfiguration.java, but they can be changed in the database, using the Server Config admin page.

Comment 12 Damian Jansen 2014-04-29 03:34:10 UTC
Verified.


Note You need to log in before you can comment on or make changes to this bug.