Companies providing APIs will often cap usage in some way - most often the metric in question is the number of requests (or messages, orders, …) per second. As an end user it’s often important to respect those limits if a breach can cause you to get banned or penalised in some ways.
Coinbase’s public API
Coinbase has a public market data API which, among other things, allows us to query a book for a given pair (TL;DR, this lists the various price bands at which folks are willing to buy and/or sell - it changes all the time depending on supply and demand among other things). This is a public endpoint and as per their published rate limits we may make up to 10 requests per second, with a burst of 15 (this page from Auth0 provides a great explanation of the difference between the rate limit and burst capacity).
Hitting rate limits
Let’s start with a short snippet whose sole objective is to get rate-limited.
Depending on where you’re running this, you’ll need to tweak the number of works - but 20 should be plenty (note each coroutine continously requests the endpoint):
~/r/c/golang ❯❯❯ go run ratelimit.go -w 20 ✘ 1 master ✱ ◼
2024/07/06 15:03:12 Public rate limit exceeded
2024/07/06 15:03:12 Public rate limit exceeded
2024/07/06 15:03:12 Public rate limit exceeded
2024/07/06 15:03:12 Public rate limit exceeded
Press ctrl-c
to exit. Yeah it’s not pretty, we’ll fix that up later.
We need stats…
But how many requests are we actually making? Let’s add a (crude) monitor! And for this we’ll also add a context.Context
with a timeout (you wouldn’t want to leave this running forever right?).
Running this for 10 seconds and we’re already getting rate-limited - at 33 requests/second that’s somewhat expected.
~/r/c/golang ❯❯❯ go run ratelimit3.go -w 40 -t 10 master ◼
2024/07/13 13:50:27 [worker 38] Public rate limit exceeded
2024/07/13 13:50:27 [worker 0] Public rate limit exceeded
2024/07/13 13:50:27 [worker 7] Public rate limit exceeded
2024/07/13 13:50:27 [worker 4] Public rate limit exceeded
2024/07/13 13:50:27 [worker 6] Public rate limit exceeded
2024/07/13 13:50:28 requests/second: 6, elapsed: 1.000361042s
2024/07/13 13:50:29 requests/second: 33, elapsed: 999.991958ms
2024/07/13 13:50:29 [worker 33] Public rate limit exceeded
2024/07/13 13:50:29 [worker 36] Public rate limit exceeded
2024/07/13 13:50:29 [worker 37] Public rate limit exceeded
2024/07/13 13:50:29 [worker 16] Public rate limit exceeded
Introducing a Limiter
We can create a Limiter
by calling the NewLimiter
constructor with our rate r
(which represents the number of requests per seconds) and the burst capacity b
- in our case the former will be 10, and we’ll deal with the latter separately.
And each goroutines wait for a token before making a request:
Running this again for 10s with just as many workers (40), it’s all clean!
~/r/c/golang ❯❯❯ go run ratelimit4.go -w 40 -t 10 master ◼
2024/07/13 13:53:16 requests/second: 0, tokens available: -28.99, elapsed: 1.0010665s
2024/07/13 13:53:17 requests/second: 2, tokens available: -18.99, elapsed: 1.000019459s
2024/07/13 13:53:18 requests/second: 18, tokens available: -10.00, elapsed: 999.602916ms
2024/07/13 13:53:19 requests/second: 20, tokens available: -6.00, elapsed: 999.669459ms
2024/07/13 13:53:20 requests/second: 9, tokens available: -7.00, elapsed: 1.00023475s
2024/07/13 13:53:21 requests/second: 10, tokens available: -8.99, elapsed: 1.000437625s
2024/07/13 13:53:22 requests/second: 7, tokens available: -7.99, elapsed: 999.992458ms
2024/07/13 13:53:23 requests/second: 11, tokens available: -7.00, elapsed: 999.642333ms
2024/07/13 13:53:24 requests/second: 12, tokens available: -6.00, elapsed: 999.72375ms
2024/07/13 13:53:25 requests/second: 8, tokens available: 1.00, elapsed: 1.000533709s
2024/07/13 13:53:26 requests/second: 2, tokens available: 1.00, elapsed: 999.314833ms
2024/07/13 13:53:27 requests/second: 1, tokens available: 1.00, elapsed: 1.000788167s
2024/07/13 13:53:27 done
But wait up - what are those negative numbers? In a nutshell this tells us how many tokens need to be replenished before another gets released (that’s the limiter.Wait(ctx)
call). A new token gets added to the bucket every 1/10th of a second (0.1 seconds). If I have 20 coroutines all asking for a tokens at once, the limiter.Wait(ctx)
will essentially place a hold on the next 20 tokens - meaning that the 21st corouting asking for one will have to wait for 2.1 seconds (20x0.1+0.1
seconds), which leads to this interesting behaviour.
We can illustrate by calculating both the time to the next token (which is limit.Tokens()/<rate per second>
and logging the time we were blocked during the limiter.Wait
call:
And sure enough, some workers will wait quite a bit:
2024/07/13 14:56:32 [worker 33] waited 2.200255084s for a token
2024/07/13 14:56:32 [worker 37] waited 2.299760416s for a token
2024/07/13 14:56:32 [worker 38] waited 2.400653292s for a token
2024/07/13 14:56:32 [worker 30] waited 2.500609666s for a token
2024/07/13 14:56:32 [worker 6] waited 1.138844833s for a token
Going back to how we created a limiter, we specified the burst b
as 1 - which meant that we’d have at most 1 token available in the bucket at any point in time (that is, this is our bucket’s capacity - we still replenish at the same rate but if it’s full, we don’t add any more tokens). We can easily see this by kicking off our monitor without the coroutines:
~/r/c/golang ❯❯❯ go run ratelimit4.go -w 40 -t 20 -r 10 ✘ 1 master ◼
2024/07/13 14:31:31 requests/second: 0, tokens available: 1.00, elapsed: 1.000864834s
2024/07/13 14:31:32 requests/second: 0, tokens available: 1.00, elapsed: 999.962166ms
2024/07/13 14:31:33 requests/second: 0, tokens available: 1.00, elapsed: 1.0000055s
2024/07/13 14:31:34 requests/second: 0, tokens available: 1.00, elapsed: 1.000001292s
This mostly work because our coroutines are constantly starved for tokens. If instead we had periods of dormancy it’d be great for the bucket to have a greater capacity - so when work resumes, we can make a bunch of requests in one go instead of waiting one by bone. Let’s do it.
However Coinbase gives us a burst capacity of 15 - meaning that when we start our bucket already has 15 tokens available for us to use. But if we change this to 15, we can see we are still hitting rate limits - what gives?
~/r/c/golang ❯❯❯ go run ratelimit4.go -w 40 -t 20 -r 10 ✘ 1 master ◼
2024/07/13 14:33:38 requests/second: 0, tokens available: -15.00, elapsed: 1.001029541s
2024/07/13 14:33:39 requests/second: 2, tokens available: -6.00, elapsed: 999.997167ms
2024/07/13 14:33:40 [worker 0] Public rate limit exceeded
2024/07/13 14:33:40 [worker 19] Public rate limit exceeded
2024/07/13 14:33:40 [worker 3] Public rate limit exceeded
2024/07/13 14:33:40 [worker 5] Public rate limit exceeded
2024/07/13 14:33:40 [worker 1] Public rate limit exceeded
Our program starts with 15 tokens (which are consumed immediately by all our goroutines) - but starts to replenish it at rate of 1 token every 0.1 seconds. Meaning that from t=0s to t=1s we have made up to 25 such requests. From Coindbase’s perspective we should only be allowed to make requests at t=1.5s! However this isn’t something that the rate
package provides out of the box. If the aim is to fully avoid being rate-limited, keeping b
smaller than our consumption rate is key.
Taking it further
The rate
package provides enough functionality out of the box to be useful, but for more elaborate use-cases there are a number of packages that provide more advanced functionality.
Another idea would be to set the rate limit dynamically - some services might throttle requests during peak times and it’d be great if we could feed that back to the limiter to adjust the rate accordingly. And given some sort of cool-off period, it could try gently ramping up again.
The full code for the (latest) version of the example is available here.