Skip to main content
Calls to our API are rate limited to provide equitable access to the API for everyone and to prevent abuse. We are going to be evolving these limits as we gather more information, and encourage your feedback. Any changes to limits will be announced in our Slack community. We use the leaky bucket algorithm for our rate limiters, which means that your tokens are refilled with a constant rate of LIMIT_AMOUNT / LIMIT_PERIOD.

Limits

We enforce two types of rate limits:

Request Limits

IntervalLimit
Per minute100
Per day4000

Complexity Limits

Some endpoints consume more resources than others. We track this with “complexity tokens” that are consumed based on the endpoint:
IntervalLimit
Per minute200
Per day8000
EndpointComplexity
Most endpoints1
/v2/mem-it40
For example, with a complexity limit of 200 per minute and mem-it costing 40 complexity tokens, you can make up to 5 mem-it calls per minute.

Response Headers

Every response includes headers indicating your current rate limit status. We show the most constrained bucket (minute or day) for each limit type:
HeaderDescription
X-RateLimit-BucketWhich time window is most constrained (minute or day)
X-RateLimit-LimitMaximum requests allowed in that window
X-RateLimit-RemainingRequests remaining in that window
X-RateLimit-ResetSeconds until tokens replenish
X-Complexity-BucketWhich time window is most constrained (minute or day)
X-Complexity-LimitMaximum complexity tokens allowed in that window
X-Complexity-RemainingComplexity tokens remaining in that window
X-Complexity-ResetSeconds until complexity tokens replenish

Handling Rate Limit Errors

When rate limits are exceeded, the API will return a 429 status code. We also return a Retry-After header with the number of seconds to wait before the next request will be allowed.