> ## Documentation Index
> Fetch the complete documentation index at: https://docs.mem.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Rate Limits

Calls to our API are rate limited to provide equitable access to the API for everyone and to prevent abuse. We are going to be evolving these limits as we gather more information, and encourage your feedback. Any changes to limits will be announced in our Slack community.

We use the [leaky bucket algorithm](https://en.wikipedia.org/wiki/Leaky_bucket) for our rate limiters, which means that your tokens are refilled with a constant rate of `LIMIT_AMOUNT / LIMIT_PERIOD`.

## Limits

We enforce two types of rate limits:

### Request Limits

| Interval   | Limit |
| ---------- | ----- |
| Per minute | 100   |
| Per day    | 4000  |

### Complexity Limits

Some endpoints consume more resources than others. We track this with "complexity tokens" that are consumed based on the endpoint:

| Interval   | Limit |
| ---------- | ----- |
| Per minute | 200   |
| Per day    | 8000  |

| Endpoint       | Complexity |
| -------------- | ---------- |
| Most endpoints | 1          |
| `/v2/mem-it`   | 40         |

For example, with a complexity limit of 200 per minute and mem-it costing 40 complexity tokens, you can make up to 5 mem-it calls per minute.

## Response Headers

Every response includes headers indicating your current rate limit status. We show the most constrained bucket (minute or day) for each limit type:

| Header                   | Description                                               |
| ------------------------ | --------------------------------------------------------- |
| `X-RateLimit-Bucket`     | Which time window is most constrained (`minute` or `day`) |
| `X-RateLimit-Limit`      | Maximum requests allowed in that window                   |
| `X-RateLimit-Remaining`  | Requests remaining in that window                         |
| `X-RateLimit-Reset`      | Seconds until tokens replenish                            |
| `X-Complexity-Bucket`    | Which time window is most constrained (`minute` or `day`) |
| `X-Complexity-Limit`     | Maximum complexity tokens allowed in that window          |
| `X-Complexity-Remaining` | Complexity tokens remaining in that window                |
| `X-Complexity-Reset`     | Seconds until complexity tokens replenish                 |

## Handling Rate Limit Errors

When rate limits are exceeded, the API returns a `429` status code. It also
returns a `Retry-After` header with the number of seconds to wait before the
next request is allowed.

Usage quotas, such as a plan's monthly note creation allowance, also return
`429`. These responses use `error.type: "quota_exceeded"` and include the quota
reset time in `error.details.reset_time`. `Retry-After` reflects that reset.
Upgrading the account may restore access sooner, so clients should check the
account state after an upgrade instead of caching the quota response.