Quick Start
Get your first rate limit check working in two steps.
1. Create a bucket
A bucket defines your rate limit policy. Use your ADMIN_TOKEN to create one. You only need to do this once.
curl -X POST https://api.waitstate.io/v1/buckets \
-H "Authorization: Bearer $WAITSTATE_ADMIN_TOKEN" \
-H "Content-Type: application/json" \
-d '{"name":"my-api","capacity":100,"refill_rate":1}' Response:
{
"id": "01J8KM2P4XQYZ",
"name": "my-api",
"capacity": 100,
"refill_rate": 1,
"region": null,
"log_webhook": null,
"log_sampling": 1,
"created_at": 1739832000,
"updated_at": 1739832000
} 2. Check a limit
Call POST /v1/deduct before each operation you want to rate limit. Use your DEDUCT_TOKEN.
curl -X POST https://api.waitstate.io/v1/deduct \
-H "Authorization: Bearer $WAITSTATE_DEDUCT_TOKEN" \
-H "Content-Type: application/json" \
-d '{"key":"user_123","bucket":"my-api","cost":1}' Allowed (200):
HTTP/1.1 200 OK
X-WaitState-Limit: 100
X-WaitState-Remaining: 99
X-WaitState-Reset: 1739832160
{"allowed":true} Denied (429):
HTTP/1.1 429 Too Many Requests
Retry-After: 5
{"error":"wait_state_active","message":"Quota exhausted. Required 1 units.","retry_after":5} key. See Keys below.Core Concepts
Buckets
A bucket defines your rate limit policy for a given operation. It has two parameters:
capacity: the maximum number of tokens a user can accumulate (controls burst size)refill_rate: tokens added per second (controls sustained throughput)
Example: capacity: 100, refill_rate: 1 allows a burst of up to 100 requests, then sustains 1 request per second. Each user gets their own isolated token bucket; the policy applies per user, not globally.
Create buckets once with your admin token. You can update capacity, refill_rate, and logging config at any time. The region field is set at creation and cannot be changed.
Keys
The key field identifies the entity being rate limited, typically a user ID or API key. WaitState creates an isolated token bucket for every unique key + bucket pair. There is no limit to the number of keys per bucket.
Hash before sending:
echo -n "user@example.com" | sha256sum | awk '{print $1}'
# a665a45920422f9d417e4867efdc4fb8a04a1f3fff1fa07e998e86f7f7a27ae3import { createHash } from 'node:crypto'
const key = createHash('sha256').update(userId).digest('hex')import hashlib
key = hashlib.sha256(user_id.encode()).hexdigest()import (
"crypto/sha256"
"encoding/hex"
)
h := sha256.Sum256([]byte(userID))
key := hex.EncodeToString(h[:])require 'digest'
key = Digest::SHA256.hexdigest(user_id)Cost
The cost field is the number of tokens to deduct for this request. You calculate it client-side; WaitState never sees your request payload or protocol. This makes it work equally well with REST, GraphQL, gRPC, WebSockets, or any other protocol.
| Use case | Cost formula |
|---|---|
| Simple REST API | cost: 1 per request |
| GraphQL (complexity limiting) | cost: queryComplexity |
| File uploads | cost: Math.ceil(bytes / 1024) |
| LLM token budgets | cost: inputTokens + outputTokens |
If cost is omitted it defaults to 1. Cost must be a positive number.
Common rate limit patterns
Set capacity and refill_rate on the bucket, then pass cost: 1 per request. The math: refill_rate is tokens per second, so multiply by 60 to get tokens per minute.
| Target limit | capacity | refill_rate | Notes |
|---|---|---|---|
| 60 req/min with burst | 60 | 1 | Users can burn 60 tokens instantly, then get 1/sec (60/min) sustained. |
| 60 req/min, no burst | 1 | 1 | Capacity = 1 means no accumulation. Strict 1 request per second. |
| 10 req/sec | 10 | 10 | 10 tokens burst, 10 refilled per second. Sustained 600/min. |
| 100 req/sec with headroom | 500 | 100 | Allows a 5-second burst before throttling to 100/sec sustained. |
| GraphQL complexity budget | 5000 | 83 | ~5,000 complexity points per minute. Pass query complexity as cost. |
| LLM token budget | 100000 | 1666 | ~100k tokens/min sustained. Pass inputTokens + outputTokens as cost. |
refill_rate = N / 60 (round to nearest integer) and capacity to however much burst you want to allow.Authentication
WaitState uses two separate bearer tokens. They are scoped to different operations intentionally. If your production environment is compromised, your bucket configuration remains safe.
| Token | Endpoints | Where to use it |
|---|---|---|
ADMIN_TOKEN | /v1/buckets/* | CI/CD pipelines, scripts. Never embed in client-facing code. |
DEDUCT_TOKEN | /v1/deduct | Your application servers. Embed as an env var. |
Pass either token as a standard HTTP bearer:
Authorization: Bearer <your-token> All requests without a valid token return 401 Unauthorized. Tokens sent to the wrong endpoint also return 401. The DEDUCT_TOKEN cannot access bucket CRUD and vice versa.
Integration Examples
These examples show how to call /v1/deduct from common server-side runtimes. The pattern is always the same: hash the user ID, POST to WaitState, branch on the status code.
# Hash your user ID before sending -- never send raw PII
KEY=$(echo -n "user_123" | sha256sum | awk '{print $1}')
curl -X POST https://api.waitstate.io/v1/deduct \
-H "Authorization: Bearer $WAITSTATE_DEDUCT_TOKEN" \
-H "Content-Type: application/json" \
-d "{\"key\":\"$KEY\",\"bucket\":\"my-api\",\"cost\":1}"
# 200 OK -> {"allowed":true}
# 429 -> {"error":"wait_state_active","retry_after":5}import { createHash } from 'node:crypto'
const TOKEN = process.env.WAITSTATE_DEDUCT_TOKEN!
function sha256(s: string) {
return createHash('sha256').update(s).digest('hex')
}
export async function allow(userId: string, cost = 1): Promise<boolean> {
const res = await fetch('https://api.waitstate.io/v1/deduct', {
method: 'POST',
headers: {
Authorization: `Bearer ${TOKEN}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({ key: sha256(userId), bucket: 'my-api', cost }),
})
if (res.ok) return true
if (res.status === 429) {
const { retry_after } = await res.json() as { retry_after: number }
// forward to caller: reply.header('Retry-After', String(retry_after))
return false
}
throw new Error(`WaitState error: ${res.status}`)
}import hashlib, os
import httpx
TOKEN = os.environ['WAITSTATE_DEDUCT_TOKEN']
def sha256(s: str) -> str:
return hashlib.sha256(s.encode()).hexdigest()
def allow(user_id: str, cost: int = 1) -> bool:
r = httpx.post(
'https://api.waitstate.io/v1/deduct',
headers={'Authorization': f'Bearer {TOKEN}'},
json={'key': sha256(user_id), 'bucket': 'my-api', 'cost': cost},
)
if r.status_code == 200:
return True
if r.status_code == 429:
retry_after = r.json()['retry_after']
# raise RateLimitError(retry_after=retry_after)
return False
r.raise_for_status()package ratelimit
import (
"bytes"
"crypto/sha256"
"encoding/hex"
"encoding/json"
"fmt"
"net/http"
"os"
)
var token = os.Getenv("WAITSTATE_DEDUCT_TOKEN")
type Result struct {
Allowed bool
RetryAfter int
}
func Allow(userID string, cost int) (Result, error) {
h := sha256.Sum256([]byte(userID))
body, _ := json.Marshal(map[string]any{
"key": hex.EncodeToString(h[:]),
"bucket": "my-api",
"cost": cost,
})
req, _ := http.NewRequest(http.MethodPost,
"https://api.waitstate.io/v1/deduct",
bytes.NewReader(body))
req.Header.Set("Authorization", "Bearer "+token)
req.Header.Set("Content-Type", "application/json")
resp, err := http.DefaultClient.Do(req)
if err != nil {
return Result{}, err
}
defer resp.Body.Close()
switch resp.StatusCode {
case 200:
return Result{Allowed: true}, nil
case 429:
var r struct {
RetryAfter int `json:"retry_after"`
}
json.NewDecoder(resp.Body).Decode(&r)
return Result{Allowed: false, RetryAfter: r.RetryAfter}, nil
default:
return Result{}, fmt.Errorf("waitstate: status %d", resp.StatusCode)
}
}require 'digest'
require 'json'
require 'net/http'
TOKEN = ENV.fetch('WAITSTATE_DEDUCT_TOKEN')
def allow(user_id, cost: 1)
key = Digest::SHA256.hexdigest(user_id)
uri = URI('https://api.waitstate.io/v1/deduct')
Net::HTTP.start(uri.host, uri.port, use_ssl: true) do |http|
req = Net::HTTP::Post.new(uri,
'Authorization' => "Bearer #{TOKEN}",
'Content-Type' => 'application/json')
req.body = { key:, bucket: 'my-api', cost: }.to_json
res = http.request(req)
return true if res.code == '200'
return false if res.code == '429'
raise "WaitState error: #{res.code}"
end
endThe bucket field accepts either the bucket id or name. Using the name is more readable; using the ID is slightly faster and avoids breaking if you rename a bucket.
Client Libraries
Official client libraries are in development. They will handle SHA-256 hashing, response header parsing, retry logic, and error propagation automatically. All libraries will be MIT-licensed and open source.
| Language | Package | Status |
|---|---|---|
| Node.js / TypeScript | @waitstate/node | Coming soon |
| Python | waitstate | Coming soon |
| Go | github.com/waitstate/waitstate-go | Coming soon |
| Ruby | waitstate | Coming soon |
Deduct Tokens
Atomically deducts tokens from a user's bucket and returns whether the request is allowed.
Request body
| Field | Type | Required | Description |
|---|---|---|---|
key | string | yes | Opaque user identifier. SHA-256 hash PII before sending. |
bucket | string | yes | Bucket ID or name. |
cost | number | no | Tokens to deduct. Defaults to 1. Must be a positive number. |
200 OK
{ "allowed": true } Response headers are always set. See Response Headers.
429 Too Many Requests
{
"error": "wait_state_active",
"message": "Quota exhausted. Required 10 units.",
"retry_after": 5
} retry_after is the number of seconds until enough tokens have refilled to allow the same cost. Forward this value to your client as a Retry-After header.
Response Headers
Every response includes these headers, regardless of whether the request was allowed or denied.
| Header | Description |
|---|---|
X-WaitState-Limit | The bucket's capacity. |
X-WaitState-Remaining | Tokens remaining in this user's bucket after the request. |
X-WaitState-Reset | Unix timestamp when the bucket will refill to full capacity at the current rate. |
Retry-After | Seconds until the same cost can be satisfied. Only present on 429 responses. |
List Buckets
Returns all buckets, newest first.
[
{
"id": "01J8KM2P4XQYZ",
"name": "graphql-api",
"capacity": 1000,
"refill_rate": 10,
"region": "wnam",
"log_webhook": null,
"log_sampling": 1.0,
"created_at": 1739832000,
"updated_at": 1739832000
},
{
"id": "01J8KM2P4ABCD",
"name": "rest-api",
"capacity": 100,
"refill_rate": 1,
"region": null,
"log_webhook": "https://ingest.example.com/waitstate",
"log_sampling": 0.1,
"created_at": 1739831000,
"updated_at": 1739831000
}
] Create Bucket
Creates a new bucket. Returns 201 Created with the full bucket object.
Request body
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
name | string | yes | - | Human-readable name. Used as an alias in /v1/deduct. |
capacity | integer | yes | - | Maximum tokens per user (burst size). |
refill_rate | integer | yes | - | Tokens added per second (sustained rate). |
region | string | no | null | Pin rate limiters to a region. Immutable after creation. See Regions. |
log_webhook | string | no | null | HTTPS URL to POST log events to. See Log Delivery. |
log_sampling | number | no | 1.0 | Fraction of events to log (0.0 to 1.0). See Log Delivery. |
Bucket object
{
"id": "01J8KM2P4XQYZ",
"name": "my-api",
"capacity": 100,
"refill_rate": 1,
"region": "wnam",
"log_webhook": null,
"log_sampling": 1.0,
"created_at": 1739832000,
"updated_at": 1739832000
} Get Bucket
Returns a single bucket by ID. Returns 404 if not found.
Update Bucket
Partial update. All fields are optional. region is immutable and cannot be changed after creation. Returns the updated bucket object.
| Field | Type | Description |
|---|---|---|
name | string | New display name. |
capacity | integer | New capacity. Takes effect on the next deduct call. |
refill_rate | integer | New refill rate. Takes effect on the next deduct call. |
log_webhook | string or null | Set or clear the webhook URL. |
log_sampling | number | New sampling rate (0.0–1.0). |
Delete Bucket
Deletes the bucket definition. Returns 204 No Content. This does not immediately destroy existing Durable Object state for active users; those will expire naturally.
Regions
By default, WaitState uses Smart Placement to automatically co-locate each rate limiter near your application servers. If your users are concentrated in a specific geography, you can pin a bucket to a region for more predictable latency.
Set region when creating a bucket. It cannot be changed after creation.
| Code | Location |
|---|---|
wnam | Western North America |
enam | Eastern North America |
sam | South America |
weur | Western Europe |
eeur | Eastern Europe |
apac | Asia Pacific |
oc | Oceania |
afr | Africa |
me | Middle East |
region unset. Smart Placement will handle it.Log Delivery
WaitState can emit a structured log record for every rate limit check. Logs are useful for auditing, debugging, and building dashboards. Configure logging per bucket.
Webhook delivery
Set log_webhook to an HTTPS URL and WaitState will POST each log record there as JSON. Use this to ship logs to Datadog, Splunk, a custom ingest endpoint, or anywhere that accepts HTTP.
curl -X PUT https://api.waitstate.io/v1/buckets/<id> \
-H "Authorization: Bearer $WAITSTATE_ADMIN_TOKEN" \
-H "Content-Type: application/json" \
-d '{"log_webhook":"https://ingest.example.com/waitstate","log_sampling":0.1}' The request is fire-and-forget. WaitState does not retry failed webhook calls. If your endpoint is unavailable, those log records are dropped. If no webhook is set, logs are written to internal R2 storage.
Log record format
{
"ts": 1739832100,
"bucket_id": "01J8KM2P4XQYZ",
"key_hash": "a665a45920422f9d417e4867efdc4fb8a04a1f3fff1fa07e998e86f7f7a27ae3",
"cost": 10,
"allowed": false,
"remaining": 0,
"retry_after": 5
} | Field | Description |
|---|---|
ts | Unix timestamp of the check. |
bucket_id | The bucket that was checked. |
key_hash | SHA-256 hex digest of the raw key. The original key is never stored. |
cost | Tokens requested. |
allowed | true if the request was allowed. |
remaining | Tokens remaining after this check. |
retry_after | Seconds until retry. Only present when allowed is false. |
Sampling
For high-volume buckets, reduce log volume with log_sampling. Set it to a value between 0.0 and 1.0 representing the fraction of events to log.
| Value | Behavior |
|---|---|
1.0 | Log every event (default) |
0.1 | Log ~10% of events |
0.01 | Log ~1% of events |
0.0 | Disable logging entirely |
Sampling is applied randomly and independently per request. It is not guaranteed to produce an exact percentage, but converges at volume.
Error Reference
| Status | Meaning |
|---|---|
400 | The request body is missing required fields, contains invalid values, or is not valid JSON. The response body includes a human-readable error field. |
401 | Missing or invalid Authorization header, or the correct token was used on the wrong endpoint (e.g. DEDUCT_TOKEN on a bucket endpoint). |
404 | The bucket ID or name in the request does not exist. |
429 | The user's token bucket does not have enough tokens to cover the requested cost (error: "wait_state_active"). Check retry_after in the response body and the Retry-After header. |