Quick Start

Get your first rate limit check working in two steps.

1. Create a bucket

A bucket defines your rate limit policy. Use your ADMIN_TOKEN to create one. You only need to do this once.

curl -X POST https://api.waitstate.io/v1/buckets \
  -H "Authorization: Bearer $WAITSTATE_ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"name":"my-api","capacity":100,"refill_rate":1}'

Response:

{
  "id": "01J8KM2P4XQYZ",
  "name": "my-api",
  "capacity": 100,
  "refill_rate": 1,
  "region": null,
  "log_webhook": null,
  "log_sampling": 1,
  "created_at": 1739832000,
  "updated_at": 1739832000
}

2. Check a limit

Call POST /v1/deduct before each operation you want to rate limit. Use your DEDUCT_TOKEN.

curl -X POST https://api.waitstate.io/v1/deduct \
  -H "Authorization: Bearer $WAITSTATE_DEDUCT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"key":"user_123","bucket":"my-api","cost":1}'

Allowed (200):

HTTP/1.1 200 OK
X-WaitState-Limit: 100
X-WaitState-Remaining: 99
X-WaitState-Reset: 1739832160

{"allowed":true}

Denied (429):

HTTP/1.1 429 Too Many Requests
Retry-After: 5

{"error":"wait_state_active","message":"Quota exhausted. Required 1 units.","retry_after":5}
Note: In production, always SHA-256 hash your user IDs before sending them as the key. See Keys below.

Core Concepts

Buckets

A bucket defines your rate limit policy for a given operation. It has two parameters:

Example: capacity: 100, refill_rate: 1 allows a burst of up to 100 requests, then sustains 1 request per second. Each user gets their own isolated token bucket; the policy applies per user, not globally.

Create buckets once with your admin token. You can update capacity, refill_rate, and logging config at any time. The region field is set at creation and cannot be changed.

Keys

The key field identifies the entity being rate limited, typically a user ID or API key. WaitState creates an isolated token bucket for every unique key + bucket pair. There is no limit to the number of keys per bucket.

Important: Never send raw user IDs, emails, or other PII as the key. Always hash them first. WaitState stores key hashes in logs. The original value is never recorded.

Hash before sending:

echo -n "user@example.com" | sha256sum | awk '{print $1}'
# a665a45920422f9d417e4867efdc4fb8a04a1f3fff1fa07e998e86f7f7a27ae3
import { createHash } from 'node:crypto'
const key = createHash('sha256').update(userId).digest('hex')
import hashlib
key = hashlib.sha256(user_id.encode()).hexdigest()
import (
    "crypto/sha256"
    "encoding/hex"
)

h := sha256.Sum256([]byte(userID))
key := hex.EncodeToString(h[:])
require 'digest'
key = Digest::SHA256.hexdigest(user_id)

Cost

The cost field is the number of tokens to deduct for this request. You calculate it client-side; WaitState never sees your request payload or protocol. This makes it work equally well with REST, GraphQL, gRPC, WebSockets, or any other protocol.

Use caseCost formula
Simple REST APIcost: 1 per request
GraphQL (complexity limiting)cost: queryComplexity
File uploadscost: Math.ceil(bytes / 1024)
LLM token budgetscost: inputTokens + outputTokens

If cost is omitted it defaults to 1. Cost must be a positive number.

Common rate limit patterns

Set capacity and refill_rate on the bucket, then pass cost: 1 per request. The math: refill_rate is tokens per second, so multiply by 60 to get tokens per minute.

Target limitcapacityrefill_rateNotes
60 req/min with burst 60 1 Users can burn 60 tokens instantly, then get 1/sec (60/min) sustained.
60 req/min, no burst 1 1 Capacity = 1 means no accumulation. Strict 1 request per second.
10 req/sec 10 10 10 tokens burst, 10 refilled per second. Sustained 600/min.
100 req/sec with headroom 500 100 Allows a 5-second burst before throttling to 100/sec sustained.
GraphQL complexity budget 5000 83 ~5,000 complexity points per minute. Pass query complexity as cost.
LLM token budget 100000 1666 ~100k tokens/min sustained. Pass inputTokens + outputTokens as cost.
Tip: To convert any "N requests per minute" limit: set refill_rate = N / 60 (round to nearest integer) and capacity to however much burst you want to allow.

Authentication

WaitState uses two separate bearer tokens. They are scoped to different operations intentionally. If your production environment is compromised, your bucket configuration remains safe.

TokenEndpointsWhere to use it
ADMIN_TOKEN /v1/buckets/* CI/CD pipelines, scripts. Never embed in client-facing code.
DEDUCT_TOKEN /v1/deduct Your application servers. Embed as an env var.

Pass either token as a standard HTTP bearer:

Authorization: Bearer <your-token>

All requests without a valid token return 401 Unauthorized. Tokens sent to the wrong endpoint also return 401. The DEDUCT_TOKEN cannot access bucket CRUD and vice versa.

Integration Examples

These examples show how to call /v1/deduct from common server-side runtimes. The pattern is always the same: hash the user ID, POST to WaitState, branch on the status code.

# Hash your user ID before sending -- never send raw PII
KEY=$(echo -n "user_123" | sha256sum | awk '{print $1}')

curl -X POST https://api.waitstate.io/v1/deduct \
  -H "Authorization: Bearer $WAITSTATE_DEDUCT_TOKEN" \
  -H "Content-Type: application/json" \
  -d "{\"key\":\"$KEY\",\"bucket\":\"my-api\",\"cost\":1}"

# 200 OK  ->  {"allowed":true}
# 429     ->  {"error":"wait_state_active","retry_after":5}
import { createHash } from 'node:crypto'

const TOKEN = process.env.WAITSTATE_DEDUCT_TOKEN!

function sha256(s: string) {
  return createHash('sha256').update(s).digest('hex')
}

export async function allow(userId: string, cost = 1): Promise<boolean> {
  const res = await fetch('https://api.waitstate.io/v1/deduct', {
    method: 'POST',
    headers: {
      Authorization: `Bearer ${TOKEN}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({ key: sha256(userId), bucket: 'my-api', cost }),
  })

  if (res.ok) return true

  if (res.status === 429) {
    const { retry_after } = await res.json() as { retry_after: number }
    // forward to caller: reply.header('Retry-After', String(retry_after))
    return false
  }

  throw new Error(`WaitState error: ${res.status}`)
}
import hashlib, os
import httpx

TOKEN = os.environ['WAITSTATE_DEDUCT_TOKEN']

def sha256(s: str) -> str:
    return hashlib.sha256(s.encode()).hexdigest()

def allow(user_id: str, cost: int = 1) -> bool:
    r = httpx.post(
        'https://api.waitstate.io/v1/deduct',
        headers={'Authorization': f'Bearer {TOKEN}'},
        json={'key': sha256(user_id), 'bucket': 'my-api', 'cost': cost},
    )
    if r.status_code == 200:
        return True
    if r.status_code == 429:
        retry_after = r.json()['retry_after']
        # raise RateLimitError(retry_after=retry_after)
        return False
    r.raise_for_status()
package ratelimit

import (
	"bytes"
	"crypto/sha256"
	"encoding/hex"
	"encoding/json"
	"fmt"
	"net/http"
	"os"
)

var token = os.Getenv("WAITSTATE_DEDUCT_TOKEN")

type Result struct {
	Allowed    bool
	RetryAfter int
}

func Allow(userID string, cost int) (Result, error) {
	h := sha256.Sum256([]byte(userID))
	body, _ := json.Marshal(map[string]any{
		"key":    hex.EncodeToString(h[:]),
		"bucket": "my-api",
		"cost":   cost,
	})

	req, _ := http.NewRequest(http.MethodPost,
		"https://api.waitstate.io/v1/deduct",
		bytes.NewReader(body))
	req.Header.Set("Authorization", "Bearer "+token)
	req.Header.Set("Content-Type", "application/json")

	resp, err := http.DefaultClient.Do(req)
	if err != nil {
		return Result{}, err
	}
	defer resp.Body.Close()

	switch resp.StatusCode {
	case 200:
		return Result{Allowed: true}, nil
	case 429:
		var r struct {
			RetryAfter int `json:"retry_after"`
		}
		json.NewDecoder(resp.Body).Decode(&r)
		return Result{Allowed: false, RetryAfter: r.RetryAfter}, nil
	default:
		return Result{}, fmt.Errorf("waitstate: status %d", resp.StatusCode)
	}
}
require 'digest'
require 'json'
require 'net/http'

TOKEN = ENV.fetch('WAITSTATE_DEDUCT_TOKEN')

def allow(user_id, cost: 1)
  key = Digest::SHA256.hexdigest(user_id)
  uri = URI('https://api.waitstate.io/v1/deduct')

  Net::HTTP.start(uri.host, uri.port, use_ssl: true) do |http|
    req = Net::HTTP::Post.new(uri,
      'Authorization' => "Bearer #{TOKEN}",
      'Content-Type'  => 'application/json')
    req.body = { key:, bucket: 'my-api', cost: }.to_json

    res = http.request(req)
    return true  if res.code == '200'
    return false if res.code == '429'
    raise "WaitState error: #{res.code}"
  end
end

The bucket field accepts either the bucket id or name. Using the name is more readable; using the ID is slightly faster and avoids breaking if you rename a bucket.

Client Libraries

Official client libraries are in development. They will handle SHA-256 hashing, response header parsing, retry logic, and error propagation automatically. All libraries will be MIT-licensed and open source.

LanguagePackageStatus
Node.js / TypeScript@waitstate/nodeComing soon
PythonwaitstateComing soon
Gogithub.com/waitstate/waitstate-goComing soon
RubywaitstateComing soon

Deduct Tokens

POSThttps://api.waitstate.io/v1/deductAuth: DEDUCT_TOKEN

Atomically deducts tokens from a user's bucket and returns whether the request is allowed.

Request body

FieldTypeRequiredDescription
keystringyesOpaque user identifier. SHA-256 hash PII before sending.
bucketstringyesBucket ID or name.
costnumbernoTokens to deduct. Defaults to 1. Must be a positive number.

200 OK

{ "allowed": true }

Response headers are always set. See Response Headers.

429 Too Many Requests

{
  "error": "wait_state_active",
  "message": "Quota exhausted. Required 10 units.",
  "retry_after": 5
}

retry_after is the number of seconds until enough tokens have refilled to allow the same cost. Forward this value to your client as a Retry-After header.

Response Headers

Every response includes these headers, regardless of whether the request was allowed or denied.

HeaderDescription
X-WaitState-LimitThe bucket's capacity.
X-WaitState-RemainingTokens remaining in this user's bucket after the request.
X-WaitState-ResetUnix timestamp when the bucket will refill to full capacity at the current rate.
Retry-AfterSeconds until the same cost can be satisfied. Only present on 429 responses.

List Buckets

GEThttps://api.waitstate.io/v1/bucketsAuth: ADMIN_TOKEN

Returns all buckets, newest first.

[
  {
    "id": "01J8KM2P4XQYZ",
    "name": "graphql-api",
    "capacity": 1000,
    "refill_rate": 10,
    "region": "wnam",
    "log_webhook": null,
    "log_sampling": 1.0,
    "created_at": 1739832000,
    "updated_at": 1739832000
  },
  {
    "id": "01J8KM2P4ABCD",
    "name": "rest-api",
    "capacity": 100,
    "refill_rate": 1,
    "region": null,
    "log_webhook": "https://ingest.example.com/waitstate",
    "log_sampling": 0.1,
    "created_at": 1739831000,
    "updated_at": 1739831000
  }
]

Create Bucket

POSThttps://api.waitstate.io/v1/bucketsAuth: ADMIN_TOKEN

Creates a new bucket. Returns 201 Created with the full bucket object.

Request body

FieldTypeRequiredDefaultDescription
namestringyes-Human-readable name. Used as an alias in /v1/deduct.
capacityintegeryes-Maximum tokens per user (burst size).
refill_rateintegeryes-Tokens added per second (sustained rate).
regionstringnonullPin rate limiters to a region. Immutable after creation. See Regions.
log_webhookstringnonullHTTPS URL to POST log events to. See Log Delivery.
log_samplingnumberno1.0Fraction of events to log (0.0 to 1.0). See Log Delivery.

Bucket object

{
  "id": "01J8KM2P4XQYZ",
  "name": "my-api",
  "capacity": 100,
  "refill_rate": 1,
  "region": "wnam",
  "log_webhook": null,
  "log_sampling": 1.0,
  "created_at": 1739832000,
  "updated_at": 1739832000
}

Get Bucket

GEThttps://api.waitstate.io/v1/buckets/:idAuth: ADMIN_TOKEN

Returns a single bucket by ID. Returns 404 if not found.

Update Bucket

PUThttps://api.waitstate.io/v1/buckets/:idAuth: ADMIN_TOKEN

Partial update. All fields are optional. region is immutable and cannot be changed after creation. Returns the updated bucket object.

FieldTypeDescription
namestringNew display name.
capacityintegerNew capacity. Takes effect on the next deduct call.
refill_rateintegerNew refill rate. Takes effect on the next deduct call.
log_webhookstring or nullSet or clear the webhook URL.
log_samplingnumberNew sampling rate (0.0–1.0).

Delete Bucket

DELETEhttps://api.waitstate.io/v1/buckets/:idAuth: ADMIN_TOKEN

Deletes the bucket definition. Returns 204 No Content. This does not immediately destroy existing Durable Object state for active users; those will expire naturally.

Regions

By default, WaitState uses Smart Placement to automatically co-locate each rate limiter near your application servers. If your users are concentrated in a specific geography, you can pin a bucket to a region for more predictable latency.

Set region when creating a bucket. It cannot be changed after creation.

CodeLocation
wnamWestern North America
enamEastern North America
samSouth America
weurWestern Europe
eeurEastern Europe
apacAsia Pacific
ocOceania
afrAfrica
meMiddle East
Tip: If your API is global or you're not sure which region to pick, leave region unset. Smart Placement will handle it.

Log Delivery

WaitState can emit a structured log record for every rate limit check. Logs are useful for auditing, debugging, and building dashboards. Configure logging per bucket.

Webhook delivery

Set log_webhook to an HTTPS URL and WaitState will POST each log record there as JSON. Use this to ship logs to Datadog, Splunk, a custom ingest endpoint, or anywhere that accepts HTTP.

curl -X PUT https://api.waitstate.io/v1/buckets/<id> \
  -H "Authorization: Bearer $WAITSTATE_ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"log_webhook":"https://ingest.example.com/waitstate","log_sampling":0.1}'

The request is fire-and-forget. WaitState does not retry failed webhook calls. If your endpoint is unavailable, those log records are dropped. If no webhook is set, logs are written to internal R2 storage.

Log record format

{
  "ts": 1739832100,
  "bucket_id": "01J8KM2P4XQYZ",
  "key_hash": "a665a45920422f9d417e4867efdc4fb8a04a1f3fff1fa07e998e86f7f7a27ae3",
  "cost": 10,
  "allowed": false,
  "remaining": 0,
  "retry_after": 5
}
FieldDescription
tsUnix timestamp of the check.
bucket_idThe bucket that was checked.
key_hashSHA-256 hex digest of the raw key. The original key is never stored.
costTokens requested.
allowedtrue if the request was allowed.
remainingTokens remaining after this check.
retry_afterSeconds until retry. Only present when allowed is false.

Sampling

For high-volume buckets, reduce log volume with log_sampling. Set it to a value between 0.0 and 1.0 representing the fraction of events to log.

ValueBehavior
1.0Log every event (default)
0.1Log ~10% of events
0.01Log ~1% of events
0.0Disable logging entirely

Sampling is applied randomly and independently per request. It is not guaranteed to produce an exact percentage, but converges at volume.

Error Reference

StatusMeaning
400 The request body is missing required fields, contains invalid values, or is not valid JSON. The response body includes a human-readable error field.
401 Missing or invalid Authorization header, or the correct token was used on the wrong endpoint (e.g. DEDUCT_TOKEN on a bucket endpoint).
404 The bucket ID or name in the request does not exist.
429 The user's token bucket does not have enough tokens to cover the requested cost (error: "wait_state_active"). Check retry_after in the response body and the Retry-After header.

Join the first wave.

We're onboarding a small group of teams to help shape the product. Drop your email and we'll be in touch.

You're on the list. We'll be in touch when we're ready for you.

No spam. We'll reach out when we're ready for you.