Documentation - WaitState

Quick Start

Get your first rate limit check working in two steps.

1. Create a bucket

A bucket defines your rate limit policy. Use your ADMIN_TOKEN to create one. You only need to do this once.

curl -X POST https://api.waitstate.io/v1/buckets \
  -H "Authorization: Bearer $WAITSTATE_ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"name":"my-api","capacity":100,"refill_rate":1}'

Response:

{
  "id": "01J8KM2P4XQYZ",
  "name": "my-api",
  "capacity": 100,
  "refill_rate": 1,
  "region": null,
  "log_webhook": null,
  "log_sampling": 1,
  "created_at": 1739832000,
  "updated_at": 1739832000
}

2. Check a limit

Call POST /v1/deduct before each operation you want to rate limit. Use your DEDUCT_TOKEN.

curl -X POST https://api.waitstate.io/v1/deduct \
  -H "Authorization: Bearer $WAITSTATE_DEDUCT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"key":"user_123","bucket":"my-api","cost":1}'

Allowed (200):

HTTP/1.1 200 OK
X-WaitState-Limit: 100
X-WaitState-Remaining: 99
X-WaitState-Reset: 1739832160

{"allowed":true}

Denied (429):

HTTP/1.1 429 Too Many Requests
Retry-After: 5

{"error":"wait_state_active","message":"Quota exhausted. Required 1 units.","retry_after":5}

Note: In production, always SHA-256 hash your user IDs before sending them as the key. See Keys below.

Core Concepts

Buckets

A bucket defines your rate limit policy for a given operation. It has two parameters:

capacity: the maximum number of tokens a user can accumulate (controls burst size)
refill_rate: tokens added per second (controls sustained throughput)

Example: capacity: 100, refill_rate: 1 allows a burst of up to 100 requests, then sustains 1 request per second. Each user gets their own isolated token bucket; the policy applies per user, not globally.

Create buckets once with your admin token. You can update capacity, refill_rate, and logging config at any time. The region field is set at creation and cannot be changed.

Keys

The key field identifies the entity being rate limited, typically a user ID or API key. WaitState creates an isolated token bucket for every unique key + bucket pair. There is no limit to the number of keys per bucket.

Important: Never send raw user IDs, emails, or other PII as the key. Always hash them first. WaitState stores key hashes in logs. The original value is never recorded.

Hash before sending:

Shell Node.js Python Go Ruby

echo -n "user@example.com" | sha256sum | awk '{print $1}'
# a665a45920422f9d417e4867efdc4fb8a04a1f3fff1fa07e998e86f7f7a27ae3

import { createHash } from 'node:crypto'
const key = createHash('sha256').update(userId).digest('hex')

import hashlib
key = hashlib.sha256(user_id.encode()).hexdigest()

import (
    "crypto/sha256"
    "encoding/hex"
)

h := sha256.Sum256([]byte(userID))
key := hex.EncodeToString(h[:])

require 'digest'
key = Digest::SHA256.hexdigest(user_id)

Cost

The cost field is the number of tokens to deduct for this request. You calculate it client-side; WaitState never sees your request payload or protocol. This makes it work equally well with REST, GraphQL, gRPC, WebSockets, or any other protocol.

Use case	Cost formula
Simple REST API	`cost: 1` per request
GraphQL (complexity limiting)	`cost: queryComplexity`
File uploads	`cost: Math.ceil(bytes / 1024)`
LLM token budgets	`cost: inputTokens + outputTokens`

If cost is omitted it defaults to 1. Cost must be a positive number.

Common rate limit patterns

Set capacity and refill_rate on the bucket, then pass cost: 1 per request. The math: refill_rate is tokens per second, so multiply by 60 to get tokens per minute.

Target limit	capacity	refill_rate	Notes
60 req/min with burst	`60`	`1`	Users can burn 60 tokens instantly, then get 1/sec (60/min) sustained.
60 req/min, no burst	`1`	`1`	Capacity = 1 means no accumulation. Strict 1 request per second.
10 req/sec	`10`	`10`	10 tokens burst, 10 refilled per second. Sustained 600/min.
100 req/sec with headroom	`500`	`100`	Allows a 5-second burst before throttling to 100/sec sustained.
GraphQL complexity budget	`5000`	`83`	~5,000 complexity points per minute. Pass query complexity as `cost`.
LLM token budget	`100000`	`1666`	~100k tokens/min sustained. Pass `inputTokens + outputTokens` as `cost`.

Tip: To convert any "N requests per minute" limit: set refill_rate = N / 60 (round to nearest integer) and capacity to however much burst you want to allow.

Authentication

WaitState uses two separate bearer tokens. They are scoped to different operations intentionally. If your production environment is compromised, your bucket configuration remains safe.

Token	Endpoints	Where to use it
`ADMIN_TOKEN`	`/v1/buckets/*`	CI/CD pipelines, scripts. Never embed in client-facing code.
`DEDUCT_TOKEN`	`/v1/deduct`	Your application servers. Embed as an env var.

Pass either token as a standard HTTP bearer:

Authorization: Bearer <your-token>

All requests without a valid token return 401 Unauthorized. Tokens sent to the wrong endpoint also return 401. The DEDUCT_TOKEN cannot access bucket CRUD and vice versa.

Integration Examples

These examples show how to call /v1/deduct from common server-side runtimes. The pattern is always the same: hash the user ID, POST to WaitState, branch on the status code.

cURL TypeScript Python Go Ruby

# Hash your user ID before sending -- never send raw PII
KEY=$(echo -n "user_123" | sha256sum | awk '{print $1}')

curl -X POST https://api.waitstate.io/v1/deduct \
  -H "Authorization: Bearer $WAITSTATE_DEDUCT_TOKEN" \
  -H "Content-Type: application/json" \
  -d "{\"key\":\"$KEY\",\"bucket\":\"my-api\",\"cost\":1}"

# 200 OK  ->  {"allowed":true}
# 429     ->  {"error":"wait_state_active","retry_after":5}

import { createHash } from 'node:crypto'

const TOKEN = process.env.WAITSTATE_DEDUCT_TOKEN!

function sha256(s: string) {
  return createHash('sha256').update(s).digest('hex')
}

export async function allow(userId: string, cost = 1): Promise<boolean> {
  const res = await fetch('https://api.waitstate.io/v1/deduct', {
    method: 'POST',
    headers: {
      Authorization: `Bearer ${TOKEN}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({ key: sha256(userId), bucket: 'my-api', cost }),
  })

  if (res.ok) return true

  if (res.status === 429) {
    const { retry_after } = await res.json() as { retry_after: number }
    // forward to caller: reply.header('Retry-After', String(retry_after))
    return false
  }

  throw new Error(`WaitState error: ${res.status}`)
}

import hashlib, os
import httpx

TOKEN = os.environ['WAITSTATE_DEDUCT_TOKEN']

def sha256(s: str) -> str:
    return hashlib.sha256(s.encode()).hexdigest()

def allow(user_id: str, cost: int = 1) -> bool:
    r = httpx.post(
        'https://api.waitstate.io/v1/deduct',
        headers={'Authorization': f'Bearer {TOKEN}'},
        json={'key': sha256(user_id), 'bucket': 'my-api', 'cost': cost},
    )
    if r.status_code == 200:
        return True
    if r.status_code == 429:
        retry_after = r.json()['retry_after']
        # raise RateLimitError(retry_after=retry_after)
        return False
    r.raise_for_status()

package ratelimit

import (
	"bytes"
	"crypto/sha256"
	"encoding/hex"
	"encoding/json"
	"fmt"
	"net/http"
	"os"
)

var token = os.Getenv("WAITSTATE_DEDUCT_TOKEN")

type Result struct {
	Allowed    bool
	RetryAfter int
}

func Allow(userID string, cost int) (Result, error) {
	h := sha256.Sum256([]byte(userID))
	body, _ := json.Marshal(map[string]any{
		"key":    hex.EncodeToString(h[:]),
		"bucket": "my-api",
		"cost":   cost,
	})

	req, _ := http.NewRequest(http.MethodPost,
		"https://api.waitstate.io/v1/deduct",
		bytes.NewReader(body))
	req.Header.Set("Authorization", "Bearer "+token)
	req.Header.Set("Content-Type", "application/json")

	resp, err := http.DefaultClient.Do(req)
	if err != nil {
		return Result{}, err
	}
	defer resp.Body.Close()

	switch resp.StatusCode {
	case 200:
		return Result{Allowed: true}, nil
	case 429:
		var r struct {
			RetryAfter int `json:"retry_after"`
		}
		json.NewDecoder(resp.Body).Decode(&r)
		return Result{Allowed: false, RetryAfter: r.RetryAfter}, nil
	default:
		return Result{}, fmt.Errorf("waitstate: status %d", resp.StatusCode)
	}
}

require 'digest'
require 'json'
require 'net/http'

TOKEN = ENV.fetch('WAITSTATE_DEDUCT_TOKEN')

def allow(user_id, cost: 1)
  key = Digest::SHA256.hexdigest(user_id)
  uri = URI('https://api.waitstate.io/v1/deduct')

  Net::HTTP.start(uri.host, uri.port, use_ssl: true) do |http|
    req = Net::HTTP::Post.new(uri,
      'Authorization' => "Bearer #{TOKEN}",
      'Content-Type'  => 'application/json')
    req.body = { key:, bucket: 'my-api', cost: }.to_json

    res = http.request(req)
    return true  if res.code == '200'
    return false if res.code == '429'
    raise "WaitState error: #{res.code}"
  end
end

The bucket field accepts either the bucket id or name. Using the name is more readable; using the ID is slightly faster and avoids breaking if you rename a bucket.

Client Libraries

Official client libraries are in development. They will handle SHA-256 hashing, response header parsing, retry logic, and error propagation automatically. All libraries will be MIT-licensed and open source.

Language	Package	Status
Node.js / TypeScript	`@waitstate/node`	Coming soon
Python	`waitstate`	Coming soon
Go	`github.com/waitstate/waitstate-go`	Coming soon
Ruby	`waitstate`	Coming soon

Deduct Tokens

POSThttps://api.waitstate.io/v1/deductAuth: DEDUCT_TOKEN

Atomically deducts tokens from a user's bucket and returns whether the request is allowed.

Request body

Field	Type	Required	Description
`key`	string	yes	Opaque user identifier. SHA-256 hash PII before sending.
`bucket`	string	yes	Bucket ID or name.
`cost`	number	no	Tokens to deduct. Defaults to `1`. Must be a positive number.

200 OK

{ "allowed": true }

Response headers are always set. See Response Headers.

429 Too Many Requests

{
  "error": "wait_state_active",
  "message": "Quota exhausted. Required 10 units.",
  "retry_after": 5
}

retry_after is the number of seconds until enough tokens have refilled to allow the same cost. Forward this value to your client as a Retry-After header.

Response Headers

Every response includes these headers, regardless of whether the request was allowed or denied.

Header	Description
`X-WaitState-Limit`	The bucket's `capacity`.
`X-WaitState-Remaining`	Tokens remaining in this user's bucket after the request.
`X-WaitState-Reset`	Unix timestamp when the bucket will refill to full capacity at the current rate.
`Retry-After`	Seconds until the same cost can be satisfied. Only present on 429 responses.

List Buckets

GEThttps://api.waitstate.io/v1/bucketsAuth: ADMIN_TOKEN

Returns all buckets, newest first.

[
  {
    "id": "01J8KM2P4XQYZ",
    "name": "graphql-api",
    "capacity": 1000,
    "refill_rate": 10,
    "region": "wnam",
    "log_webhook": null,
    "log_sampling": 1.0,
    "created_at": 1739832000,
    "updated_at": 1739832000
  },
  {
    "id": "01J8KM2P4ABCD",
    "name": "rest-api",
    "capacity": 100,
    "refill_rate": 1,
    "region": null,
    "log_webhook": "https://ingest.example.com/waitstate",
    "log_sampling": 0.1,
    "created_at": 1739831000,
    "updated_at": 1739831000
  }
]

Create Bucket

POSThttps://api.waitstate.io/v1/bucketsAuth: ADMIN_TOKEN

Creates a new bucket. Returns 201 Created with the full bucket object.

Request body

Field	Type	Required	Default	Description
`name`	string	yes	-	Human-readable name. Used as an alias in `/v1/deduct`.
`capacity`	integer	yes	-	Maximum tokens per user (burst size).
`refill_rate`	integer	yes	-	Tokens added per second (sustained rate).
`region`	string	no	`null`	Pin rate limiters to a region. Immutable after creation. See Regions.
`log_webhook`	string	no	`null`	HTTPS URL to POST log events to. See Log Delivery.
`log_sampling`	number	no	`1.0`	Fraction of events to log (0.0 to 1.0). See Log Delivery.

Bucket object

{
  "id": "01J8KM2P4XQYZ",
  "name": "my-api",
  "capacity": 100,
  "refill_rate": 1,
  "region": "wnam",
  "log_webhook": null,
  "log_sampling": 1.0,
  "created_at": 1739832000,
  "updated_at": 1739832000
}

Get Bucket

GEThttps://api.waitstate.io/v1/buckets/:idAuth: ADMIN_TOKEN

Returns a single bucket by ID. Returns 404 if not found.

Update Bucket

PUThttps://api.waitstate.io/v1/buckets/:idAuth: ADMIN_TOKEN

Partial update. All fields are optional. region is immutable and cannot be changed after creation. Returns the updated bucket object.

Field	Type	Description
`name`	string	New display name.
`capacity`	integer	New capacity. Takes effect on the next deduct call.
`refill_rate`	integer	New refill rate. Takes effect on the next deduct call.
`log_webhook`	string or null	Set or clear the webhook URL.
`log_sampling`	number	New sampling rate (0.0–1.0).

Delete Bucket

DELETEhttps://api.waitstate.io/v1/buckets/:idAuth: ADMIN_TOKEN

Deletes the bucket definition. Returns 204 No Content. This does not immediately destroy existing Durable Object state for active users; those will expire naturally.

Regions

By default, WaitState uses Smart Placement to automatically co-locate each rate limiter near your application servers. If your users are concentrated in a specific geography, you can pin a bucket to a region for more predictable latency.

Set region when creating a bucket. It cannot be changed after creation.

Code	Location
`wnam`	Western North America
`enam`	Eastern North America
`sam`	South America
`weur`	Western Europe
`eeur`	Eastern Europe
`apac`	Asia Pacific
`oc`	Oceania
`afr`	Africa
`me`	Middle East

Tip: If your API is global or you're not sure which region to pick, leave region unset. Smart Placement will handle it.

Log Delivery

WaitState can emit a structured log record for every rate limit check. Logs are useful for auditing, debugging, and building dashboards. Configure logging per bucket.

Webhook delivery

Set log_webhook to an HTTPS URL and WaitState will POST each log record there as JSON. Use this to ship logs to Datadog, Splunk, a custom ingest endpoint, or anywhere that accepts HTTP.

curl -X PUT https://api.waitstate.io/v1/buckets/<id> \
  -H "Authorization: Bearer $WAITSTATE_ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"log_webhook":"https://ingest.example.com/waitstate","log_sampling":0.1}'

The request is fire-and-forget. WaitState does not retry failed webhook calls. If your endpoint is unavailable, those log records are dropped. If no webhook is set, logs are written to internal R2 storage.

Log record format

{
  "ts": 1739832100,
  "bucket_id": "01J8KM2P4XQYZ",
  "key_hash": "a665a45920422f9d417e4867efdc4fb8a04a1f3fff1fa07e998e86f7f7a27ae3",
  "cost": 10,
  "allowed": false,
  "remaining": 0,
  "retry_after": 5
}

Field	Description
`ts`	Unix timestamp of the check.
`bucket_id`	The bucket that was checked.
`key_hash`	SHA-256 hex digest of the raw key. The original key is never stored.
`cost`	Tokens requested.
`allowed`	`true` if the request was allowed.
`remaining`	Tokens remaining after this check.
`retry_after`	Seconds until retry. Only present when `allowed` is `false`.

Sampling

For high-volume buckets, reduce log volume with log_sampling. Set it to a value between 0.0 and 1.0 representing the fraction of events to log.

Value	Behavior
`1.0`	Log every event (default)
`0.1`	Log ~10% of events
`0.01`	Log ~1% of events
`0.0`	Disable logging entirely

Sampling is applied randomly and independently per request. It is not guaranteed to produce an exact percentage, but converges at volume.

Error Reference

Status	Meaning
`400`	The request body is missing required fields, contains invalid values, or is not valid JSON. The response body includes a human-readable `error` field.
`401`	Missing or invalid `Authorization` header, or the correct token was used on the wrong endpoint (e.g. DEDUCT_TOKEN on a bucket endpoint).
`404`	The bucket ID or name in the request does not exist.
`429`	The user's token bucket does not have enough tokens to cover the requested cost (`error: "wait_state_active"`). Check `retry_after` in the response body and the `Retry-After` header.

Quick Start

1. Create a bucket

2. Check a limit

Core Concepts

Buckets

Keys

Cost

Common rate limit patterns

Authentication

Integration Examples

Client Libraries

Deduct Tokens

Request body

200 OK

429 Too Many Requests

Response Headers

List Buckets

Create Bucket

Request body

Bucket object

Get Bucket

Update Bucket

Delete Bucket

Regions

Log Delivery

Webhook delivery

Log record format

Sampling

Error Reference

Join the first wave.