Skip to main content

Quick Start

Working enforcement in under 10 minutes.

This guide runs entirely locally using the open source enforcement engine. You'll define an AI pricing policy, create customers, enforce token limits across multiple entitlements, handle an event, and see exactly what happens when a limit is hit.

By the end, you'll understand what Limitr does well enough to integrate it into your own stack — and what Limitr Cloud adds on top.

Prerequisites
  • Bun installed (curl -fsSL https://bun.sh/install | bash)
  • Node.js also works — replace bun run with npx tsx throughout

Install

mkdir limitr-quickstart && cd limitr-quickstart
bun init -y
bun add @formata/limitr

Define your policy

Create a file called policy.json. This policy models a real AI product: two plans, an AI model with separate input/output token credits, and an exchange table for mapping credit grants.

policy.json
{"policy": {

"plans": {
"starter": {
"label": "Starter Plan",
"default": true,
"entitlements": {
"ai-chat-access": {
"description": "Access entitlement to the AI chat feature"
},
"ai-chat-input": {
"description": "AI chat input tokens (hard limit)",
"limit": { "credit": "claude-sonnet-4-input", "value": 500000, "resets": true, "reset_inc": "1day" }
},
"ai-chat-output": {
"description": "AI chat output tokens (hard limit)",
"limit": { "credit": "claude-sonnet-4-output", "value": 500000, "resets": true, "reset_inc": "1day" }
}
}
},

"growth": {
"label": "Growth Plan",
"entitlements": {
"ai-chat-access": {
"description": "Access entitlement to the AI chat feature"
},
"ai-chat-input": {
"description": "AI chat input tokens (soft limit with overage cost)",
"limit": { "mode": "soft", "credit": "claude-sonnet-4-input", "value": 700000, "resets": true, "reset_inc": "1day" }
},
"ai-chat-output": {
"description": "AI chat output tokens (soft limit with overage cost)",
"limit": { "mode": "soft", "credit": "claude-sonnet-4-output", "value": 400000, "resets": true, "reset_inc": "1day" }
}
},
"topups": {
"ai-token-small": {
"description": "Grant one-time 10 AI tokens to a customer",
"credit": "ai-token",
"value": 10
}
}
}
},

"credits": {
"claude-sonnet-4-input": {
"description": "Input tokens for Claude Sonnet 4",
"overhead_cost": 0.000003,
"stof_units": "int",
"resets": true,
"pricing_model": "flat",
"price": { "amount": 0.000004 }
},
"claude-sonnet-4-output": {
"description": "Output tokens for Claude Sonnet 4",
"overhead_cost": 0.000015,
"stof_units": "int",
"resets": true,
"pricing_model": "flat",
"price": { "amount": 0.00002 }
},
"ai-token": {
"description": "Abstract user-facing credit for all AI tokens",
"stof_units": "int"
}
},

"exchange": {
"rune": { "value": 1, "currency": "usd" },
"ai-token": { "value": 1.25, "currency": "rune" },
"claude-sonnet-4-input": { "value": 0.000004, "currency": "ai-token" },
"claude-sonnet-4-output": { "value": 0.00002, "currency": "ai-token" }
}
}}

What this defines:

  • Two discrete credits — one for input tokens, one for output — each with a real overhead_cost and price attached. This is how Limitr computes margin per customer.
  • One abstract credit (ai-token) — the customer-facing unit, mapped to discrete model credits through the exchange table. Used for grants and top-ups.
  • Two plans with daily resets. Starter limits are hard — requests are blocked at the limit. Growth limits are soft — the included allocation is enforced, then overage is billed per the credit price.

Load the policy and create customers

Create index.ts:

index.ts
import { Limitr } from '@formata/limitr';
import { readFileSync } from 'fs';

const policyDoc = readFileSync('./policy.json', 'utf-8');
const policy = await Limitr.new(policyDoc, 'json');

await policy.createCustomer('user_starter', 'starter');
await policy.createCustomer('user_growth', 'growth');

console.log('Policy loaded. Customers created.');
console.log('Starter limit:', await policy.limit('user_starter', 'ai-chat-input'));
console.log('Growth limit: ', await policy.limit('user_growth', 'ai-chat-input'));
bun run index.ts
Policy loaded. Customers created.
Starter limit: 500000
Growth limit: 700000

Enforce a limit

Replace the contents of index.ts. This simulates a customer consuming tokens across multiple requests until they hit their hard limit.

index.ts
import { Limitr } from '@formata/limitr';
import { readFileSync } from 'fs';

const policy = await Limitr.new(readFileSync('./policy.json', 'utf-8'), 'json');
await policy.createCustomer('user_starter', 'starter');

const requests = [120000, 95000, 180000, 60000, 90000];

for (const tokens of requests) {
if (await policy.allow('user_starter', 'ai-chat-access')) {
if (await policy.allow('user_starter', 'ai-chat-input', tokens)) {
await policy.allow('user_starter', 'ai-chat-output', tokens / 2.2);

const remaining = await policy.remaining('user_starter', 'ai-chat-input') ?? 0;
const output = await policy.value('user_starter', 'ai-chat-output') ?? 0;
console.log(`✓ Allowed ${tokens.toLocaleString()} input tokens — ${remaining.toLocaleString()} remaining, ${output.toLocaleString()} output tokens used`);
} else {
const remaining = await policy.remaining('user_starter', 'ai-chat-input') ?? 0;
console.log(`✗ Denied ${tokens.toLocaleString()} tokens — limit exceeded (${remaining.toLocaleString()} remaining)`);
}
} else {
console.log(`no ai-chat-access on this plan`);
}
}
✓ Allowed 120,000 input tokens — 380,000 remaining, 54,545 output tokens used
✓ Allowed 95,000 input tokens — 285,000 remaining, 97,726 output tokens used
✓ Allowed 180,000 input tokens — 105,000 remaining, 179,544 output tokens used
✓ Allowed 60,000 input tokens — 45,000 remaining, 206,816 output tokens used
✗ Denied 90,000 tokens — limit exceeded (45,000 remaining)

The fifth request is blocked before it reaches your LLM provider. No cost is incurred for that request.


Handle an event

Limitr fires events in-process that you can subscribe to and handle however your application needs. The meter-limit event fires whenever a customer hits a hard limit, with the full context available.

index.ts
import { Limitr } from '@formata/limitr';
import { readFileSync } from 'fs';

const policy = await Limitr.new(readFileSync('./policy.json', 'utf-8'), 'json');
await policy.createCustomer('user_starter', 'starter');

policy.addHandler('limit-reached-handler', (key: string, value: unknown) => {
if (key === 'meter-limit') {
const record = JSON.parse(value as string);
const { meter, customer, plan, entitlement, credit } = record;

console.log(`
Customer ${customer.id} hit their limit for ${entitlement}.
Used: ${meter.value} tokens. Total: ${meter.limit}. Attempted: ${meter.invalid}.
Plan: ${plan}. Credit: ${credit.description}.
`);
}
});

const requests = [120000, 95000, 180000, 60000, 90000];
for (const tokens of requests) {
await policy.allow('user_starter', 'ai-chat-input', tokens);
}
Customer user_starter hit their limit for ai-chat-input.
Used: 455000 tokens. Total: 500000. Attempted: 545000.
Plan: starter. Credit: Input tokens for Claude Sonnet 4.
tip

You can define additional event handlers, and embed notification conditions directly in your policy. In Limitr Cloud, these route automatically to Slack or email in real time — no additional code required.


Read customer state

At any point you can read the full state of a customer — useful for driving usage meters, plan badges, upgrade prompts, or any customer-facing UI.

const customer = await policy.customer('user_starter');
console.log(JSON.stringify(customer, null, 2));

// Or read individual values
const pct = await policy.value('user_starter', 'ai-chat-input', true);
console.log(`Usage: ${pct}% of daily token budget`);

The policy itself is serializable too, useful for driving pricing UI components directly:

const obj = policy.doc.record(); // JS Record
const json = policy.doc.stringify('json');
const yaml = policy.doc.stringify('yaml');
const toml = policy.doc.stringify('toml');

Apply a credit grant

Credits can be granted to customers on a one-time or recurring basis. They're applied toward entitlements with soft limits when the included allocation runs out — drawing down from the grant before overage events start.

index.ts
import { Limitr } from '@formata/limitr';
import { readFileSync } from 'fs';

const policy = await Limitr.new(readFileSync('./policy.json', 'utf-8'), 'json');
await policy.createCustomer('user_growth', 'growth');

// Grant 10 abstract AI tokens to this customer
await policy.applyCustomerTopup('user_growth', 'ai-token-small');

// 700,000 included, then the grant covers overage (via exchange table)
const requests = [520000, 98000, 200000, 60000, 900000];
for (const tokens of requests) {
await policy.allow('user_growth', 'ai-chat-input', tokens);
}

const customer = await policy.customer('user_growth');
console.log(customer!.meters);

for (const [_, gr] of Object.entries(customer!.grants)) {
const grant = gr as any;
if (grant.topup === 'ai-token-small') {
console.log('Grant starting value:', grant.starting_value);
console.log('Grant current value: ', Math.round(grant.value * 100) / 100);
}
}
{
"ai-chat-input": {
credit: "claude-sonnet-4-input",
started: 1776962288909,
value: 1778000,
},
"ai-chat-output": {
credit: "claude-sonnet-4-output",
started: 1776962288909,
value: 0,
},
}
Grant starting value: 10
Grant current value: 5.69

What you've built

In about 10 minutes, you've got a working policy that:

  • Models real AI model costs with discrete input/output credits
  • Enforces hard and soft limits per customer, per plan
  • Blocks requests before cost is incurred
  • Meters usage automatically on every check
  • Fires in-process events when limits are hit
  • Supports credit grants that drain before overage starts
  • Exposes full customer state at any time

The enforcement engine runs entirely in your process — no network calls on the hot path, no external dependencies at runtime.


What Limitr Cloud adds

Everything above runs locally with no Limitr account. When you're ready for production, Cloud adds:

Policy management without deploys. Your policy lives in the Cloud dashboard, versioned and auditable. Any authorized team member can update limits, add plans, or adjust overage rules without a PR or a deploy. Changes take effect immediately.

Live per-customer margin. Because your credit costs are defined in the policy, Limitr knows what each enforcement check costs you in real terms. Cloud surfaces this as cost-to-serve versus captured revenue per customer — not at month-end, right now.

Usage-based invoicing. Generated directly from metered consumption. What your customers used is what they owe.

Team alerting. Notification conditions defined in your policy route automatically to Slack, email, or webhooks in real time.

Switching to Cloud is a one-line change:

// Local (open source)
const policy = await Limitr.new(policyDoc, 'json');

// Cloud — everything else stays the same
const policy = await Limitr.cloud({ token });

Get started with Limitr Cloud →