Skip to main content

API / MCP Call Limits

Per-call billing and rate limiting.

Per-plan rate limiting with daily and monthly reset windows, soft limits for overage billing on high-volume plans, per-endpoint entitlements for granular control, and a pattern for per-API-key customer resolution.


Policy

policy:
credits:
api_call:
description: API call
overhead_cost: 0.000008 # infrastructure cost per call
pricing_model: flat
price: { amount: 0.00001 } # what you charge per call in overage
stof_units: int
resets: true

# Separate credit for expensive endpoints — different cost/price profile
export_call:
description: Export/batch API call
overhead_cost: 0.0002
pricing_model: flat
price: { amount: 0.0004 }
stof_units: int
resets: true

plans:
free:
label: Free
period: monthly
default: true
entitlements:
api_access:
description: API access

api_calls_daily:
description: API calls — hard limit, 100/day
limit: { credit: api_call, mode: hard, value: 100, resets: true, reset_inc: 1day }

api_calls_monthly:
description: API calls — hard limit, 1000/month
limit: { credit: api_call, mode: hard, value: 1000, resets: true, reset_inc: 30days }

# No export_calls on free — check() returns false, no entitlement exists

pro:
label: Pro
period: monthly
entitlements:
api_access:
description: API access

api_calls_daily:
description: API calls — hard limit, 5000/day
limit: { credit: api_call, mode: hard, value: 5000, resets: true, reset_inc: 1day }

api_calls_monthly:
description: API calls — soft limit, 50000/month, overage billed
limit: { credit: api_call, mode: soft, value: 50000, resets: true, reset_inc: 30days }

export_calls:
description: Export/batch API calls — hard limit, 500/month
limit: { credit: export_call, mode: hard, value: 500, resets: true, reset_inc: 30days }

enterprise:
label: Enterprise
period: monthly
entitlements:
api_access:
description: API access

api_calls_daily:
description: API calls — observe, no daily cap, metered for reporting
limit: { credit: api_call, mode: observe, resets: true, reset_inc: 1day }

api_calls_monthly:
description: API calls — soft limit, 500000/month, overage billed
limit: { credit: api_call, mode: soft, value: 500000, resets: true, reset_inc: 30days }

export_calls:
description: Export/batch API calls — soft limit, overage billed
limit: { credit: export_call, mode: soft, value: 5000, resets: true, reset_inc: 30days }

Integration

import { Limitr } from '@formata/limitr';
import { readFileSync } from 'fs';

const policy = await Limitr.new(readFileSync('./policy.yaml', 'utf-8'), 'yaml');

policy.addHandler('billing', (key: string, value: unknown) => {
if (key === 'meter-overage') {
const event = JSON.parse(value as string);
billing.queueCharge({
customerId: event.customer.id,
entitlement: event.entitlement,
credit: event.credit.description,
units: event.overage,
});
}
});


// ── API key → customer resolution ─────────────────────────────────────────────

// Register each API key as an alt ID on its owner customer at key creation time
async function registerApiKey(customerId: string, apiKey: string) {
await policy.addAltID(customerId, apiKey);
}

// In middleware: policy.customer() resolves against both primary IDs and alt IDs
async function resolveCustomer(apiKey: string) {
return await policy.customer(apiKey); // full customer object, or null
}


// ── Standard API call enforcement ─────────────────────────────────────────────

type EndpointType = 'standard' | 'export';

async function enforceApiCall(
apiKey: string,
endpointType: EndpointType = 'standard'
): Promise<{ allowed: boolean; error?: string; headers: Record<string, string> }> {
const customer = await policy.customer(apiKey);
if (!customer) {
return { allowed: false, error: 'Invalid API key', headers: {} };
}
const customerId = customer.id as string;

// Feature gate
if (!await policy.check(customerId, 'api_access')) {
return { allowed: false, error: 'API access not enabled on this plan', headers: {} };
}

if (endpointType === 'export') {
// Gate first — returns false if export_calls entitlement doesn't exist on this plan
if (!await policy.check(customerId, 'export_calls')) {
return {
allowed: false,
error: 'Export API requires Pro or Enterprise',
headers: await buildRateLimitHeaders(customerId),
};
}
const allowed = await policy.increment(customerId, 'export_calls');
if (!allowed) {
return {
allowed: false,
error: 'Monthly export call limit reached',
headers: await buildRateLimitHeaders(customerId, 'export_calls'),
};
}
return { allowed: true, headers: await buildRateLimitHeaders(customerId) };
}

// Standard call: check daily first — cheaper to fail fast on the tighter window
const dailyAllowed = await policy.increment(customerId, 'api_calls_daily');
if (!dailyAllowed) {
return {
allowed: false,
error: 'Daily API call limit reached',
headers: await buildRateLimitHeaders(customerId, 'api_calls_daily'),
};
}

// Then meter monthly — soft on Pro/Enterprise, hard on Free
const monthlyAllowed = await policy.increment(customerId, 'api_calls_monthly');
if (!monthlyAllowed) {
return {
allowed: false,
error: 'Monthly API call limit reached',
headers: await buildRateLimitHeaders(customerId, 'api_calls_monthly'),
};
}

return { allowed: true, headers: await buildRateLimitHeaders(customerId) };
}

async function buildRateLimitHeaders(
customerId: string,
entitlement = 'api_calls_daily'
): Promise<Record<string, string>> {
const remaining = await policy.remaining(customerId, entitlement) ?? 0;
const limit = await policy.limit(customerId, entitlement) ?? 0;
return {
'X-RateLimit-Limit': String(limit),
'X-RateLimit-Remaining': String(Math.max(0, remaining)),
};
}


// ── Express/Hono middleware ────────────────────────────────────────────────────

async function apiRateLimitMiddleware(req: Request, next: () => Promise<Response>) {
const apiKey = req.headers.get('Authorization')?.replace('Bearer ', '');
if (!apiKey) return new Response('Unauthorized', { status: 401 });

const isExport = req.url.includes('/export') || req.url.includes('/batch');
const result = await enforceApiCall(apiKey, isExport ? 'export' : 'standard');

if (!result.allowed) {
return new Response(JSON.stringify({ error: result.error }), {
status: 429,
headers: { 'Content-Type': 'application/json', ...result.headers },
});
}

const response = await next();
for (const [k, v] of Object.entries(result.headers)) {
response.headers.set(k, v);
}
return response;
}

Notes

Why two entitlements (daily and monthly)? Daily and monthly are different constraints. The daily limit prevents burst abuse. The monthly limit is the billing boundary. Two separate entitlements let you fail fast on the daily limit before touching the monthly meter at all — and let them have different enforcement modes (the daily can be hard while the monthly is soft).

One edge case: when both api_calls_daily and api_calls_monthly are incremented on a single request, the daily meter is always checked first. If the daily passes but the monthly fails, the daily meter has already been incremented for this request. In practice this rarely matters, but it's worth knowing.

observe mode for Enterprise daily limits — Enterprise customers have monthly commitments, not daily caps. Setting the daily limit to observe means every call is metered for dashboards and anomaly detection, but nothing is ever blocked at the daily window. The monthly soft limit still enforces the contractual ceiling and fires overage events.

API key as alt ID — Limitr resolves customer() lookups against both primary IDs and alt IDs. Register each API key as an alt ID once at key creation time, and your middleware never needs a separate key-to-customer mapping. policy.customer(apiKey) returns the full customer object directly.

Per-endpoint entitlementsexport_calls is a separate credit with a different cost/price profile. This is the pattern for metering expensive endpoints separately from standard calls. It also lets you gate export access entirely by plan — policy.check(customerId, 'export_calls') returns false if the entitlement doesn't exist on the plan at all.

Rate limit headersbuildRateLimitHeaders reads from the daily entitlement by default since that's the most immediately relevant for the caller. For plans nearing their monthly ceiling, you could add a second header set using api_calls_monthly.