Rate Limit Calculator

Check if your usage fits within API rate limits for different LLM providers and tiers.

Filter by Provider

Your Expected Usage

Requests Per Minute

How many requests you'll make per minute

Tokens Per Request

Average tokens per request (input + output)

Total Tokens Per Minute: 100,000

Requests Per Day (24/7): 144,000

Model	Provider	Tier	RPM Limit	TPM Limit	RPD Limit	Status
GPT-5.1	OpenAI	Free	3 3333% used	40,000 250% used	200	Exceeded
GPT-5.5	OpenAI	Tier 1	500 20% used	500,000 20% used	10,000	OK
GPT-5.1	OpenAI	Tier 1	500 20% used	800,000 13% used	10,000	OK
GPT-5 mini	OpenAI	Tier 1	3,500 3% used	2,000,000 5% used	10,000	OK
GPT-5.1	OpenAI	Tier 2	5,000 2% used	2,000,000 5% used	100,000	OK
Claude Opus 4.8	Anthropic	Tier 1	50 200% used	30,000 333% used	1,000	Exceeded
Claude Sonnet 4.6	Anthropic	Tier 1	50 200% used	50,000 200% used	1,000	Exceeded
Claude Sonnet 4.6	Anthropic	Tier 2	1,000 10% used	100,000 100% used	10,000	OK
Claude Haiku 4.5	Anthropic	Tier 1	50 200% used	50,000 200% used	1,000	Exceeded
Gemini 3 Pro	Google	Free	5 2000% used	250,000 40% used	100	Exceeded
Gemini 3 Flash	Google	Free	15 667% used	1,000,000 10% used	1,500	Exceeded

Understanding Rate Limits

RPM: Requests Per Minute - Maximum number of API calls per minute
TPM: Tokens Per Minute - Maximum number of tokens (input + output) per minute
RPD: Requests Per Day - Maximum number of API calls per 24 hours
Rate limits vary by provider tier - higher tiers have higher limits
Implement exponential backoff and retry logic to handle rate limit errors

Understanding API rate limits

Every LLM provider caps how fast you can call its API, usually along three axes: requests per minute (RPM), tokens per minute (TPM), and requests per day (RPD). Hitting any one of them returns a 429 error, so a workload can be well within your budget yet still fail in production simply because it sends requests too quickly. Limits rise as you move up usage tiers and build a billing history.

This calculator checks your expected throughput against the published limits for each model and tier, flagging whether RPM or TPM would be the bottleneck. It's the fastest way to find out whether you need to request a tier upgrade, add client-side queuing and backoff, or spread load across keys before launch.

How to use this tool

Enter your expected requests per minute and average tokens per request.
Scan the table for any model or tier where your usage exceeds the limit — those rows are flagged.
If you're over, plan for queuing, exponential backoff, or a higher usage tier.

Frequently asked questions

What's the difference between RPM and TPM?+

RPM caps how many requests you send per minute; TPM caps the total tokens across those requests. Long prompts hit the TPM limit first; many tiny requests hit the RPM limit first.

How do I raise my rate limits?+

Limits increase automatically as you move up usage tiers, which is driven by your cumulative spend and account age. Most providers also let you request a manual increase for production workloads.