Perplexity API Pricing: Full Breakdown of Costs (Jun 2026)
On this page
This guide breaks down current Perplexity API pricing across every Sonar model, what actually drives your bill, how to lower it, and how to use Perplexity for free.
How much does the Perplexity API cost?
The Perplexity API (the Sonar API) charges $3 per million input tokens and $15 per million output tokens on its flagship model, Sonar Pro. The cheapest model, Sonar, is $1 per million tokens for both input and output. Every Sonar call also runs a live web search and returns citations, which is the part that separates Perplexity from a plain chat model.
| Model | Input ($/1M) | Output ($/1M) |
|---|---|---|
| Sonar | $1 | $1 |
| Sonar Pro | $3 | $15 |
| Sonar Reasoning Pro | $2 | $8 |
| Sonar Deep Research | $2 | $8 |
Two caveats before you budget from this table. First, token rates are only part of the cost: Perplexity adds a per-request fee that scales with how much web content the call retrieves, and Sonar Deep Research adds three more meters on top. Second, pricing is pay-as-you-go in USD, billed against prepaid credits, with no monthly minimum.
How Perplexity API pricing works
Perplexity bills per token, split into input and output. Input tokens are everything you send (your prompt, system message, and any context); output tokens are everything the model returns. A token is roughly four characters of English text, so 1,000 tokens is about 750 words.
On top of tokens, Perplexity layers several mechanics that most LLM APIs do not have.
The per-request search fee
Every Sonar, Sonar Pro, and Sonar Reasoning Pro call carries a request fee billed per 1,000 requests, separate from tokens. The fee depends on the search context size you set.
| Model | Low | Medium | High |
|---|---|---|---|
| Sonar | $5 | $8 | $12 |
| Sonar Pro | $6 | $10 | $14 |
| Sonar Reasoning Pro | $6 | $10 | $14 |
These are dollars per 1,000 requests. On short, high-volume queries, this fee can be a larger share of the bill than the tokens themselves.
Search context size
Search context size controls how much web content the model retrieves before answering: Low (the default), Medium, or High. Higher settings pull more sources and cost more per request. This is not the same as the context window, which is the maximum number of tokens the model can process in one call. Search context size affects the request fee; the context window affects token limits.
Reasoning, citation, and search-query meters (Deep Research)
Sonar Deep Research bills on five separate meters instead of the standard token-plus-request model:
| Meter | Rate |
|---|---|
| Input tokens | $2 / 1M |
| Output tokens | $8 / 1M |
| Citation tokens | $2 / 1M |
| Reasoning tokens | $3 / 1M |
| Search queries | $5 / 1K |
Citation tokens cover the source links and references the model assembles. Reasoning tokens cover its step-by-step analysis. Search queries are the individual searches it runs, which it decides on its own; you can influence the count with the reasoning_effort parameter but cannot set it directly. Deep Research does not carry the standard per-request search fee, because the per-search-query meter replaces it.
Pro Search mode for Sonar Pro
Sonar Pro can run in a Pro Search mode that performs multi-step searches and URL fetches for complex queries. It requires streaming and is selected with the search_type parameter.
| Search type | Request fee per 1K (Low / Medium / High) |
|---|---|
fast (default) |
$6 / $10 / $14 |
pro |
$14 / $18 / $22 |
auto |
Varies by classification |
Token rates stay at Sonar Pro's $3 input and $15 output. Only the request fee changes.
What makes your bill higher than expected
Medium and High search context on every call
Search context size defaults to Low. If you raise it globally to Medium or High for better answers, you raise the request fee on every single call. On Sonar Pro that is the difference between $6 and $14 per 1,000 requests. Set it per query based on what each query needs.
Reasoning tokens dominate Deep Research
In Perplexity's own published examples, a single Deep Research call generates tens to hundreds of thousands of reasoning tokens. One example call ran 339,594 reasoning tokens at $3 per million, which is about $1.02 in reasoning alone, pushing the call total to $1.32. The output you read is a small fraction of what you pay for.
The request fee is flat per call
The per-request fee is charged once per request regardless of how short the answer is. A workload of many small queries is dominated by request fees, not tokens. If you can consolidate several questions into one call, you pay the fee once instead of many times.
Pro Search routing to pro
If you use auto search type, simple queries stay at the fast rate but complex ones route to pro, where the request fee runs $14 to $22 per 1,000 instead of $6 to $14. The routing is automatic, so a shift in query mix changes your bill without any code change.
Agent API tool calls add up
On the Agent API, each tool invocation is billed on its own: web search at $0.005, URL fetch at $0.0005, people search and finance search at $0.005 each. A single request can fire several tool calls, and each one is added to the model token cost.
How to reduce Perplexity API costs
1. Match the model to the task
Model choice moves the bill more than any other setting. Sonar at $1/$1 handles straightforward factual lookups. Sonar Pro at $3/$15 is for queries that need richer sources and longer answers. Sonar Reasoning Pro and Deep Research cost more per call and should be reserved for work that genuinely needs multi-step analysis. Running everything through Deep Research is the most common way to overspend.
2. Keep search context at Low
Low is the default and the cheapest request tier. Raise context to Medium or High only on the specific queries that need deeper retrieval, not as a global default. On Sonar this is a $5 versus $12 difference per 1,000 requests.
3. Use Deep Research sparingly and cap reasoning effort
Deep Research can cost forty cents to over a dollar per call. Use it only for real research tasks, and lower the reasoning_effort parameter when you do, since it influences both reasoning tokens and the number of searches the model runs.
4. Control output length
Output is the most expensive token type on Sonar Pro at $15 per million, five times the input rate. Ask for concise answers, set response limits, and avoid prompting for long restatements of context you already supplied.
5. Use the Search API when you only need links
If your application only needs web results to process yourself, the Search API returns raw results at $5 per 1,000 requests with no token cost. That avoids paying for a synthesized answer you were going to discard.
6. Consolidate queries to pay the request fee once
Because the request fee is flat per call, batching related questions into a single request spreads one fee across more work instead of paying it on every small call.
Can you use the Perplexity API for free?
Puter.js: the User-Pays model
Puter.js is a JavaScript library that adds Perplexity models to your app with no API key, no backend, and no bill to you as the developer. It runs on the User-Pays model: each user covers their own AI usage through their own Puter account, so your cost stays at zero regardless of how many users you have.
<html>
<body>
<script src="https://js.puter.com/v2/"></script>
<script>
puter.ai.chat("What are the latest developments in quantum computing?", {
model: "perplexity/sonar-pro"
}).then(response => {
document.body.innerHTML = response.message.content;
});
</script>
</body>
</html>
We ran the workload we use across these pricing guides: 500 monthly users sending 30 messages each, averaging 1,000 input and 300 output tokens per message, for 15M input and 4.5M output tokens a month. On Sonar Pro through the API, that is $45 for input and $67.50 for output, plus about $90 in per-request search fees at the default Low context (15,000 requests at $6 per 1,000), around $202.50 a month, growing linearly with your user base. Through Puter.js the same app costs you $0 at any scale, because each user carries their own usage. Beyond the dollar figure, it also removes API key management and billing setup.
Pro subscriber API credit (discontinued)
Perplexity Pro used to bundle a recurring $5/month in API credit for subscribers. Perplexity ended this perk in February 2026, so a Pro subscription no longer includes free API usage. If you find older guides citing the $5 credit, it no longer applies.
Startup and trial credits
Perplexity runs a startup program with larger credit grants for qualifying companies, and new API accounts have at times included trial credits. Amounts vary and are not published on the pricing page, so confirm current terms when you apply.
OpenRouter
The Sonar models are also available through OpenRouter, which can be useful for testing across providers, though it adds its own pricing layer. OpenRouter's Perplexity listing shows current availability.
For most developers, none of these is a standing free tier. Puter.js is the only option here that stays at $0 to you as you scale.
Real-world cost examples
We modeled three common workloads at current rates. All figures use Low search context unless noted.
Customer support chatbot. We calculated a bot handling 10,000 conversations a month, averaging 1,000 input and 400 output tokens each, one request per conversation.
| Model | Token cost | Request fee | Monthly total |
|---|---|---|---|
| Sonar | $14 | $50 | $64 |
| Sonar Pro | $90 | $60 | $150 |
| Sonar Reasoning Pro | $52 | $60 | $112 |
The request fee is a large share of the Sonar total here because the queries are short. The Sonar Reasoning Pro figure is a floor: that model bills its visible reasoning as output tokens, so real output volume runs higher than the 400 tokens assumed above.
Summarizing 100 PDFs. We modeled 100 documents at about 8,000 input tokens each with 600-token summaries. On Sonar that is roughly $0.80 input, $0.06 output, and $0.50 in request fees, about $1.36. On Sonar Pro it is about $3.90. One honest caveat: Sonar is a search-grounded model, and summarizing text you already supply does not need a web search, but you still pay the per-request search fee. For pure document summarization, a non-search model is usually cheaper.
Daily content generation. We modeled 50 pieces a day at 500 input and 2,000 output tokens each, 30 days a month, which is 1,500 requests. On Sonar Pro that is about $2.25 input, $45 output, and $9 in request fees, around $56 a month for cited, web-grounded drafts. On Sonar the same volume runs about $11 a month if you do not need Sonar Pro's richer sourcing.
Complete Perplexity API pricing table
Sonar token pricing (per 1M tokens)
| Model | Input | Output | Citation | Search queries | Reasoning |
|---|---|---|---|---|---|
| Sonar | $1 | $1 | — | — | — |
| Sonar Pro | $3 | $15 | — | — | — |
| Sonar Reasoning Pro | $2 | $8 | — | — | — |
| Sonar Deep Research | $2 | $8 | $2/1M | $5/1K | $3/1M |
Request fee by search context size (per 1K requests)
| Model | Low | Medium | High |
|---|---|---|---|
| Sonar | $5 | $8 | $12 |
| Sonar Pro | $6 | $10 | $14 |
| Sonar Reasoning Pro | $6 | $10 | $14 |
Pro Search mode for Sonar Pro (request fee per 1K)
| Search type | Low | Medium | High |
|---|---|---|---|
fast |
$6 | $10 | $14 |
pro |
$14 | $18 | $22 |
auto |
Varies | Varies | Varies |
Agent API tools (third-party models billed at provider rates)
| Tool | Price |
|---|---|
web_search |
$0.005 per invocation |
fetch_url |
$0.0005 per invocation |
people_search |
$0.005 per invocation |
finance_search |
$0.005 per invocation |
Search API
| API | Price |
|---|---|
| Search API | $5 per 1K requests (no token cost) |
Embeddings API (per 1M tokens)
| Model | Dimensions | Price |
|---|---|---|
pplx-embed-v1-0.6b |
1024 | $0.004 |
pplx-embed-v1-4b |
2560 | $0.03 |
pplx-embed-context-v1-0.6b |
1024 | $0.008 |
pplx-embed-context-v1-4b |
2560 | $0.05 |
The Agent API serves models from OpenAI, Anthropic, Google, and xAI at direct provider rates with no markup. For that full model-by-model table, see Perplexity's official pricing page.
Conclusion
The Perplexity API runs $3 input and $15 output per million tokens on Sonar Pro, down to $1/$1 on Sonar, plus a per-request search fee on every call. The main levers on your bill:
- Model choice (Sonar versus Sonar Pro versus Deep Research)
- Search context size, kept at Low unless a query needs more
- Deep Research discipline, including lower
reasoning_effort - Output length
The free paths are limited: Puter.js keeps your cost at $0 through the User-Pays model, while Perplexity's own credits are small and partly uncertain. All rates verified on June 19, 2026.
Related
- Free, Unlimited Perplexity AI API
- Access Perplexity Using OpenAI-Compatible API
- How to Use Perplexity with the Vercel AI SDK
- OpenAI API Pricing
- Claude API Pricing
- Gemini API Pricing
- Grok API Pricing
- DeepSeek API Pricing
- Qwen API Pricing
- Mistral API Pricing
- MiniMax API Pricing
- Free, Unlimited AI API
- Free LLM API
Free, Serverless AI and Cloud
Start creating powerful web applications with Puter.js in seconds!
Get Started Now