On this page

How much does the ElevenLabs API cost?ElevenCreative, ElevenAPI, and ElevenAgents are different products How ElevenLabs API pricing works What makes your bill higher than expected How to reduce ElevenLabs API costs Can you use the ElevenLabs API for free?Real-world cost examples Complete ElevenLabs API pricing table Conclusion Related

ElevenLabs API Pricing: Full Breakdown of Costs (Jun 2026)

Reynaldi Chernando

Updated: June 23, 2026

On this page

This guide breaks down what the ElevenLabs API costs across every product, what drives your bill, how to lower it, and how to use it for free (see also our free ElevenLabs API tutorial).

How much does the ElevenLabs API cost?

ElevenLabs bills text to speech per character: its high-quality Multilingual v2/v3 model costs $0.10 per 1,000 characters, and the faster Flash/Turbo model costs $0.05 per 1,000 characters, which is the cheapest way to generate speech on the platform. The rest of the catalog is priced by audio duration (per minute or per hour) or per generation, because ElevenLabs is a suite of voice products rather than a single model.

Product	Model	Price	Unit
Text to Speech	Flash / Turbo	$0.05	per 1,000 characters
Text to Speech	Multilingual v2/v3	$0.10	per 1,000 characters
Speech to Text	Scribe v1/v2	$0.22	per hour
Speech to Text	Scribe v2 Realtime	$0.39	per hour
Agents	Speech Engine	$0.08	per minute
Music	Eleven Music	$0.30	per minute
Voice Changer	—	$0.12	per minute
Voice Isolator	—	$0.12	per minute
Sound Effects	—	$0.12	per generation
Dubbing	Dubbing v1	$0.33	per minute (watermarked)

Prices exclude tax. And the per-unit rate is the same whether you are on the free plan or the Business plan: paid tiers bundle more included usage and add features, they do not lower the rate.

ElevenCreative, ElevenAPI, and ElevenAgents are different products

ElevenLabs sells three things that share one credit system, and people building apps often pay for the wrong one:

ElevenCreative is the consumer subscription (web app, Studio, voice cloning), measured in monthly credits.
ElevenAPI is the developer interface billed per character, minute, or generation. This is what this article covers.
ElevenAgents (the Speech Engine) is the real-time conversational layer billed per minute.

A Creator or Pro subscription gives you credits you can spend through the API, but you don't need one to call the API: pay-as-you-go works on its own.

How ElevenLabs API pricing works

ElevenLabs does not bill by token. There are no tokens, no input/output split, and no context window. You pay for the audio you produce or process.

Characters, not tokens

Text to speech is billed on the number of characters you send to be synthesized. The text you submit is the meter: a 500-character paragraph costs 500 characters' worth of generation regardless of how long the resulting audio is. In the consumer credit system, one character equals one credit on Multilingual, and Flash costs roughly half a credit per character, which is why Flash comes out at $0.05 against Multilingual's $0.10 per 1,000 characters.

Audio duration for everything that isn't text to speech

Speech to text is billed per hour of audio; dubbing, music, voice changer, and voice isolator are billed per minute of audio processed or produced. The agents Speech Engine is billed per minute of conversation. Sound effects are billed per generation. This matters when you plan a budget: a one-minute clip costs the same to transcribe whether it contains ten words or two hundred.

Subscription credits versus pay as you go

You can pay two ways. A subscription (Starter at $6/month through Business at $990/month) gives you a monthly credit allowance at a fixed price, with unused credits rolling over for up to two months while the subscription stays active. Pay-as-you-go charges the per-unit rates above with no monthly commitment. The per-unit price is identical either way, so the subscription is worth it only if you reliably use the bundled allowance.

The model you pick sets the rate

On text to speech, Flash and Turbo cost half what Multilingual costs. Flash and Turbo are functionally equivalent, and ElevenLabs recommends Flash over Turbo in every case. On speech to text, the batch Scribe v2 model at $0.22/hour costs less than Scribe v2 Realtime at $0.39/hour. Picking the cheaper model where it meets your quality bar is where most cost control starts.

What makes your bill higher than expected

Multilingual costs twice as much as Flash

The default text-to-speech model in many examples is Multilingual v2 at $0.10 per 1,000 characters. Flash delivers the same character throughput at $0.05. If you default to Multilingual for content that does not need its quality, you are paying double.

Scribe add-ons stack on the base rate

Scribe transcription starts at $0.22/hour, but two features add to that. Keyterm prompting adds $0.05/hour (a 20% increase), and entity detection, which is API-only, adds $0.07/hour (a 30% increase). Enable both and your effective rate is about $0.34/hour, roughly 55% above the base.

Watermark-free dubbing costs more

Dubbing is $0.33 per source minute with a watermark on the output. Removing the watermark raises the rate to $0.50 per minute, about 50% more. If you are dubbing content for commercial release, budget for the clean rate, not the headline one.

Realtime transcription carries a premium

Scribe v2 Realtime at $0.39/hour costs about 77% more than batch Scribe at $0.22/hour. If your workload is recorded audio that does not need live results, batch transcription is the cheaper path.

Concurrency limits push you up a tier

Higher subscription tiers raise how many requests you can run in parallel. A workload that needs to process many files or calls at once can hit a concurrency ceiling on a lower plan even when you have credits left, which forces an upgrade based on throughput rather than volume.

Free output is not licensed for commercial use

The free tier includes 10,000 credits a month but no commercial license. Commercial rights begin on the Starter tier at $6/month. Audio you generate on the free plan cannot be used in monetized or commercial work.

How to reduce ElevenLabs API costs

1. Use Flash or Turbo instead of Multilingual where quality allows

Flash at $0.05 per 1,000 characters halves your text-to-speech cost against Multilingual at $0.10. Use Multilingual for final, high-quality production audio and Flash for drafts, real-time responses, and high-volume content where the quality difference does not matter.

2. Cut the characters you synthesize

You are billed per character, so generate each piece of audio once and store it. Cache static prompts, menu messages, and repeated responses instead of re-synthesizing them on every request. Trimming boilerplate text before synthesis directly lowers the bill.

3. Match your subscription tier to your real usage

Because the per-unit rate is flat, a subscription only saves money if you use its bundled credits. If your usage is spiky or below a tier's allowance, pay-as-you-go is cheaper. If you consistently exceed a tier, moving up bundles the overage at the same rate while adding concurrency and features.

4. Turn off Scribe add-ons you don't need

Entity detection (+$0.07/hour) and keyterm prompting (+$0.05/hour) each raise your transcription rate. Enable them only for the jobs that need them rather than as global defaults.

5. Accept watermarked dubbing when you can

Watermarked dubbing at $0.33/minute costs a third less than the $0.50/minute clean rate. For internal review, previews, or non-commercial use, the watermarked output is enough.

6. Batch generations to stay within concurrency limits

Group work into batches that fit your plan's concurrency rather than upgrading for occasional parallel spikes. For transcription, Scribe processes long files in parallel internally, so consolidating short clips into longer files can use your throughput more efficiently.

Can you use the ElevenLabs API for free?

Puter.js: the User-Pays model

Puter.js is a JavaScript library that lets you add ElevenLabs models to your app with no API key, no backend, and no bill to you as the developer. It works on the User-Pays model: each user of your app covers their own AI usage through their Puter account, so your costs stay at zero no matter how many users you have.

<html>
<body>
  <script src="https://js.puter.com/v2/"></script>
  <script>
    puter.ai.txt2speech("Explain quantum computing in simple terms", {
      provider: "elevenlabs",
      model: "eleven_multilingual_v2"
    }).then(audio => {
      audio.play();
    });
  </script>
</body>
</html>

We ran the same workload we use across our pricing guides: 500 monthly users sending 30 messages each. Each 300-token response is about 1,300 characters of speech, so that's roughly 19.5M characters a month across 15,000 messages. Text to speech bills only the characters you synthesize, so the input text isn't charged.

Here's what that costs:

Multilingual v2 ($0.10 / 1K chars): about $1,950 a month
Flash ($0.05 / 1K chars): about $975 a month
Puter.js: $0 at any scale, because each user carries their own usage

The ElevenLabs costs grow linearly with your user base; Puter.js doesn't. (If your app also transcribes user audio, Scribe adds $0.22/hour on top.)

ElevenLabs Startup Grants

The Startup Grants Program gives eligible new companies 12 months of free access and 33,000,000 characters, along with higher concurrency limits and improved support. It is aimed at startups building real-time conversational agents, and you apply through ElevenLabs.

The ElevenLabs free tier

The free plan includes 10,000 credits a month, which covers roughly 10 minutes of Multilingual text to speech, and gives access to most API endpoints. It has no commercial license, so it suits prototyping and testing rather than production. Commercial use starts at the Starter tier ($6/month).

Real-world cost examples

A customer support voice bot

We modeled a support bot that speaks its replies, handling 10,000 conversations a month with about 1,500 characters of synthesized response per conversation, for 15M characters a month. We also priced the same workload as a full conversational agent on the Speech Engine, assuming an average three-minute call.

Approach	Rate	Monthly volume	Monthly cost
Flash text to speech	$0.05 / 1K chars	15M characters	$750
Multilingual text to speech	$0.10 / 1K chars	15M characters	$1,500
Speech Engine (full agent)	$0.08 / minute	30,000 minutes	$2,400

Raw text to speech is cheaper when you only need to voice text you already have. The Speech Engine costs more per minute because it handles the whole real-time loop (transcription, turn-taking, and synthesis) in one pipeline.

Narrating 100 PDFs

ElevenLabs cannot summarize documents: it has no language model, so it converts text to audio rather than condensing it. To narrate a batch of PDFs you would first extract or summarize the text with an LLM, then send that text to ElevenLabs. We calculated the narration step for 100 PDFs averaging 10 pages each, at about 3,000 characters per page, for 3M characters total. On Flash that is $150; on Multilingual, $300. The summarization step, if you want one, is billed by whichever LLM you use, not by ElevenLabs.

Daily content generation

We modeled a daily 10-minute audio brief, about 9,000 characters per episode, published every day for a month, for 270,000 characters. On Flash that comes to about $13.50 a month; on Multilingual, about $27. Adding a one-minute music bed per episode through the Music API at $0.30/minute adds about $9 a month, and any one-off custom music finetune is $1.50 each.

Complete ElevenLabs API pricing table

Text to Speech (per 1,000 characters)

Model	Price
Flash / Turbo	$0.05
Multilingual v2/v3	$0.10

Speech to Text (per hour)

Model	Base	Entity detection	Keyterm prompting
Scribe v1/v2	$0.22	+$0.07	+$0.05
Scribe v2 Realtime	$0.39	—	—

Other products

Product	Price	Unit
Speech Engine (agents)	$0.08 included / $0.16 additional	per minute
Music	$0.30 (+$1.50 per finetune)	per minute
Voice Changer	$0.12	per minute
Voice Isolator	$0.12	per minute
Sound Effects	$0.12	per generation
Dubbing v1	$0.33 watermarked / $0.50 clean	per minute

Subscription tiers (bundled credits, same per-unit rates)

Tier	Price/month	Credits/month
Free	$0	10,000
Starter	$6	30,000
Creator	$22 (first month $11)	121,000
Pro	$99	600,000
Scale	$299	1,800,000
Business	$990	6,000,000
Enterprise	Custom	Custom

For the full per-tier included-usage breakdown across every product, see the official ElevenLabs API pricing page.

Conclusion

ElevenLabs text to speech costs $0.10 per 1,000 characters on Multilingual v2/v3 and $0.05 on Flash, with the rest of the catalog billed by audio duration (per minute or hour) or per generation. The main ways to control your bill:

Use Flash or Turbo instead of Multilingual where quality allows (halves text-to-speech cost).
Generate audio once and cache it instead of re-synthesizing.
Match your subscription tier to your real usage, or stay on pay-as-you-go.
Turn off Scribe add-ons and watermark-free dubbing when you do not need them.
Use batch Scribe instead of Realtime for recorded audio.

Pricing verified against ElevenLabs' official pages.

Free, Serverless AI and Cloud

Start creating powerful web applications with Puter.js in seconds!

Get Started Now

Read the Docs • Try the Playground