How should agencies price AI usage into a client retainer?

The dominant 2026 pattern is a transparent pass-through line with a 20-40% handling markup, modeled on how agencies have always billed media spend, stock assets, and third-party software. Estimate token volume per deliverable across the retainer scope, multiply by the published model rate, add headroom for revision rounds, and put it on the master service agreement as 'AI tooling, estimated.' The TinyTools AI Cost Calculator produces the underlying number so the retainer is grounded in math rather than guessed margin.

Can agencies mark up AI API costs the same way they mark up media?

Yes — that is the prevailing model among independent and mid-sized agencies in 2026. Most shops show the API spend itemized as a pass-through with a 15-35% handling markup, identical to how billable expenses, programmatic media, and stock photography have always been billed. The calculator gives you the at-cost figure so the markup is defensible to a procurement reviewer.

Which AI model is most cost-effective for agency deliverables?

For creative ideation, long-form briefs, and strategic synthesis, Claude Sonnet 4.6 and GPT-5 offer the best quality-to-price ratio. For social copy variants, banner headlines, alt text, and first-pass outlines, Gemini 3 Flash and DeepSeek V3.1 cut costs by 80-95% with no visible quality drop on short outputs. The calculator surfaces the trade-off across all 30+ supported models so you can mix and match per service line.

Is the AI Cost Calculator free for agencies and creative shops?

Yes. The calculator is free, requires no signup, and runs entirely in your browser. No client data leaves your machine. Use the estimates inside SOWs, retainer proposals, fixed-bid pitches, scope-of-work documents, or internal capacity planning.

Should agencies disclose AI usage to clients?

In 2026, AI disclosure is the professional default for most service categories and is contractually required by many enterprise procurement teams. Be explicit in the MSA about which deliverables are AI-assisted and which are reviewed by a human practitioner. Tools like the TinyTools AI Disclosure Generator produce standard labels you can embed in deliverables, decks, or invoice appendices.

Free · No signup · Runs in your browser

AI Cost Calculator for Agencies

Price retainers and fixed-bid AI work without leaking margin client by client. Plug in deliverable mix, prompt context, and revision rounds — get an honest per-account and per-month cost across GPT-5, Claude 4.6, Gemini 3, DeepSeek, and 25+ more models.

Open the AI Cost Calculator →

Why agencies need an honest LLM cost calculator

An agency's cost stack used to be predictable: salaries, software seats, and pass-through media. Then 2025 happened, the work shifted from manual production to AI-augmented production, and a brand-new line item appeared inside almost every retainer — variable token spend that scales with deliverable volume, prompt context, and how many rounds a brand director sends back. Most agencies are still pricing as if that line doesn't exist, or as if it's small enough to swallow.

The math gets dangerous when you scale. A creative team running thirty social variants a week per account, with research briefs that pull 20,000 tokens of brand context and three revision cycles, can burn anywhere from $80 to $1,800 per month per client depending entirely on model selection. Multiply that across a roster of twelve accounts and the wrong default model can turn a 22% retainer margin into a 9% retainer margin without a single new hire on payroll. That difference is the size of an account director's salary or the difference between hitting and missing your annual bonus pool.

This calculator exists to put the number on the page before the SOW gets signed. Enter the realistic shape of your service mix — brief length, output volume, revision count, deliverables per month — and it cross-multiplies against published rates for every major hosted model. The output is a sortable cost table that pastes straight into a pitch deck, a retainer worksheet, or a pass-through line on a client invoice.

Five jobs it actually does for agencies

1. Retainer scoping with realistic AI overhead

Before you commit to a monthly retainer ceiling, model the realistic AI bill at the agreed deliverable volume. A "content sprint" retainer that promises 40 social posts, 4 long-form blogs, and 2 email sequences a month sits in a very different cost band depending on whether the underlying model is GPT-5 or Gemini Flash. The calculator's volume slider shows the spread so you set the retainer at a number that survives Q4.

2. Pass-through markup defense

If you bill API usage as a transparent pass-through with markup (the dominant 2026 model), procurement reviewers will compare your number against the public per-token rate. The calculator gives you the at-cost figure plus a clean markup breakdown. Drop it into the MSA and the procurement conversation gets shorter, the renewal conversation gets cleaner, and the receivables aging stays under control.

3. New-business pitch economics

When you're pitching a new account, the AI line is the one number you can't easily back-of-envelope in a Slack thread. The calculator gives a defensible pitch number in under two minutes, and the same view exports the assumptions sheet your finance lead needs for the win-rate review. No more "we'll figure out tooling after we win it" surprises in week three.

4. Service-line profitability

Most agencies discover, sometimes painfully, that paid social management eats more AI token spend than long-form content because of the sheer number of variants. The calculator lets you slice by service line — SEO content, paid social, email, brand — and surfaces which lines are quietly margin-thin so you can re-price or restructure before the next pricing review.

5. Make-vs-buy on AI-native subs

Specialist subs (AI copy shops, automated UGC houses) are pitching agencies hard in 2026. Plug your in-house workload into the calculator, compare against the sub's flat per-deliverable price, and the answer is usually obvious in under a minute. It also exposes when a sub's quote is too cheap to be sustainable, which protects you from the inevitable mid-contract repricing.

What agencies usually get wrong

Three failure modes show up over and over inside AI-augmented agency work in 2026:

Pasting the entire brand book and asset library into every prompt. Context tokens are still billable tokens. A 60-page brand guidelines PDF dropped into a prompt that runs forty times across a campaign is paid for forty times. Use a retrieval store or a tight one-page brand brief as the carrier; reserve the full book for the one strategic pass that needs it.
Treating every revision as a fresh full-document call. Most creative tools re-send the entire deliverable on each revision, so by round four you've billed the same 4,000 output tokens four times across one piece of content. Targeted edit prompts on the changed section only cut revision cost by 60-80% across an account's monthly volume.
Defaulting to the frontier tier for production work. Hashtag generation, caption variants, alt text, and brief restructuring do not need GPT-5 or Claude Opus. Pushing routine production to Gemini Flash or DeepSeek and reserving the frontier model for the strategic and senior-creative pass typically cuts the bill by 60-75% with no detectable quality drop in client review.

The calculator surfaces all three by design — it asks for context size, output length, and revision count separately, then recomputes against every supported provider. For the canonical per-token rates check OpenAI's pricing page and Anthropic's Claude pricing. For broader coverage of how agencies are restructuring retainers around AI tooling, Digiday and the 4A's publish regular pieces on agency billing models.

Sample agency account workload

To make the numbers concrete, here's how a typical mid-size B2B content retainer lands when run through the calculator:

Deliverables / month

Posts, blogs, emails, ads

Avg output length

900 wd

~1,200 output tokens

Revision rounds

Internal QA + client

Brand & brief context

6,000 tok

Guidelines, voice doc

Model	Cost / deliverable	Monthly per account
GPT-5	$0.74	$45.88
Claude Sonnet 4.6	$0.51	$31.62
Gemini 3 Flash	$0.07	$4.34
DeepSeek V3.1	$0.04	$2.48
Mixed (Flash draft + Sonnet polish)	$0.18	$11.16

Numbers above are illustrative. Plug your real per-account shape into the live tool to get a current comparison against the latest published rates. The "mixed" row is the pattern most profitable agencies settle into — a cheap workhorse for first-pass production, a smarter model for the senior-creative polish layer. Multiply by your roster size to see the annualized impact on agency margin.

How this fits the rest of an agency stack

The AI bill is one line on the account P&L. The other lines that matter — and where the TinyTools suite already covers most of them without adding another seat or another vendor login:

New-brand naming for client launches: the Domain Generator shortcuts the brand-and-WHOIS loop on a new product line or sub-brand.
Social previews and pitch covers: the OG Image Generator produces social cards and proposal cover art without a Canva seat per account team.
Brand color systems: the Color Palette tool generates the accessible swatches that survive a brand audit.
On-page SEO for client service pages: the SEO Meta Generator writes title tags and meta descriptions that fit the SERP character limits.
Client deliverable compliance: the AI Disclosure Generator produces standard labels for AI-assisted creative, increasingly required in enterprise and regulated client work.
Crawler control on agency and client sites: the AI Robots.txt Generator manages which crawlers may train on published case studies and thought leadership.

The pattern is the same across all of them: free, single-purpose, no signup, no extra seat license to expense. For a wider view of how agencies are restructuring fees around AI tooling, the Adweek archive tracks billing-model shifts at independent and holding-company shops, and the IAB publishes guidance on how media markups translate to AI-assisted services.

Frequently asked questions

Can I drop the comparison table directly into a client SOW or pitch deck?

Yes — the table is plain HTML, so it pastes cleanly into a Notion SOW, a Google Doc pitch, a Keynote deck, or a Figma proposal. Many agencies paste the per-account number directly into the master service agreement as a transparent pass-through estimate with a stated markup.

Does it model the cheap "mini" tiers for high-volume production?

Yes. GPT-5 mini, Claude Haiku 4.5, Gemini 3 Flash, and DeepSeek's full lineup are all included. Mini tiers run 5-20x cheaper than the flagship and are good enough for caption variants, alt text, hashtag generation, and outline drafts — the majority of an agency's volume.

How current is the pricing data?

The calculator reads from a price table that we update whenever a major provider publishes a change. Expect 1-3 day lag on smaller providers, near-real-time on the top five.

Can I model across multiple accounts in one view?

The current calculator models one workload at a time, but agencies typically run it three or four times — one per service line — and sum the totals into a roster-wide budget. We are working on a multi-account dashboard for the next release.

What about self-hosted or private-tenant models for regulated clients?

Self-hosted GPU pricing is too workload-dependent to model precisely, but we cover hosted serverless rates (Together, Fireworks, Groq, Bedrock) for Llama, Mistral, Qwen, and DeepSeek — those are a reasonable upper bound for what a private deployment saves once ops overhead is factored in. For regulated industries that require it, pair the calculator with the AI Disclosure Generator so the client deliverable carries the correct label.

Try the AI Cost Calculator →