OpenAI's Free Codex Offer: The 60-Day P&L That Said No

A first-person operator P&L for solo founders weighing OpenAI's May 13 promotion against staying on Anthropic Max with the new +50% weekly Claude Code limit through July 13. Walks the actual switching cost in engineering hours, the…

Thursday, May 14, 2026

Omid Saffari

Tools

OpenAI's Free Codex Offer: The 60-Day P&L That Said No

OpenAI offered me two months of free Codex on May 13 to migrate off Claude Code. I ran the 60-day P&L and stayed. The "free" was $387 in line items I do not pay, against $3,700 in switching cost I would.

The offer, decoded as a P&L line item

Sam Altman announced the Codex migration promotion on May 13: two months free for teams that migrate inside a 30-day window. The face value at the tier I'd actually need – six concurrent agent sessions, a weekly cap generous enough to handle daily publisher routines – works out to roughly $200/month against the closest-comparable Codex SKU. Two months × $200 = $400 of nominal value sitting on the table.

The same day, Anthropic raised Claude Code weekly limits +50% for Pro, Max, Team, and Enterprise customers through July 13. Not coincidentally. The symmetry is the signal.

When two vendors move on the same Tuesday – one offering free months to switch, the other quietly widening the cap to keep – the implied cost of acquiring a switcher has crossed into the low hundreds of dollars. That number tells you something more important than the offer itself: switching cost on the operator side of the table is no longer trivial either. If it were, the vendors wouldn't be paying this much.

I treat any "free for N months" offer the way I treat a procurement decision, not the way I treat a coffee. The right unit of analysis isn't "is the trial free to start?" It's "what does the round trip cost in engineering hours, and what's the probability I make that round trip?" The marketing framing is trial. The accounting framing is capex with an embedded switching tax.

Once you see it that way, the $400 face value stops being the headline number. It becomes one line on a much longer P&L.

My current 60-day baseline on Claude Max + AI Gateway

Before I can score the offer, I need the baseline I'd be displacing. Here it is, line by line.

The publication runs on Claude Max at $200/month, fronted by an AI Gateway with BYOK pass-through for the six daily routines, sitting on Cloudflare Workers with D1 for storage and Vectorize for the sim-check that gates every brief. Total operating cost across the last 60 days, including embeddings, DataForSEO, and cover-image generation: $387/month. I broke that down in detail in the Workday-Anthropic grant piece – same stack, same numbers.

Six daily publisher routines fire on cron: publish-news, publish-dev, publish-design, publish-marketing, publish-founders, publish-business. Each one runs 4–8 minutes of Claude Max time, generates a brief, dispatches it to the PublishWorkflow, and writes the result to D1. The workflow has six discrete steps – validate, ground, generate, clean, persist, index – and each step calls Anthropic with a structured prompt that leans on a specific cost lever: prompt caching.

The cache structure is what makes the unit economics work. Every routine call carries a markdown contract block (~3,500 tokens) and a voice block (~1,500 tokens), both cache_control ephemeral with a 1-hour TTL, followed by the per-format prompt and the brief. The first call in the hour pays 1.25× input rate to write the cache. Every subsequent call in that hour pays 0.1× to read it. At my volume – roughly 40–50 Anthropic calls per day across the six routines and their sub-steps – cache-hit rate sits around 85% of input tokens, which keeps the monthly bill in the $387 zone instead of $900.

The +50% weekly limit Anthropic just shipped means I now have headroom for a seventh routine before the cap matters. Free of charge. No migration required. That's the comparison the Codex offer has to beat.

What switching to Codex actually costs me, line by line

This is the section the SERP doesn't have, because no one running six production routines on the same stack is publishing the math.

Routine rewrite. Each of the six routines is between 600 and 900 lines of skill markdown plus tool-call definitions tuned to Claude's specific tool-use grammar and prompt-cache primitives. Codex's tool-use semantics, cache control, and prompt structure are different enough that this is not a find-and-replace. It's a per-routine re-authoring with a real testing tail. Conservative estimate: 2–3 hours per routine × 6 routines = 14 hours of focused build time.

Opportunity cost on those hours. I bill DVNC.dev client work at $250/hour blended. 14 hours × $250 = $3,500 in opportunity cost, real money I'd otherwise be earning while the migration is in flight.

Prompt-cache re-tuning during testing. Getting cache-hit rates back to the 80%+ band on a new provider takes iteration. Tool definitions get rewritten, prompt structure gets reordered, the ephemeral-vs-persistent cache decision gets remade per call site. Ad-hoc API spend during this phase, based on my prior experience tuning the Anthropic side: ~$200 in real cash.

Publish-workflow rewrite. The six-step workflow uses Anthropic-specific features – cache_control: { type: "ephemeral", ttl: "1h" }, the prompt-shape that maximizes cache hits across step boundaries, the streaming protocol the workflow assumes. Rewriting against Codex's prompt API and its tool-loop semantics: another 4–6 hours, call it $1,250 at the same opportunity-cost rate.

Let me put it in a ledger:

Line item	Cost
Routine rewrite (opportunity cost)	$3,500
Workflow rewrite (opportunity cost)	$1,250
Prompt-cache re-tuning (cash)	$200
"Free" face value	+$400
Net P&L	-$4,550

If I refuse to value my own hours – which is the implicit assumption baked into every "it's free, why not try it" pitch – the cash-only line reads roughly break-even. $400 of value against $200 of cache-tuning API spend, plus two weeks of weekends I no longer have.

The trap is the opportunity cost. Founders who don't bill their hours pay this bill in the wrong currency and walk away convinced they saved $400.

The decision framework I run on every "free for N months" offer

After enough of these decisions, I codified the four questions I run on every vendor offer that requires me to do anything to redeem it.

What does switching cost in engineering hours, not just net cash? The "free" is always net cash. The real bill is hours.
What is my opportunity cost per hour at my next-best use of time? For me, $250/hour against client work. For someone earlier in the curve, it might be $0 – in which case the math changes and the offer might genuinely make sense.
What is the renewal probability? Will I switch back at month 3, paying the migration tax twice? If yes, the offer is worse than it looks. If no, the offer is locking me in deliberately, which is the vendor's actual goal.
What's the platform-lock asymmetry? Does the new tool make returning harder than arriving was? If yes – and it almost always does – the offer's true cost is the asymmetric exit, not the symmetric entry.

For Codex, the renewal probability is high. My six routines are encoded in a vendor-specific grammar; coming back to Anthropic in month 3 means paying the engineering bill twice, once each way. The asymmetric exit is real.

This is where the "free trial" framing trips people up. A trial costs nothing to start and is pure upside if you can return cheaply. A migration costs days of work to start and days of work to return – that's not a trial. That's a procurement decision dressed in trial clothing. I cover the same lock-in logic from a different angle in Cloudflare's 100x-engineer pitch decoded; the structural pattern repeats every time a platform vendor wants your routines.

The framework isn't always "no." When I moved off Vercel to Cloudflare last year, I took a $5K credit on the way in. But that was a deliberate platform consolidation – D1 + Workers + Vectorize all in one place – not a free trial. The four questions cleared it cleanly. The migration was the strategy; the credit was a kicker.

The Codex offer fails the framework. So do the GCP credits, the Linear free year for AI startups, the Notion AI six-month promo that sits in my inbox right now. They all share the same structure: a platform-specific work tax to redeem, and a stickier second-bill cycle once you've paid it.

The general operator rule: vendors are not in the business of paying you to evaluate. They're in the business of paying you to switch. Score every offer as procurement.

What I am doing this week instead

The hours I would have spent evaluating Codex have a higher-value use, and naming that use explicitly is the only honest way to score the decision.

The publish-design-deep routine ships this week – it uses the same workflow scaffold as the existing six, with a longer brief target and a different voice block. Build cost: ~3 hours. Marginal operating cost: ~$8/month, well inside the headroom the +50% cap just handed me.

The prompt-cache audit is the unglamorous compounding work. Each routine has 2–4 call sites where the cache structure could be tightened – usually by reordering the prompt so the volatile bits come last, or by promoting a near-static section into the cached block. At my volume, the audit pays for itself in under three weeks of operating cost.

The publish-marketing cadence change is the boring but correct call. DataForSEO queries are the dominant cost on that routine, not Anthropic tokens, and daily firing was overserving the marketing category anyway. Tuesday/Thursday gives me the same publication rhythm at 40% of the data cost.

Every hour I don't spend evaluating a vendor that wants my routines is an hour available to ship – for clients, or for the publication. That's the opportunity-cost rule applied in reverse, and it's the operator move that doesn't show up on any pricing page.

What I'd do differently

If I were six months earlier in the publication's life – one or two routines instead of six, no production traffic, prompt-cache structure not yet tuned – the math flips.

At low lock-in, the switching cost is real but small. A 60-day free window genuinely lets you A/B two providers on the same workload, and the Codex offer becomes what the marketing says it is: a trial with real signal. For someone with ≤2 routines and an opportunity cost closer to $0/hour, taking the offer is the rational move. I'd take it.

The decision moment is roughly the third routine. That's where prompt-cache structure stabilizes, where tool grammars get dialed in, where the workflow scaffolding stops changing weekly. Past that point, switching cost crosses opportunity cost and the framework starts saying "no" to offers it would have said "yes" to three months earlier.

The general lesson: lock-in is not always bad. Sometimes it's exactly what you paid for when you spent six weekends tuning the stack in the first place. The investment that creates the lock-in is the same investment that drove the cost from $900/month to $387/month. You can't have one without the other.

I'd still test Codex on a single greenfield routine – a 2-hour build with real signal, on a workload I don't yet depend on. That's a different decision from migrating the running publication. The first is reconnaissance. The second is a procurement event with a $4,550 net cost and no strategic reason to incur it.

Is the +50% Claude Code weekly limit permanent?

No. It runs through July 13 per Anthropic's May 13 announcement. The post-July rate hasn't been announced. I'm assuming it reverts and sizing my new routine count against the pre-bump cap, with the bonus as headroom for the next eight weeks.

What does Codex actually cost at the tier comparable to Claude Max?

OpenAI hasn't published a fully aligned SKU. My estimate is ~$200/month for the closest equivalent – multiple concurrent agent sessions, a generous weekly cap, comparable model access. If the real comparable tier is higher, the "free" face value is higher, and the math gets closer to neutral on cash. The opportunity-cost line doesn't move.

Can you run both Codex and Claude Code in parallel for different routines?

Technically yes. Operationally, it doubles your prompt-cache investment and splits your tool-grammar surface area. For solo operators, the carrying cost of maintaining two parallel agent stacks is rarely worth it unless the routines have genuinely different requirements – which mine don't.

What rate did you use for opportunity cost?

$250/hour blended, which is roughly mid-market for a senior AI systems builder on contract. Substitute your own rate and re-run the math. If your rate is $0 – you're pre-revenue, or building on weekends – the cash-only line is the relevant one, and the offer looks better. If your rate is $500/hour, the offer looks even worse than it does for me.

How long did the original prompt-cache tuning take?

About 9–12 hours spread across two weeks to get cache-hit rates above 80%, plus another 4–6 hours of iteration as the routines stabilized. That's the work I'd be redoing on Codex. It's also the work I'd be redoing again if I switched back. The asymmetric-exit cost is real.

Last Updated

Jun 2, 2026

CategoryGrowth