I Run Six AI Publisher Routines on One Static Cloudflare Workflow. Dynamic Workflows and Workflows V2 Just Made That a Choice, and Here's the Per-Lane Migration Math

A migration playbook for operators running agent-triggered durable pipelines on Cloudflare. It walks the real architecture behind six AI publisher routines hitting one `WorkflowEntrypoint`, the exact trade when you split that into per-lane Dynamic Workflows loaded at runtime, what Workflows V2's 50,000-concurrent / 2,000,000-queued ceiling changes for machine-speed triggers, and the single idempotency invariant (`publish-{brief_id}`) that survives the migration untouched. Real version numbers, real cost framing, the part that breaks first.

Sunday, May 17, 2026

Omid Saffari

I Run Six AI Publisher Routines on One Static Cloudflare Workflow. Dynamic Workflows and Workflows V2 Just Made That a Choice, and Here's the Per-Lane Migration Math

Six Anthropic-side publisher routines fire at me every day, and every one of them POSTs into a single static PublishWorkflow keyed publish-{brief_id}. On May 1 Cloudflare shipped @cloudflare/dynamic-workflows, and a few days later Workflows V2 with a control plane rebuilt for agent-triggered load. The interesting part is not the launch. It is that the one-static-workflow decision I made for idempotency reasons is now a decision I have to defend instead of a constraint I inherited.

The thing that surprised me

The launch I expected was "Workflows scales bigger." Bigger ceilings, more queue depth, the usual annual bump. That was the smaller half of the announcement.

The launch that mattered is that workflow code can now differ per tenant at runtime. @cloudflare/dynamic-workflows shipped May 1, MIT-licensed, built on Dynamic Workers. It lets you register a single WorkflowEntrypoint with a body resolved from a Dynamic Worker at instance-create time, so the step graph itself becomes tenant-shaped rather than baked into the deployment. Cloudflare's own framing is "durable execution that follows the tenant," which is a precise sentence if you've ever tried to bolt per-customer logic onto a single workflow with feature flags.

My six publisher routines – news, dev, build, design, marketing, founders, business – are functionally six tenants sharing one PublishWorkflow. They are not customers. They are AI routines on the Anthropic side, each producing a different lane of content. But the multi-tenant shoe fits perfectly: same orchestration spine, slightly different shape per lane, all dispatched into the same Cloudflare account.

Then Workflows V2 landed a few days later, and the framing in that post matters more than the numbers. V2 is explicitly rearchitected around the idea that workflow instances are created by agents at machine speed, not by humans clicking buttons. That is exactly the trigger pattern I run. A publisher routine deciding to ship an article is a non-human event that creates a durable execution instance, and V1's control plane was built for a different shape of load.

blog.cloudflare.com

Dynamic Workflows: durable execution that follows the tenant

Cloudflare's launch post for @cloudflare/dynamic-workflows, MIT-licensed, built on Dynamic Workers.

What this means for non-technical founders

"Durable execution" is a phrase engineered to scare you off. The plain-language version: it is a multi-step job that survives a crash and resumes from the last finished step instead of restarting from scratch. If step 5 of 8 fails because an API was down, the system retries step 5 – it does not re-run steps 1 through 4, and it does not re-bill you for them.

This is the part a founder needs to care about. The per-step retry boundary is also the per-step cost boundary. If your pipeline calls a paid AI model in step 3 and a paid model in step 6, and step 7 crashes, you want step 7 to retry – not the whole job. A non-idempotent re-run on a paid pipeline is not a duplicate row in a database. It is a duplicate invoice.

The build-vs-defer call you can repeat to a contractor in one sentence: split workflows per product line only when the lines' logic actually diverges, not because the platform now lets you. Cloudflare just shipped a powerful new abstraction, and the temptation when you see one of those is to redesign around it. The cost of N workflow definitions is N times the maintenance, and the benefit only shows up if the lanes are genuinely doing different work. A founder who hears "we should split each product line into its own workflow now that we can" should ask what changed about the work, not what changed about the platform.

The architecture I actually run

The real shape is small enough to keep in your head. Each publisher routine on the Anthropic side completes its research, drafts a brief, and POSTs to /api/admin/publish on the site. That endpoint validates the payload, mints an instance ID, and calls instances.create on a single WorkflowEntrypoint called PublishWorkflow.

The workflow itself is eight idempotent step.do stages:

validate (brief schema, slug uniqueness)
ground (citation fetch, link resolution)
generate (Claude call for body)
clean (markdown directive validation)
persist (Postgres insert, versioning)
index (embeddings, search refresh)
cover (image generation + upload to R2)
publish (status flip, sitemap ping)

TypeScript

export class PublishWorkflow extends WorkflowEntrypoint<Env, PublishParams> {
  async run(event: WorkflowEvent<PublishParams>, step: WorkflowStep) {
    const brief = await step.do("validate", () => validateBrief(event.payload));
    const grounded = await step.do("ground", () => groundCitations(brief));
    const draft = await step.do("generate", () => generateBody(grounded));
    const cleaned = await step.do("clean", () => validateDirectives(draft));
    const row = await step.do("persist", () => persistArticle(cleaned));
    await step.do("index", () => reindex(row.id));
    await step.do("cover", () => generateCover(row.id));
    await step.do("publish", () => flipStatus(row.id));
  }
}

The dispatch boundary looks like this:

TypeScript

const id = `publish-${brief.brief_id}`;
try {
  await env.PUBLISH.create({ id, params: brief });
} catch (e) {
  if (isDuplicateIdError(e)) return new Response("already queued", { status: 200 });
  throw e;
}

instances.create throws on a duplicate ID. That single line does a load-bearing amount of work: it is the only thing preventing a re-fired Anthropic routine from publishing the same article twice and double-charging me for the AI calls inside.

Why one definition for six lanes? Because the eight stages are identical across all six. Only the editorial bundle differs – the voice spec, the audience spec, the anti-patterns list, the brief hints. Those ship as runtime data inside the dispatch envelope, not as code. The lane name selects which bundle gets passed to step 3 (generate); every other step is lane-agnostic.

The pressure point – and the only honest reason to even open the Dynamic Workflows question – is that one lane is starting to want a different step graph, not just a different bundle. The research-heavy lane wants an extra grounding pass before generation, and possibly a fact-check step after. That is a structural difference, not a data difference. Right now it ships as a conditional branch inside ground, which works but is the kind of thing that rots if you let three more lanes add their own special-case branches.

The Dynamic Workflows migration, costed

This is what createDynamicWorkflowEntrypoint buys you. Per-lane workflow code is loaded at runtime from a Dynamic Worker, which means lanes that did not fire today cost close to nothing to keep around. You stop paying the maintenance tax of "all lanes deployed in one bundle, all the time," and you can evolve lane code independently without redeploying the shared spine.

Diagram showing the /publish HTTP boundary owning the publish-{brief_id} idempotency key as a gate before fan-out to per-lane workflows — The idempotency key lives at the dispatch boundary, never inside lane code.

The honest cost is the part the launch post does not dwell on. You trade one type-checked WorkflowEntrypoint – where the compiler tells you when a step's input contract drifts – for N runtime-loaded definitions where it does not. Cross-lane step contract drift becomes a runtime failure instead of a build failure. If you have six lanes and the persist step expects a slightly different row shape in two of them because someone refactored half the lanes and forgot the rest, you find out when a real article tries to publish, not when CI runs.

The migration is not all-or-nothing, and pretending otherwise is how you over-engineer. The move I am planning:

Keep the shared 8-stage spine as a static WorkflowEntrypoint. This is the hot path. Five of the six lanes use it unchanged.
Build the research-heavy lane as a Dynamic Workflow with its own step graph (extra grounding pass, fact-check step).
Dispatch by lane at the /publish boundary – a one-line switch on brief.lane that picks which binding to call create on.

When to delete the Cloudflare Workflow code: managed agents, outcomes, webhooks

The inverse decision – when Anthropic Managed Agents make your CF Workflow redundant.

The decision rule, stated with a number attached: migrate a lane to its own Dynamic Workflow only when its step graph differs from the spine by more than one stage. Below that threshold, a conditional branch inside the existing step is cheaper to maintain than a second workflow definition. Above it, the branch-in-step starts to obscure what the lane is doing, and a separate definition pays for itself.

One lane over the threshold. Five lanes under it. That is the migration plan in two sentences. If I read this same brief from a contractor proposing to split all six into Dynamic Workflows on day one, I would not approve it. The build cost is real and the benefit is concentrated in exactly one lane.

What Workflows V2 changes for machine-speed triggers

The headline numbers from V2: 50,000 concurrent instances and 2,000,000 queued instances per workflow, up from 1,000,000 queued in V1. Those are not vanity figures, but they are also nowhere near where I live. Six routines firing once or twice a day puts me roughly nine orders of magnitude below the new ceiling.

infoq.com

Cloudflare Workflows V2: deterministic replay, 50k concurrent instances

InfoQ's coverage of the V2 control plane rearchitecture and new concurrency limits.

The part that matters for my stack is not the numbers. It is that V2's control plane is rebuilt around agent-triggered instance creation as the primary case. V1 was optimized for the human-triggered shape – a user clicks a button, an instance starts, the system handles bursty-but-bounded load. V2 assumes the trigger is a process, not a person, and that the rate is governed by upstream agent throughput rather than UI clicks.

Deterministic replay is the V2 property worth restating in plain language. Each step is isolated, replayable, idempotent, and the workflow resumes from the last successful step on retry. That is exactly the property my publish-{brief_id} instance ID was manually defending at the dispatch boundary. V1 gave me the per-step retry guarantee; V2 hardens the replay semantics so the structural property I was protecting is now also enforced by the platform underneath.

The practical move this week: nothing in the hot path. The V2 model is opt-in by re-deploy, and the worst thing you can do with a working idempotent pipeline is rush it onto a new control plane to chase semantics it already had. I will move the research-heavy lane onto V2 when I build it as a Dynamic Workflow, because new code on the new model is cheap. The five static lanes stay where they are until I have a reason to touch them that is not "the platform shipped a new version."

The one invariant I will not give up

publish-{brief_id} as the instance ID is load-bearing. Drop it and you have a pipeline that re-fires when an Anthropic routine retries on its end, publishes the same article twice, and double-charges every paid API call inside the workflow body.

Production AI agent guardrails: the blast-radius playbook

Idempotent steps are one blast-radius layer. The cost chokepoint is the other.

Dynamic Workflows does not threaten this invariant directly. The ID contract on instances.create is unchanged. What could threaten it is a sloppy per-lane refactor where lane code starts re-deriving the ID locally – for instance, a lane that decides it wants to include a timestamp in the ID to "be safe," which silently breaks dedup because every retry now generates a unique ID.

The rule that survives every version bump, and the one I would write on the wall before letting anyone else touch this code:

This is the kind of invariant that does not show up in a launch post or a migration guide, because it only exists if you have shipped the failure it prevents. The cost-chokepoint side of this – making sure no single workflow run can spiral into a runaway bill even if dedup fails – is a separate layer, and it lives at the model-call level rather than the instance level. Both layers exist for the same reason: in a pipeline where every step costs money, structural correctness is the cheapest insurance you will ever buy.

What I'd do differently starting from scratch in May 2026

If I were starting this stack today instead of inheriting decisions from a year ago, three things would change.

First, I would start with the shared static spine plus one Dynamic Workflow for any genuinely divergent lane, not six separate definitions. The temptation when a new abstraction lands is to use it everywhere, and the maintenance tax of N workflow definitions arrives before the scale benefit does. Six definitions is six places to fix a bug, six places to upgrade a dependency, six places where a step contract can drift. One spine plus one outlier is the minimum-viable split.

Second, I would put the idempotency key at the HTTP boundary on day one. Retrofitting publish-{brief_id} after a double-publish incident is the expensive path – you have to reconcile the double-published rows, refund the affected costs, and instrument the dispatch layer while the system is already in production. Adding the try { create({id}) } catch (dup) {} pattern before you have ever shipped a pipeline costs ten minutes and prevents an entire category of failure.

Third, I would treat Workflows V2's determinism as the floor of the design, not a feature to opt into later. Design every step to be replay-safe even if you never hit the 50k concurrency ceiling. Replay-safety is not a scale property; it is a correctness property. A step that is not replay-safe is a step that breaks under retry, and retry happens at any scale.

That last one is the senior move dressed up as a technical decision. The trap is treating new platform capabilities as features to adopt. The discipline is treating them as constraints to design against, whether or not you ever use the headroom.

Do I need Dynamic Workflows if all my tenants run identical logic?

No. If the step graph is identical and only data differs, ship the data in the dispatch payload and keep one static workflow. Dynamic Workflows earns its keep when the code itself diverges per tenant – when the step graph differs, not just the parameters inside it.

Does Workflows V2 break existing V1 workflows?

V2 is a rearchitected control plane focused on deterministic, agent-triggered execution. Treat the migration as opt-in, validate idempotency before moving a hot path, and do not migrate working code just because a new version exists.

What is the actual concurrency ceiling now?

50,000 concurrent instances and 2,000,000 queued instances per workflow, up from 1,000,000 queued. For most operators this is far above where you live, and the more important V2 property is deterministic replay, not the new ceiling.

Is @cloudflare/dynamic-workflows production-ready or a preview?

It shipped May 1 2026 as an MIT-licensed library built on Dynamic Workers. Treat library maturity and your idempotency contract as the gating risk, not the license. If your dispatch layer is sound, the library is ready before your migration plan is.

When should a founder pay an engineer to do this migration?

When a product line's automation logic genuinely diverges from the others – different step graph, not different parameters. Splitting for scale alone is premature; splitting for divergent logic is the real trigger. If this is the kind of migration math your team needs help running, this is exactly the sort of architectural review I do through DVNC.dev.

Key Takeaways

One static PublishWorkflow serves six AI publisher routines because the step graph is identical and only the editorial bundle differs – code stays shared, data ships in the payload.
Migrate a lane to @cloudflare/dynamic-workflows only when its step graph diverges from the spine by more than one stage. Below that, branch-in-step is cheaper to maintain than a second definition.
Workflows V2's 50k concurrent / 2M queued ceiling is not the headline for most operators. Deterministic replay semantics are – design every step to be replay-safe whether or not you ever scale.
The publish-{brief_id} idempotency key is owned by the HTTP dispatch boundary, never by per-lane workflow code. This is the non-negotiable that survives every version bump.
Do not migrate a working hot path just because a new version shipped. New code on the new model is cheap; refactoring working code is not.

Last Updated

May 17, 2026

CategoryBuild