Ray Amjad · Youtube · 15:10

Anthropic Just Dropped the Update Everyone's Been Waiting For

A 15-minute first-look at Claude Code's new Workflow tool — the code-as-orchestrator pattern that kills the token tax on multi-agent pipelines.

Posted

May 22nd 2026

3 days ago

Duration

15:10

Format

Tutorial

educational

Channel

RA

Ray Amjad

§ 01 · The Hook

The bait, then the rug-pull.

Claude Code 2.1.147 shipped a feature Anthropic hadn't announced yet: a Workflow tool that replaces the model-as-orchestrator pattern with a JavaScript file, and with it, the token tax that compounds with every sub-agent handoff.

§ · Chapters

Where the time goes.

00:00 – 01:21

01 · Introduction

Hook — announces the Workflow feature, shows /workflows slash command, previews a previous Sentry triage run.

01:21 – 03:18

02 · The Big Picture

Explains the old model-as-orchestrator pattern, the token tax at each sub-agent join, and why context fills and causes sloppy orchestration.

03:18 – 04:52

03 · How Workflows Look

Introduces the code-as-orchestrator concept, shows slide diagrams contrasting the two patterns, introduces agent() and loops in code.

04:52 – 06:21

04 · Making a Workflow

Walks through triage-sentry.js manually — meta, schemas, args, phase definitions, plain JS filtering.

06:21 – 07:42

05 · Workflow Demo

Shows the triage-sentry workflow running live — phase log, parallel fix agents, retrying, background execution.

07:42 – 09:31

06 · Running the Workflow

Continues live demo — navigating /workflows UI, pausing and resuming, watching the verifying stage.

09:31 – 10:14

07 · Workflow 2: Dead Code Sweep

While loop up to 8 rounds, parallel removal with test-and-revert safety check, early exit when no dead code remains.

10:14 – 11:10

08 · Workflow 3: Personalized Outreach

CSV leads, Haiku research stage, Opus writing stage via pipeline(), model-switching per stage.

11:10 – 12:24

09 · Results

Triage-sentry completes — 7 sub-agents, 400K tokens, three fixes verified. Personalized outreach output folder shown.

12:24 – 12:46

10 · Workflow Creator

Mentions a GitHub skill that teaches Claude Code to generate workflow files; expects Anthropic to ship an official one.

12:46 – 13:31

11 · The Toolkit

Rapid summary of all workflow primitives: agent, parallel, pipeline, schema, phase log, args.

13:31 – 13:49

12 · Budgets

Shows budget.remaining() in a while loop — token-aware self-stopping loops to prevent runaway spend.

13:49 – 14:24

13 · My Suggestion

Recommends asking Claude Code to audit previous sessions and identify workflow opportunities.

14:24 – 15:04

14 · When to Workflow

The three-condition decision rule: repeatable, fans out, worth resuming. Everything else: just let Claude do it.

15:04 – 15:10

15 · Conclusion

CTA to newsletter and masterclass.

§ · Storyboard

Visual structure at a glance.

hook — changelog

hook hook — changelog 00:00

old pattern diagram

promise old pattern diagram 01:21

token tax diagram

value token tax diagram 02:05

code wrapper diagram

value code wrapper diagram 03:18

triage-sentry.js walkthrough

value triage-sentry.js walkthrough 04:52

live workflow run

value live workflow run 07:42

toolkit summary

value toolkit summary 12:46

when to workflow

cta when to workflow 14:24

§ · Frameworks

Named ideas worth stealing.

14:24 list

When to Workflow (3 conditions)

Repeatable — you will run it over and over
Fans out — conditionals, loops, or parallel agents based on data
Worth resuming — long enough that a mid-run failure is costly

A portable decision rule for deciding when to invest in a workflow file vs. just running Claude manually.

Steal for Any decision framework for AI automation — use as an audience filter or CTA hook

03:18 model

Code-as-Orchestrator Pattern

Replace the model orchestrator with a JS file — agent(), parallel(), pipeline(), schema, phase(), args, budget — so context stays flat and joins are free.

Steal for Explaining deterministic multi-agent architecture to a non-technical audience

§ · Quotables

Lines you could clip.

02:05

"Four sub-agents is not one tax, it is a tax at every join. The more you fan out, the more you pay."

Visceral one-liner that reframes the cost model for anyone who has run multi-agent workflows. → TikTok hook

03:25

"Same five steps — one run pays a tax at every arrow; the other pays nothing."

The before/after contrast is complete in one sentence, no setup needed. → IG reel cold open

06:10

"Reviewing is no longer the model's decision, it is the file's."

Tight philosophical punchline about determinism vs. model drift. → newsletter pull-quote

14:38

"Repeatable, fans out, worth resuming — that is the sweet spot."

Portable decision rule, stands alone. → TikTok hook

§ · Resources Mentioned

Things they pointed at.

12:33toolWorkflow Creator skill (GitHub) ↗

15:04linkMaster Claude Code masterclass ↗

§ · CTA Breakdown

How they asked for the click.

15:04 newsletter

"If you want to get free insights delivered from me on a regular basis, then you may want to join my newsletter as well linked below."

Soft close at the very end after the decision-rule summary; non-intrusive, no mid-video pitch.

§ 04 · The Script

Word for word.

HOOK opening / re-engagementCTA the pitch metaphor

00:00HOOKOkay. So a few hours ago, Anthropic added a brand new feature that may fundamentally change the way that you use Cloud Code going forwards, and that is by adding the workflow tool for deterministic multi agent orchestration. Now they haven't yet made an announcement about this feature, so by watching this video, you're gonna be ahead of the curve.

00:17So I'll be going through exactly what the feature is and how you can use it to be a Cloud Code power user. So they do say that it is off by default, and we have to set this environment variable to enable it. So I will copy this over, go to the terminal, paste this in, add a space, and then run cloud.

00:32And after doing that, you will see we have a brand new slash command, which is slash workflows, and this will allow you to browse your workflow history running and completed. So going over here, I have no workflows I have run-in this session.

00:44So we basically gotta go ahead and make ourselves a workflow. But, essentially, I can go back to a previous session where I was testing out the workflows feature. And if I do slash workflows inside of here, then I can see that these are all the workflows that I previously ran.

00:57So I ran this, like, triage workflow for Sentry, then I can see that it looks like this. So we have different stages of a workflow that I predefined, where each workflow is running as a individual subagent. Then I can press enter on any of these workflows, and then I can see the run that happened, all the tool calls that were made, the prompt, and so forth.

01:14And then I can go over to any stage of the workflow and see all the different agents that have run inside of the workflow, how many tools they use, tokens, and so forth. Okay. So before actually going ahead and making a workflow,

01:25I wanna talk about what people have been doing until now before the update and why this new approach is better and solves a lot of the problems that people have been previously facing. So until now, you may have had some kind of skill that described a set of steps, a workflow, whereby Claude, your main session would be the Orchestrator,

01:42and then tell a sub agent to do something like implement a feature. It would then pass that information back into main session, and that would then go into the next sub agent. That would then do a review, pass it back to main session, go to next sub agent, and so forth.

01:55So, essentially, the orchestrator is deciding what to run next and holding every intermediate result. And this is the old, less effective way of making workflows that people have been doing until now. Now this can have some drawbacks.

02:07The first of all being a token tax. So every time Cloud Code spins a persuasion and then it gets a result back, that goes into the main context or note of the orchestrator,

02:17and then it's passed again into the next sub agent, goes back, and so forth. And this can consume quite a lot of tokens in your main session, especially if we have something like ten, fifteen sub agents that run.

02:27And your main orchestrator would do worse over time as a context window fills up with all these results, and these tokens are going back and forth unnecessarily between the main orchestrator and the sub agents. So ideally, should be passed directly from one to the next without ever entering the context window of the orchestrator.

02:44We also have some other problems whereby you may trigger a workflow and you have no visibility to what is going on. You basically see some kind of like a wall of text scroll by as a workflow is triggering over the next twenty, thirty minutes. And finally, because the orchestrator's context window is filling up, bypassing tokens back and forth between different sub agents, eventually, it just starts forgetting and acts more sloppy and lazy, which is not what you want from an orchestrator.

03:07And this forgetting in the orchestrator can be even worse if you have conditionals. So you may have a conditional like, if we get this particular result, then spawn another sub agent to verify it. If not, don't do that.

03:18So then the Cloud Code team were like, okay. So what if instead of having a nondeterministic model being a wrapper of all these sub agents, we had a code wrapper instead?

03:28So instead of a model passing information back and forth between sub agents and incurring a token tax every single time and also getting sloppier the longer it runs, what if we just had a workflow file, some kind of code that could pass the result directly without ever entering the main conversation, which means that you can have, like, twenty, thirty, 100 agents run one after another and the context of the main Orchestrator never actually filling because the Orchestrator is now code instead.

03:54So the way that the workflow tool works is we have a workflow dot j s file, which is a JavaScript file, and this basically defines how our workflow is gonna look. So we have phases that we can define. So for example, we can have a review phase, and then we can define an agent that runs, like does a basic review.

04:11This is a prompt. And then if the review passes, it would just end as it is by doing a return. If it doesn't, then it will fix the issues.

04:19And because this is code, we can also implement loops as well. So you can see in this example, we would have, like, implement the feature. And then over here, it would do up to three reviews.

04:28And if the review passes, then it would just break. If it doesn't pass, then it would fix it with a different agent. So every time you see agent over here, that is a different sub agent running.

04:37And what's interesting about this as well is that we can define schemas as well. So we want this review sub agent to return a schema of, like, either past or issues that we can then reference in the next agent. Okay.

04:49Now let's go through the process of actually making a workflow. So I'll be manually going through the file so you can see what it's like behind the scenes. But, ideally, you will be getting Cloud Code to make you workflows instead.

04:59So you wanna go to your dot Cloud folder in your project and make a brand new folder, which is workflows. And then inside of this, we can define a workflow. So I'm gonna call this triage

05:08sentry dot j s. So this will be a JavaScript file. And, essentially, at the very top, we have to define a meta for the workflow.

05:16So the workflow meta looks kind of like this. We have the name, the description that will appear inside of Cloud Code, and then we have the different phases as well. So then we can define some schemas.

05:26So this will be one of our schemas, which is an issues schema. So you can see over here, it has the issue ID, the title, and the user count of how many users are affected.

05:36Then we will define another schema here, verdict, which is whether or not it was fixed and any notes about this. And then because we can pass in arguments into our workflows, for example, you can see an argument would be passed in kinda like this, Minimum number of users affected,

05:5020 users. We have to then pass the arguments so they are loaded in properly. So it looks kinda like this.

05:56So what happens here is that 20 will be the default. So any issue that affects more than 20 users on Sentry will be loaded in unless we have specified an argument instead. So firstly, we'll define our first phase that will load in the issues.

06:10So phase over here will pull in issues. So we define an agent. So you can see that it says, use a Sentry MCP to list unresolved issues for each return its ID, title, and affected user account.

06:21And the schema that we define up here for issues is also passed into agent, so it knows what kind of schema to return back. Now the interesting thing is that we can now define some plain JavaScript. So if we were to write something like this, then we can basically filter down the issues that are returned from two ones that are above the threshold that we defined earlier.

06:39So this will then log inside of the workflow for us to see. And if no issues are present that are big enough, then it will just end the workflow as it is right over here. So say fixed, no issues affecting more than the threshold number of users.

06:53And now next up, we can define a pipeline. So this is our pipeline. And essentially, all the big issues that we have will be passed into a two stage process.

07:02The first stage, what it will do is that for each of the issues that are returned so let's say we have five issues that are returned, it will then investigate and fix the issue, and then it will go to next stage of verifying that the fix is actually real and working. And then finally, at the very bottom, we can say this of how many issues were actually fixed, how many, uh, issues we found, and the final results.

07:23So the interesting thing that you will notice is that we're basically mixing in plain JavaScript with the sub agent that will be running inside of Cloud Code. So the issue ID from earlier, the title, and the user account are being passed directly into the prompt.

07:37So they're not going back in through the main orchestrator. They're going in through the prompt instead. So after that, triage sentry, then you will see that it appears over here with the workflow tag next to it.

07:47And if I press enter, then you will see that it begins to trigger a workflow. So it's triggering the triage sentry workflow, and it's running with zero out of one agents right now.

07:57It has one out of three phases, and then I can see right over here as a background workflow. So if I go over here, then I think I have to zoom out. I can see right now it's pulling the issues.

08:07If I press enter, then I can see that, uh, it's running here. Press enter again.

08:12I can see the different tools that it's calling and the prompt and stuff like that, and then I can go back, or I can pause this workflow as well by pressing the p command. So let me go back to workflows and then press p again to resume it. And you can see the workflow has now returned a result of 25 unresolved sentry issues.

08:29So if I go back and then go over to next stage here, which is a fixing stage, then I can see that it only decided to spin up three sub agents because only three of them were affecting more than 20 users. So all of these sub agents are running in the background right now, and I can skip any of the sub agents by pressing x or retrying them.

08:46And one nice thing about this is that we have automatic retrying. So if one of the sub agents were to fail for any reason, like the MCP server stopped working, then Cloud Code would then retry that particular sub agent. And then if I press down, I can see that the verifying has not been started yet.

09:00So whilst this workflow is running in the background, I can just continue to send messages to Cloud and work with Cloud, or I can trigger another workflow as well. So I can have multiple different workflows running at the same time. So by default, it's using the Opus model of the main session, but I could define a different model.

09:16So whilst we're waiting on this, let's go through a different workflow. So here is another workflow that I made, what's called dead code sweep. So find and remove unused code round by round.

09:27So we don't have any arguments. We have the meta defined over here, so the name and description. Then we have some types defined over here, and then we've defined our own variables.

09:36So how many rounds do we want in this loop? So then we have a while loop right over here whereby one agent will find unused code in the code base. It will then list it out according to schema that we defined earlier.

09:49If no dead code has been found, then it would basically end. But if dead code has been found, then for each of those issues, each of that dead code, it would then remove it one by one. And then once this is over, it will then run another time up to eight times, and it will exit early if it finds out there's no more dead code available.

10:07So that's a pretty good example of combining loops inside of these workflows and adding some conditional logic as well. Another example is for personalizing outreach. So you may wanna load a set of leads from a file, research them inside of SubAgent, and then draft and save a personalized message.

10:23So we have our leads object up here, then it loads in any arguments such as the leads file, where we want our emails to be stored after we have written it, and then we have the different phases. So first phase will load in the leads from the file. We could also load it in programmatically if we have structured data like a CSV.

10:41So the first pipeline will research the lead, and you can imagine this, like, does research with a cheap API or cheap MCP server. If it can find anything for that lead, then it would switch over to a more expensive one in a conditional. And you can see this first model is running with Haiku over here, and then any information will be passed into the next stage where it will basically write that copy.

11:03Now, of course, this is a simple example. You may wanna make it more complicated for your own workflows that you have defined. And now the workflow has completed.

11:11So it used seven sub agents, 400,000 tokens, and this was a stage that I went through. So we verified each of those fixes successfully,

11:19and, yeah, this looks pretty good. Now let's actually do our personalized outreach workflow. So I have a list of leads over here that I'm gonna be contacting,

11:27and this is completely for demo purposes. And now our workflow is underway. So going over to workflow, we can see that it's loading the leads right now.

11:36And now we have eight research agents, all high q ones spawned in parallel with one agent per lead, and that's because we have eight leads inside of our CSV. And now after it researches all of them, so it seems to be going pretty fast, then it will move on to next stage where it will write eight different messages ideally for each of the leads separately.

11:55And you can kind of imagine a workflow whereby, like, it couldn't really find the contact details using a lighter model, and then it switches over to a heavier model, and it uses a more expensive APIs to get those details for the leads. So I finished writing the personalized outreach with the Opus model. So we can then see we have a brand new folder with the output, and then we have all the personalized outreach for Bill Gates or whatever, blah blah blah.

12:19Like, this is outreach. And, of course, I don't think this would get a reply, but, like, this is for demo purposes. Now to make making a workflow easier for you, I have a workflow creator scale that you can download from below from my GitHub.

12:30And this basically teaches Cloud Code how to make a workflow file, all the functionality that is available, and stuff like that. But I think once the feature is officially announced, then maybe Anthropic will add their own official workflow creator skill, in which case you should use that once it is released. Anyways, so to summarize, we have a toolkit over here.

12:48We have an agent where we spawn one fresh sub agent every time. We can run them in parallel, so we can batch, like, 10 agents in parallel, wait for all of them to complete.

12:57We can use pipeline where streams items through stages instead. So for example, with the lead outreach workflow, if you are paying close attention, you may have noticed that as soon as one research agent was done, then the next writing a message agent would start immediately rather than waiting for all eight research agents to be done, and that is because of the pipeline here.

13:16So we can combine these both together, parallel and pipeline. Then we have a schema, so we get a structured answer back. We have the phase log, which basically gives us a live view of what is happening.

13:27And then finally, we have the arguments that would be passed in as well. Now we also have budgets when it comes to our workflow too. So there is a budget parameter.

13:35So for example, in this case, we can have a while loop. And whilst the budget remaining is more than 50,000 tokens, we can try and find end of a bug, for example. So this can kinda keep your workflow structured and prevent them from growing out of control.

13:49Now I would suggest basically getting Cloud Code to look for your previous sessions, identify any opportunities for making workflows, and make workflows around them. So for example, if you're a big fan of Ralph Loops and you like using Ralph Loops to quickly

14:02go for a backlog of issues, for example, then you can make a workflow where it loads in the issues from GitHub. It will then go through a loop whereby for each issue, it will then make a fix. It will do verification.

14:15CTAIt may also do an adversarial review as well, and then it will move on to the next issue. So I've made a whole bunch over here that I am testing myself, like implement and review. So when should you be reaching for a workflow?

14:26CTAFirstly, anytime you wanna do something repeatable, so you will be doing it over and over again, probably every single day, anytime you want to fan out agents, for example, based on conditionals or loops or getting some data in some way, and anything that may seem long enough to fail halfway. So you can split that down into workflow, which is automatically resumable because Cloud Code will retry each sub agent up to three times if it does fail.

14:50CTABut for any one off task, you probably should just select Cloud Date manually. There's no point making a workflow because we get to take advantage of the fact that the results aren't being passed back and forth from the main session and the sub agents. It just goes directly because of the code.

15:04And if you want to get free insights delivered from me on a regular basis, then you may want to join my newsletter as well linked

— full transcript

§ 05 · For Joe

Replace the model with code and pay nothing at every join.

WHAT TO LEARN

The token tax in multi-agent pipelines is not a fixed cost — it multiplies at every sub-agent join, and moving orchestration into a JavaScript file eliminates it entirely.

Every time a sub-agent result returns to the main orchestrator context, you pay a token tax; with 15 agents that tax compounds until the orchestrator degrades and starts making sloppy decisions.
A workflow.js file keeps the main context flat by passing results directly between agents through code, not through the model — the orchestrator never reads intermediate outputs.
The pipeline() primitive starts downstream processing on the first completed item, not after all items finish, which means a 10-lead outreach job finishes faster than a naive parallel-then-batch approach.
Schemas on agent() calls give the next stage typed, structured data — instead of prompting a model to parse free text, the next agent receives named fields it can reference directly.
A budget.remaining() guard in a while loop is the correct way to prevent runaway token spend on iterative jobs; it scales the run to available budget rather than a fixed agent count.
Conditional branching is only reliable when it lives in code; inside a filling model context it degrades as the context grows, which is why complex if-then logic belongs in the workflow file.
The three-condition decision rule — repeatable, fans out, worth resuming — is a portable filter: if a job fails all three, there is no benefit to wrapping it in a workflow.
Auto-retry (up to 3 times per sub-agent) makes long workflows self-healing on transient failures like MCP server restarts, removing the need to babysit runs.
Mixing a cheap model (Haiku) for research stages and an expensive model (Opus) for writing stages within the same pipeline is a cost pattern the workflow file makes trivial to express.

§ 06 · Frame Gallery

Visual moments.

08:43