WEBVTT

00:00:00.000 --> 00:00:12.000
Okay. So a few hours ago, Anthropic added a brand new feature that may fundamentally change the way that you use Cloud Code going forwards, and that is by adding the workflow tool for deterministic multi agent orchestration.

00:00:12.000 --> 00:00:36.780
Now they haven't yet made an announcement about this feature, so by watching this video, you're gonna be ahead of the curve. So I'll be going through exactly what the feature is and how you can use it to be a Cloud Code power user. So they do say that it is off by default, and we have to set this environment variable to enable it. So I will copy this over, go to the terminal, paste this in, add a space, and then run cloud. And after doing that, you will see we have a brand new slash command, which is slash workflows,

00:00:36.940 --> 00:00:51.795
and this will allow you to browse your workflow history running and completed. So going over here, I have no workflows I have run-in this session. So we basically gotta go ahead and make ourselves a workflow. But, essentially, I can go back to a previous session where I was testing out the workflows feature.

00:00:51.955 --> 00:01:25.050
And if I do slash workflows inside of here, then I can see that these are all the workflows that I previously ran. So I ran this, like, triage workflow for Sentry, then I can see that it looks like this. So we have different stages of a workflow that I predefined, where each workflow is running as a individual subagent. Then I can press enter on any of these workflows, and then I can see the run that happened, all the tool calls that were made, the prompt, and so forth. And then I can go over to any stage of the workflow and see all the different agents that have run inside of the workflow, how many tools they use, tokens, and so forth. Okay. So before actually going ahead and making a workflow,

00:01:25.210 --> 00:01:42.185
I wanna talk about what people have been doing until now before the update and why this new approach is better and solves a lot of the problems that people have been previously facing. So until now, you may have had some kind of skill that described a set of steps, a workflow, whereby Claude, your main session would be the Orchestrator,

00:01:42.345 --> 00:02:07.805
and then tell a sub agent to do something like implement a feature. It would then pass that information back into main session, and that would then go into the next sub agent. That would then do a review, pass it back to main session, go to next sub agent, and so forth. So, essentially, the orchestrator is deciding what to run next and holding every intermediate result. And this is the old, less effective way of making workflows that people have been doing until now. Now this can have some drawbacks.

00:02:07.885 --> 00:02:10.205
The first of all being a token tax.

00:02:10.365 --> 00:02:12.525
So every time Cloud Code spins a persuasion

00:02:12.790 --> 00:02:17.510
and then it gets a result back, that goes into the main context or note of the orchestrator,

00:02:17.590 --> 00:02:24.070
and then it's passed again into the next sub agent, goes back, and so forth. And this can consume quite a lot of tokens in your main session,

00:02:24.310 --> 00:02:35.195
especially if we have something like ten, fifteen sub agents that run. And your main orchestrator would do worse over time as a context window fills up with all these results, and these tokens are going back and forth unnecessarily

00:02:35.195 --> 00:03:11.670
between the main orchestrator and the sub agents. So ideally, should be passed directly from one to the next without ever entering the context window of the orchestrator. We also have some other problems whereby you may trigger a workflow and you have no visibility to what is going on. You basically see some kind of like a wall of text scroll by as a workflow is triggering over the next twenty, thirty minutes. And finally, because the orchestrator's context window is filling up, bypassing tokens back and forth between different sub agents, eventually, it just starts forgetting and acts more sloppy and lazy, which is not what you want from an orchestrator. And this forgetting in the orchestrator can be even worse if you have conditionals.

00:03:11.830 --> 00:03:23.675
So you may have a conditional like, if we get this particular result, then spawn another sub agent to verify it. If not, don't do that. So then the Cloud Code team were like, okay. So what if instead of having a nondeterministic

00:03:23.675 --> 00:03:41.480
model being a wrapper of all these sub agents, we had a code wrapper instead? So instead of a model passing information back and forth between sub agents and incurring a token tax every single time and also getting sloppier the longer it runs, what if we just had a workflow file, some kind of code that could pass the result directly

00:03:41.720 --> 00:05:04.280
without ever entering the main conversation, which means that you can have, like, twenty, thirty, 100 agents run one after another and the context of the main Orchestrator never actually filling because the Orchestrator is now code instead. So the way that the workflow tool works is we have a workflow dot j s file, which is a JavaScript file, and this basically defines how our workflow is gonna look. So we have phases that we can define. So for example, we can have a review phase, and then we can define an agent that runs, like does a basic review. This is a prompt. And then if the review passes, it would just end as it is by doing a return. If it doesn't, then it will fix the issues. And because this is code, we can also implement loops as well. So you can see in this example, we would have, like, implement the feature. And then over here, it would do up to three reviews. And if the review passes, then it would just break. If it doesn't pass, then it would fix it with a different agent. So every time you see agent over here, that is a different sub agent running. And what's interesting about this as well is that we can define schemas as well. So we want this review sub agent to return a schema of, like, either past or issues that we can then reference in the next agent. Okay. Now let's go through the process of actually making a workflow. So I'll be manually going through the file so you can see what it's like behind the scenes. But, ideally, you will be getting Cloud Code to make you workflows instead. So you wanna go to your dot Cloud folder in your project and make a brand new folder, which is workflows.

00:05:04.360 --> 00:05:08.760
And then inside of this, we can define a workflow. So I'm gonna call this triage

00:05:08.760 --> 00:05:26.520
sentry dot j s. So this will be a JavaScript file. And, essentially, at the very top, we have to define a meta for the workflow. So the workflow meta looks kind of like this. We have the name, the description that will appear inside of Cloud Code, and then we have the different phases as well. So then we can define some schemas.

00:05:26.600 --> 00:05:32.920
So this will be one of our schemas, which is an issues schema. So you can see over here, it has the issue ID,

00:05:33.080 --> 00:05:45.315
the title, and the user count of how many users are affected. Then we will define another schema here, verdict, which is whether or not it was fixed and any notes about this. And then because we can pass in arguments into our workflows,

00:05:45.395 --> 00:05:50.590
for example, you can see an argument would be passed in kinda like this, Minimum number of users affected,

00:05:50.670 --> 00:06:31.140
20 users. We have to then pass the arguments so they are loaded in properly. So it looks kinda like this. So what happens here is that 20 will be the default. So any issue that affects more than 20 users on Sentry will be loaded in unless we have specified an argument instead. So firstly, we'll define our first phase that will load in the issues. So phase over here will pull in issues. So we define an agent. So you can see that it says, use a Sentry MCP to list unresolved issues for each return its ID, title, and affected user account. And the schema that we define up here for issues is also passed into agent, so it knows what kind of schema to return back. Now the interesting thing is that we can now define some plain JavaScript.

00:06:31.140 --> 00:07:28.230
So if we were to write something like this, then we can basically filter down the issues that are returned from two ones that are above the threshold that we defined earlier. So this will then log inside of the workflow for us to see. And if no issues are present that are big enough, then it will just end the workflow as it is right over here. So say fixed, no issues affecting more than the threshold number of users. And now next up, we can define a pipeline. So this is our pipeline. And essentially, all the big issues that we have will be passed into a two stage process. The first stage, what it will do is that for each of the issues that are returned so let's say we have five issues that are returned, it will then investigate and fix the issue, and then it will go to next stage of verifying that the fix is actually real and working. And then finally, at the very bottom, we can say this of how many issues were actually fixed, how many, uh, issues we found, and the final results. So the interesting thing that you will notice is that we're basically mixing in plain JavaScript

00:07:28.095 --> 00:07:54.370
with the sub agent that will be running inside of Cloud Code. So the issue ID from earlier, the title, and the user account are being passed directly into the prompt. So they're not going back in through the main orchestrator. They're going in through the prompt instead. So after that, triage sentry, then you will see that it appears over here with the workflow tag next to it. And if I press enter, then you will see that it begins to trigger a workflow. So it's triggering the triage sentry workflow,

00:07:54.795 --> 00:08:08.650
and it's running with zero out of one agents right now. It has one out of three phases, and then I can see right over here as a background workflow. So if I go over here, then I think I have to zoom out. I can see right now it's pulling the issues. If I press enter,

00:08:08.970 --> 00:09:31.915
then I can see that, uh, it's running here. Press enter again. I can see the different tools that it's calling and the prompt and stuff like that, and then I can go back, or I can pause this workflow as well by pressing the p command. So let me go back to workflows and then press p again to resume it. And you can see the workflow has now returned a result of 25 unresolved sentry issues. So if I go back and then go over to next stage here, which is a fixing stage, then I can see that it only decided to spin up three sub agents because only three of them were affecting more than 20 users. So all of these sub agents are running in the background right now, and I can skip any of the sub agents by pressing x or retrying them. And one nice thing about this is that we have automatic retrying. So if one of the sub agents were to fail for any reason, like the MCP server stopped working, then Cloud Code would then retry that particular sub agent. And then if I press down, I can see that the verifying has not been started yet. So whilst this workflow is running in the background, I can just continue to send messages to Cloud and work with Cloud, or I can trigger another workflow as well. So I can have multiple different workflows running at the same time. So by default, it's using the Opus model of the main session, but I could define a different model. So whilst we're waiting on this, let's go through a different workflow. So here is another workflow that I made, what's called dead code sweep. So find and remove unused code round by round. So we don't have any arguments. We have the meta defined over here, so the name and description.

00:09:32.440 --> 00:09:36.600
Then we have some types defined over here, and then we've defined our own variables.

00:09:36.920 --> 00:10:15.950
So how many rounds do we want in this loop? So then we have a while loop right over here whereby one agent will find unused code in the code base. It will then list it out according to schema that we defined earlier. If no dead code has been found, then it would basically end. But if dead code has been found, then for each of those issues, each of that dead code, it would then remove it one by one. And then once this is over, it will then run another time up to eight times, and it will exit early if it finds out there's no more dead code available. So that's a pretty good example of combining loops inside of these workflows and adding some conditional logic as well. Another example is for personalizing outreach.

00:10:16.365 --> 00:10:34.410
So you may wanna load a set of leads from a file, research them inside of SubAgent, and then draft and save a personalized message. So we have our leads object up here, then it loads in any arguments such as the leads file, where we want our emails to be stored after we have written it, and then we have the different phases.

00:10:34.570 --> 00:10:49.225
So first phase will load in the leads from the file. We could also load it in programmatically if we have structured data like a CSV. So the first pipeline will research the lead, and you can imagine this, like, does research with a cheap API or cheap MCP server.

00:10:49.385 --> 00:11:14.485
If it can find anything for that lead, then it would switch over to a more expensive one in a conditional. And you can see this first model is running with Haiku over here, and then any information will be passed into the next stage where it will basically write that copy. Now, of course, this is a simple example. You may wanna make it more complicated for your own workflows that you have defined. And now the workflow has completed. So it used seven sub agents, 400,000 tokens,

00:11:14.725 --> 00:11:19.285
and this was a stage that I went through. So we verified each of those fixes successfully,

00:11:19.765 --> 00:11:23.525
and, yeah, this looks pretty good. Now let's actually do our personalized outreach workflow.

00:11:24.020 --> 00:11:27.700
So I have a list of leads over here that I'm gonna be contacting,

00:11:27.940 --> 00:11:32.100
and this is completely for demo purposes. And now our workflow is underway.

00:11:32.260 --> 00:11:45.155
So going over to workflow, we can see that it's loading the leads right now. And now we have eight research agents, all high q ones spawned in parallel with one agent per lead, and that's because we have eight leads inside of our CSV.

00:11:45.235 --> 00:11:52.755
And now after it researches all of them, so it seems to be going pretty fast, then it will move on to next stage where it will write eight different messages

00:11:53.330 --> 00:12:53.220
ideally for each of the leads separately. And you can kind of imagine a workflow whereby, like, it couldn't really find the contact details using a lighter model, and then it switches over to a heavier model, and it uses a more expensive APIs to get those details for the leads. So I finished writing the personalized outreach with the Opus model. So we can then see we have a brand new folder with the output, and then we have all the personalized outreach for Bill Gates or whatever, blah blah blah. Like, this is outreach. And, of course, I don't think this would get a reply, but, like, this is for demo purposes. Now to make making a workflow easier for you, I have a workflow creator scale that you can download from below from my GitHub. And this basically teaches Cloud Code how to make a workflow file, all the functionality that is available, and stuff like that. But I think once the feature is officially announced, then maybe Anthropic will add their own official workflow creator skill, in which case you should use that once it is released. Anyways, so to summarize, we have a toolkit over here. We have an agent where we spawn one fresh sub agent every time. We can run them in parallel,

00:12:53.380 --> 00:13:04.145
so we can batch, like, 10 agents in parallel, wait for all of them to complete. We can use pipeline where streams items through stages instead. So for example, with the lead outreach workflow,

00:13:04.225 --> 00:13:35.575
if you are paying close attention, you may have noticed that as soon as one research agent was done, then the next writing a message agent would start immediately rather than waiting for all eight research agents to be done, and that is because of the pipeline here. So we can combine these both together, parallel and pipeline. Then we have a schema, so we get a structured answer back. We have the phase log, which basically gives us a live view of what is happening. And then finally, we have the arguments that would be passed in as well. Now we also have budgets when it comes to our workflow too. So there is a budget parameter.

00:13:35.735 --> 00:13:56.130
So for example, in this case, we can have a while loop. And whilst the budget remaining is more than 50,000 tokens, we can try and find end of a bug, for example. So this can kinda keep your workflow structured and prevent them from growing out of control. Now I would suggest basically getting Cloud Code to look for your previous sessions, identify any opportunities for making workflows,

00:13:56.210 --> 00:14:02.050
and make workflows around them. So for example, if you're a big fan of Ralph Loops and you like using Ralph Loops to quickly

00:14:02.465 --> 00:14:15.025
go for a backlog of issues, for example, then you can make a workflow where it loads in the issues from GitHub. It will then go through a loop whereby for each issue, it will then make a fix. It will do verification.

00:14:15.185 --> 00:14:42.015
It may also do an adversarial review as well, and then it will move on to the next issue. So I've made a whole bunch over here that I am testing myself, like implement and review. So when should you be reaching for a workflow? Firstly, anytime you wanna do something repeatable, so you will be doing it over and over again, probably every single day, anytime you want to fan out agents, for example, based on conditionals or loops or getting some data in some way, and anything that may seem long enough to fail halfway.

00:14:42.175 --> 00:15:10.195
So you can split that down into workflow, which is automatically resumable because Cloud Code will retry each sub agent up to three times if it does fail. But for any one off task, you probably should just select Cloud Date manually. There's no point making a workflow because we get to take advantage of the fact that the results aren't being passed back and forth from the main session and the sub agents. It just goes directly because of the code. And if you want to get free insights delivered from me on a regular basis, then you may want to join my newsletter as well linked