WEBVTT

00:00:00.080 --> 00:00:18.235
Clud workflows dropped about a week ago, and they're not really getting enough airtime. And I think the reason for that is that they can feel a little bit abstract like, hey. You can run something called a workflow now, and it'll burn millions of tokens. But there are some real gems that, dare I say, are

00:00:18.795 --> 00:00:24.475
game changers. So I'm gonna break down what workflows really are, how they work, and the six patterns

00:00:24.635 --> 00:00:42.400
that you can use to make them with real examples. And if you want any of the ones that I show you, there will be a link in the description below. And so credit where credit is due, the best description of what these things are and how you can use them actually comes from Anthropic itself. So a lot of the things I'm gonna go through in this video are based on this documentation.

00:00:42.905 --> 00:00:44.665
So at the most abstract

00:00:44.665 --> 00:00:45.305
level,

00:00:45.545 --> 00:00:49.305
Cloud Code itself is an agent harness that is mostly

00:00:49.305 --> 00:00:56.265
built for coding, even though there are other use cases and types of tasks. But inevitably, there are these certain types of tasks

00:00:56.345 --> 00:01:04.560
that you might wanna do really well that ClaudeCode itself out of the box isn't particularly great at. So things like researching,

00:01:04.800 --> 00:01:06.080
security analysis,

00:01:06.320 --> 00:01:09.040
using agent teams, running code reviews,

00:01:09.120 --> 00:01:11.360
those are the types of things where ClaudeCode

00:01:11.360 --> 00:01:12.160
itself

00:01:12.355 --> 00:01:15.955
isn't super well suited to actually do those things in-depth.

00:01:15.955 --> 00:01:19.555
And so what workflows do is they allow you to dynamically

00:01:19.635 --> 00:01:27.490
create these, like, different harnesses on the fly. So that, like, in a pretty easy way, you can create these reusable

00:01:27.890 --> 00:01:28.690
harnesses

00:01:28.770 --> 00:01:35.250
that can solve pretty complicated problems in a a much easier way. So a few plugins out there, like Compound Engineering,

00:01:35.330 --> 00:01:54.530
for example, they tried to build these types of workflows into their skill files in the way that they would use sub agents, but now that stuff is just supported directly inside of Cloud Code. So one example just to, like, ground everything we're about to talk about, this new deep research that rolled out with Opus 4.8 and these workflow releases

00:01:54.610 --> 00:01:57.490
are an example of a dynamic workflow.

00:01:57.490 --> 00:01:59.810
So what this thing specifically does

00:01:59.970 --> 00:02:03.730
is it can fan out and do a bunch of different web searches.

00:02:03.810 --> 00:02:05.330
It can fetch all of the sources,

00:02:05.925 --> 00:02:07.605
Then it adversarially

00:02:07.605 --> 00:02:10.485
verifies the claims with separate sub agents.

00:02:10.645 --> 00:02:18.005
And then finally, synthesizes all of that back to you in a cited report. So this is an example of a dynamic workflow.

00:02:18.005 --> 00:02:26.370
And so this is something that you could have tried to force together in the past with different, like, skills and commands and trying to, like, manually call them and string them together.

00:02:26.530 --> 00:02:43.865
But now it's better, it's more seamless, it's supported natively, and it's easy for you to create these things with natural language. So how do they work? Uh, first things first is that you need to have the agent, obviously, that's powering all of this stuff. But then there are two things that make this, like, different and what constitute it being a workflow.

00:02:43.945 --> 00:02:52.130
Number one is that it can run things in parallel. And so in this case, what happens is that you can pass it, like, all of these different things that it needs to accomplish,

00:02:52.290 --> 00:02:57.570
and it can run all of them at the same time, and then it waits for all of them to finish.

00:02:57.650 --> 00:03:11.925
So for example, if you were gonna run, like, a deep research report, it would go out, it would run all of those like search queries and exploring your code base and all that stuff at the same time, and then would wait for all of them to come back together. The second piece is a pipeline.

00:03:11.925 --> 00:03:25.480
And so in the cases where things don't have to wait for all of them to done and they can be run-in order, we are running a pipeline. Now the thing is when we're working with workflows, it doesn't need to have both of these. Either of them can be used individually,

00:03:25.480 --> 00:03:35.165
or depending on the workflow, they can be combined. So here's, like, very concretely what this might look like in practice. So if you were to ask, for example, should we migrate our checkout service to a new provider?

00:03:35.325 --> 00:03:36.845
Inside of, like, the static

00:03:36.925 --> 00:03:41.405
traditional, like, clawed code harness, this might turn into, like, five different web searches.

00:03:41.870 --> 00:03:44.910
It fetches the top results from those web searches.

00:03:45.150 --> 00:03:46.510
It verifies

00:03:46.590 --> 00:03:58.015
what it found, and then it summarizes it back to you and gives you a generic research report. And so, for example, like, can have a skill that instructed it, like, hey, you need to go through these stages. So what makes a workflow

00:03:58.095 --> 00:04:14.800
different? Well, in a dynamic workflow, it might first go out and read our actual, like, current, like, Stripe integration or billing code. So maybe it identifies that there's three different, like, primary functions inside of the the app that handle these things. It then takes all of that as context

00:04:14.800 --> 00:04:20.640
and checks each of our features against, like, the new provider that we're considering documentation.

00:04:20.880 --> 00:04:26.745
It also has a separate agent that's trying to price out, like, based on the volume that we're doing.

00:04:27.065 --> 00:04:37.065
What is the cost consideration within this? And then maybe you have an adversarial review stage that kicks in, and it takes all of, like, the individual context that it's been getting from these different sections.

00:04:37.560 --> 00:04:40.360
And maybe it tries to make a case against

00:04:40.360 --> 00:04:41.320
migrating.

00:04:41.320 --> 00:04:45.320
And so the result of this is that you get a much more specific

00:04:45.560 --> 00:04:53.240
set of recommendations out the other side. So this is, like, fundamentally what the the difference is between, like, a static harness

00:04:52.805 --> 00:04:56.325
versus these, like, kind of, like, customized harnesses

00:04:56.405 --> 00:05:09.380
via a dynamic workflow. So in a bit, we're gonna go through, like, some very concrete examples of these things. But first, I wanna walk you guys through these six patterns that can be used to construct any number of workflows.

00:05:09.460 --> 00:05:27.885
So again, the one thing to keep in mind as I go through all of these things is that they can and probably should be combined together. So we're gonna go through again some of those cool use cases after we talk about what each of these things actually are. The first pattern up is called classify and act. And this is where you would give in some sort of task.

00:05:27.965 --> 00:05:29.325
It can then classify

00:05:29.325 --> 00:05:48.750
that task and like what it really means. And then it can route that task to specific agents, depending on, like, the logic that you set of how it chooses to actually route something. So a really simple example of something like this might be, like, handling, like, bugs and edge cases inside of your app. So you could have a workflow that classifies

00:05:48.750 --> 00:05:52.195
those bugs based on, like, the severity or the complexity.

00:05:52.435 --> 00:06:16.720
And then depending on exactly where it falls in that spectrum, maybe you're handing certain things off to, like, Haiku to solve, like, very basic things. Maybe you pass, like, medium bugs off to, like, Sonnet with a systematic debugging type of skill. And then maybe your most complex issues get passed into, like, Opus, for example. So this would be an example of the classify and act pattern. The next one up is called fan out and synthesize.

00:06:16.995 --> 00:06:30.115
So this is where you take a task, you split it up into a sequence of smaller steps. Each of these steps runs individually, like, as sub agents. And then the results of all of them get passed back to a synthesizer

00:06:30.000 --> 00:06:37.840
that returns you the end result that you were looking for. So again, a really well known example of this is going to be like a research style workflow.

00:06:37.920 --> 00:06:43.695
So maybe it's going out first and it's exploring your code base, then it's pulling in, like, actual documentation

00:06:43.695 --> 00:07:00.340
from providers, maybe using something like Context seven. Then it's looking for, like, common use cases that are circulating online. And then after it has all of these things, it's gonna synthesize all of that into a final result and report back to you. So number three and one of my favorite of these is the adversarial,

00:07:00.340 --> 00:07:02.180
like, review and verification

00:07:02.260 --> 00:07:12.485
type. And so one of the big downsides of language models is that they really tend to lock in on narratives. Meaning, if something comes out of the context that you've been generating,

00:07:12.565 --> 00:07:16.405
it might find one thread there, turn that into some sort of story,

00:07:16.565 --> 00:07:22.165
and then start jamming that story down your throat, when in reality, like, the claim isn't that verified,

00:07:22.165 --> 00:07:23.845
and it just locked into some direction.

00:07:24.400 --> 00:07:36.480
And now it has, like, blinders on, and it won't consider anything outside of that. And so the thing that's really cool about this adversarial approach is that we can challenge all of the claims and premises with a fresh verification

00:07:36.480 --> 00:07:38.185
or, like, adversary.

00:07:38.265 --> 00:07:44.905
That is really, like, a forcing function. Like, it's forcing the model to debate back and forth with itself with fresh context

00:07:45.065 --> 00:08:14.215
so that any of that, like, narrative lock that was taking place kind of fades away, and you can spin up a bunch of these at the same time. So for example, each of these different, like, adversarial verifications that are taking place can all have, like, different definitions and scopes of what they're actually looking for when they're evaluating this claim. So the fourth is called generate and filter. This is one that I think compound engineering does really well, where we have different agents that are solely responsible for generating, like, ideas, for example,

00:08:14.455 --> 00:08:17.415
with different prompts about, like, how they specifically

00:08:17.495 --> 00:08:19.255
think about generating ideas.

00:08:19.415 --> 00:08:27.650
Then you get a big list of ideas out the other side, and you can actually filter them down based on a, like, a rubric, a set of criteria,

00:08:27.890 --> 00:08:46.685
overlap between them. And then finally, from that, you get, like, the best ideas out the other side, and the ones that didn't fit your criteria get discarded. And so you can imagine you wanna, like, brainstorm different approaches to some feature that you have in your mind. You could kick off three different sub agents that are thinking about how to solve that problem from different perspectives.

00:08:46.845 --> 00:09:16.350
And then you could filter all of the ideas that pop out the other side based on, like, the context of your app, the dos and don'ts of your app, maybe, like, your business or marketing standards, depending on, like, the use case that you're using this for, because this isn't just for coding use cases. And then again, you're returning just the best ideas, like where maybe you wanna take that that feature, maybe what the UX approach for that feature should be. It's really sky's the limit if you get creative with how you wanna use these things. So again, in a few minutes, we're gonna get into how you can combine all of these together.

00:09:16.590 --> 00:09:40.845
But first, let's finish up with the last two. So the fifth one is fun. It's called the tournament, and this is where you're spawning agents to compete on the same task. So each agent here is attempting to solve the same task using different approaches. And then we have a judge model that is judging the outputs of all of these things and trying to get down to, like, what is the actual winner. So this one almost feels like a combination of the generate and filter and the adversarial

00:09:40.845 --> 00:09:41.485
verification.

00:09:41.790 --> 00:10:28.225
But the difference is that we are going through rounds. So for example, we have these two different competing ideas that result in one winner, but then we had the secondary round that had its own test it was running that resulted in its own set of winners. And now we're gonna pair these two winners against each other in battle until we get to our final result. So maybe you're trying to brainstorm, like, the best UX approaches for your app, and each of these different attempts that you are launching is, a different approach to how you could think about the user experience. And then we're taking those winners, pitting them against each other, and finally landing on, like, what is actually the best approach to the UX for this feature or for the app in general. Or maybe you're trying to brainstorm viral hooks for your marketing campaign based on the hook library that you lifted off of your favorite YouTuber.

00:10:28.305 --> 00:10:39.825
That type of system would work equally well here. So the last one that we have here is called loop until done. And this one out of all of them is probably the one that gets used the most with others.

00:10:40.370 --> 00:10:43.570
So it's pretty straightforward. What this thing does is it's gonna continuously

00:10:43.650 --> 00:10:45.810
loop through a problem

00:10:45.890 --> 00:11:03.055
until a set, like, acceptance criteria is met, and it knows that it's complete. So maybe, for example, you realize that you just built this entire app and you weren't following, like, a test driven development approach, and now you have, like, no testing across your entire app. You could kick off an agent that's meant to discover

00:11:03.055 --> 00:11:05.775
all of the areas of your app that are untested.

00:11:05.775 --> 00:11:07.615
And then for all of those new findings,

00:11:08.060 --> 00:11:14.940
you could kick off a sub agent. That sole responsibility is to find the riskiest areas of your app that are untested,

00:11:15.020 --> 00:11:42.360
and it just continues to loop through this until, for example, it has, like, a certain percentage of test coverage across all the different, like, functionality inside of your app. So this one, like, feels kind of similar to a Ralph loop, but it's it's a little bit more flexible, I think, in how you can actually invoke it and the logic it uses to to get things done. So like I said earlier, there's a lot of use cases where you can combine these things together. So let's go through a few of those and talk about which ones they are combining.

00:11:42.440 --> 00:11:53.640
So one good example of this is the built in deep research workflow. So let's say, for example, that, like, you have an app that uses a language model, and you realize that, like, you might not be efficient

00:11:53.855 --> 00:11:58.255
with the amount of turns that are being taken, how it's, like, summarizing information,

00:11:58.255 --> 00:12:00.015
how it's cashing tokens,

00:12:00.255 --> 00:12:10.070
all of that type of stuff. And you know that you need to put something like that inside of your project so that your costs don't, like, get blown out of the water. Well, you could run a deep research workflow

00:12:10.070 --> 00:12:13.670
that first goes out and it fans and synthesizes.

00:12:13.750 --> 00:12:27.225
So it's gonna break down what that query is, like prompt optimization techniques for an app that does x, y, and z. And maybe it goes out and it determines that it needs six different sub agents that are gonna go out and they're gonna look at, like, different components

00:12:27.305 --> 00:12:32.025
of what you asked. After it takes into account all of that research,

00:12:32.185 --> 00:12:34.260
maybe then runs it through an adversarial

00:12:34.260 --> 00:13:00.115
review. And what that review is looking at is it's looking at the context of your code base, the conventions that you already have in place, the problems that your app is actually meant to solve, the tech stack that you are actually using, and it's going to fact check each of these claims that came through from the fan out stage to make sure that they are actually accurate and make sense given the context of your project specifically.

00:13:00.590 --> 00:13:06.190
And then after all of that is done, it will synthesize it down into a set of concrete recommendations

00:13:06.350 --> 00:13:12.270
for you. Now one of the things that makes this, I think, a little bit stronger than something like a

00:13:12.350 --> 00:13:18.045
skill is that as findings come out of the research or the adversarial

00:13:18.045 --> 00:13:22.445
review, it can continue to iterate and improve

00:13:22.685 --> 00:13:49.895
the quality of what is kind of going on. And we'll look at an example of that in a second where I show you, I think it's the third example that I have, where it starts spawning a lot of different sub agents to resolve things that are coming up in real time. So another use case of these things being combined together might be fact checking your meta prompts. I think one thing that people love to do is to tell their tool to go out and, you know, research something, and then, like, build a skill that replaces

00:13:50.215 --> 00:13:55.095
twenty years experience in some field. And again, these models are prone to narrative

00:13:55.095 --> 00:14:03.300
lock in. So if you're not, like, really fact checking those things and verifying everything that gets encoded into those prompts or those skills,

00:14:03.540 --> 00:14:10.100
you could be doing things that, like, just simply aren't correct, because the model has convinced itself that combining those things together actually

00:14:10.965 --> 00:14:26.645
reality, it may not really make sense. So for example, you could have a classify and act pattern that actually checks through that, like, meta prompt or skill, and looks for all of the things that are being claimed to be a fact. And then based on all of those facts that it uncovers,

00:14:27.040 --> 00:14:29.840
it can act toward a adversarial

00:14:29.840 --> 00:14:34.880
review that then actually loops through and looks at all of those claims,

00:14:35.120 --> 00:14:39.440
researches them, and make sure they actually make sense and integrate

00:14:39.440 --> 00:14:52.305
with the skill that you had in the first place. So now we're gonna go through two real concrete examples of these that I have made for myself that combine about three patterns, I think, each one. So the first one is mining from your session history

00:14:52.545 --> 00:15:00.390
updates that should be made to your Claude Markdown file. So a lot of people, like, create a Claude Markdown file once and then forget about it for eternity.

00:15:00.470 --> 00:15:08.550
And the Claude Markdown file is one of, like, the most important things for you to keep up to date if you want to vibe code, like, a total Chad

00:15:08.630 --> 00:15:17.445
and not become a meme that someone talks about on Twitter with having AI slop and such. And the reason for that is that you really wanna document

00:15:17.605 --> 00:15:21.765
patterns and conventions inside of your project that are not obvious.

00:15:22.180 --> 00:15:26.580
So Claude code and really any tools that supplies just as much to like something like Codex.

00:15:26.740 --> 00:15:30.020
They're really good at understanding things that are very clear patterns.

00:15:30.180 --> 00:15:35.860
But anytime they would have to read like a lot of stuff to understand something that could just be explained

00:15:35.860 --> 00:15:41.795
in a sentence or two, or if there are things about your project that would not be immediately

00:15:41.875 --> 00:15:42.595
obvious

00:15:42.755 --> 00:15:53.330
from a language model just reading your project, those are things you wanna have inside of your Claude Markdown file. And so here's what this workflow does. Step number one is a discover and digest.

00:15:53.490 --> 00:15:59.250
So we can think of this as, like, a fan out pattern, where it is going to take the last 20 conversations

00:15:59.250 --> 00:16:08.255
within this project, like sessions I've had within this project. And then it's going to loop through all of those sessions with parallel agents

00:16:08.335 --> 00:16:25.110
that read chunks of the session, and then propose anything that came out of that session that is a potential candidate for being included inside of the Claude markdown file based on, like, the rules that I gave it, which in this case were, like, things that are not easily inferable

00:16:25.110 --> 00:16:32.550
from, like, what is already inside of the project. So we're grabbing all of our chats. We're mining them for, like, opportunities

00:16:33.105 --> 00:16:37.905
for potentially updating the clogged markdown file. And then we're running through an adversarial

00:16:37.905 --> 00:16:38.865
verification,

00:16:38.865 --> 00:16:40.945
where it's gonna take all of those candidates,

00:16:41.105 --> 00:16:44.145
and then it has two different, like, skeptic lenses

00:16:44.225 --> 00:16:55.310
that is gonna look through to think if it needs to reject this. So the first one is the structure lens, which is going to try to see if what we're saying should go in that Claude Markdown file is easily inferred

00:16:55.390 --> 00:17:00.910
from what we already have in the project. And then it has another one that looks at, like, the novelty or truth

00:17:01.235 --> 00:17:08.995
of what actually came out in this mining loop. So is this already in the clawed markdown file? Is it not a real pattern that needs to be documented?

00:17:09.235 --> 00:17:29.230
Or is it simply not really supported by the evidence? So then it continues that process for two rounds, seeing if, like, okay, well, I loop through this again, am I finding anything new? And again, these are all, like, fresh agents, so fresh set of eyes. And then once that's all done, it synthesizes it down into a report for us. So this is a good example of where we're using, like, at least three different patterns,

00:17:29.735 --> 00:17:36.775
and we're doing something that would be, like, very convoluted and difficult to replicate inside of something like a like a skill.

00:17:36.935 --> 00:17:44.615
And I will get into, the end of this video, the best practices so that these things don't run away and blow through millions of tokens. Because full disclosure,

00:17:45.040 --> 00:17:58.480
this one did use millions of tokens to do this. And the reason that happened is because I didn't follow my own best practices that I'm about to discuss with you. So if we wanna look at, like, well, what does the output actually look like? This thing ran, I think, for about twenty, yeah, twenty six minutes.

00:17:58.975 --> 00:18:04.975
It read the 20 sessions that I loaded it up with. It looped through six different times, which is definitely overkill.

00:18:05.055 --> 00:18:07.295
And it ended up finding 10 candidates

00:18:07.375 --> 00:18:10.495
that survived two different adversarial passes.

00:18:10.655 --> 00:18:30.535
So in this case, like, the project that we're we're using here, this is like one of my, like, paid plugins that people get inside of my community. And I have certain, like, conventions that I like to follow when I am improving this and, like, releasing updates and doing all that type of stuff. And so what this was looking for in this case was my own cloud markdown file for, like, maintaining

00:18:30.855 --> 00:18:37.415
this thing that I build for other people. And so I found a bunch of different things, tooling and runtime issues inside of the project,

00:18:37.655 --> 00:18:42.215
invariance across files, like things that are different and, like, break from patterns and conventions,

00:18:42.730 --> 00:18:55.690
Details of, like, how different plugins and skills actually work together, like, aren't obvious without actually going in and reading those skill files. And then it actually found some things that are now out of date. So I didn't actually update my Clawdermarkdown

00:18:55.690 --> 00:19:03.215
file, and now what I specify in there is not actually correct. So for example, I have like this visual verification loop skill,

00:19:03.295 --> 00:19:15.950
which was originally stubbed as I was building this thing out, but now I actually have the full version of it. And so the clawed markdown file is saying that this is a stub, like, is no longer accurate. So these are all things that I would wanna move through then and update.

00:19:15.950 --> 00:19:16.430
And,

00:19:16.830 --> 00:19:26.405
we're gonna get into, like, some of the best practices of this in a second. But before I do that, wanna walk you through another workflow that I think is pretty cool. This So one is called the React Refactor

00:19:26.485 --> 00:19:30.565
Tournament. And so what this one is gonna do is it's going to invoke Vercel's

00:19:30.565 --> 00:19:32.325
React best practices

00:19:32.405 --> 00:19:35.765
skill. So what this one is gonna do is it's going to identify

00:19:35.765 --> 00:19:38.080
the candidates of optimization

00:19:38.240 --> 00:19:56.135
using that react best practices skill. It's gonna have judges identify which of these are actually the bigger issues inside of the project based on the context of the project and the severity of the thing. It's gonna have one winner from that, and then it's going to continue back through that loop of identifying the issues,

00:19:56.375 --> 00:20:15.700
running them through this tournament, and finding the winner for as many cycles as we tell it to. In this case, it's defaulting to three cycles so that it doesn't go crazy. And so if we wanted to pop in and, like, see what this looks like as it's running in real time, they actually have this, like, phased system where we can see, like, what are the phases of this workflow that are actually

00:20:15.860 --> 00:20:49.775
running. So in this case, it's first defining, like, the actual scope. It's then moving through a discovery phase and finding, like, all of the opportunities for where we would even, like, want to apply this thing. Uh, part of the skill that I defined is that anything it finds, it writes to a backlog so that we can always go back and reference those things and don't have to rerun the skill. And then it's gonna actually run it through the tournament and make the fixes. So we'll let this thing run for a little bit, and then we'll pop back in and look at the outputs. And so we can see in this case that all of these different tournament rounds are running at, uh, pretty much the same time. Like, they don't need to be run necessarily,

00:20:49.775 --> 00:20:56.015
like, in a strict pipeline where, like, one needs to wait or, like, anything like that. These are all running in parallel.

00:20:56.095 --> 00:20:59.215
So when we talk about, like, why this stuff can consume tokens,

00:20:59.375 --> 00:21:42.225
the number of, like, tasks that are gonna be done can balloon quickly if you're not, like, really aware of what type of workflow you're kicking this thing off on. So if the idea of running, like, 19 different sub agents at the same time is something that's, like, scary to you, Again, we'll get into some of the best practices in terms of, like, how you can control these things so that you don't end up with a session that is just, like, blowing through tokens like crazy. One thing that I will say, though, is, like, in this case, for example, the React best practices gives you, like, really good detailed guidance on what to do, and they give examples of the types of fixes. So honestly, if we had run this on Sonnet, it probably would have gotten, like, the same quality of work done, I think, without having to blow through this many Opus tokens.

00:21:42.305 --> 00:21:47.585
So those are the types of things, though, that, like, you can explain in natural language when you create these workflows,

00:21:47.825 --> 00:21:50.305
what models you want it to use for specific,

00:21:50.385 --> 00:22:05.130
like, rounds of the workflow, like, all of those types of things. You can control those and configure those things. So the last thing that we'll look at while that is running is like, what does this file actually look like? So these workflows are created as a JavaScript files.

00:22:05.290 --> 00:22:07.450
So the meta of the workflow,

00:22:07.530 --> 00:22:09.370
you have like the name, the description,

00:22:09.610 --> 00:22:17.195
uh, when to use it. If this isn't something that, like, you want to specifically trigger, and that is something to be aware of, these workflows

00:22:17.355 --> 00:22:19.275
will run by themselves.

00:22:19.595 --> 00:22:28.130
And so you wanna be, like, very clear about the when to use kind of, like, meta field so that it doesn't go off and do crazy stuff. But then we have, like, what are the different phases

00:22:28.290 --> 00:22:32.930
that are gonna actually run inside of this thing? And this is how we're getting that structure inside of that workflow,

00:22:33.090 --> 00:22:52.515
where we have, like, the scope, the discovery, the backlog, the tournament, the fix. All of this was created in natural language by me describing what I wanted to happen. And then the rest of this file is just giving it the logic of how it's going to actually work. It's very strict on, like, the schema of the thing, so that it's gonna, like, actually run, like, a little bit more, like, deterministically,

00:22:52.515 --> 00:22:58.550
and it's not gonna go off the rails and do random stuff. So it is very strict. But, again, all of this stuff gets created

00:22:58.630 --> 00:23:02.230
for you. So for example, in, like, that phase one, like, the scoping,

00:23:02.390 --> 00:23:22.730
there are very clear instructions on how things are being done, what the tasks are. So all of these things and the configurations, like how many rounds it's running in loops and, like, all of those things are specified inside of this file. Now if this is intimidating, you don't need to necessarily go in and read it, but you should at least be aware that this lives inside of your project in a in your directory

00:23:22.730 --> 00:24:01.705
in a workflows folder. And so now what we have happening is that the fixes are actually being done inside of separate work trees, and they are being committed as they are done. And so there's really endless opportunities for how you can combine all of these different types of, like, workflows to compose, like, really great ones that help you solve involved problems. But like I said a few times, you need to be aware of best practices and how you can control these things so that they don't go off the rails and end up burning through tokens. Because I promise you it will do that if you're not aware of what you can control. So these best practices are again coming from from Anthropic directly. So number one is prompting.

00:24:01.865 --> 00:24:06.985
You should be very detailed about telling it how to set things up. I actually use the exact language

00:24:07.225 --> 00:24:30.775
from these when I'm describing what I want it to do. So for example, with the tournament workflow that we just looked at, I told it, like, I want you to use a tournament with pairwise judges that do x y and z. So being very specific about how you prompt things. You can also tell it that you want this to be, like, a quick workflow that doesn't go crazy. So I think this is part of where, like, the whole, like, it's gonna burn tokens argument goes out the window.

00:24:31.015 --> 00:24:46.820
You can force it to do things in whatever way you want it to do. If you give it an open ended task, like, yeah, it might go up and accidentally spin up a 150 agents, But you should see that happening if you're paying attention to what's happening, and then just go in and modify the workflow so that it fixes

00:24:46.820 --> 00:24:51.780
that. Number two, using goals and loops. So on the loop side, if you create a workflow,

00:24:51.940 --> 00:25:04.795
maybe you created some workflow that can, like, look at failing poll requests, and it it runs through some sort of workflow to resolve that in whatever way you want to. You can run that on a loop, for example, that's going to look for failing pull requests on some sort of interval

00:25:04.875 --> 00:25:14.950
and then run it through your workflow. Uh, another thing you can do is use the slash goal command inside of ClawdCode to set, like, an actual hard measurable completion requirement

00:25:14.950 --> 00:25:17.190
so that the workflow has context

00:25:17.270 --> 00:25:41.450
around when to stop. Number three, and I think this is the biggest takeaway, is you can tell it that it has a token budget. So for example, if you were to go through and say, like, you can only use 10,000 tokens, you can only use a 100,000 tokens for this entire run, it is going to actually adhere to that. So this is, like, a really powerful thing if you're afraid of, like, the number of tokens that are gonna be used or whatever.

00:25:41.610 --> 00:25:46.250
Just, like, tell it what your threshold is for this thing being run properly,

00:25:46.490 --> 00:25:54.135
and then you won't have to deal with that. And then last but not least, you can save workflows. So you can save workflows, you can share workflows,

00:25:54.135 --> 00:26:00.135
and it's really easy to do that. So for example, like, we were to pop back down inside of this specific,

00:26:00.215 --> 00:26:03.175
uh, skill, if I was to come through and hit save,

00:26:03.175 --> 00:26:27.505
I can give it a name, and now this skill is going to be saved into our project, and then we can reuse it. We can share it. We can put it in a repo and share it with others. Like, whatever you want to do, sky's the limit. So like they say, it's a whole new world. It is, I think, borrowing patterns from, like, things that other plugins were attempting to do, but now it's an extension of ClawdCode, and it actually is going to work that way by default.

00:26:27.665 --> 00:26:45.280
And there is a lot of control that you have over the system. Again, if you want the two workflows that I went through specifically and you wanna customize them, which I I recommend you do customize them, you can find links for those in the description below. If you're someone trying to, like, upscale from basic vibe coding into, like, more intermediate vibe engineering,

00:26:45.280 --> 00:26:51.894
as I like to call it, you should Hulk smash the subscribe button. But that's it for this video. I will see you in the next one.
