WEBVTT

00:00:00.320 --> 00:00:19.105
Welcome back and a happy New Year. Now I'm convinced that long running agents are the future of agentic coding and vibe coding. And in this video, I'll prove that to you. I'm going to show you two projects that have three things in common. They were both built using Claude code. I sent them each the exact same prompt,

00:00:19.265 --> 00:00:24.865
and finally, I didn't interfere at any point. I simply let the agent run on autopilot.

00:00:24.865 --> 00:00:33.260
And as you'll see in a second, the results are drastically different. And later on in this video, I'll show you how you can use this long running agent yourself,

00:00:33.500 --> 00:00:38.860
and it's really easy. First, let's have a look at the prompt that I passed to each of these agents.

00:00:38.860 --> 00:00:48.605
And this is a really detailed prompt. It gives a basic overview of the app that we're trying to build. And by the way, this is an application that can assist content creators

00:00:48.685 --> 00:00:52.525
with coming up with hooks and titles and thumbnail ideas.

00:00:52.765 --> 00:00:55.565
I also referenced some technical documentation

00:00:55.260 --> 00:01:02.620
that it can use to understand the different frameworks involved. Then we've got some prerequisites like we're starting off with the Next. Js project.

00:01:02.780 --> 00:01:08.540
We've got our problem statement and a lot lot more. So a few things to note is with the thumbnail generation,

00:01:09.125 --> 00:01:11.925
I actually want to use the Nano Banana model

00:01:11.925 --> 00:01:14.245
to generate different thumbnail ideas.

00:01:14.325 --> 00:01:17.445
And then we also have this built in AI assistant

00:01:17.445 --> 00:01:20.805
that's able to manipulate the results, generate more results,

00:01:21.390 --> 00:01:23.550
and come up with additional ideas.

00:01:23.710 --> 00:01:31.950
So this really is a massive project. So for the first project, all I did was go through the normal workflow using ClothCode CLI.

00:01:32.030 --> 00:01:36.750
So I went into planning mode, I then passed in that entire prompt into ClothCode,

00:01:37.035 --> 00:01:39.915
and Claude then created an implementation plan.

00:01:40.155 --> 00:01:47.835
And afterwards, I just told Claude to go ahead and implement it, and this was the result. So this is a really simple user interface.

00:01:48.075 --> 00:01:56.690
There's no light or dark mode. Although, I did specify light and dark mode in the actual prompt. And for reference, I did use Opus 4.5

00:01:56.690 --> 00:02:06.665
in thinking mode for this project. So I've already created a few sample projects here, but something to note is there's actually no option to delete any of these projects.

00:02:06.745 --> 00:02:08.505
Let's create a new project.

00:02:08.825 --> 00:02:11.465
Let's create n eight n beginner tutorial,

00:02:11.545 --> 00:02:23.630
and now we can provide the topic that this video is about. So we can say this is a comprehensive beginner tutorial on getting started with n eight n from basic installation to building agentic workflows.

00:02:23.710 --> 00:02:41.325
Now what we can do is generate the different hooks and this simply takes us to the hook page where we have to click on generate. Alright. So the agent generated these different hook ideas. We can see these cards on the screen and we also have this chat interface where we can have a conversation with our AI assistant.

00:02:41.325 --> 00:02:43.645
But I'm really not a fan of this response.

00:02:43.885 --> 00:02:49.240
Either way, one of the requirements in the prompt file was that the assistant should have the ability

00:02:49.560 --> 00:02:51.880
to edit any of these cards.

00:02:52.040 --> 00:03:01.560
So it should be able to add cards, remove them, or change the cards. As an example, let's change this one. So instead of saying stop paying for Sapir,

00:03:01.865 --> 00:03:03.705
let's change it to make.com.

00:03:03.785 --> 00:03:07.305
Okay. So it's responded saying that the hook was changed,

00:03:07.385 --> 00:03:11.305
and in this array, we can see that it's now saying make.com.

00:03:11.465 --> 00:03:14.665
But if I have a look at the cart, it's still showing SAPIER.

00:03:14.665 --> 00:03:18.720
Let's try to refresh this page, and this still wasn't changed.

00:03:18.960 --> 00:03:26.240
Look, it's impressive that some of the stuff just works as a one shot prompt, but it's clear that some of the more advanced features were missed,

00:03:26.320 --> 00:03:34.395
like the ability for this agent to actually manipulate the cards on the screen. Another thing that's missing is the ability to edit the system prompts

00:03:34.475 --> 00:03:36.395
for any of these agents.

00:03:36.635 --> 00:03:43.675
Like, what if I don't like the style of these hooks and I actually wanted to train the agent to generate hooks that meet my preferences?

00:03:43.835 --> 00:03:47.490
I don't see any way to edit the system prompts.

00:03:47.650 --> 00:03:50.210
And to be clear, this was a requirement

00:03:50.210 --> 00:03:56.290
in the prompt file as well. Either way, let's select one of these hooks and let's continue to intros.

00:03:56.290 --> 00:03:59.970
Let's generate it. Alright. So now we have our different intros.

00:04:00.525 --> 00:04:05.165
Again, we can see the array in the conversation window.

00:04:05.325 --> 00:04:07.805
Let's try to remove a card. Please

00:04:07.965 --> 00:04:08.925
remove

00:04:09.085 --> 00:04:09.965
item

00:04:09.965 --> 00:04:10.765
five.

00:04:10.765 --> 00:04:19.480
Right. So in the chat window, it says the item was removed by looking at the UI. Card five is still there even after refreshing the UI.

00:04:19.640 --> 00:04:24.040
By the way, let's just select any of these intros and let's continue to the titles.

00:04:24.200 --> 00:04:25.960
It's generated titles

00:04:26.040 --> 00:04:37.055
and we get our title ideas. Let's actually test this regenerate function to see if it works. And that does seem to work. So let's select this one. Let's continue to thumbnails.

00:04:37.295 --> 00:04:39.135
Let's generate the thumbnails,

00:04:39.215 --> 00:04:41.535
and this simply gives us a description

00:04:41.535 --> 00:04:44.015
of the thumbnail and not the image itself.

00:04:44.470 --> 00:04:51.670
And as a reminder, we specifically said in the prompt that we want to use Gemini three Pro, so Nano Banana,

00:04:51.830 --> 00:04:53.430
to generate thumbnails.

00:04:53.590 --> 00:04:57.670
And I don't see any option on the screen to generate those thumbnails.

00:04:58.085 --> 00:05:02.325
So let's simply go with this one, and let's complete the project.

00:05:02.485 --> 00:05:17.020
And that actually takes us back to the home screen without any summary or anything else. And unfortunately, this is the reality of trying to build a massive project like this in a single context window. As the agent was trying to implement all of these features,

00:05:17.340 --> 00:05:19.740
it exceeded the context window size.

00:05:19.980 --> 00:05:22.300
So it kept compacting the conversation,

00:05:22.380 --> 00:05:25.580
which actually lost a lot of important context.

00:05:25.580 --> 00:05:55.010
And then this is the final result. Now there are frameworks out there that try to solve this issue by splitting this massive requirement up into smaller features and then implementing these features one by one. You might be familiar with SpecKit or BMAD or if you've ever used by Agenci coding starter kit, that is exactly what the SpecK feature does. It looks at this massive prompt and then it actually creates a separate implementation plan that's split up into phases and actionable tasks.

00:05:55.405 --> 00:06:00.045
Now the limitation with that approach is that you still have to coordinate everything.

00:06:00.045 --> 00:06:03.725
So you need to tell the agent to go and implement phase one,

00:06:03.885 --> 00:06:13.700
and once that's done, you have to intervene by clearing the context window and asking the agent to start with the next phase. And this does introduce other issues.

00:06:13.940 --> 00:06:16.660
This is like having developers work in shifts,

00:06:16.820 --> 00:06:46.010
where one developer would do a piece of work and they would then leave the office. And the next developer comes in having no context on what the previous developer did. So they just kind of take on the next piece of work, implement it, and they could introduce duplicate code. They could even introduce bugs that break whatever the previous developer did. And besides, for large projects like this, I don't think any of us actually want to sit around and babysit the agent whilst going through 20 or 50 phases.

00:06:46.090 --> 00:06:50.410
So the solution to this is to implement a long running agent harness.

00:06:50.490 --> 00:06:56.585
This harness will allow the agent to run for a very long time. This could be hours or even days

00:06:56.745 --> 00:06:58.585
to implement everything,

00:06:58.825 --> 00:07:06.425
and this also includes regression testing. So let's have a look at the second project. Again, I used the exact same prompt.

00:07:06.665 --> 00:07:10.105
I also used Opus 4.5

00:07:09.080 --> 00:07:18.120
in thinking mode. A lot of this will look familiar. It's the same project that we're trying to build after all, but you'll already see a lot of small differences.

00:07:18.200 --> 00:07:24.040
The first being is we've got access to light and dark mode. We also have this settings button

00:07:24.355 --> 00:07:31.875
where we can edit the different prompts. So indeed, we can edit the system prompt for the Hooks agent, the Intros agent, titles,

00:07:31.955 --> 00:07:33.315
and thumbnails.

00:07:33.395 --> 00:07:37.475
Then from the main screen, we can actually filter projects by different statuses.

00:07:37.790 --> 00:07:41.790
This isn't even something I asked for, but the long running agent decided

00:07:41.870 --> 00:07:55.745
that these features would make sense. When I hover over these cards, we have the option to delete the project, edit it, we can even duplicate the project, and of course, open it. Let's try to delete this project and that works.

00:07:55.825 --> 00:08:00.145
Let's then try to create a new project. So let's give it a name.

00:08:00.385 --> 00:08:04.305
Let's provide the topic. We can also provide the target audience.

00:08:04.545 --> 00:08:07.825
So I'll just say no coders and automation experts.

00:08:08.300 --> 00:08:11.980
And for the content style, let's do educational.

00:08:12.140 --> 00:08:17.260
Then let's create this project. Then let's open it. And this already looks way

00:08:17.260 --> 00:08:20.300
better. We can see exactly where we are in the process.

00:08:20.895 --> 00:08:26.735
We have a summary of the topic, and on the right hand side, we've got access to our AI assistant.

00:08:26.735 --> 00:08:30.015
Let's just say hello to see if this actually works,

00:08:30.255 --> 00:08:31.295
and it does.

00:08:31.535 --> 00:09:12.360
Keep in mind, this was all done through a single prompt. The same with the first project, but I just think this just feels way more polished. Let's go to generate hooks, and now we've got our different hooks. We can actually copy the text, we can even edit the hooks, which is really helpful. I didn't see that in the first project. And of course, we can select the hook that we want. We also have the option to regenerate these hooks or we can add more hooks. But let's see if our AI assistant can actually edit any of these. So let's have a look at this one. Most people spend ten hours a week on repetitive tasks. I spent ten minutes. Let's actually change this text. So let's say, please

00:09:12.600 --> 00:09:14.200
change card

00:09:14.200 --> 00:09:15.720
three from

00:09:15.960 --> 00:09:17.160
this to

00:09:17.400 --> 00:09:18.120
I

00:09:18.520 --> 00:09:19.640
spent

00:09:19.375 --> 00:09:20.495
five minutes.

00:09:20.895 --> 00:09:31.135
Let's see what this does. We'll just let the agent cook and that's it. It actually changed this card. And we can see this little edited badge as well. That's awesome.

00:09:31.630 --> 00:09:40.270
Another thing that just popped up is this history button. So if I click on this, I can actually see the previous version of this card. This is insane.

00:09:40.430 --> 00:09:43.550
Let's try removing a card. Please remove

00:09:43.550 --> 00:09:51.685
card five. So this guy over here. And done. That card is now gone. This is insane. This is really,

00:09:51.925 --> 00:09:58.910
really cool. All with a single prompt. Let's continue to intros. I'm just going to generate whatever.

00:09:59.070 --> 00:10:01.790
This is probably a small thing, but I do appreciate

00:10:01.870 --> 00:10:07.310
that the AI assistant gives you these little buttons to tell you what you can use this assistant for.

00:10:07.390 --> 00:10:13.335
This isn't something I asked for, but it's really cool that it's there. Okay. So we've got our different intros.

00:10:13.575 --> 00:10:17.175
Let's select this one. Let's continue to titles,

00:10:17.495 --> 00:10:19.895
and we've got our different title ideas.

00:10:20.055 --> 00:10:24.215
Let's simply grab this one. Let's move on to thumbnails.

00:10:24.680 --> 00:10:39.960
So during the planning phase, the agent asked me if I would like to I have the ability to send reference images to Nano Banana. So if you're unfamiliar with how Nano Banana works, it's really good at generating new images, but also really good at editing images,

00:10:40.575 --> 00:10:45.455
which also means you can provide a reference image, so maybe a photo of myself

00:10:45.535 --> 00:10:50.095
along with maybe any important logos that could be used in the design.

00:10:50.335 --> 00:10:55.215
So that agent actually did that research. It looked at the capabilities of Nano Banana

00:10:55.690 --> 00:10:58.890
and said, hey, would you like the ability to provide

00:10:59.050 --> 00:11:00.410
reference images?

00:11:00.650 --> 00:11:05.050
So one thing I did was in the project files, if I go to

00:11:05.370 --> 00:11:07.130
public thumbnails

00:11:07.290 --> 00:11:08.250
and template,

00:11:08.475 --> 00:11:18.235
I simply uploaded this thumbnail template that I use in a lot of my videos. So in the planning phase, I just said that's a great idea. I would love to use reference images.

00:11:18.555 --> 00:11:22.475
So I just uploaded this template of myself into the public folder.

00:11:22.990 --> 00:11:25.150
You can use that as a reference.

00:11:25.230 --> 00:11:27.390
But now I can see in the implementation,

00:11:27.390 --> 00:11:32.510
it's actually taking it a step further. So it also allows me to upload things like logos.

00:11:32.670 --> 00:11:35.710
So for this, let's actually select the n eight n logo

00:11:35.905 --> 00:11:39.665
and that should be it. Let's go ahead and create these thumbnails.

00:11:39.825 --> 00:11:41.985
And just to be clear, for the first project,

00:11:42.145 --> 00:11:48.625
the planning agent never asked me if I wanted to use reference images with Nano Banana. It just kind of skipped that altogether.

00:11:48.865 --> 00:11:51.265
Now I do realize this is nondeterministic,

00:11:51.630 --> 00:12:01.390
and chances are if I had to run that process again, it might have considered this to be an option. And here we go. We actually have our thumbnails coming in from Nano Banana.

00:12:01.630 --> 00:12:03.230
This is awesome.

00:12:03.310 --> 00:12:28.090
I actually like this first one. So what we can do is view this in full screen, which looks great. We can also view the prompt that was sent to Nano Banana. We can also upscale the image. So I guess if this is the one that we want to use, we can simply upscale it to four k. We can also refine the image. So let's click on that and let's change the text. Let's say, please replace

00:12:28.490 --> 00:12:29.370
game

00:12:29.370 --> 00:12:30.730
changer with

00:12:30.890 --> 00:12:32.570
beginner tutorial.

00:12:32.650 --> 00:12:41.155
Alright. Let's click on regenerate, and let's see if this actually works. Alright. It seems to be done. And, dude, look at that.

00:12:41.395 --> 00:12:44.675
We have our thumbnail with the text beginner tutorial.

00:12:44.835 --> 00:12:48.675
And when we click on history, we can see the different revisions of this thumbnail.

00:12:49.030 --> 00:12:51.910
This is just stupidly impressive.

00:12:51.990 --> 00:12:56.150
And keep in mind, this was a single prompt that simply ran on autopilot,

00:12:56.310 --> 00:12:58.710
and this is pretty much a usable application.

00:12:58.870 --> 00:13:01.270
I'm actually going to make one more refinement.

00:13:01.270 --> 00:13:12.215
Let's say, please change the yellow shirt and glow into color that complements the n eight and pink on the color wheel. Let's just see what we get back. And there we go. This looks way

00:13:12.215 --> 00:13:13.335
way better.

00:13:13.495 --> 00:13:18.520
And finally, I guess we can simply upscale this image to use it on the actual video.

00:13:18.680 --> 00:13:26.760
And cool. The image was upscale to four k and we're able to download it as well. Man, this is just so impressive.

00:13:26.760 --> 00:13:33.315
Just to think this was a single prompt. And finally, on the complete screen, we actually get a summary.

00:13:33.475 --> 00:13:48.610
Now I just wanted to see if we can actually go back to the different steps. So if I click on Hooks, it says that, yes, you can go back to Hooks. Your progress in later stages will be preserved. Oh, cool. So let's go back to hooks. Here, I can see the hook that I selected.

00:13:48.770 --> 00:13:50.690
And if we go back to thumbnails,

00:13:50.850 --> 00:13:55.090
we can see the thumbnail that was selected. So all of this is persisted

00:13:55.090 --> 00:14:15.460
even if I refresh this page. Now let me show you how I built this and how you can use this long running agent yourself. A few weeks ago, I created this video that shows you how you can let Cloud Code run for hours or days to implement complex solutions. That solution was based on this effective harness for long running agents by Anthropic.

00:14:15.540 --> 00:14:37.525
This is a brilliant article, so I definitely recommend giving it a read. They also give you this quick start guide, which is an example project that you can play with. Now, this is a really cool project and it's actually quite simple. The entire process starts with you providing an app spec file. This is a really simple document that contains all of the prerequisites

00:14:37.525 --> 00:14:41.040
and tech stack and core features of your application.

00:14:41.280 --> 00:14:47.840
You can simply use Cloth or ChatGPT to populate the file for you. Now when you execute this application,

00:14:47.840 --> 00:14:50.080
so the long running agent harness,

00:14:50.320 --> 00:14:52.800
it will start off by running this initializer

00:14:52.800 --> 00:15:02.295
agent. This agent will have a look at your app spec file and it will actually do a couple of things. It will set up the basic project structure,

00:15:02.455 --> 00:15:03.975
but more importantly,

00:15:04.375 --> 00:15:06.855
it will also produce this feature list.

00:15:07.180 --> 00:15:16.300
Now this is really the magic of having these long running agents. This feature list could have tens to hundreds to even thousands of features.

00:15:16.460 --> 00:15:22.700
Each and every aspect of your application becomes a feature. For example, having a light and dark toggle,

00:15:23.235 --> 00:15:24.995
that becomes a feature

00:15:25.155 --> 00:15:41.200
and that feature needs to be implemented by a coding agent and tested thoroughly. And once the initializer agent completes its work, the framework kicks off coding agents. So this coding agent will have a look at the feature list to find the next feature that needs to be implemented.

00:15:41.280 --> 00:15:49.840
This coding agent also has a second task and that is to perform regression testing. So it needs to retrieve the next feature that should be implemented,

00:15:50.315 --> 00:15:53.515
but it should also pick three features at random

00:15:53.515 --> 00:15:58.635
that have already been implemented and then do regression testing on those features.

00:15:58.875 --> 00:16:03.115
So this agent will perform its regression testing and implement the next feature.

00:16:03.610 --> 00:16:09.050
And once it exceeds its context window or gets close to filling up its context window,

00:16:09.210 --> 00:16:10.810
it will close the session,

00:16:11.050 --> 00:16:13.210
update the statuses of the features,

00:16:13.610 --> 00:16:19.595
and then move on to the next coding agent. And each of these coding agents have their own context windows,

00:16:19.675 --> 00:16:21.995
ensuring that the context stays lean

00:16:22.075 --> 00:16:27.755
and focused on very specific tasks. So that is basically how this Anthropic framework works.

00:16:27.915 --> 00:16:34.000
But, of course, this all runs in the terminal. It's very technical to set up, and it's not really meant for everyone.

00:16:34.080 --> 00:16:37.680
So there's been this drive to build UIs and applications

00:16:37.920 --> 00:16:49.955
that use this theory behind the scenes, but it's way easier for pretty much anyone to use, whether you are an experienced developer or a wide coder. An example of such a project is automaker

00:16:49.955 --> 00:16:53.075
from another YouTuber called WebDevCody.

00:16:53.155 --> 00:16:57.875
Cody and his Discord community have been putting a lot of work into this application.

00:16:58.115 --> 00:17:18.715
It uses all the theory that we just discussed. We start off with an app spec and the initializer agent will then create a feature list based on your requirements. And now you can visually see the coding agents implementing all of these features using a Kanban board. Now this is really a complex and fully featured application,

00:17:19.195 --> 00:17:28.155
and I do want to create a dedicated video on this. Now tools like Automaker and even AutoClaw do a lot more than just this Anthropic harness.

00:17:28.510 --> 00:17:30.990
They're trying to be fully featured applications

00:17:31.150 --> 00:17:37.230
that pretty much replace your coding editor to an extent. You can do stuff like branching and work trees

00:17:37.390 --> 00:17:40.350
and run terminal sessions within those applications.

00:17:40.430 --> 00:17:46.045
So for this video, I decided to really simplify things. This takes the Anthropic harness

00:17:46.205 --> 00:17:55.005
and simply slaps a UI on top of it. And it's really easy to use and free by the way. So all you have to do is go to this repository,

00:17:55.005 --> 00:18:00.110
which I'll link to in the description of this video. Then what you have to do is click on code

00:18:00.190 --> 00:18:09.950
and click on download ZIP. Then you can simply extract the contents of that file anywhere on your machine. It really doesn't matter. Then depending on your operating system,

00:18:10.190 --> 00:18:13.790
you can double click on start UI for Windows

00:18:13.765 --> 00:18:14.565
or

00:18:14.645 --> 00:18:21.685
start UI dot SH for Mac or Linux. So because I'm using Windows, I'll just click on start UI,

00:18:21.925 --> 00:18:23.365
then it's run this,

00:18:23.525 --> 00:18:28.245
and this will set up the Python environment for you and install all of the dependencies.

00:18:28.710 --> 00:18:30.950
And afterwards, you'll get this URL

00:18:30.950 --> 00:18:35.030
and if you open this, you will have access to this UI.

00:18:35.190 --> 00:18:45.255
You just won't have any projects set up yet. Now this is using your Cloud Code account behind the scenes, so you do have to have ClothCode set up. So it means you have to install the ClothCode

00:18:45.255 --> 00:18:48.935
CLI tool and authenticate yourself in ClothCode.

00:18:48.935 --> 00:18:57.990
Now if you're watching this video, I do assume that you already have ClothCode set up, but if you have no idea what I'm talking about, you can simply go to my GitHub repository

00:18:58.230 --> 00:19:00.470
and simply follow these prerequisites.

00:19:00.470 --> 00:19:09.405
So you just have to install Claude code and then authenticate yourself by running this command in the terminal. You need to either have a Clawd subscription

00:19:09.645 --> 00:19:15.965
or you can use an Anthropic API key. So this uses the exact same harness from Anthropic

00:19:15.965 --> 00:19:18.525
with a few very important changes.

00:19:18.685 --> 00:19:30.160
And I simply made these changes for performance sake, but everything else remains the same. To create a new project, we can simply click on this drop down. Let's go to new project,

00:19:30.240 --> 00:19:33.120
and let's just give this a name like YouTube

00:19:33.280 --> 00:19:37.635
demo. Let's click on next. Then let's click on select this folder.

00:19:37.875 --> 00:19:39.555
And now you have two options.

00:19:40.035 --> 00:19:43.555
With the Anthropic example, so the vanilla project,

00:19:43.875 --> 00:19:46.755
you have to change the app spec file yourself.

00:19:46.835 --> 00:19:49.830
So you have to use Cloth code or ChatGPT

00:19:49.830 --> 00:19:52.150
or just manually manipulate

00:19:52.150 --> 00:20:02.870
that file. But what I added to this application is the ability to use Cloth to generate that file for you. So let's actually test that feature. Cloth will now ask you a few basic questions,

00:20:03.525 --> 00:20:10.645
like, why do you want to call this project? And in her own words, what are we building? Who will use it? So it's just really,

00:20:10.805 --> 00:20:33.995
really simple. I'm actually just going to copy this prompt and paste it into the chat. Now this is really meant for everyone. So if you have no idea how to use a tech stack or how to define what you want, simply have a natural conversation with the agent and it will guide you through everything. In fact, even passing in all of this context, the agent is giving us two routes.

00:20:34.155 --> 00:20:43.515
We can select quick mode, and this is where the agent will make decisions on your behalf. For instance, the tech stack. Or if you really want to get involved in the architecture,

00:20:43.930 --> 00:20:50.810
you can go into detailed mode. And this is where you can tell the agent exactly what you want to use in terms of the database,

00:20:51.050 --> 00:20:53.930
conventions related to database schemas,

00:20:54.010 --> 00:20:54.650
etcetera.

00:20:54.810 --> 00:21:01.805
For this, let's simply go with quick mode. Right. It's asking us a few more questions. I'm just going to say you choose.

00:21:01.965 --> 00:21:09.725
And the agent is asking us a few more questions. Now this is something I really like, and this is where the agent asked me if I wanted to use reference images

00:21:09.805 --> 00:21:14.990
for generating thumbnails, something Clothco didn't do in the vanilla planning phase.

00:21:15.150 --> 00:21:17.230
I'll just say, you choose again.

00:21:17.470 --> 00:21:19.870
And then let's just say, sounds good.

00:21:20.110 --> 00:21:32.845
Obviously, when you're using this, you don't want to rush through this. Really go through all the questions and just have this conversation with the agent. And now the agent is giving us a breakdown of all of the features in our application.

00:21:33.005 --> 00:21:37.325
So if you want to add anything or remove anything or change anything,

00:21:37.485 --> 00:21:41.690
this is the time to do it. In total, we're looking at a 174

00:21:41.690 --> 00:21:44.250
unique features for this application.

00:21:44.330 --> 00:21:56.595
I'm just going to say it looks good. Another agent is creating two files. It's creating that app spec file and it's also updating the initializer prompt with an amount of features that need to be implemented.

00:21:56.755 --> 00:22:09.840
And I do want to warn you, this step can take a few minutes to complete. And one thing I forgot to mention is you can attach files to this conversation as well. So maybe you've got an image of a design that you really like,

00:22:10.000 --> 00:22:22.815
you can definitely upload that image as well. And by the way, if you were curious, this is what the project currently looks like. So the agent wrote this app spec file. This gives us the project name and an overview.

00:22:23.055 --> 00:22:25.855
It contains the tech stack, the prerequisites,

00:22:26.095 --> 00:22:28.415
the amount of features that need to be implemented,

00:22:28.735 --> 00:22:30.415
and all the core features.

00:22:30.575 --> 00:22:48.350
And then the initializer prompt was simply updated to contain the amount of features as well. And now we can see the agent completed its work, and we now have this button to continue to the project. So now that we have our spec file, the initializer agent will run to set up the entire project structure,

00:22:48.670 --> 00:22:58.825
and it will also produce this feature list. If So we have a look at this app, the initializer agent is currently running, and you can also see exactly what the agent is currently doing.

00:22:58.905 --> 00:23:03.065
If you want more information, what you can do is simply click on debug.

00:23:03.065 --> 00:23:04.985
You can also press d on the keyboard

00:23:05.390 --> 00:23:14.670
to bring up or hide this debug window. And in here, you can see exactly what the agent is busy with. Alright. So we can see the agent already created the first 20 features

00:23:14.830 --> 00:23:17.310
and they're showing up in the spending column.

00:23:17.470 --> 00:23:20.270
So these features are sorted by priority.

00:23:20.535 --> 00:23:32.135
So the ones at the top will be implemented first. I also want to mention that you can manually add new features as well. So for existing projects, really all you have to do is go to add feature,

00:23:32.215 --> 00:23:37.900
and here you can tell the agent exactly what feature you want to implement. And you can simply run the workflow

00:23:38.060 --> 00:23:53.995
and the agent will automatically pick up that feature and implement it. Another small quality of life feature is the chat window. So when you click on this, you can ask questions about your project or the code base in this chat window. The initializer agent created a 190

00:23:53.995 --> 00:23:58.555
features. That is more than the amount of features that we agreed on in the conversation.

00:23:58.875 --> 00:24:01.995
So sometimes it might create more features than you expected,

00:24:02.155 --> 00:24:13.020
but that is just because it identified certain gaps and it's trying to create complete test coverage. It's also setting up the project structure and installing all of the dependencies.

00:24:13.020 --> 00:24:16.540
Let me show you some of the improvements I made to the original harness.

00:24:16.700 --> 00:24:25.525
The Anthropic harness writes all of the features to a massive feature list file. Now, if you have hundreds of features, this file can get massive.

00:24:25.605 --> 00:24:41.180
That feature list file by itself can exceed the context window of the agent. So instead of having this massive JSON file, I instructed the agent to add features to a SQLite database instead. So in this database, we can see all of the features with their priorities,

00:24:41.420 --> 00:24:42.620
with their steps,

00:24:42.780 --> 00:24:49.765
and whether this feature passed or not. Now in order for the agent to add features and update features in this database,

00:24:50.005 --> 00:24:53.445
I created a dedicated features MCP server

00:24:53.605 --> 00:25:04.580
with tools for updating and working with this database. So there's a tool for retrieving the next feature to implement, there's a separate tool for retrieving three features to regression test,

00:25:04.660 --> 00:25:05.300
etcetera.

00:25:05.540 --> 00:25:10.020
I found that this approach reduces token usage and it actually improves performance.

00:25:10.575 --> 00:25:13.215
Perfect. The initializer agent is now done.

00:25:13.375 --> 00:25:18.575
The coding agents will now start implementing these features. So if I have a look at this debug window,

00:25:18.735 --> 00:25:21.375
you can see the agent is using these MCP tools.

00:25:21.455 --> 00:25:23.935
So there's a tool for retrieving the stats.

00:25:24.320 --> 00:25:27.600
There's also we tell the agent how many features have been implemented.

00:25:27.760 --> 00:25:30.640
There's a tool for retrieving the next feature to implement.

00:25:30.800 --> 00:25:33.280
And if there were any completed features,

00:25:33.360 --> 00:25:50.535
the agent would use the MCP tool to retrieve three random features to regression test. Great. So the coding agent just pulled in a feature, and it's busy implementing this feature over a year. And it's already implemented that feature. Well, that's cool. So it's now retrieving the next feature to implement,

00:25:51.570 --> 00:26:00.290
which is this guy over here. And in my opinion, this really is the secret sauce. This agent will actually open up a browser window

00:26:00.370 --> 00:26:09.165
and test the application in real time. So as you can see, it's actually opened up this browser window. This allows the agent to actually see the application,

00:26:09.165 --> 00:26:17.245
so it can identify any UI related bugs or styling issues as well. This is really relying on test driven development.

00:26:17.325 --> 00:26:24.460
So it will test a feature, realize, oh, it actually doesn't work because there's a bug or the feature is totally missing.

00:26:24.540 --> 00:26:30.540
It will then implement the feature and try again. So as I've been speaking, you can actually see the page changing

00:26:30.785 --> 00:26:33.185
as the agent is testing this functionality.

00:26:33.345 --> 00:26:40.945
And it's this workflow of implementing a feature and testing it using the UI that really makes a massive difference.

00:26:41.105 --> 00:26:43.585
The agent is not simply implementing this blindly,

00:26:44.000 --> 00:26:52.880
It's testing the application to make sure everything works and everything is there. Yes. This takes a long time to complete, and it uses a lot of tokens.

00:26:53.040 --> 00:27:00.795
And if it's quality you're after, this is really the approach you need to take. And if you're on a smaller plan with Anthropic, like the $20 plan,

00:27:00.955 --> 00:27:03.595
you will reach your usage limit relatively quickly.

00:27:03.755 --> 00:27:22.760
But honestly, it doesn't matter. Your usage resets after a certain period of time, and this agent will simply auto continue the moment your usage has been reset. So you can run this while you're sleeping or over weekends or when you're away. Now if you want to just run through the stuff as quickly as possible without testing,

00:27:22.840 --> 00:27:25.735
you can do that as well. Simply stop the server,

00:27:25.895 --> 00:27:35.015
then select YOLO mode, this yellow button over here, and click on start agent in YOLO mode. So this will simply grab the next feature,

00:27:35.255 --> 00:27:36.935
implement it blindly.

00:27:37.095 --> 00:27:55.845
It will do lint and type checks though, So it won't look for any syntax errors, but it's not going to test these features at all. It's just going to assume that everything works and move on to the next feature. So if you really just want to go with speed, then just use YOLO mode. But if you want to build something that's actually robust and production ready,

00:27:55.925 --> 00:28:00.565
then let the agent test. If you've got any questions, please leave a comment below.

00:28:00.805 --> 00:28:03.365
Or if you get stuck or have any issues,

00:28:03.685 --> 00:28:05.680
you can join my school community

00:28:05.680 --> 00:28:15.840
and either myself or one of the community members will assist you. And also remember to hit the like button and subscribe to my channel for more ClothCode and the Gentic coding tutorials.

00:28:15.840 --> 00:28:18.000
I'll see you in the next one. Bye bye.