Leon van Zyl · Youtube · 28:18

Stop Using Claude Code the Normal Way

Same prompt, same model, two apps: one broken, one shippable. The difference is a long-running agent harness.

Posted

January 5th 2026

4 months ago

Duration

28:18

Format

Demo

educational

Channel

LV

Leon van Zyl

§ 01 · The Hook

The bait, then the rug-pull.

Same prompt. Same model. Zero human interference. Leon van Zyl ran an identical spec through vanilla Claude Code and through his open-source long-running agent harness — and the gap is embarrassing. One app shipped with broken card editing, no light/dark mode, and a thumbnail generator that never fired. The other delivered all of it, plus features nobody asked for.

§ · Stated Promise

What the video promised.

stated at 00:05 "I am convinced that long running agents are the future of agentic coding and vibe coding. And in this video, I will prove that to you." delivered at 07:05

§ · Chapters

Where the time goes.

00:00 – 01:23

01 · The Premise

Long-running agents are the future. Same prompt, same model (Opus 4.5 thinking), two drastically different outcomes promised.

01:23 – 05:02

02 · Project 1 — Vanilla (broken)

No dark mode, AI chat cannot edit cards in real time, no system prompt editor, thumbnail generation produces text descriptions. Context compaction killed the implementation.

05:02 – 07:05

03 · Why Vanilla Fails

Context window compaction loses critical context mid-build. SpecKit/BMAD help but still require babysitting. The shift-handoff developer analogy introduced.

07:05 – 14:08

04 · Project 2 — Polished

Same prompt, polished result: light/dark mode, editable system prompts, delete/duplicate/filter, AI that actually edits cards, card history, Gemini thumbnail generation with reference images, 4K upscaling.

14:08 – 16:33

05 · The Architecture

Anthropic long-running agent harness: initializer creates feature list, fresh coding agents each implement next feature plus regression-test 3 random completed features, then close context window.

16:33 – 19:00

06 · The Tools Landscape

Automaker (WebDevCody) and AutoClaw are full-featured replacements. Leon repo is simplified version: harness plus UI, free, download ZIP and run.

19:00 – 22:48

07 · Live Setup Demo

New project creation, Claude generates app spec via conversational Q&A (quick mode vs detailed), agent proactively asks about reference images for thumbnail generation.

22:48 – 27:10

08 · Autonomous Coder Running Live

Initializer creates 190 features stored in SQLite not JSON. Dedicated MCP server with get_next_feature, get_regression_features tools. Debug window. Agent opens real browser to test each feature.

27:10 – 28:18

09 · YOLO Mode + CTA

YOLO mode skips browser testing for raw speed (lint/type checks only). Join Agentic Labs Skool community. Subscribe.

§ · Storyboard

Visual structure at a glance.

split-screen comparison

hook split-screen comparison 00:01

vanilla app broken

problem vanilla app broken 01:23

prompt file in VS Code

context prompt file in VS Code 05:02

polished app thumbnails

value polished app thumbnails 07:05

architecture walkthrough

framework architecture walkthrough 14:08

GitHub repo

resource GitHub repo 16:33

spec generation chat

demo spec generation chat 19:00

SQLite features.db

value SQLite features.db 22:48

live browser testing

value live browser testing 26:01

Agentic Labs Skool

cta Agentic Labs Skool 28:00

§ · Frameworks

Named ideas worth stealing.

14:08 model

Anthropic Long-Running Agent Harness

Initializer agent parses app spec, creates feature list
Fresh coding agents pick up next feature + 3 regression tests
Agent closes session when near context limit, updates statuses
New agent starts fresh with lean focused context

Solves context-compaction by design: no single agent ever needs the full project history.

Steal for Any multi-session Claude Code project — JoeFlow, MCN, new MVPs

24:08 concept

SQLite Feature Store over JSON

A massive JSON feature-list file can itself exceed the agent context window. SQLite + MCP tools let agents query only what they need.

Steal for Any batch agent workflow with 50+ tasks

24:53 concept

Dedicated Features MCP Server

get_next_feature
get_regression_features (3 random)
update_feature_status

Purpose-built MCP tools reduce token usage and improve reliability vs having the agent read/write files directly.

Steal for Agent orchestration layers in JoeFlow or MCN Chef workflows

27:15 concept

YOLO Mode vs Test Mode

Explicit speed/quality toggle: Test Mode opens a real browser and verifies each feature; YOLO Mode runs lint/type checks only.

Steal for Any product with a batch processing pipeline

§ · Quotables

Lines you could clip.

06:13

"This is like having developers work in shifts, where one developer does a piece of work and then leaves the office. The next developer comes in having no context on what the previous developer did."

Instantly relatable analogy that makes the technical problem visceral → TikTok hook

26:01

"This really is the secret sauce. This agent will actually open up a browser window and test the application in real time."

Reveal moment — visual proof of autonomous quality assurance → TikTok hook

12:50

"Keep in mind, this was all done through a single prompt. The same with the first project, but I just think this just feels way more polished."

Single-prompt claim plus visible quality gap is high-engagement clip setup → IG reel cold open

§ · Pacing

How they spent the runtime.

Hook length83s

Info densityhigh

Filler8%

§ · Resources Mentioned

Things they pointed at.

14:08linkEffective harnesses for long-running agents (Anthropic) ↗

16:33linkautonomous-coding-ui by leonvanzyl ↗

16:48productAutomaker by WebDevCody

16:57productAutoClaw

28:00productAgentic Labs (Skool community) ↗

§ · CTA Breakdown

How they asked for the click.

28:00 product

"You can join my school community and either myself or one of the community members will assist you."

Soft sell after YOLO mode demo. Agentic Labs Skool at $5/month. Paired with subscribe ask.

§ 04 · The Script

Word for word.

HOOK opening / re-engagementCTA the pitch analogy story

00:00HOOKWelcome back and a happy New Year. Now I'm convinced that long running agents are the future of agentic coding and vibe coding. And in this video, I'll prove that to you. I'm going to show you two projects that have three things in common. They were both built using Claude code. I sent them each the exact same prompt, and finally, I didn't interfere at any point. I simply let the agent run on autopilot.

00:24HOOKAnd as you'll see in a second, the results are drastically different. And later on in this video, I'll show you how you can use this long running agent yourself, and it's really easy. First, let's have a look at the prompt that I passed to each of these agents. And this is a really detailed prompt. It gives a basic overview of the app that we're trying to build. And by the way, this is an application that can assist content creators

00:48HOOKwith coming up with hooks and titles and thumbnail ideas. I also referenced some technical documentation that it can use to understand the different frameworks involved. Then we've got some prerequisites like we're starting off with the Next. Js project. We've got our problem statement and a lot lot more. So a few things to note is with the thumbnail generation,

01:09HOOKI actually want to use the Nano Banana model to generate different thumbnail ideas. And then we also have this built in AI assistant that's able to manipulate the results, generate more results,

01:21HOOKand come up with additional ideas. So this really is a massive project. So for the first project, all I did was go through the normal workflow using ClothCode CLI. So I went into planning mode, I then passed in that entire prompt into ClothCode, and Claude then created an implementation plan.

01:40And afterwards, I just told Claude to go ahead and implement it, and this was the result. So this is a really simple user interface. There's no light or dark mode. Although, I did specify light and dark mode in the actual prompt. And for reference, I did use Opus 4.5 in thinking mode for this project. So I've already created a few sample projects here, but something to note is there's actually no option to delete any of these projects.

02:06Let's create a new project. Let's create n eight n beginner tutorial, and now we can provide the topic that this video is about. So we can say this is a comprehensive beginner tutorial on getting started with n eight n from basic installation to building agentic workflows. Now what we can do is generate the different hooks and this simply takes us to the hook page where we have to click on generate. Alright. So the agent generated these different hook ideas. We can see these cards on the screen and we also have this chat interface where we can have a conversation with our AI assistant.

02:41But I'm really not a fan of this response. Either way, one of the requirements in the prompt file was that the assistant should have the ability to edit any of these cards. So it should be able to add cards, remove them, or change the cards. As an example, let's change this one. So instead of saying stop paying for Sapir,

03:01let's change it to make.com. Okay. So it's responded saying that the hook was changed, and in this array, we can see that it's now saying make.com. But if I have a look at the cart, it's still showing SAPIER.

03:14Let's try to refresh this page, and this still wasn't changed. Look, it's impressive that some of the stuff just works as a one shot prompt, but it's clear that some of the more advanced features were missed, like the ability for this agent to actually manipulate the cards on the screen. Another thing that's missing is the ability to edit the system prompts

03:34for any of these agents. Like, what if I don't like the style of these hooks and I actually wanted to train the agent to generate hooks that meet my preferences? I don't see any way to edit the system prompts. And to be clear, this was a requirement

03:50in the prompt file as well. Either way, let's select one of these hooks and let's continue to intros. Let's generate it. Alright. So now we have our different intros. Again, we can see the array in the conversation window. Let's try to remove a card. Please

04:07remove item five. Right. So in the chat window, it says the item was removed by looking at the UI. Card five is still there even after refreshing the UI.

04:19By the way, let's just select any of these intros and let's continue to the titles. It's generated titles and we get our title ideas. Let's actually test this regenerate function to see if it works. And that does seem to work. So let's select this one. Let's continue to thumbnails. Let's generate the thumbnails,

04:39and this simply gives us a description of the thumbnail and not the image itself. And as a reminder, we specifically said in the prompt that we want to use Gemini three Pro, so Nano Banana, to generate thumbnails.

04:53And I don't see any option on the screen to generate those thumbnails. So let's simply go with this one, and let's complete the project. And that actually takes us back to the home screen without any summary or anything else. And unfortunately, this is the reality of trying to build a massive project like this in a single context window. As the agent was trying to implement all of these features,

05:17it exceeded the context window size. So it kept compacting the conversation, which actually lost a lot of important context. And then this is the final result. Now there are frameworks out there that try to solve this issue by splitting this massive requirement up into smaller features and then implementing these features one by one. You might be familiar with SpecKit or BMAD or if you've ever used by Agenci coding starter kit, that is exactly what the SpecK feature does. It looks at this massive prompt and then it actually creates a separate implementation plan that's split up into phases and actionable tasks.

05:55Now the limitation with that approach is that you still have to coordinate everything. So you need to tell the agent to go and implement phase one, and once that's done, you have to intervene by clearing the context window and asking the agent to start with the next phase. And this does introduce other issues. This is like having developers work in shifts,

06:16where one developer would do a piece of work and they would then leave the office. And the next developer comes in having no context on what the previous developer did. So they just kind of take on the next piece of work, implement it, and they could introduce duplicate code. They could even introduce bugs that break whatever the previous developer did. And besides, for large projects like this, I don't think any of us actually want to sit around and babysit the agent whilst going through 20 or 50 phases.

06:46So the solution to this is to implement a long running agent harness. This harness will allow the agent to run for a very long time. This could be hours or even days to implement everything, and this also includes regression testing. So let's have a look at the second project. Again, I used the exact same prompt.

07:06I also used Opus 4.5 in thinking mode. A lot of this will look familiar. It's the same project that we're trying to build after all, but you'll already see a lot of small differences. The first being is we've got access to light and dark mode. We also have this settings button where we can edit the different prompts. So indeed, we can edit the system prompt for the Hooks agent, the Intros agent, titles,

07:31and thumbnails. Then from the main screen, we can actually filter projects by different statuses. This isn't even something I asked for, but the long running agent decided that these features would make sense. When I hover over these cards, we have the option to delete the project, edit it, we can even duplicate the project, and of course, open it. Let's try to delete this project and that works.

07:55Let's then try to create a new project. So let's give it a name. Let's provide the topic. We can also provide the target audience. So I'll just say no coders and automation experts. And for the content style, let's do educational.

08:12Then let's create this project. Then let's open it. And this already looks way better. We can see exactly where we are in the process. We have a summary of the topic, and on the right hand side, we've got access to our AI assistant. Let's just say hello to see if this actually works,

08:30and it does. Keep in mind, this was all done through a single prompt. The same with the first project, but I just think this just feels way more polished. Let's go to generate hooks, and now we've got our different hooks. We can actually copy the text, we can even edit the hooks, which is really helpful. I didn't see that in the first project. And of course, we can select the hook that we want. We also have the option to regenerate these hooks or we can add more hooks. But let's see if our AI assistant can actually edit any of these. So let's have a look at this one. Most people spend ten hours a week on repetitive tasks. I spent ten minutes. Let's actually change this text. So let's say, please

09:12change card three from this to I

09:18spent five minutes. Let's see what this does. We'll just let the agent cook and that's it. It actually changed this card. And we can see this little edited badge as well. That's awesome. Another thing that just popped up is this history button. So if I click on this, I can actually see the previous version of this card. This is insane.

09:40Let's try removing a card. Please remove card five. So this guy over here. And done. That card is now gone. This is insane. This is really, really cool. All with a single prompt. Let's continue to intros. I'm just going to generate whatever. This is probably a small thing, but I do appreciate

10:01that the AI assistant gives you these little buttons to tell you what you can use this assistant for. This isn't something I asked for, but it's really cool that it's there. Okay. So we've got our different intros. Let's select this one. Let's continue to titles, and we've got our different title ideas.

10:20Let's simply grab this one. Let's move on to thumbnails. So during the planning phase, the agent asked me if I would like to I have the ability to send reference images to Nano Banana. So if you're unfamiliar with how Nano Banana works, it's really good at generating new images, but also really good at editing images, which also means you can provide a reference image, so maybe a photo of myself

10:45along with maybe any important logos that could be used in the design. So that agent actually did that research. It looked at the capabilities of Nano Banana and said, hey, would you like the ability to provide reference images?

11:00So one thing I did was in the project files, if I go to public thumbnails and template, I simply uploaded this thumbnail template that I use in a lot of my videos. So in the planning phase, I just said that's a great idea. I would love to use reference images.

11:18So I just uploaded this template of myself into the public folder. You can use that as a reference. But now I can see in the implementation, it's actually taking it a step further. So it also allows me to upload things like logos.

11:32So for this, let's actually select the n eight n logo and that should be it. Let's go ahead and create these thumbnails. And just to be clear, for the first project, the planning agent never asked me if I wanted to use reference images with Nano Banana. It just kind of skipped that altogether.

11:48Now I do realize this is nondeterministic, and chances are if I had to run that process again, it might have considered this to be an option. And here we go. We actually have our thumbnails coming in from Nano Banana. This is awesome. I actually like this first one. So what we can do is view this in full screen, which looks great. We can also view the prompt that was sent to Nano Banana. We can also upscale the image. So I guess if this is the one that we want to use, we can simply upscale it to four k. We can also refine the image. So let's click on that and let's change the text. Let's say, please replace

12:28game changer with beginner tutorial. Alright. Let's click on regenerate, and let's see if this actually works. Alright. It seems to be done. And, dude, look at that.

12:41We have our thumbnail with the text beginner tutorial. And when we click on history, we can see the different revisions of this thumbnail. This is just stupidly impressive. And keep in mind, this was a single prompt that simply ran on autopilot,

12:56and this is pretty much a usable application. I'm actually going to make one more refinement. Let's say, please change the yellow shirt and glow into color that complements the n eight and pink on the color wheel. Let's just see what we get back. And there we go. This looks way way better.

13:13And finally, I guess we can simply upscale this image to use it on the actual video. And cool. The image was upscale to four k and we're able to download it as well. Man, this is just so impressive. Just to think this was a single prompt. And finally, on the complete screen, we actually get a summary. Now I just wanted to see if we can actually go back to the different steps. So if I click on Hooks, it says that, yes, you can go back to Hooks. Your progress in later stages will be preserved. Oh, cool. So let's go back to hooks. Here, I can see the hook that I selected.

13:48And if we go back to thumbnails, we can see the thumbnail that was selected. So all of this is persisted even if I refresh this page. Now let me show you how I built this and how you can use this long running agent yourself. A few weeks ago, I created this video that shows you how you can let Cloud Code run for hours or days to implement complex solutions. That solution was based on this effective harness for long running agents by Anthropic.

14:15This is a brilliant article, so I definitely recommend giving it a read. They also give you this quick start guide, which is an example project that you can play with. Now, this is a really cool project and it's actually quite simple. The entire process starts with you providing an app spec file. This is a really simple document that contains all of the prerequisites

14:37and tech stack and core features of your application. You can simply use Cloth or ChatGPT to populate the file for you. Now when you execute this application, so the long running agent harness, it will start off by running this initializer

14:52agent. This agent will have a look at your app spec file and it will actually do a couple of things. It will set up the basic project structure, but more importantly, it will also produce this feature list. Now this is really the magic of having these long running agents. This feature list could have tens to hundreds to even thousands of features.

15:16Each and every aspect of your application becomes a feature. For example, having a light and dark toggle, that becomes a feature and that feature needs to be implemented by a coding agent and tested thoroughly. And once the initializer agent completes its work, the framework kicks off coding agents. So this coding agent will have a look at the feature list to find the next feature that needs to be implemented.

15:41This coding agent also has a second task and that is to perform regression testing. So it needs to retrieve the next feature that should be implemented, but it should also pick three features at random that have already been implemented and then do regression testing on those features. So this agent will perform its regression testing and implement the next feature.

16:03And once it exceeds its context window or gets close to filling up its context window, it will close the session, update the statuses of the features, and then move on to the next coding agent. And each of these coding agents have their own context windows,

16:19ensuring that the context stays lean and focused on very specific tasks. So that is basically how this Anthropic framework works. But, of course, this all runs in the terminal. It's very technical to set up, and it's not really meant for everyone. So there's been this drive to build UIs and applications

16:37that use this theory behind the scenes, but it's way easier for pretty much anyone to use, whether you are an experienced developer or a wide coder. An example of such a project is automaker from another YouTuber called WebDevCody. Cody and his Discord community have been putting a lot of work into this application. It uses all the theory that we just discussed. We start off with an app spec and the initializer agent will then create a feature list based on your requirements. And now you can visually see the coding agents implementing all of these features using a Kanban board. Now this is really a complex and fully featured application,

17:19and I do want to create a dedicated video on this. Now tools like Automaker and even AutoClaw do a lot more than just this Anthropic harness. They're trying to be fully featured applications that pretty much replace your coding editor to an extent. You can do stuff like branching and work trees and run terminal sessions within those applications.

17:40So for this video, I decided to really simplify things. This takes the Anthropic harness and simply slaps a UI on top of it. And it's really easy to use and free by the way. So all you have to do is go to this repository, which I'll link to in the description of this video. Then what you have to do is click on code and click on download ZIP. Then you can simply extract the contents of that file anywhere on your machine. It really doesn't matter. Then depending on your operating system,

18:10you can double click on start UI for Windows or start UI dot SH for Mac or Linux. So because I'm using Windows, I'll just click on start UI, then it's run this,

18:23and this will set up the Python environment for you and install all of the dependencies. And afterwards, you'll get this URL and if you open this, you will have access to this UI. You just won't have any projects set up yet. Now this is using your Cloud Code account behind the scenes, so you do have to have ClothCode set up. So it means you have to install the ClothCode

18:45CLI tool and authenticate yourself in ClothCode. Now if you're watching this video, I do assume that you already have ClothCode set up, but if you have no idea what I'm talking about, you can simply go to my GitHub repository and simply follow these prerequisites. So you just have to install Claude code and then authenticate yourself by running this command in the terminal. You need to either have a Clawd subscription

19:09or you can use an Anthropic API key. So this uses the exact same harness from Anthropic with a few very important changes. And I simply made these changes for performance sake, but everything else remains the same. To create a new project, we can simply click on this drop down. Let's go to new project, and let's just give this a name like YouTube

19:33demo. Let's click on next. Then let's click on select this folder. And now you have two options. With the Anthropic example, so the vanilla project, you have to change the app spec file yourself.

19:46So you have to use Cloth code or ChatGPT or just manually manipulate that file. But what I added to this application is the ability to use Cloth to generate that file for you. So let's actually test that feature. Cloth will now ask you a few basic questions, like, why do you want to call this project? And in her own words, what are we building? Who will use it? So it's just really,

20:10really simple. I'm actually just going to copy this prompt and paste it into the chat. Now this is really meant for everyone. So if you have no idea how to use a tech stack or how to define what you want, simply have a natural conversation with the agent and it will guide you through everything. In fact, even passing in all of this context, the agent is giving us two routes.

20:34We can select quick mode, and this is where the agent will make decisions on your behalf. For instance, the tech stack. Or if you really want to get involved in the architecture, you can go into detailed mode. And this is where you can tell the agent exactly what you want to use in terms of the database, conventions related to database schemas, etcetera.

20:54For this, let's simply go with quick mode. Right. It's asking us a few more questions. I'm just going to say you choose. And the agent is asking us a few more questions. Now this is something I really like, and this is where the agent asked me if I wanted to use reference images for generating thumbnails, something Clothco didn't do in the vanilla planning phase.

21:15I'll just say, you choose again. And then let's just say, sounds good. Obviously, when you're using this, you don't want to rush through this. Really go through all the questions and just have this conversation with the agent. And now the agent is giving us a breakdown of all of the features in our application. So if you want to add anything or remove anything or change anything,

21:37this is the time to do it. In total, we're looking at a 174 unique features for this application. I'm just going to say it looks good. Another agent is creating two files. It's creating that app spec file and it's also updating the initializer prompt with an amount of features that need to be implemented. And I do want to warn you, this step can take a few minutes to complete. And one thing I forgot to mention is you can attach files to this conversation as well. So maybe you've got an image of a design that you really like,

22:10you can definitely upload that image as well. And by the way, if you were curious, this is what the project currently looks like. So the agent wrote this app spec file. This gives us the project name and an overview. It contains the tech stack, the prerequisites, the amount of features that need to be implemented, and all the core features.

22:30And then the initializer prompt was simply updated to contain the amount of features as well. And now we can see the agent completed its work, and we now have this button to continue to the project. So now that we have our spec file, the initializer agent will run to set up the entire project structure, and it will also produce this feature list. If So we have a look at this app, the initializer agent is currently running, and you can also see exactly what the agent is currently doing.

22:58If you want more information, what you can do is simply click on debug. You can also press d on the keyboard to bring up or hide this debug window. And in here, you can see exactly what the agent is busy with. Alright. So we can see the agent already created the first 20 features and they're showing up in the spending column.

23:17So these features are sorted by priority. So the ones at the top will be implemented first. I also want to mention that you can manually add new features as well. So for existing projects, really all you have to do is go to add feature, and here you can tell the agent exactly what feature you want to implement. And you can simply run the workflow and the agent will automatically pick up that feature and implement it. Another small quality of life feature is the chat window. So when you click on this, you can ask questions about your project or the code base in this chat window. The initializer agent created a 190

23:53features. That is more than the amount of features that we agreed on in the conversation. So sometimes it might create more features than you expected, but that is just because it identified certain gaps and it's trying to create complete test coverage. It's also setting up the project structure and installing all of the dependencies. Let me show you some of the improvements I made to the original harness.

24:16The Anthropic harness writes all of the features to a massive feature list file. Now, if you have hundreds of features, this file can get massive. That feature list file by itself can exceed the context window of the agent. So instead of having this massive JSON file, I instructed the agent to add features to a SQLite database instead. So in this database, we can see all of the features with their priorities,

24:41with their steps, and whether this feature passed or not. Now in order for the agent to add features and update features in this database, I created a dedicated features MCP server with tools for updating and working with this database. So there's a tool for retrieving the next feature to implement, there's a separate tool for retrieving three features to regression test,

25:04etcetera. I found that this approach reduces token usage and it actually improves performance. Perfect. The initializer agent is now done. The coding agents will now start implementing these features. So if I have a look at this debug window,

25:18you can see the agent is using these MCP tools. So there's a tool for retrieving the stats. There's also we tell the agent how many features have been implemented. There's a tool for retrieving the next feature to implement.

25:30And if there were any completed features, the agent would use the MCP tool to retrieve three random features to regression test. Great. So the coding agent just pulled in a feature, and it's busy implementing this feature over a year. And it's already implemented that feature. Well, that's cool. So it's now retrieving the next feature to implement, which is this guy over here. And in my opinion, this really is the secret sauce. This agent will actually open up a browser window

26:00and test the application in real time. So as you can see, it's actually opened up this browser window. This allows the agent to actually see the application, so it can identify any UI related bugs or styling issues as well. This is really relying on test driven development. So it will test a feature, realize, oh, it actually doesn't work because there's a bug or the feature is totally missing.

26:24It will then implement the feature and try again. So as I've been speaking, you can actually see the page changing as the agent is testing this functionality. And it's this workflow of implementing a feature and testing it using the UI that really makes a massive difference. The agent is not simply implementing this blindly,

26:44CTAIt's testing the application to make sure everything works and everything is there. Yes. This takes a long time to complete, and it uses a lot of tokens. And if it's quality you're after, this is really the approach you need to take. And if you're on a smaller plan with Anthropic, like the $20 plan, you will reach your usage limit relatively quickly. But honestly, it doesn't matter. Your usage resets after a certain period of time, and this agent will simply auto continue the moment your usage has been reset. So you can run this while you're sleeping or over weekends or when you're away. Now if you want to just run through the stuff as quickly as possible without testing,

27:22CTAyou can do that as well. Simply stop the server, then select YOLO mode, this yellow button over here, and click on start agent in YOLO mode. So this will simply grab the next feature, implement it blindly. It will do lint and type checks though, So it won't look for any syntax errors, but it's not going to test these features at all. It's just going to assume that everything works and move on to the next feature. So if you really just want to go with speed, then just use YOLO mode. But if you want to build something that's actually robust and production ready,

27:55CTAthen let the agent test. If you've got any questions, please leave a comment below. Or if you get stuck or have any issues, you can join my school community and either myself or one of the community members will assist you. And also remember to hit the like button and subscribe to my channel for more ClothCode and the Gentic coding tutorials.

28:15CTAI'll see you in the next one. Bye bye.

— full transcript

§ 05 · For Joe

The shift-handoff analogy is the hook. The harness is the product.

Builder playbook

One prompt, two apps, visible gap — the before/after demo format Leon uses here is exactly how Joe sells JoeFlow and any tool with a quality story.

Run leonvanzyl/autonomous-coding-ui on the next big MCN or JoeFlow feature sprint — free, open source, just download ZIP.
Steal the shift-handoff developer analogy for any Claude Code content: it lands instantly with non-technical audiences.
The SQLite + MCP server pattern for feature tracking applies directly to Chef orchestration in JoeFlow Sessions.
Surface YOLO mode as a product concept — explicit speed/quality toggle is a UI pattern worth borrowing for batch jobs.
Frame long-running agents as stop babysitting — dovetails with own your stack, stop renting positioning.

§ 05 · For You

What this means if you build with Claude Code.

For the builder watching

If your Claude Code sessions keep producing half-finished apps, the problem is the context window — and there is a free, ready-to-run fix.

Download leonvanzyl/autonomous-coding-ui from GitHub — ZIP, extract, double-click start_ui on Windows.
Start with a detailed app spec: list your tech stack, core features, and requirements. Claude will help you write it.
Use Test Mode for anything you plan to ship. YOLO Mode for throwaway prototypes only.
The initializer agent will generate more features than you asked for — that is a feature, not a bug.
If you hit the Anthropic usage limit mid-run, the harness auto-resumes when your quota resets.

§ 06 · Frame Gallery

Visual moments.

06:53

14:09

16:05

19:12

23:02

27:24