WEBVTT

00:00:00.080 --> 00:00:01.040
Recently,

00:00:01.040 --> 00:00:04.960
I called Ralph Wiggum the peak evolution of vibe coding,

00:00:05.120 --> 00:00:20.525
and I did not mean that as a compliment. If you've been on my channel before, you know that I'm not a fan of vibe coding. We need to be in the driver's seat when we're working with our coding agents. So, yeah, I was in the camp of Ralph Wiggum is overhyped, let's just move on.

00:00:20.845 --> 00:01:03.420
But after learning about it more recently, I'm honestly not so sure about that stance anymore. Now, don't get me wrong. It is still vibe coding to an extent and so a lot of what I covered in the last video still stands true. But one of the biggest gripes that I had with Ralph Wiggum is that there's no planning process really built into it or at least I thought because I was really focused on the Anthropic plug ins specifically for Ralph Wigham. But this my friend is not the official Ralph Wigham. And actually after I put out this video, Jeffrey Huntley, the creator of Ralph Wigham, he commented and he shared some resources, some videos on his channel, which I'll link to in the description, where he talks about how he builds extensive planning

00:01:03.500 --> 00:01:06.940
into Ralph Wiggum. Treating Ralph Wiggum as a philosophy

00:01:07.020 --> 00:01:57.470
versus this really shiny object framework like Anthropic has presented to us. And so there's a lot more to Ralph Wiggum than you would think. That's what I wanna cover with you in this video. Now, I am absolutely not saying that with proper planning, Ralph Wiggum is the best thing in the world. In fact, even in my last video, I said that when we add planning upfront to guide our coding agent, it is still not enough. And so if you really want to see what my process looks like starting complex projects from the ground up, I got a video on my channel coming soon for that, so definitely stay tuned. But there's still a lot more use cases for Ralph Wiggum than I outlined in that other video because proof of concepts are a very powerful thing. Even when I was working in an enterprise setting, a lot of times I would be building proof of concepts to prove out ideas to upper management or even myself as I'm in the exploratory phase.

00:01:57.630 --> 00:01:59.230
And it is just fascinating

00:01:59.230 --> 00:02:37.055
that with AI coding assistance and philosophies slash tools like Ralph Wiggum, our exploratory phase can literally go so far as to build out multiple entire versions of our application as a proof of concept to validate things like our architecture and our tech stack. That is as long as you have a structure plan. These harnesses do not work well if you don't have very clearly defined goals, and that's one of the things I'll show you in a little bit. So if there is a single takeaway for you to have from this video, it's that at least in my humble opinion, the real use of Ralph Wiggum and any agent harness for long running tasks is validation

00:02:37.215 --> 00:02:40.895
of your own ideas. Because now with these tools and frameworks,

00:02:41.055 --> 00:02:44.255
it is so incredibly easy to build even multiple

00:02:44.610 --> 00:02:46.370
entire proof of concepts

00:02:46.450 --> 00:02:50.930
to make sure you're heading in the right direction before you go build the full production ready application

00:02:51.090 --> 00:03:22.970
applying your more rigorous AI coding workflow. And that's what I'm gonna show you how to do right now. So I have this template that I'll link to in the description. You can use this to create a plan for the Ralph Wiggum loop in literally just five to ten minutes. And as long as it is a clearly defined scope in your plan, you can send this off to the Ralph loop and it'll build a beautiful proof of concept for you. So I'll cover how this works and then I'll do a live demonstration with you. And so we'll build something that's not even that trivial. We'll be using Neon for the database, Clerk for authentication,

00:03:22.970 --> 00:03:33.685
the Bracel agent browser CLI, so we have the validation loop within Ralph Wiggum. That's also super important. And so you'll see by the end of this, so we can use this to build proof of concepts very,

00:03:33.765 --> 00:03:34.565
very quickly.

00:03:34.885 --> 00:04:28.665
And you can apply this to anything you'd want to make. Alright. So let's get into this now and build something out together. And credit where credit is due, what I have for you here is very inspired by Jared Blue's Ralph Wiggum guide. So this is one of the resources that I looked into when I wanted to really see how to use the Ralph Wiggum loop, not following the Anthropic official plug in which is not the real Ralph Wiggum. And again, focused on that way too much in my last video. And so I took a lot of inspiration from this, but I built something that I customized a lot for you to make it as easy to start as possible. And so scrolling past the introduction here where I also talk about why we're not using Anthropic plug in because it doesn't even reset the context window. Like, it's just a bad idea to use it. Here we get to the prerequisites. So you can use Ralph Wiggum with any coding agent. I just have it set up to use Claude code in the actual, uh, Ralph loop. So this is the bash script that drives everything

00:04:28.825 --> 00:04:48.600
with the Ralph Wiggum loop. And then going back to the read me here, I'm also using the Vercel agent browser CLI for browser automation because a big part of the loop is we want the agent to be able to check its own work at the end of each loop before we go to the next one. So you could use something like the Playwright MCP as well. This is just my preferred tool. I also made a video on it recently.

00:04:49.080 --> 00:05:11.760
And then we have all of the project files already set up ready for you to go. So I'll talk about this more in a little bit. But the process for using Ralph Wiggum with structured planning here is super simple. And so we start with this create p r d slash command that I've built for you. And so again, using Clive code here where you can invoke create PRD and it'll ask you questions,

00:05:11.840 --> 00:05:23.285
basically doing the discovery for your application to figure out all the context that it needs to create the structure plan for you. It'll even do research if you wanted to and then it'll generate this PRD.

00:05:23.685 --> 00:05:45.290
This is the list of features that you have to knock out to build your entire proof of concept. So this is the main context that goes into the Ralph Wiggum loop. And then there are some other markdown files that we have here. So I wanna explain this really quickly and then I'll talk more about how the actual loop works. And so first we have the prompt dot m d. So this is the context that we send into

00:05:45.615 --> 00:06:20.575
Ralph every single time we do a loop. And so as a part of it, we're also telling it to read the PRD so it has the full scope of work, what are we building for this proof of concept, and then also the activity dot m d. This is the markdown file where we keep track of what has happened in previous loops. And so every time we end a loop, we're going to update this file as one of our last steps. So we have this basically long term memory for the loop. So every single iteration we know what was implemented already. We can see like how many tasks, uh, from the PRD have been completed, last time that is updated, things like that.

00:06:20.895 --> 00:07:07.090
And so again going back to the read me here, after you create your PRD, you're going to have a prompt on MD that is customized to what you're gonna build. Obviously, the scope of work customized too, and then you'll have your settings dot JSON. So we have security built into Cloud Code as well. It's running in sandbox mode. So we're going to be doing a lot of trusting here running the Ralph Wingham loop, but at least we can limit the specific commands it can run. So we can let this go for a while and it's not just gonna go haywire and delete everything from our computer. So we have the settings dot JSON here to prevent the things that it can do and it only allows certain actions. So very important to have that as well. That's another one of those things that's not built into the anthropic plug in when it really should be included in the guide and that's why I want to give it as a part of the guide for you here.

00:07:07.410 --> 00:07:09.890
And so once we have all this set up and we have our PRD

00:07:10.235 --> 00:07:27.360
that is when we can run the Ralph loop. And so this is the classic bash script because all Ralph is is just running Claude code in a bash loop. And so there's a bit more here than I really need to explain. The main thing though is we check to make sure we have all the files that we need, our prompt PRD and activity markdown documents,

00:07:27.520 --> 00:07:46.305
and then we start the loop. So we can limit the number of iterations as well. We can specify the max iterations, but otherwise what we're going to do is run Claude at just piping in this prompt. So it knows like here's your PRD, here's the activity log, now go and knock out the next feature. And the really cool part about our PRD,

00:07:46.305 --> 00:08:05.790
I'll show you this here, that the feature list is structured in exactly the same way as it is in the anthropic harness for long running agents. So I covered this on my channel as well. There's a lot of really great takeaways from this and one of the big things to be learned from Anthropic harness is how they list out the features

00:08:05.950 --> 00:08:10.590
that need to be knocked out for us to be able to say the application is actually complete.

00:08:11.185 --> 00:08:14.705
And so it's this combination of we have the category, the feature description,

00:08:14.785 --> 00:08:25.600
the steps for validation when in our case we're using the Vercel agent browser CLI. And then also this flag, is the feature working or not? And so the only time that we tell Ralph,

00:08:25.760 --> 00:08:34.240
you are allowed to do that completion token. So we have this, like, know, like when you're done, that is when you can say promise complete.

00:08:34.240 --> 00:08:54.460
The only time that we're allowed to output this to exit the Ralph loop is when passes is true for every single feature that we have in our PRD. Right? Like this it's important to key in on this here for a sec because this is our only exit condition. Otherwise, this Ralph loop is going to run forever or until we reach the max iterations.

00:08:54.860 --> 00:09:32.905
And then like I said earlier, at the end of every single loop, we're also going to update the activity log here. So that's essentially how we take the output of one agent and we feed it in as the input for the next in the next loop of Ralph. And that's pretty much everything to cover in the loop here and also everything that I have in the read me. So I just wanted you to understand exactly how Ralph works because it's more of a philosophy than it is a framework. It's how can we keep track of state over time having many different agents knocking out these different features in different context windows so we avoid context bloat. That's also the main thing that I had as a takeaway from Anthropix harness as well.

00:09:33.385 --> 00:09:35.625
And so with that, let's go ahead and start

00:09:36.185 --> 00:09:54.370
from the beginning together. I have a completely empty code base here besides the template. I'll start by creating the PRD, we'll build an entire application together and I'll show you the end results as well. Alright. So over in Claude now, this is the only time that I'll run the Claude command myself because once we have our PRD,

00:09:54.775 --> 00:10:23.490
then I just run the Ralph Bash loop and that's it. And then at the end, we'll have our full application. Hopefully, things are working perfectly. Fingers crossed. So little bit of vibe coding, but this is the driver's seat part right here creating our plan. And so we're having the coding agent walk us through a bunch of questions so it can create the plan, but it's still us determining things like the tech stack and architecture. So we'll see this in just a second here. So for each of the questions that Claude asks you to help create the PRD,

00:10:23.945 --> 00:10:50.710
it's multiple choice and you can also type out your own answers. So this is using one of the new features in Claude where it can ask questions natively in the CLI. It's super cool. And so I added multiple choice just in case you wanna rip through things really quickly and get to the Ralph Wiggum loop. Generally though, the more specific the better. So I will type out all the answers myself. And for the application, I have a pretty awesome idea to throw at Ralph Wiggum here. I wanna create an agent driven habit tracker application.

00:10:51.135 --> 00:11:00.015
Because keeping track of our habits, it's pretty important. So being able to check them off each day, but using an agent to give suggestions for us and helping keep us accountable.

00:11:00.175 --> 00:11:02.895
And so I'm gonna be using Clerk for authentication,

00:11:03.135 --> 00:11:24.215
Neon for the database to keep track of conversation history, and OpenRouter for the LLMs for the agent. And so it's not a super trivial application overall, and I wanna be quite specific about my tech stack here. So I'm gonna answer this question and all the other ones off camera just to not have to bore you with going through the details there. And I'll come back after and show you what I planned for the PRD.

00:11:24.455 --> 00:11:27.255
Okay. So our PRD is now generated.

00:11:27.255 --> 00:11:34.010
I wanna show you my answers to the questions really quickly if you wanna take inspiration from this to go through this whole process yourself.

00:11:34.330 --> 00:11:40.090
And so first of all, I said I wanna build an agent driven habit tracker application just like I described to you.

00:11:40.650 --> 00:12:01.660
And then all the text in gray here, this is the question it asked me and the answer that I gave as well. So it asked me who the audience is. I just said general personal users. I listed some of the features that it suggested to me actually. And so one of the things I love doing with coding agents in the planning phase is I keep myself in the driver's seat by still having it make recommendations

00:12:01.660 --> 00:12:08.220
to me, but in the end, I'm the final one to say, yes, this is our tech stack or these are the features that I want to implement.

00:12:08.860 --> 00:12:15.635
And so it asks about the tech stack, speaking of that. And so this is where I specify that I wanna use Clerk for my authentication,

00:12:15.875 --> 00:12:46.175
Neon for my serverless Postgres, and then I have an API key set up in OpenRouter as well. And so this API key is going to be slowly exhausted as Ralph Wiggum runs the validation with the Vercel agent browser SDK. So as a protection, I have a $5 limit here. So I will be spending real money in order to go through the loop, but I'm already spending real money for the coding agent anyway. And so as long as you have limits to make sure it doesn't go in infinite loop and exhaust all of your API credits,

00:12:46.335 --> 00:12:49.215
then you're good. So protection's in place. I've got that covered.

00:12:49.810 --> 00:12:55.730
And then it asked me for some other ideas for the tech stack as well. So I'm gonna be using Next. Js and Tailwind.

00:12:55.730 --> 00:13:35.125
I'm going to be using Drizzle for the ORM. So a couple of different things just to finish off the tech stack, SHATCN as well. That's the main thing that I asked about deploying to Vercel, which I don't really care about deployment currently, but it's, I guess, good that I asked that too. And then the last question, understanding the success criteria. So when can we actually exit from the Ralph Wiggum loop? And I just said once we have all of the features marked as done in the PRD so that it knows like, okay, we're using Ralph Wiggum. We have to knock out everything in the PRD. Only then can we exit the loop. And now finally, with all those questions answered, I know takes a bit of time, but it's worth doing that to create the structure plan

00:13:35.365 --> 00:13:42.805
because now it edits all of the placeholders that we have here. It crafts a prompt that is specific to how we wanna start our application.

00:13:42.965 --> 00:13:49.900
It gets the activity log kicked off for us here. We have zero out of 19 tasks complete. And then we have our PRD.

00:13:49.900 --> 00:13:55.100
This is the single source of truth or what we are building here with Ralph Wiggum.

00:13:55.260 --> 00:14:09.425
And so we have things that we decided together with our agent in planning, the target audience, the core features. The most important thing though is the task list that we have here. And so step by step here are all the things that you're gonna knock out. So generally,

00:14:09.665 --> 00:15:01.005
each loop of Ralph Wiggum is going to focus on one of these tasks. And so it's not always a one to one, but we'll run something around 19 loops to implement all 19 of these tasks. And then one other custom thing I wanna show you quickly before we run the loop is there's some things that I added to my global rules, my claw .md. And depending on what you're building, you might have to do something like this as well. Because think about it, we have our agent doing self validation with browser automation tools. So it's gonna create this habit tracker front end and it's going to visit it and navigate it as a user would. But it's gonna have to sign into Clerk. It's gonna have to set things up in our database. It needs environment variables. And so I'm telling it, here's this test account that I already made in Clerk. You can use this to sign in with username and password authentication.

00:15:01.325 --> 00:15:03.405
And then I am specifically

00:15:03.405 --> 00:15:17.710
prohibiting it from reading the dot ENV file. This is blocked in the settings dot JSON. But I'm telling it here that you can't read the dot ENV, but within it I already have set the credentials you need for clerk, for authentication,

00:15:17.870 --> 00:15:53.780
for postgres, you can run the migrations and create the tables, and the open router API key and open router model so you can set up everything to interact with the LLM as well. So everything that it needs to test the complete end to end application I have set up for it here. I don't want it to mock things, I don't want it to just create unit tests, I want it to be able to verify the entire application just like I would when I do manual validation because that way once the control is passed to me with all these loops that I've run, things are actually working well and verified. And so we can leverage what's in the dot ENV without reading it itself. So we finished creating our PRD,

00:15:53.940 --> 00:15:57.060
it made all these changes to these files and created our PRD.

00:15:57.375 --> 00:16:27.815
After it does that, I would recommend validating these files to make sure everything's aligned with the goals that you have for the project. Then we go into running the loop, and it is as simple as just simply running this Ralph Bash script. This is going to kick off Claude and then just work in the cycles until we have that completion promise. And so I'm gonna go into my terminal here. I'm going to exit out of Claude because we're not running Claude ourself, we're just running the loop and the loop itself calls Claude. And so I'll open up a new window here, paste in ralph.sh

00:16:27.815 --> 00:17:10.890
and then 20 for 20 max iterations. You can do this however you want. Whatever you wanna do to prevent burning hundreds of thousands of tokens, you could do 10 if it's something simpler, but I would recommend something around twenty and twenty five. I'll just do 50 because I just wanna be absolutely sure that it gets through the end here. And I have the MaxClod subscription, I know that I'm not gonna run out of tokens here. So I'll send this in, and there we go. We are starting in three seconds. And so what I'm gonna do here is I'm going to pause, and I'm gonna come back once we have this promise, once the application is fully complete and we will see the results together. Alright. I am back the next day and the application is completely finished. I ran it overnight. I didn't actually need that much time. I think it took about five hours to do everything,

00:17:11.130 --> 00:17:17.210
including all of the validation. And so take a look at all the artifacts that we have here. So we have the screenshots directory,

00:17:17.210 --> 00:17:20.570
all the different images that it took as it was building the application.

00:17:20.650 --> 00:17:42.250
I had it switched to a dark theme towards the end here, so we'll see that in just a second. There we go. So things start to, like, really shape up and actually look fantastic. We have all of the AI coach features built in as well. We have our activity logs. We can see essentially the long term memory for the Ralph Wiggum loop. Every single loop, it updated this file with its progress and the things it did for validation.

00:17:42.490 --> 00:17:48.170
We have our PRD and scrolling down to the big part right here. Every single one of the features

00:17:48.250 --> 00:18:06.065
passes is set to true because we are done done. And going to our read me here, we have instructions for how to spin this up ourselves. So generally, after you build out a proof of concept, you still wanna do the manual validation even though we have the browser automation tools for the agent to do it itself. But ideally,

00:18:06.065 --> 00:18:09.505
since it did all the checks itself and even all the regression testing,

00:18:09.940 --> 00:19:16.400
what we see now should work really, really well. And so I followed all these instructions to get it set up, and I've got the site running right here. So we got our sign in page with our Clerk integration, so I'll go ahead and sign in with the, uh, just test user that I created here. So my password, this is the same one that I told it to use to sign into the account. So I'll go ahead and continue. Very good. Take a look at that. So we can go to our dashboard here. And man, this is looking really beautiful. And take a look at this. It even generates this little like AI encouragement message every single time I visit. I can refresh the page as well. Don't know if I wanna do that every single time. It's a little bit expensive, but this is super cool. And so I can go to my habits here and I can see all my habits. I can check them off or uncheck them. I can add new habits like let's say, uh, drink coffee. I wanna drink coffee every morning. Uh, do it daily. Description optional. Very good. Create habit. Boom. Take a look at that. Drink coffee. This is amazing. We have our goals here so we can set longer term goals. We have the coach tab so we can see insights. So the agent is running I don't know exactly how often it does this, but it's generating

00:19:16.400 --> 00:19:24.275
insights based on the goals and habits that we set and how we check them off over time. User struggles with maintaining exercise motivation

00:19:24.275 --> 00:19:35.475
after initial strong start. So I guess as it was going through the testing here, it kind of like phased out checking things off. It's probably why it says this. This is super cool. I can ask the agent what habits should I focus on to improve my productivity,

00:19:35.840 --> 00:19:56.615
and I can have a conversation with my PadNTAI agent running under the hood. It's storing all the information, the conversations in my Neon database. Like, this is actually fantastic. Maybe I should do some better markdown rendering here. So, like, certainly there are some things especially the UI that should be touched up like the rendering and the spacing of the components here. But overall,

00:19:56.695 --> 00:19:58.375
like, this is fantastic.

00:19:58.375 --> 00:20:03.735
It built this entire thing for me end to end. I didn't do any human in the loop until the very end here.

00:20:04.430 --> 00:20:06.510
So I wouldn't trust this entirely

00:20:06.510 --> 00:20:08.270
to be the perfect application.

00:20:08.590 --> 00:20:13.390
But for a proof of concept, this is exactly what I need. If I'm just trying to validate an idea,

00:20:13.630 --> 00:20:33.430
just see it in action, maybe validate my architecture or tech stack, Like I said, this is the real purpose of the Ralph Wiggum loop. And it wasn't that expensive overall. I mean, yeah, it used a lot of tokens. I don't know exactly how many tokens with Claude code. But as far as all the testing that it did using the LLM for the agent coach, it only used about 7¢.

00:20:33.430 --> 00:21:20.675
I was using Claude Haiku 4.5 as my model. And so it did dozens and dozens of interactions with the coach and it was still really cheap. So overall, it was relatively fast, pretty reliable. Like, what I got back here is enough for me to validate any kind of proof of concept I would want. So that my friend is the real purpose of the Ralph Wiggum bloop, and I would encourage you to try this out now. I'll have the template linked in the description. You can go through the exact same process I did, you only have to have your hands on the keyboard for about ten minutes. Otherwise, you just let it rip and build something pretty awesome like this. So if you appreciate this video and you're looking forward to more things on AI coding, I would really appreciate a like and a subscribe. I got a lot more content coming soon around my techniques for agentic engineering. And with that, I will see you in the next video.
