WEBVTT

00:00:00.000 --> 00:01:04.510
So I went to all the major skill libraries, Gary Tan, superpowers, Matt Pocock, bMAD, everything Cloud Code, GSD, OpenSpec, and more. Essentially, what I wanted to do was distill all the key ideas down into a couple of prompts, skills, and patterns that you can reuse on every project over and over, and these are the ones I've settled on after almost two years now of working with agents. We're also gonna talk about how the most powerful skill system is built and why you're not gonna find it in someone else's repo. Also this week, our sponsor CodeRobbit just released a wicked new tool for a code review. More on that later. So skills are just part of the entire agent harness that we use to build our software. Skills are just natural language prompts. What differs is you put them in a special file called a skill dot m d. You've got some front matter in there, which is some extra information like the name of the skill and the description. And this is always loaded by the model, so it has an idea of what skills it has capable of using. You can have scripts, reference material, assets all bundled into your skills folder.

00:01:04.590 --> 00:01:08.190
So recently, what we're starting to see trend is skill libraries.

00:01:08.190 --> 00:01:30.890
So we've seen some really popular ones from Gary Tan, Afan Mustafa, Matt Pocock, Ali Asmani. All these guys have been working in the industry for many years and have been pooling their approaches and thought processes into collections of skills, which they're sharing with the rest of the community. If you look at GitHub for any of these skill systems, you've seen they're seeing huge growth. If you look at something like Everything Clog Code, that's up at a 183,000

00:01:30.890 --> 00:01:31.610
stars.

00:01:31.770 --> 00:01:45.945
For context, there have been some huge open source projects that have been on the go for twenty years that don't have anywhere near that many stars. So having studied, used, and reviewed all these different skill library, you can really see that there really is a lot of common denominators.

00:01:45.945 --> 00:02:00.750
Let's take a quick look at some of the major repositories and see what the commonalities are. So Adi Asmani is from Google. He's got a great blog that's worth following. What he advocates for is a simple set of skills based around creating a spec or specification

00:02:00.750 --> 00:02:25.630
for what you want to build. So you have a chat with an agent and decide exactly what's going to be built. Then you create a plan so the agent thinks about what how it's going to build out the specification or idea that you have. It builds it and builds incrementally one slice at a time. Once the build is finished, it gets tested to make sure it adheres to the plan and the spec. Optionally then, you can review the code before it gets merged. You can run code simplification

00:02:25.790 --> 00:02:27.630
prompt to ship to production.

00:02:27.790 --> 00:02:31.870
So this is basically just one iteration of development all encapsulated

00:02:31.870 --> 00:02:32.590
into

00:02:32.750 --> 00:02:34.110
six or seven prompts.

00:02:34.270 --> 00:02:48.835
Mac Pocock has been on the go for years, a fantastic TypeScript developer and AI engineer. He is advocating for real simplicity. In Matt's skills, see a lot of similarities to what Adi has developed. We have a diagnose, which is very similar to a discuss or specify.

00:02:48.915 --> 00:03:05.940
I do like the grill with doc skill. Matt is a big believer. Have a grounding and understanding in the domain model that you're working in and not having to repeat that every single time because you remember our agent has a certain amount of context window, and you wanna load some of that context and understanding of the project. So grill with docs is one that's worth checking out.

00:03:06.555 --> 00:04:39.475
Again, we see some commonalities around test driven development. All of the frameworks advocate for some level of test driven development. It's the idea that you've got tests in place so that if your agent goes off and writes new code or does something else, that you can run a suite of tests to make sure that nothing that you had previously built ends up being broken. It can be hit and miss. With a of AI models, they actually cheat their way through testing, but it is one extra step to make sure that we're not breaking existing code as we move forward. Same as Adi Asmani, we have the two issues, which is basically, again, just breaking down a bigger project into a smaller vertical slice that the agent can handle. You don't want the agent taking on too much in the one go without actually verifying it to make sure that they built exactly what needed to be built. And then another one that I see that Matt has added is prototype. So it's the idea that you actually flesh out a design before you build it out, and I'm a big believer in this. Anyone who has taken my course knows that I'm a big fan of a design mode or a prototype step. The idea here is that you just tell the model that you're going to implement the front end of the application, not the back end. Because once we get into the back end, it starts to slow the agent down. And then actually passing that prototype to the agent gives it a much better sense of what it needs to build in the back end. So doing this as an upfront step really saves some time. And again, this isn't a new idea. We've been doing wireframing and prototyping for the last thirty plus years. This is just the agentic approach. So Gary Tan is the CEO of y Combinator. So Gary has a much more opinionated skill stack, and there is a lot to learn from this. Personally, it's a bit too overengineered

00:04:39.730 --> 00:04:57.365
for my liking. But if you look at any one of the individual skills, there's a lot of interesting things to learn from, particularly his skill around office hours. Hours is really unique. It's like the idea that we sit down with the Y Combinator CEO, and he asks us six forcing questions about our startup idea.

00:04:57.445 --> 00:05:15.580
I actually think a big differentiator now when anyone can build quickly is actually knowing what to build and what to focus on. And then we have everything Claude code by Affen. Now don't be worried. This relaxed skill system will work for any different kind of agent. So of all the libraries that I've shown, this one is probably the biggest. It's at about a 182,000

00:05:15.580 --> 00:05:18.060
stars and a 170 contributions.

00:05:18.380 --> 00:05:26.835
It is pretty huge in terms of what it offers. It looks after memory, continuous learning, how to run verification loops, sub agent orchestration.

00:05:26.835 --> 00:05:29.235
It has a heavy focus on security.

00:05:29.395 --> 00:05:48.070
If you want to deep dive into what an extensive harness might look like, it's worth taking a look at all the different skills that it's got in there. And just a quick note on all the various different spec libraries. There's a lot of overlap between the skill systems that I showed you and, of course, these spec based libraries, and I've covered a lot of them on the channel previously.

00:05:48.310 --> 00:05:53.885
BMAD is very enterprise level. It follows the path of a business analyst, a product manager,

00:05:54.125 --> 00:06:06.790
a senior architect, all working in party mode to send to figure out what you need to build and make sure it follows a strong software development life cycle process. So if you're at an enterprise level, it's worth checking out BMAD. Superpowers

00:06:06.790 --> 00:06:42.170
is a little bit more lightweight. Of all of them, I think I actually like superpowers the best. I've had some good results with it, but I'll be honest, I don't rely heavily on spec based systems. I just go with something a little bit lightweight and bespoke myself, and we'll talk about that in a second. Working with enterprise development teams, one of the biggest complaints I've gotten in the last year is the amount of code review that needs to be done. Thousands of lines if you let it get out of hand, but I highly recommend you make your commits a lot smaller to be a lot nicer to your team. But we still have to review a lot of this code, and that's become a bottleneck. So if you've been watching the channel for a while, you know I'm a big fan of CodeRabbit.

00:06:42.170 --> 00:07:25.645
So what I'm most excited about is just in the last couple of days, they've released Atlas, and this really solves a big pain point for me. I've got a project here that I'm working on, and we have a pull request. If you don't know what pull request is, I've got a great video on git, a couple of videos back you can check out, and I'll link to that in the description. But essentially, either an agent or somebody else in my team has gone and created a load of code, and they've submitted it for me to review before I actually add it into the main code base. So normally, what I have to do is go through the commits and go through all the merged code. And generally, this is quite a painful step because you're just looking at these side by side diff views that we're all familiar with. Basically, a side by side diff, here's the original, and here's what changed. But when I'm dealing with so many pull requests across so many different repos,

00:07:25.805 --> 00:07:45.575
this really starts to fry my brain a little bit. So CodeRabbit was really paying attention to this and came up with something that I think is really, really useful. So essentially, when I click that little review button, here's what I get instead. But instead of a big long alphabetical file list, it breaks it down into layers. So to me, the layers are basically, okay, well, what's the new data shape? And we can see that here.

00:07:45.895 --> 00:08:40.265
Then what kind of business logic is gonna consume that data shape? And then, you know, what's the testing around it? I wanna look at this change in patterns. So on the left here, I get that structured navigation and the layers of the diff, then I get the traditional diff in the middle. And then on the right, I get these little AI summaries of exactly what's going on. And as I move to the code, they actually change and update based on what I'm looking at. So here, it gives me an explanation of exactly what's happening at each stage. And to be honest, even if you're a beginner developer, this is really useful for helping you learn the code and the patterns that AI might be generating so your skills don't atrophy and that you actually improve as a developer. And then what's even cooler is I can go ahead and chat with any one of these different comments or the pull request in general. So it handles everything native GitHub review can do as well. So I can add my comments here, and when I'm finished, I can just go and submit my full review, comment approve, or request changes.

00:08:40.505 --> 00:09:11.925
So Atlas just came out in the last couple of days. I highly recommend you check it out, and it's free for a limited time, so get stuck in. Check out the link in the description down below. Okay. So we've taken a look at all the various different popular skill systems and stacks. Let's compile them together to show you the best pattern and skill pattern you can use to build out your projects. And this is going to be super simple. I'm not introducing you to another repository you have to follow. It's just a set of patterns and a way of thinking that if you can imprint, you're gonna be getting away with 90%

00:09:11.925 --> 00:09:46.625
of what all of these systems offer. So the first thing you're gonna do is go into ask mode. You can get ask mode in cursor in Clog code, or you can set it up as a little skill yourself. So this doesn't need to be a skill. All you need to do is basically ask your agent to interview you about the project that you're going to create. Next up is you wanna create some kind of a prototype. You can ask the agent to to produce an ASCII diagram. They do it really quick, and it'll give you a quick wireframe, fastest way to do it. Or you can just chat to the agent and say, hey. Let's go into design mode. I only want you to work in the front end, not in the back end. I want to design the wireframe of my application.

00:09:46.865 --> 00:09:55.180
So here's a quick example of design mode. So you basically just wanna let the agent know we're in prototyping mode, develop front end only, we're mocking up the interface,

00:09:55.340 --> 00:10:40.475
use dummy JSON to represent any back end data, link all components for navigation, make buttons responsive, don't connect to back end logic. So essentially, you're stopping the agent from creating a complicated back end that it has to support. We just wanna figure out the front end. We're gonna do the back end later. So sometimes if it's a basic feature or we're just doing any kind of back end logic, we can actually skip prototyping. We sometimes have a strong idea of what we need to build, so you don't need to do it in every single step. So both Clog code, Codex, and Cursor have their own vanilla plan mode, which is really quite powerful, and I don't think you need any kind of skill system around this. In the case of Cursor here, and it's the same if you're using Clog code or Codex, It's basically gonna create a document as a markdown,

00:10:40.795 --> 00:11:30.080
breaks it down into various different phases so it knows what it should work on first, which is important. It's very similar to how those other frameworks would shard or split the project into multiple different parts. And then it creates a list of to dos to work through. And it sets out all these different verification steps naturally without you having to build any kind of skill around this. This is something we needed to do for ourselves via skills and processes before, but now a lot of the modern agents are doing this for us. Once the build phase is completed, we're going to want to test and verify exactly what we wanted to get built was built. And again, this is something that's baked into the plan mode of many agents now. It's going to run a lint and build test to make sure that everything compiles and works okay. Then what you can do is get the agents to use Playwright or the browser mode, which is in cursor or in claw code

00:11:30.485 --> 00:12:32.065
to basically test the application to make sure that it's all working fine. Now this is kind of hit and miss. The testing that agents do doesn't really catch what you want it to do. Inevitably, the test phase is still what we call a human in the loop phase where via some kind of a smoke test where you're gonna click through the process yourself to make sure that everything works okay. Now one thing a lot of the other frameworks talked about was test driven development. If you want that to be part of your workflow, you just include that in the planning stage. Just say, want to include test driven development, and it will start to implement and build tests as it's writing the code. So that's pretty much it. Research, prototype, plan, build, and test. That's your kind of iteration loop. Essentially, this is all you need to get your software built, but there's a few extra steps, and that is the polish step. Adi Asmani has a skill which is called simplify code. So essentially, that means you're getting an extra agent to run through the code. In this case, you might have used Opus or a Anthropic model to build the software. You might use GPD 5.5 or something like that to basically work through the code and give its opinions.

00:12:35.825 --> 00:13:07.595
I think the best system is going to be the one that you end up building yourself. That's for a couple of different reasons, and I'll explain why. So first, I'd start with just natural language prompting, the kind of framework that I just showed you. You're just using your own agent patterns to work with the agent. Use the built in ask mode, the build mode, the plan mode as much as you can, and just rely on the native capability of your mind and also the harness. When you do create a skill, it's only because the agent has messed up in some way or you want to give it very bespoke information

00:13:07.675 --> 00:13:14.240
about your code base or your process that you want to carry out and when you want that to be a very repeatable process.

00:13:14.480 --> 00:13:40.645
Also, you want to be rewriting your skills to be as simple as possible, maybe even just a few short paragraphs. There's no need for tons of extra information, which is going to bloat your context window and confuse the model. Next up, you're just gonna work with that very simple skill alongside the agent harness that you're using. Understand where it works well and where it doesn't, then manage for those edge cases you feel like the skill is missing. In traditional software, we have something that's called the software development life cycle, and that encompasses

00:13:40.100 --> 00:13:42.180
multiple different disciplines.

00:13:42.260 --> 00:13:49.940
Now we're looking at something that's like agentic development life cycle, and a big part of that is managing your harness and also skill management,

00:13:50.100 --> 00:13:53.940
particularly when you work across multiple different developers and large organizations.

00:13:54.275 --> 00:14:00.675
Let's say you develop a skill that you feel is useful. How do you share and organize that skill with other people in your organizations?

00:14:00.755 --> 00:14:01.795
Skills.sh

00:14:01.795 --> 00:14:13.130
from Vercel does a good job of allowing you to store your skills in private repositories, being able to update them, store them, and share them as you see fit. Skills in some ways are essentially just documentation,

00:14:13.130 --> 00:14:25.210
and we've all experienced the scenario where comments go out of date, documents go out of date. You're going to have to spend as much time updating these skills and keeping them current, so you need to be very careful about what ones you introduce.

00:14:25.535 --> 00:14:42.640
Are they absolutely necessary, or are they just going to cause confusion to the model at a later point? This is a whole new skill set, if you will. So inevitably, the best skill system and the best harness is going to be the one you develop over time for you. I think this is going to be how you really differentiate as a software developer.

00:14:42.800 --> 00:15:31.885
Your agent harness, your set of skills that has been built up over time working with a particular code base. If there's one skill that you absolutely can't live without, please share it in the comments so I can learn and everybody else in the community can learn as well at the same time. So hopefully, was helpful. Instead of introducing you to yet another skills library that you have to test and try, I kinda wanted to bring things back down to earth, give you the feeling that you weren't losing your mind in keeping up with all of these systems, and understanding that it's really just some simple development loops that we've been practicing and using for the last thirty or forty years of software development that you can rely on. A lot of these libraries, a lot of these skills are replicating that same approach. That really starts to compound for you and for the organization that you're working within. If you found this useful, hit the subscribe button. It means you'll get more videos like this in your feed, and I look forward to seeing you next week.