WEBVTT

00:00:00.000 --> 00:00:07.200
A really talented AI educator and engineer named Matt Pocock has a skill library that has soared to over 70,000

00:00:07.200 --> 00:00:12.480
stars pretty quickly, but the description says that it's not for vibe coders.

00:00:13.515 --> 00:00:30.040
So it really makes me wonder, am I actually a vibe coder? Are you? I don't use tools like Replit and ask it to build me horse Tinder, but I also don't have a decade's experience as a software engineer. So I think vibe engineering is a better term for that middle ground exit level beginner into intermediate,

00:00:30.040 --> 00:00:39.400
where we're trying to actually build awesome stuff, but do it in a very systematic way. Because this skill library is pretty amazing, and I use it alongside tools like OpenSpec,

00:00:39.560 --> 00:00:40.360
OBRA,

00:00:40.440 --> 00:00:54.615
and Compound Engineering every single day. So I'm gonna show you five of the skills from it that I use on the daily, how they work, and where in your projects you can use them. Starting with one of my favorites, improving code base architecture.

00:00:54.695 --> 00:01:01.230
One of the big downsides of vibe coding is that you can go along your merry way making a bunch of changes that contradict

00:01:01.230 --> 00:01:23.705
what you've already done. And so your project gets very complicated and convoluted over time, which makes it really difficult to build things on top of it. So this skill aims to solve that by improving the underlying structure of your project. So let's take a look at how to use it. So here we're looking at a project that I've been working on inside of my paid community, Shameless Plug, which is a Twitter intelligence tool that was built using

00:01:23.785 --> 00:01:32.160
BMAD. Now it's pretty simple how a user interacts with it, but behind the scenes, there's a bit going on for identifying trends, clustering topics together,

00:01:32.320 --> 00:01:38.480
ranking them. Again, as a Twitter research tool that's meant to identify trends and surface them to me proactively.

00:01:38.640 --> 00:02:03.800
So let's see what happens when we run this improve code base architecture command in this project. So one of the things that I really like about how this skill works is it makes sure that it's actually using the language of your app to explain things to you. And this is important because when we're talking with the language model and when the language model's responding to us, we need to make sure that we're using the same vocabulary to describe things so that we don't go off and make changes inadvertently.

00:02:03.800 --> 00:02:12.440
So now what this is doing is it's going out, it's dispatching an exploration agent to actually go through the code base and find any architectural

00:02:12.015 --> 00:02:26.440
friction. Now if you're interested in exactly what it's doing, you can go check out the repo. But basically, what it's looking for are moments of friction inside of your app. And so the premise behind this is that it's gonna propose to you deepening opportunities,

00:02:26.440 --> 00:02:36.360
where you can take shallow modules and turn them into deeper modules. And the intent behind that is that it's meant to make your code base more testable

00:02:36.360 --> 00:02:43.545
and easier for an AI to navigate. And so now what we get out the other side are the top five deepening opportunities

00:02:43.625 --> 00:02:48.185
in priority order. Now, obviously, you wanna use your own judgment to

00:02:48.345 --> 00:02:51.465
see which of these actually makes sense for you to implement.

00:02:51.545 --> 00:02:54.825
But, again, as we move through, we can see that it's recommending

00:02:55.010 --> 00:03:01.890
a change. It's talking about the files related to this change, the fundamental problem that is surfacing

00:03:01.890 --> 00:03:06.130
based on what you've built in these files and how they interact.

00:03:06.210 --> 00:03:12.485
It'll then provide you a concrete solution and the benefits of that solution. And so this process continues now through,

00:03:12.725 --> 00:03:30.750
in this case, the top five issues that it found. So having something like this that can kind of surface the high level changes you should make is really valuable. But what happens when some of these objections that it is bringing up needs a deeper dive? So in this case, as I read through these, the one that actually stuck out to me as the biggest concern

00:03:30.910 --> 00:03:36.910
is this quality scoring system that I have inside of the app. They're calling it a god orchestrator

00:03:37.095 --> 00:03:55.950
tangled with a bunch of other stuff. And so fixing this might be something that I want to explore. Well, the next skill solves that problem, and it is called grill me. So we can run the command and then talk about what it's doing while it goes. So of the things that's a real pain about a lot of these tools is they don't really push you to have an understanding

00:03:56.190 --> 00:04:09.845
of what it is exactly that you're about to change and how that might impact things. So they might ask, like, two, three, four questions trying to clarify a few, like, really big things. But often, it lets a lot of hidden assumptions just slide through,

00:04:10.165 --> 00:04:43.205
and they make their way into your project. And so this skill helps that by really stress testing exactly what you want to change and why and the ramifications of that change. So after about seven or eight questions and maybe ten minutes of back and forth, this has gone from, like, that one fundamental thing that we wanted to address all the way down to an actual design that we can implement. Now the thing that I think is really valuable about this skill is the way that it actually works. So, again, a lot of other tools are gonna ask you five questions maybe that get your directional insight on which way you wanna take the implementation,

00:04:43.365 --> 00:04:47.845
and then they're gonna go off and address all of these underlying assumptions

00:04:48.010 --> 00:05:06.685
at game time when they go to write the stuff. But what the grill me command does is that once you pick a direction, it's gonna go deep down the rabbit hole to resolve all of the other issues that crop up because of that decision. So for example, in question one, we addressed, like, what is the actual problem that we are solving specifically.

00:05:06.765 --> 00:05:09.485
And then based on that specific problem,

00:05:09.565 --> 00:05:13.805
we had to define, well, what's the right shape of this solution?

00:05:13.885 --> 00:05:17.565
And then based on that shape, how is that going to impact the other functions

00:05:17.660 --> 00:05:18.780
that interact

00:05:18.860 --> 00:05:24.780
with this service? And then based on our response to that, well, then what needs to go inside of that? Right? And it continues

00:05:25.020 --> 00:05:57.050
spiraling down until every branch of this design tree has actually been resolved. So just to draw this out real quick, with a lot of other tools, we're gonna get just our five primary questions that address something like at a high level. But then let's say that based on what we respond to here, there's technically new decisions now. So maybe now we have a whole new subset of four different directions that we could take based on what we responded to there. Well, now based on what we choose to do here, there's again another branch of decisions.

00:05:57.050 --> 00:05:59.130
And then from here, another branch of decisions,

00:05:59.485 --> 00:06:01.565
And it continues down the tree,

00:06:01.885 --> 00:06:12.285
making sure that we, as the humans in the loop, are actually resolving these decisions in a way that makes sense for us. Now, again, the thing that's really nice about this is that it is open source.

00:06:12.365 --> 00:06:33.945
So if you wanna come in here and customize this to make it something that maybe explains things in more detail to you or generally makes things clearer for you, you have the capacity to very easily do that. Now one objection I can already hear people shouting from the void is, how many tokens this cost, bro? Well, the next skill I'm gonna show you can help reduce your token costs allegedly by up to 75%,

00:06:34.025 --> 00:06:36.505
and that skill is called caveman.

00:06:36.505 --> 00:06:44.980
So as it implies, this skill forces your language model to reply to you like a smart caveman. And so it claims to cut token usage by 75%

00:06:44.980 --> 00:06:46.500
by dropping fillers,

00:06:46.580 --> 00:06:47.460
articles,

00:06:47.700 --> 00:06:48.980
and pleasantries

00:06:49.060 --> 00:07:09.425
while keeping full technical accuracy. And that's a really big thing because there's been a lot of implementations of this caveman mode. But the problem with some of those things is that they dropped the important technical language. Now one of the things that I really like is that it will automatically kick you out of caveman mode when there's really important stuff. So if there's security warnings,

00:07:09.505 --> 00:07:10.625
irreversible

00:07:10.625 --> 00:07:11.105
actions,

00:07:11.570 --> 00:07:13.410
or multi step sequences

00:07:13.650 --> 00:07:57.485
where talking like a caveman might risk you misreading it, or if the user asks you to, hey, stop talking like a caveman for a second. All of those things will automatically kick it out of this mode and explain to you in regular mode. And so in order to show you the difference, I'm gonna run this now without the caveman skill with the same exact prompt so that we can see the difference in the outputs that we get. So let's look through, like, the first few lines just to compare. It says the plan asserts the threshold read bootstrap fallback logic is duplicated between the ranker when it applies thresholds per post and the recompute stage when it logs old values for the tuning log per enforcement guideline number 10. I read the code. The recompute stage doesn't read threshold with the bootstrap fallback.

00:07:57.565 --> 00:08:12.040
It reads post created at count. Right? And then it goes on to explain all of this stuff. So we can come through here actually, and let's just pop this into a token counter. So So now this isn't perfect because we're using Opus and they tokenize things differently,

00:08:12.200 --> 00:08:14.280
but roughly 768

00:08:14.280 --> 00:08:15.160
tokens

00:08:15.240 --> 00:08:32.255
for that response without the caveman approach. Now we can hop in and check that same exact output from the caveman mode. So plan claim, ranker recompute threshold and divergence makes tuning log a lie. That is significantly shorter. Right? Red code carefully.

00:08:32.335 --> 00:08:45.630
Ranker does this. Recompute does this. Question one, is the premise wrong? So this is significantly shorter than the last one, and let's actually just go test that. If we pop back into our tokenizer, the first one was 768.

00:08:45.630 --> 00:08:50.350
This one is five zero two. So in that case, that's, I think, like, roughly, like, a 30%

00:08:50.175 --> 00:09:16.450
reduction in the number of tokens that were used just to basically check-in with us and ask us a question. And that is pretty significant, especially when you consider how these conversations are going to compound over time. So now if we had used that caveman skill with this original design that we got from the previous step, we would have had significantly less token usage. So in a second, I'm gonna show you how he carries forward a decision like this to use in future sessions.

00:09:16.905 --> 00:09:24.985
But first, what if these first three skills are generating questions for you that are hitting a little bit higher than your understanding

00:09:24.985 --> 00:09:53.085
of things? So the next skill that I really like is simply called zoom out. And what this does is it tells the agent to zoom out and give you broader context or a higher level perspective. So if you're ever unfamiliar with a section of code or need to understand how it fits into the bigger picture, this a really helpful skill to call. So So we will come down and we will run the zoom out command. And so one of the things when you're working in a domain that you're not really comfortable with, you will very often defer to the recommendation

00:09:53.085 --> 00:10:02.840
of the model. But if you really care about what you're building, you really need to look at those things and make sure that they are something that you're at least understanding

00:10:02.840 --> 00:10:10.200
and on board with. So let's say that, you know, the first code base architecture improver recommended that we look at this thing specifically.

00:10:10.360 --> 00:10:28.700
And now that we've run it, the caveman skill or the grill me skill is telling us that, hey. This isn't actually an issue. And let's say we wanna really understand if it is or not until we ask it to zoom out and explain where we are. And so the first thing that it's gonna look at is the domain vocabulary so that what it's about to explain to you is grounded

00:10:28.700 --> 00:10:40.015
in the actual language your project uses. So in this case, it's breaking down some of the domain vocabulary about how our ranking algorithm works. It's telling us these are the actual modules

00:10:40.015 --> 00:10:40.895
involved

00:10:40.975 --> 00:11:09.850
in this thing. It's showing us where certain files are read, where certain files are written. And so now it's gonna actually break down, like, does the claim that we were making in this case actually sit in the context of all of this stuff that it just broke down for us. And so in this case, it actually gives us, like, a really clear mental model breakdown. Says the ranker is looking at what threshold should it apply right now, which does need to have a fallback value, but the logging is just showing what the prior row actually held.

00:11:09.930 --> 00:11:13.775
And so the initial premise that we found in that code base architecture improvement

00:11:13.855 --> 00:11:15.295
was kind of unfounded,

00:11:15.295 --> 00:12:10.515
and we don't really need to worry about it. Now in this case, this is still using that caveman skill, which is why it's being kind of terse with its outputs to us. So if you ran this without caveman mode on, you would obviously get, like, a little bit more explanation and story behind what it's doing and why. And again, this is an open source skill. So if you wanna make modifications to this to explain things in different ways or do something else, that's something that you can very easily do by just downloading the repo and modifying the skill files. So I love the zoom out skill. I use it on a weekly basis. But like I said earlier, how do we now pull all of this different context that we've been generating together and take it forward into a new session? So like I said earlier in the video, I really love to use this skill library with other tools. I personally like to use spec driven development tools, but I find that processes like this to improve the code base, grill you about those changes,

00:12:10.835 --> 00:12:22.640
and some of these other commands, they're really valuable in helping you get to a decision on the plan. And then those spec driven tools are really great at taking that plan and actually implementing it systematically.

00:12:22.800 --> 00:12:38.915
So in this case, what we're gonna do is we're gonna loop back through and run that code base architecture again. We're gonna run through a grill me exercise, and then I'm gonna show you how you can take that output and make a lot of really great use of it. So our last skill is a really simple but high utility skill called handoff.

00:12:38.995 --> 00:12:48.310
And so this solves the problem of needing to continue your train of thought, but you really need a fresh context window. And so this is kind of like an alternative to compacting

00:12:48.310 --> 00:12:51.190
because we're still gonna get all of that information,

00:12:51.430 --> 00:13:02.245
but we can then just use that document as the context for our next session. So there's a lot of really solid use cases for this, but two that come to mind. Number one, if you wanna switch from planning to implementing,

00:13:02.405 --> 00:13:20.400
the handoff command will distill it down what you've talked about into a very concrete brief. Number two, maybe you're mid session and you need to go down a tangent in the middle of a task in a separate window. But everything you've accumulated so far is really valuable context that you don't wanna pollute with a side conversation.

00:13:20.560 --> 00:13:36.955
So let's say we go back down to our code base here for the Twitter tool. And so we just went through, like, another very extensive round of the code base review with a grill me, and we've reached a conclusion about what we wanna do. So what we can do is we can come through and run this handoff command, and then we can give instructions

00:13:37.755 --> 00:13:46.220
about what the next session is gonna actually use this for. So in this case, we're gonna say, hey. This is getting passed to a spec driven development tool for implementation.

00:13:46.460 --> 00:13:52.380
It should have adequate problem framing, the solution we came to, the key decisions we made, and any other specifics

00:13:52.380 --> 00:14:00.575
that got resolved. And so what we get out the other side of this is a markdown file that is basically a version of compacting.

00:14:00.735 --> 00:14:16.750
So everything that we just discussed in that chat has been properly ported over into this file. So now what we could do is come down and just clear the window out, and we could run our command for a spectrum and tool to take this over. And now it's gonna move through, and it's gonna start building all of the planning artifacts

00:14:16.750 --> 00:14:43.560
with this context in mind. So the thing that I really like about all of his skills is that they are very flexible and to the point. You can integrate them with pretty much any process you already use, and just simply make those things better, and make, like, your daily quality of life as you move through and do these things just a little bit easier. So if you like this video, I will link you to a playlist where I have a bunch of other awesome skill libraries and vibe engineering plugins

00:14:43.560 --> 00:14:48.840
that I use on a daily or weekly basis. But that is it for this video. I will see you in the next one.