WEBVTT

00:00:00.000 --> 00:00:59.970
Claude skills are making waves, but most people have no idea what they actually are. They think it's just another prompting trick, but it's really not. Skills are actually portable workflow packages that teach Claude your processes so you can write them once and run them forever. In this video, I'll break down the complete anthropic guide on how to write the best skills, when to use them, and how to test them properly. So let's get into it. So the first and most obvious question is what are skills? You can think of skills like an onboarding guide for Claude. Now Claude is really smart already, but it doesn't know your specific workflows. It doesn't know your tools, your standards, your business context, and things like that. So a skill teaches Claude something once, and then it can use it over and over again for the exact same thing. And that's really powerful because if you can build a really good skill up front, that means that your repeatable process is gonna be good every single time it runs. But we need to dive much further into this because it actually solves a much larger problem. So if we look over here at what skills are, they are persistent markdown files with instructions. And for those of you who don't know, markdown is just a file type. It's easily readable in plain English, so nothing overly complex here.

00:01:00.290 --> 00:01:06.450
These files load on demand, not every single session, which is really important because it's part of Viruses, such a smart solution.

00:01:06.690 --> 00:01:42.185
It's portable across Claude AI, Claude code, APIs, and things like that, and it's part of the agent skills open standard. So that's where all the providers are getting together, and they're all gonna be able to use these skills interchangeably so that they aren't locked into each other's ecosystems, which should obviously make things a lot more difficult. So the TLDR is that if you are doing the exact same process over and over again, you should probably turn that into a skill. But let's dive deeper into that first. So this brings us neatly onto progressive disclosure, which is something that you need to understand, but because it is the core innovation behind why we're doing all of this. So it's a three tier loading system that revolutionizes how Claude actually accesses information.

00:01:42.345 --> 00:01:46.425
So instead of reading everything up front, it loads it in specific pieces.

00:01:46.665 --> 00:01:52.585
If we look over here, the first thing that it does is load the front matter. So this is the name and the description of the skill only.

00:01:53.060 --> 00:02:01.860
We would use this every single time it loads into a session. The token cost here is only around 100, and this is really important because you can think of this like a catalog.

00:02:02.340 --> 00:02:10.015
Say you ask Claude to do something for you. It's going to go and search the front matter of your skills to see which one suits your request.

00:02:10.575 --> 00:02:13.135
We'll get into a practical demonstration of this very soon.

00:02:13.375 --> 00:02:18.335
The second part of progressive disclosure is the skill dot m d body, and this is where it loads the full instructions.

00:02:18.750 --> 00:03:05.845
So after it's decided, hey. This front matter over here seems to match what the user needs, we're then gonna go and load the full body, and it uses this when it matches that request. And the token cost here is usually hundreds if you've written something quite well. Can obviously be higher if you've written something terrible, but the point is is that you can see here through this progressive nature, we're no longer just loading everything upfront. We first just load the little bit into every session. Then if it matches something, we load the full skill dot m d body, which has the full instructions of what the skill is gonna do. And then it also has any linked files that come along with this, where it loads any references, scripts, templates, things like that. And these are only loaded again if it needs it as part of a step in this workflow that it's doing. So the token cost here is zero until it's needed. Alrighty. So now that we understand one of the problems that it solves, the second thing that we need to look at is the skill structure and the organization.

00:03:06.085 --> 00:03:13.845
So every single skill has a directory with the skill.md as the entry point. So understanding what goes where is critical because you need to understand how to build this efficiently.

00:03:14.330 --> 00:03:25.290
So if we take a look at an example from my environment, lives in your skills forward slash and then whatever the skill name is. In this case, we are looking at something called research lead, which is actually one of my workflows.

00:03:25.690 --> 00:03:30.010
So the main file over here is my skill dot m d, and this is the SOP.

00:03:30.250 --> 00:03:48.390
You always wanna try and keep this under 500 lines just for efficiency. The second thing that we have here are any scripts that you want. So as part of my workflow, I have some scripts that run such as scrape underscore linkedin dot p y, research with perplexity, analyze with OpenAI, whatever. I've got some scripts as part of this research lead workflow here.

00:03:48.710 --> 00:04:18.730
And the important thing to note here is that these are executed. They are not loaded into the context window. So again, you can see we have that separation over here, which doesn't bloat the context window of the AI. That is the whole purpose of having a structure like this because we want the probabilistic nature of AI, but we want the deterministic function of tools, which is scripts in this case. More importantly, we don't want to waste context on whatever it is that we're doing because then it leads to those problems that I mentioned earlier. So running these as scripts helps solve that problem because these are executed

00:04:18.810 --> 00:05:26.645
locally by a Python script, which is just programming language, and the results are then shot back up into Claude when it needs them. So instead of 500 lines of trash, we now just have 20 word result, whatever it is that these scripts do. Then as part of this, you also have references, and you can think of this as a good place to stash your examples. So again, if we take my research lead as an example here, part of that workflow, I would want to go out and DM a bunch of people. So in the references section is where I would put in few examples of what a good DM looks like. So as part of the skill, it knows to check the references folder and then say, hey, this is what good looks like. I'm gonna match this exactly. Because remember, AI is a pattern matching machine. So giving that examples is a really amazing thing to do. And then you can also have assets and these can be templates or other assets that the AI might need for the skill in order to achieve its goal. And again, the important thing to remember here is that everything is loaded only when it's needed, so we protect that precious context window. Then we need to take a look at writing the perfect description. Now remember, the description is what goes in the front matter that gets loaded into every single session. The reason this is so important, like I mentioned earlier, is that if Claude cannot see the skill, it's never gonna know how to use it or fire it, in which case it won't get used.

00:05:27.045 --> 00:05:30.965
So one of the first things you need to understand is use a max of 1,024

00:05:30.965 --> 00:05:36.580
characters. You wanna keep it concise but comprehensive enough for accurate matching. Very important.

00:05:36.980 --> 00:05:43.940
Third person only. So process emails, not I process emails. It's the wrong POV, and that will totally break the discovery of the skill that you're writing.

00:05:44.615 --> 00:06:11.160
You need to make sure that you answer the what and the when. So what does it do and when should Claude use it? And then finally on this, just include three to five trigger phrases. So specific words or phrases that should activate the skill. So let's take a look at some bad examples over here. Helps with projects. That's far too vague, and with AI, you wanna be as specific as possible. The way I like to think about it is if you are speaking to another human, if you are giving them this little minimal instruction here, would they understand what it is that you're talking about?

00:06:11.640 --> 00:06:21.775
Not, you probably need to revisit this thing. If you don't know how to write this, it means you don't understand the problem well enough to describe it, in which case you need to go and address that first. But there are other types of bad examples.

00:06:21.935 --> 00:06:28.815
So I can help you process these emails, and you see here we've written it in the wrong POV, so it breaks the discovery because it's not in third person.

00:06:29.400 --> 00:06:54.605
And then finally, implements sophisticated data pipelines, and that's just consultant word salad. There's absolutely no reason to use overly complex language if it doesn't need it. Keep it simple. Always. As simple as possible, as specific as possible so that anyone or anything that reaches would know exactly what it's for. Let's see what good examples look like. Manages linear sprint planning, including task creation and status tracking. Used when the user mentions sprint backlog or tickets. Pretty straightforward.

00:06:55.210 --> 00:07:02.250
Processes Gmail inbox to identify high risk emails and deliver an executive briefing to Slack. Run with forward slash email digest.

00:07:02.410 --> 00:07:28.920
So you can see these are really good examples because they tell the system exactly what to do or when to use something. So it can pick that up as part of the front matter that we mentioned earlier. Because you have to remember, Claude is using semantic matching here, meaning that it's matching based on meaning. It's not using regex or something like that. So create ticket matches log some tasks, but exact trigger words still give it the highest hit rate. So always keep that in mind. Why don't we take a closer look at what skill.md looks like in the real world?

00:07:29.240 --> 00:07:45.385
So we have our structure over here. We have dot claud and then forward slash skills, and in this case, research lead. That's one of my skills. So our skill dot m d file, you can think of it like the SOP, and this is a step by step instructions that Claude is gonna follow in order to achieve its goal. So if we come on over here

00:07:45.785 --> 00:08:00.490
to my research lead skill and click on the skill m d, we can see some of the things we've been speaking about already. So up at the top over here, we have our front matter, the name. We have the description, transform a LinkedIn URL into a complete research package with personalized outreach,

00:08:00.810 --> 00:08:06.570
and a little bit more info there. Then underneath that, we have the model, and this is something that's really important to understand.

00:08:07.125 --> 00:08:22.100
Because what you can do with this instead of getting Opus down here to run everything in your environment or always use Opus for your skills, you can specify the model. So I could start the process by chatting in my little box at the bottom here, but then it can spawn it off as a subprocess

00:08:22.180 --> 00:08:26.340
with Sonnet so you save a lot of money because not all of your tasks need Opus.

00:08:27.060 --> 00:08:36.355
So that's something important to keep in mind. Then we also have allowed tools, and this is an important concept as well because you don't need to load everything into every single skill that you have running.

00:08:36.675 --> 00:08:45.315
You have probably have a ton of MCP servers. You might have some scripts, stuff like that. The point is is that locking this down to only use the thing that it needs is again far more efficient.

00:08:45.635 --> 00:08:53.960
In this case, for researching a lead, we have a bunch of scripts that live in this folder over here, and it uses these scripts in order to achieve our goal.

00:08:54.360 --> 00:08:57.880
So first, let's have a look a little bit more at what a goal looks like.

00:08:58.520 --> 00:09:05.815
Now one of the things that people get wrong is that they think skills just replace prompts, and that's kind of half true, but also not really because

00:09:06.055 --> 00:09:15.575
a skill is more like an instruction manual or SOP as I've mentioned earlier. And Anthropic says that you can write these in two ways. One, you can write it imperatively,

00:09:15.575 --> 00:09:44.065
which is where we're telling this thing what to do step by step. The other way is to take more of a prompt structured approach like we used to do. Both of those will work, but this one is the preferred method. And then they have a caveat that they say at the end of all of that, and it pretty much just reads that as long as the skill does exactly what you want it to do every single time, that's what you should be aiming for. So the structure is important, especially this front matter part, but what you do after that is mostly up to you to choose how it gets you to where you wanna go.

00:09:44.770 --> 00:09:54.130
For me, the goal format always makes the most sense because I'm coming off of a framework that I made myself as part of some of my other videos you can check out on the channel now if you haven't seen them already.

00:09:54.450 --> 00:10:03.925
Point is is that to me having it laid out in this way, it's not only clear for me, but it's also very clear for the AI to understand exactly what we're trying to achieve with the skill.

00:10:04.325 --> 00:10:13.125
We're just trying to achieve a goal here. In this case, transform a LinkedIn URL into a complete research package with personalized outreach content constrained to relevant personalization

00:10:13.125 --> 00:10:13.445
only.

00:10:14.060 --> 00:10:18.940
So relevant relates to a problem they're likely facing that we can solve, and I just give a little bit more information.

00:10:19.180 --> 00:10:31.035
And then over here, we have the inputs required. So I just give the ad a little bit of knowledge about what inputs might be required for this, and the orchestrator will then run through everything. So I list the execution steps as part of this pipeline.

00:10:31.275 --> 00:10:36.315
Step one, scrape LinkedIn profile. Step two, research company and person via perplexity.

00:10:36.555 --> 00:10:47.040
Step three, run AI analysis, so on and so forth. So it's literally a list of how we achieve this goal, and you can think of it like nodes in an n n n workflow, to be honest. It's pretty much the same thing.

00:10:47.520 --> 00:10:51.440
But as part of these steps, I'm also giving it very specific instructions

00:10:51.520 --> 00:10:56.935
on what tools to use, and they relate to these over here. So my scripts are just these tools.

00:10:58.215 --> 00:11:00.375
And these will then go out and run locally,

00:11:00.455 --> 00:11:15.410
which is a very important distinction because we don't need to bloat the context. We don't need AI to do any of this work when we can use programming language to do it for free, and then just deliver us the results. So if we take a look at this, these scripts, this is just the execution layer. All of this gets executed locally,

00:11:15.810 --> 00:11:19.730
and it's not loaded, which is really important. So 3,363

00:11:19.730 --> 00:11:20.530
lines of Python

00:11:21.045 --> 00:11:27.605
just returns us a 105 lines of output with context that the AI actually needs. There is no bloat,

00:11:27.925 --> 00:11:52.685
just answers to each of the steps in this process in order to get its goal achieved. So that's a 97% reduction, which is really handy. And again, this is just part of separating concerns. There's tons of frameworks behind this already. I had one busy turning it into a version two along with using this because this is way more efficient with some of the other tips and tricks that I've been using in my other videos. But then we also have references and if we come over here, we can see one of the references I've got in here are output structures.

00:11:52.845 --> 00:12:00.365
So it's just JSON structures for each analysis type in the research lead pipeline. So this just gives the AI a little bit of structure around how I want things done.

00:12:00.765 --> 00:12:55.675
But then equally as part of this, I could also have the definition of what a good direct message looks like. Stuff like that. I would put it in there so that this can just go and match patterns when it needs to to deliver the same thing repeatedly and accurately every time. Then finally, we have assets and these are just templates or images and config files that might be part of the system. In this case, I don't really have any as part of my research lead system yet, but that's because it achieves it with everything else that I've already got in it. The point is though that not everything needs all of these. You don't need to fill them out. It's interchangeable. You might not use any scripts if there is MCP available and MCP makes sense to use. It would then use that as a tool. I use scripts because for this specific workflow, it makes complete sense to do everything locally. I I don't need to use any tokens or anything like that for this specific workflow. But now that you have an understanding of what the structure looks like, we need to look at a few other things behind skills. Next up, we need to look at invocation control, and this is important to understand because it basically decides who should be able to trigger this specific skill.

00:12:56.075 --> 00:13:54.525
So first up, we have default behavior. So you can invoke it? Yes. Claude can invoke it? Yes. You use this for most of your skills. Let Claude find them and use them easily. That's the default behavior and for the most part, most people are gonna wanna be using that kind of thing. Next up, we have disable model invocation equals true, and this means that you can invoke it, but Claude cannot invoke it. You'd wanna use this for destructive operations or something where you want a human in the loop. Because remember, you don't just want the system going out there and YOLO ing every single thing, especially if it needs your approval to get in front of other people or send something to someone, deal with sensitive information. Those are a few use cases. But then there's also user invocable equals false, and that means that you can't do anything, but Claude can do it. There are a few use cases when you'd want to have this. Me, personally, I would never have that, but mostly useful for background knowledge that Claude should know, but the users can't directly trigger. So if we have a look at what that might look like in our front matter, we would have name, deploy, description, deploy the application to production, disable model invocation,

00:13:54.525 --> 00:14:07.620
true. And in that sense, Claude wouldn't be able to do a single thing because you don't want Claude just yellowing things into production, but the user can say yes. So as part of a workflow, that's probably something I would want to set especially if I have things that are starting to travel into production.

00:14:07.940 --> 00:14:22.275
We need that human review before we go in there. And then I also wanna look at model routing and sub agents. If you remember earlier when I touched on the skills, I told you about being able to run Sonnet. So for my research lead, we kick it off with whatever I'm talking to in that chat box. Usually, that's Opus.

00:14:22.435 --> 00:14:26.035
Then it will spawn our Sonnet model, which is much cheaper to run a specific workflow.

00:14:26.630 --> 00:14:30.630
The context here is to switch to fork, which is an isolated subprocess,

00:14:30.710 --> 00:14:32.550
and then we just give it our allowed tools.

00:14:32.790 --> 00:15:01.730
So we have the user asking, it reads the front matter, it matches the skill, it then spawns Sonnet, and it executes and returns whatever I need from my research lead process. Pretty straightforward and obviously it's up to you to decide or work with the AI in order to decide which model makes sense to use for what. You can use Haiku for the things that are far less complex. Maybe you wanna keep using Opus for the things that need a lot of judgment and a lot of reasoning behind them. Point is you can toggle with all of those and you can play with them as part of your workflows.

00:15:01.890 --> 00:15:09.625
Then I just wanna touch on the skill that MD body again. So we had a look at how I structure mine already using the imperative form as an s SOP,

00:15:09.945 --> 00:15:14.665
but you need to understand what actually makes this good. So exact commands with full parts,

00:15:14.905 --> 00:15:19.145
expected output format for each step, dependencies listed explicitly,

00:15:19.145 --> 00:15:40.035
things like that. Claude knows exactly what to run and what to expect back. It's really helpful when we give it examples or when we're being very specific because it needs no room for ambiguity. So like I've got over here, step one, do this. Use this thing in order to achieve that. This is what the output would look like. Step two, exact same thing over and over again for the specific workflow that I'm doing here.

00:15:40.355 --> 00:15:47.075
But what makes an instruction bad is if you say something like research the lead and find out about their company, then write some DMs.

00:15:47.235 --> 00:16:47.415
That is insanely vague and you could get a 100 different results based on that. So doing something like this is not gonna get you where you wanna go, especially as a repeated process. This wouldn't work if you gave it to a human. If you gave these instructions to a human, they would go and do something, but it's probably not gonna be the thing that you actually wanted them to do. So again, when you are writing these things or getting AI to write them for you, be very specific and think, if I handed this to someone, would they know what to do with this? If we look at this over here, if I handed this thing to someone, they would understand exactly what is happening here and exactly what I would want them to do. So that's the way that you wanna look at writing your skill bodies. Then we have degrees of freedom, and this is essentially just telling Claude the level of freedom that it has. So from loose guidance to the exact commands that we have. So high freedom means that multiple approaches are valid. Context determines the best one. An example of this would be review the code, check for bugs, edge cases, readability, things like that. So it has high freedom to do that. For medium freedom, that's where a preferred pattern exists, but some variation is okay.

00:16:47.735 --> 00:16:52.990
So an example of this would be generate a report using this template, customize sections as needed.

00:16:53.550 --> 00:16:58.030
And then low freedom, these are operations that are fragile where consistency is critical.

00:16:58.190 --> 00:17:10.995
So a specific sequence must be followed every single time in the exact same way. And that for me is where mostly I use my Python scripts. So I say to the AI as part of the skill body, run Python three scripts migrate p y whatever.

00:17:11.155 --> 00:17:20.915
Do not modify the command. I'm giving it very specific instructions about what to run here. It doesn't really have a lot of freedom or choice in the matter of how it does this task because I'm being very specific.

00:17:21.450 --> 00:17:59.100
And that's really important for business logic and building systems. You don't want to give this thing high freedom if it needs to achieve a very specific command. That makes absolutely no sense. So you need to factor that into your skills as well. How much freedom do you wanna give this thing as part of what it's trying to do. One thing I like to do is to also have feedback loops. So generate, validate, fix, repeat. AI does that automatically, but you can bake these into your skills and and you can give them an extra nudge just to put any of your own custom rules in there as part of this generate, validate, fix iteration loop that it runs through. So if we look at an example from my research lead, we have validate the output. So we run Python three scripts validate dot p y output dot JSON.

00:17:59.340 --> 00:18:07.805
If validation fails, read the error message carefully, fix the specific issues, and run validation again. Only proceed when validation returns valid equals true.

00:18:08.125 --> 00:18:18.285
Maximum three rounds if still failing report to user. So, again, you can customize this to whatever the hell it is that you want, but feedback loops are very important because it learns every single time that it runs through this iteration,

00:18:18.840 --> 00:18:38.215
and you can also get it to update some of your other SOPs or your Claw. Md for anything specific that it might learn. The point is here is that your skills are evolving. The goals mostly stay the same, but when you're testing and you're iterating through your initial builds, that is when you are learning and evolving the system so you can get it to rewrite its own skill while it's learning.

00:18:38.535 --> 00:19:05.425
The point here is that when we go through this little loop over here of generate output, validate with script, fix issues, repeat until valid, We're trying to get to a very good definition of done, something that matches the quality that we need for the system we are building. So then next up, we obviously need to talk about MCP because that forms part of it. In your environment, they're generally gonna be like two types of tools. One of them being MCP, which then has a whole world of tools. But then we also already mentioned scripts, which in this case are also tools.

00:19:05.745 --> 00:19:10.305
Point is is that you don't want to use MCP for every single thing out there just because it exists.

00:19:10.880 --> 00:19:18.080
If we need to do something with Superbase or Vercel or something like that, Notion, whatever, there are tons of times where using MCP makes sense.

00:19:19.120 --> 00:19:39.925
But there are also times where using it is just gonna add latency or context bloat and things like that into your system, And that's where using something like a standard Python script makes much more sense. So I would say use scripts specifically if you have a fixed pipeline, the same steps every single time, something repeatable that you need to do over and over again in the exact same way that doesn't require any form of AI judgment.

00:19:40.165 --> 00:20:04.655
You're gonna wanna use a script for that because then there is absolutely no token overhead like I mentioned earlier. It's literally just the script running locally, doing its thing, whatever it needs to do. And then I would use MCP when I want some judgment behind it or if the AI needs to interact with something specifically on my behalf, that's when you would wanna use it. When you have those extended services like Vercel and Superbase, when it needs to go out into the wild and reach into some of these fancy tools and do work on your behalf,

00:20:04.975 --> 00:20:14.310
MCP makes sense for that. For me, I'm always gonna default to using scripts for deterministic stuff, and I will get MCP involved when I needed to do work on my behalf with those external systems.

00:20:14.550 --> 00:20:18.150
Then we need to take a look at a few design patterns that Anthropic put forward.

00:20:18.470 --> 00:20:35.675
So the first one is sequential, and this is where steps run-in a strict order. This is like that n n n node flow that you can think of. And that's what I've done with my research lead because everything can be done locally. I don't really need MCP or anything like that. It's mostly just using Python scripts here along with a couple of APIs into other systems.

00:20:35.915 --> 00:20:38.315
But the point is everything runs sequentially.

00:20:38.475 --> 00:20:48.810
So we research the lead, we scrape a profile, we research that, we analyze everything that we've done, and then we write the DMs, we store them, and then send them using HeyReach.

00:20:48.890 --> 00:21:03.715
So that is a very sequential thing, and most business operations function that way. We want the steps to run-in this specific order. Then the other one is iterative refinement where the output quality needs validate fix repeat loops. So an example here would would be for building a website.

00:21:03.955 --> 00:21:05.395
So generate HTML,

00:21:05.395 --> 00:21:09.475
validate, fix, revalidate until we get to whatever our definition of done looks like.

00:21:10.170 --> 00:21:16.170
The third one is multi MCP coordination, and that's where the workflow spans multiple external services.

00:21:16.570 --> 00:21:47.120
So an example here, if we were building an app, we might use Superbase for the database. We would use it ourselves for deploying it. We might use Slack to notify a bunch of people. We might use Firecrawl to do web scraping as part of that. Whole bunch of different things, but they're all running via MCP and the AI is coordinating all of that stuff for you. If you haven't seen my ChatGPT versus Claude video, it's on the screen now. And I use this exact sort of thing in there. We use MCP for everything. Then the fourth one is context aware branching. And this is where you have the same input with different execution paths.

00:21:47.200 --> 00:22:18.960
So an example of this would be to process a file. If it's a CSV, take this data path. If it's a PDF, go down that path. Very straightforward. It's just context aware in the way that the process will run based on its initial point. Then pattern five, we have domain specific intelligence. And that's where we have embedded business rules or compliance, audit trails, things like that. So an example as part of my research lead workflow that I've got, I have a relevance filter and it discards anything that looks like theater. So if it finds someone's LinkedIn profile that talks a bunch of stuff that isn't really relevant,

00:22:19.120 --> 00:22:24.875
it filters all of that out. It only focuses on keeping things that will help write its DM effectively

00:22:24.955 --> 00:22:28.555
because I've baked in some of my own domain specific intelligence

00:22:28.555 --> 00:22:45.530
when it comes to sales and AI and things like that. So I've told the AI what to look for and it is now intelligent in the way that it does that. So these five design patterns, they're probably the most common things. And realistically, for most of the work that you're gonna be doing for clients, it could include many of these different things, if not all of them, as part of their system.

00:22:45.770 --> 00:22:54.865
Just keep in mind something on MCP again. Only use it when you absolutely need to use it for your client systems, because even though they kinda solved the token bloat problem with lazy loading,

00:22:55.025 --> 00:23:14.580
it still exists to some extent, especially if you're building a massive system. But then, obviously, we need to test the skills that we're writing. We can't just go out there and do a bunch of stuff and expect it to work without running any tests. They need to be performed in this order because each of them reveals a different aspect of the quality or their effectiveness behind them. So test one is the trigger test. You open a brand new session,

00:23:14.740 --> 00:23:38.030
not an existing one because there's context bias. Obviously obviously, if you've been building the skill with the agent inside there and you had to ask it a question, it already has some context awareness to it. So the best thing to do is to use natural language inside a new window. So if I flip back here, say I was using this one to build me my initial skill, I would just come over here, open up a new window and I would say, hey, let's go and run lead research.

00:23:38.350 --> 00:23:43.630
Now as part of that, we're testing whether this thing can semantically match what I've written in my front matter,

00:23:43.790 --> 00:23:55.485
which is that little sentence up top that Claude uses to identify a skill. And we're doing that in a new window because this has no context awareness of anything that I've spoken about with regards to research lead up until now.

00:23:55.725 --> 00:24:21.905
Whereas if I did it in the window in which I built that skill, it obviously has all the context as part of that conversation up to the point where it might have cut something off. But the point being that context is still gonna be in there, which is why we want to use a new window to see firstly if it can actually trigger the workflow which validates our front matter is working as it should. So if we have under triggering, that means that the skill never fires. So we need to broaden the description in our front matter describing what this thing is or does.

00:24:22.225 --> 00:24:26.545
But then there's also over triggering where it fires for everything, and you need to narrow the description.

00:24:26.785 --> 00:24:48.145
So, again, it's about finding the right balance when you're writing your front matter. Test two is the functional test, and that's pretty straightforward. We're just running the skill four to five times with different inputs. Does behavior stay consistent when we're doing that? Try it with sub agents as well. That's very important to do. Again, if we're not just using our main Opus agent and we switch the model to Sonnet, get it to do that and see how it responds.

00:24:48.305 --> 00:24:54.225
And then the third part of testing is the value benchmark. The hardest question here is, is this skill actually helping?

00:24:54.545 --> 00:25:21.645
So compare Claude's output with the skill versus without it. If the skill doesn't improve the consistency, the quality, or the speed of whatever you're doing, it probably doesn't need to be a skill. Again, it should be something that is repeatable, but it needs to be adding that value to you. So if it's not making your life faster or more efficient, does it really need to be a skill or do you need to readdress whatever it is that you're looking at? And then finally on this, we need to look at what Anthropic recommends for iterating with Claude, this is part of their a b building and testing.

00:25:21.965 --> 00:25:33.800
So they recommend that you use two instances working in tandem, and it makes complete sense as you'll soon find out why. So Claude a is the designer. Helps you write and refine the skill dot m d. It understands the agent instructions,

00:25:34.040 --> 00:25:37.640
and it works with your domain expertise to craft the effective skills.

00:25:38.120 --> 00:25:53.045
Claude b is the tester. So you use the skill in a fresh session like we just discussed on real task, and this reveals where the instructions might be too vague or incomplete and things like that. So the workflow for this might be complete a task with Claude a. Use normal prompting to accomplish a workflow.

00:25:53.525 --> 00:25:57.365
Notice repeated context. So you identify what instructions you provide repeatedly,

00:25:57.770 --> 00:26:16.905
and that's probably gonna be your skill. If you have a workflow that you do over and over again, that can also be a skill. Things like that. Then for step three, just ask Claude a to create a skill. So again, if we're using my research lead analogy over here, this was something that I obviously did over and over again to find leads. So initially, it started as an in workflow,

00:26:16.905 --> 00:26:21.225
then I moved it into an agentic format with using my own gotcha framework.

00:26:21.545 --> 00:26:36.580
And now I'm bringing it into skills because as you've seen, they're far more efficient specifically for the context loading. But the point is is that I went through my process step by step. How do I research leads? How do I go out there and find them? And then I've built an automation around that.

00:26:36.980 --> 00:26:40.660
And in that same sense, we're not just doing that as a skill format.

00:26:40.805 --> 00:26:50.725
I So would come in here and I would say, this is my typical working model. I go into this system. I take this action. I do a bunch of these steps. And I would talk to Claude about it,

00:26:51.125 --> 00:26:53.125
describing my entire workflow.

00:26:53.205 --> 00:27:12.565
And then at the end, just say, hey. Let's turn this into a skill. And that's what we're doing over here. If you had walked through something with it or you had been working together, you can just turn that into a skill as well. Tons of ways that you can approach this. Then obviously, we need to review it. We wanna make sure there's no bloat. We wanna make sure that as a first part of the process that it is actually functioning and running at a basic level.

00:27:13.045 --> 00:27:17.925
And we do that by running tests with Claude b. So fresh session, similar tasks,

00:27:18.005 --> 00:27:30.080
observe the behavior, see what it does. Then the step six is to bring failures back to Claude a. Claude b forgot to filter test accounts. So we're feeding Claude a more information about where it went wrong when it initially built our skill.

00:27:30.240 --> 00:27:41.325
You just go back and forward like that until you get the system built. It's refining with two agents, and it's probably the most efficient thing you can do right now, especially if you already have the context that you can feed agent a upfront.

00:27:41.645 --> 00:28:25.040
And at this point, I should probably give you some pointers on where you can get the skill creator and embed it in your own environments that you can make this entire process a lot easier because Anthropic already thought about this for you. Alrighty. So let's have a very quick overview of how all of these pieces together from a practical perspective and how we can use Claw to actually help us arrange and organize all of these things. Now for the most part, I'm gonna be using an IDE, which is just this interface you see here. In order to settle all of this up, you can do it in the CLI and again, this is an open framework, so it will probably work with most of the applications that you're using nowadays as well. So I'm using Versus Code as my IDE, but again, any IDE will do just right. So one thing that you could do to get everything set up for you is literally just come down here to Claude, ask it to research skills, and then get it to set up your environment

00:28:25.360 --> 00:28:58.600
according to what Anthropic says is best practice. That's one way, and then this thing would just go ahead and do it. I'm not gonna take that approach because I use this system along with my Claude MD that I'm busy migrating to a version two to attach with my framework. So I use this a little bit differently in that sense because I've got other moving parts that form part of my skills. But for the most part, if you wanted a shortcut, just ask Claude to set this up for you. If you wanted to do it manually, everything lives in this dot Claude folder over here. This is currently at a project level. And you see we've got our skill dot m d, which is our skill creator, its scripts, its references.

00:28:59.425 --> 00:29:03.905
So if you wanna get this meta skill creator that creates all of the skills for you,

00:29:04.305 --> 00:29:08.385
there are multiple ways you can do that. One, you could come on over to your web browser

00:29:08.705 --> 00:29:30.965
and you can have a look at this main GitHub repo over here. This is from Anthropic and you can connect this directly to the IDE that we're currently working in or you could just reference this and again give this to your agent down here and say, hey, can you pull out the skill creator from here and add it as a skill to our environment? And it would do that. But probably one of the easiest things for you to do is to come down here, do forward slash plugins,

00:29:31.285 --> 00:29:33.445
go over to the marketplace tab

00:29:33.685 --> 00:29:45.770
and that repo link, just copy and paste it into here. And you'll see you'll then have it added down here. In which case under plugins, your skills become browsable from whichever repo you paste in there.

00:29:46.090 --> 00:29:56.745
So you can also search for them if you don't feel like scrolling through this entire thing. But if you scroll down, you will see a few skills in here including the skill creator. So this is the official one from Anthropic.

00:29:56.905 --> 00:30:13.130
And then once that's installed, it will be installed over here and you'll be able to use it and just come to this thing and say, hey, build me a skill based on this workflow whatever it is you're working on. And it would then go ahead and build it based on the best practices outlined in this skill.md over here. So this is the meta skill creator.

00:30:13.130 --> 00:30:14.090
Very handy.

00:30:14.330 --> 00:30:27.505
But then there's already tons and tons of existing skills out there from people who have done these workloads for you already over and over again and they're already tested. We can find those in the marketplace over here. So if you check out skillsmp.com,

00:30:27.665 --> 00:30:38.210
you'll be able to find an insane amount of agent skills. You saw you can always come down here and try and find whatever it is that you're looking for. They give you a few examples. So skills about trading, data analysis related skills.

00:30:38.370 --> 00:30:40.690
Let's have a look for sales and marketing.

00:30:41.010 --> 00:30:42.370
See what they've got for that.

00:30:42.850 --> 00:30:55.545
So you can see here there are a bunch of skills for this already. And then all we would have to do is click here and this is a little markdown structure that you can see. So again, there's our name, there's the front matter description that we spoke about in the video,

00:30:55.945 --> 00:31:20.345
lead research assistant, this skill helps you identify and qualify potential leads for a business by analyzing product and service. So there are tons of things you could do from here. You could just copy and paste this back into your chat window. Say, hey, turn this into a skill and it will go and do it for you. Or there are other options on the right hand side here depending on what you're using. You can see it's got wget which means it would just download it. You can also use n p x which is pretty handy if you didn't want to rebuild this thing yourself.

00:31:20.665 --> 00:31:44.825
Point is, this is where you come to get existing skills. There are a ton of other places where you can get skills as well. But for me, I prefer to build my own because like I said, I I work in a very specific way. So now that I've got the very basic stuff out of the way, I'm gonna jump into my environment and I'm gonna show you how I would build skills and how it works within my environment. Okay. And we're in my environment now. So let's pretend that you had a slide generator

00:31:44.825 --> 00:31:54.105
and you wanted to turn it into a repeatable workflow because you've nailed exactly how you want your slides done. Now, I've already got something like that in here. I've got under my goals folder, I've got Gamma slides.

00:31:54.700 --> 00:32:00.140
You don't need to understand what goals is. It's just my framework that I was using historically before I moved across to skills.

00:32:00.300 --> 00:32:33.530
So I'm just using this because it already, as you can see, is a repeatable workflow. I was already doing the exact same thing. Just inefficient because now that skills are so granular, you can actually do this type of thing a lot more efficiently, which is why I'm moving my framework across to half skills and then half of some of the other stuff that I've been doing in my videos if you've been watching me for a while now. So anyway, what we're gonna do is we're going to get our agents to turn these gamma slides into a skill. Shouldn't be that hard because we've already got the format nailed over here. We were already doing that kind of thing. But for the most part, it's going to use our skill creator

00:32:33.930 --> 00:32:35.850
to put it into the right format.

00:32:36.170 --> 00:32:39.530
So all I'm gonna do is I'm gonna come over here and I'm gonna tell this thing to do it.

00:32:40.395 --> 00:32:41.675
I want to create

00:32:42.155 --> 00:32:46.235
a new skill based on one of our goals for Gamma slides.

00:32:46.315 --> 00:32:48.235
So can you take Gamma slides

00:32:48.315 --> 00:32:51.195
and turn them into the new skills dot m d format?

00:32:51.355 --> 00:32:53.355
Place it in the gotcha v two folder.

00:32:53.910 --> 00:32:59.270
So this thing's now going to try and figure out what it is that I just said to her. It's probably gonna read our goals folder.

00:32:59.750 --> 00:33:24.355
In your case, if you were writing a new skill, you might not already have something as nailed down as this. But again, you would understand the workflow that you've been building or the idea that you want to have. And then you would start to articulate that to the AI and work back and forth with it in order to build some kind of repeatable workflow or an application, a system, whatever it is that you're building. And ultimately, you would get to a point where you would have something structured like a goal like this.

00:33:25.980 --> 00:33:27.820
I'm just gonna bypass permissions.

00:33:28.060 --> 00:33:31.260
Specifically, I do with my framework is myclaw.

00:33:31.260 --> 00:33:52.475
M d is tailored all around this. So it understands exactly how to build these goals and how to separate them with tools and things like that. I don't wanna go into that in this video. I have deep dives in other videos on my channel. You can check one on the screen now. Point is though, I'm gonna be keeping that exact same structure just migrating it across to skills and I'll put on a new video and deep dive into how all of that will go together. So I'm not gonna go deep into that in this video.

00:33:53.140 --> 00:33:57.620
Okay. So this thing is allegedly done. Here's what I created, the Gamma Slide skill.

00:33:57.780 --> 00:34:06.180
So if we go on over to Gotcha v two, now yours would probably be living up here. Mine is just like this because I'm using this as a test environment while I migrate frameworks.

00:34:06.340 --> 00:34:10.615
But if we come down here, can see it used the skill creator to create our Gamma slides.

00:34:11.015 --> 00:34:13.495
So if we have a look at skills.md,

00:34:13.975 --> 00:34:20.455
the name is Gamma slides. The description is to generate presentation slide decks from markdown content using the Gamma API.

00:34:20.930 --> 00:34:35.570
Used when the user says create slides, make presentation, build a deck, generate a slide deck, so on and so forth. It's given that the Sonnet model to do that and it wants to use it as a sub process which is very cool. And then it's limited its tools as well because we only need one tool in order to create that

00:34:35.890 --> 00:34:42.295
and that is probably gonna be a script that connects to the Gamma API itself and then pushes in some of the parameters that we want.

00:34:42.855 --> 00:34:54.060
So we can have a look over here. The objective is to generate a presentation deck blah blah blah. The inputs required, so we need to have content or a content idea, a title, the slide count,

00:34:54.620 --> 00:34:59.660
and then it lists that we need the Gamma API key which lives in my dot e n v file down here.

00:35:00.140 --> 00:35:04.220
And then all it does is create the execution steps. So prepare markdown content

00:35:04.300 --> 00:35:39.065
which creates the clean file that we create locally here. Now we create it locally so I can do editing here before I push it up to Gamma and then use Gamma AI to create it. Otherwise, we would burn a ton of credits between two systems and that's just silly. So we get things working really nicely in this environment, which is cheaper, and then push it up to Gamma, and Gamma's AI will make it for us. So I'm not gonna go through this whole file, but you get the point. It is a step by step framework on exactly how to do it. We tell it exactly what it needs, the text amount, whether we're using images or no images and things like that. It all gets baked into this skill or standard operating procedure.

00:35:39.465 --> 00:35:41.785
And then in order to help us achieve this goal,

00:35:42.105 --> 00:36:03.795
we have our scripts which is how we connect to the environment and this is just the tool that it uses. It's the Gamma API. You can see down here, this is just a Python script. It costs nothing to run. It doesn't need any context. It does all of the work. Don't need to use MCP for this because the AI inside Gamma will already take care of all of the building for us. So we don't need our subscription to do any of that lifting.

00:36:03.955 --> 00:36:05.235
No need for MCP.

00:36:05.555 --> 00:36:19.830
And it really is that easy, guys. Now, obviously, you would be working with the AI if you wanted to. I imagine if you started with a fresh environment, you wouldn't have any of this stuff. But the point is you could just come down here and articulate your thoughts or your problems with the AI and work through whatever it is that you're trying to turn into a skill.

00:36:20.150 --> 00:36:24.310
It's basically just understanding your problem well enough and then having solved it manually

00:36:24.470 --> 00:36:36.775
and then coming in here and trying to refine that into a system. And Claude will go ahead and build that for you, especially if you're using my framework or someone else's framework. It can help you do that even better and then you attach it to your skills like we did over here.

00:36:37.175 --> 00:36:46.615
But then obviously, all we've done here is create it. So we wouldn't want to test it in the with the exact same agent. So we would open up another one and I would just come in here and I would say to this thing,

00:36:47.660 --> 00:36:48.860
use the

00:36:49.260 --> 00:36:54.140
Gamma skill to create me a slide deck on the importance of

00:36:54.460 --> 00:36:55.180
salt.

00:36:56.140 --> 00:36:58.940
And then it's gonna go ahead and it's gonna do that I'm gonna bypass permissions.

00:37:00.375 --> 00:37:05.895
Let me understand how the Gamma skill works by reading the relevant files. So you can see there, it's reading our skills,

00:37:06.135 --> 00:37:08.615
Gamma slides, figuring out how things work.

00:37:09.015 --> 00:37:18.410
Now I'll create the slide content and generate the deck. Let me first ensure the temp directory exists which it probably does. After it does that, it's gonna go and create our local markdown files.

00:37:18.570 --> 00:37:26.010
After that, it's then just gonna use the Gamma API, send it up to Gamma, and Gamma is gonna generate our slide deck, and we'll see how that thing turns out in just a few seconds.

00:37:26.330 --> 00:37:52.310
But before we do that, we should understand that obviously, we were building something more complex, there would be back and forth between this one which is Claude a and this one which is Claude b and if you remember that's what we were speaking about when we're building and testing these things. We want to have two agents talking. Now, this thing immediately found out what I was talking about. So that means that we nailed our front matter. That means that that little section at the top was very clear and that a brand new agent can pick that thing up, which is the most important part.

00:37:52.630 --> 00:38:00.735
And then it was just able to go and do exactly what it needs to do. It wrote our Gamma content. Content is ready. Now let me generate the presentation via the Gamma

00:38:01.055 --> 00:38:01.775
API.

00:38:01.935 --> 00:38:34.035
And so you can see how it's using the skill and in this case, the tool, which is a script in order to achieve its goal. We could extend this out even further. So if you want it to be better practice, I would provide it a definition of what good looks like, and that would live in the references folder, which it didn't create here because I didn't care enough to add it for this example. But the point is is that I would add references in here, and then I would put in two or three MD files showing it exactly what a definition of good looks like for my slide structure so that I would always get that structure every single time. We're just matching patterns here.

00:38:34.435 --> 00:38:40.630
But I think it still did a pretty good job. Let's have a look at what it turned out. Look at this, the importance of salt.

00:38:40.870 --> 00:38:47.030
From ancient civilizations to modern industry, salt has shaped human history, biology, and culture in profound ways.

00:38:47.190 --> 00:38:57.485
This essential mineral touches every aspect of our lives. You get the point. This is the exact thing that you saw earlier. This is the same color theme and all of that that I use for every single one of my slides.

00:38:57.725 --> 00:39:00.365
So all of this is nailed from a functional perspective.

00:39:01.005 --> 00:39:03.325
So we have officially created a repeatable

00:39:03.325 --> 00:39:04.125
system.

00:39:04.125 --> 00:39:35.205
And anytime I wanna come into my agent, all I need to do as you've now seen is come down here and speak to it. And that's the whole point of skills to make all of these repeatable systems much easier to use for us. So hopefully, I've demystified it just a little bit. I will have another deep dive into this when I launch v two of my framework that uses skills as part of it, and we'll cover more things in there as well including plug ins. So if you have any comments, leave them below. I do have a community that just launched, if you guys wanna check that out, it is open now. Otherwise, check out the videos on the screen. They're definitely gonna help you on your journey. Thanks very much for watching. See you on the next one.
