Mansel Scheffel · Youtube · 39:35

How to Actually Use Claude Code Skills (Full Breakdown)

A 39-minute field guide to Claude skills: structure, description writing, model routing, testing, and a live demo that ships a real workflow.

Posted

February 21st 2026

3 months ago

Duration

39:35

Format

Tutorial

educational

Channel

MS

Mansel Scheffel

§ 01 · The Hook

The bait, then the rug-pull.

Most people discover Claude skills and immediately misfile them as a fancier prompt. They are not. A skill is persistent infrastructure: a workflow written once that executes consistently forever, without spending tokens on re-explanation. This breakdown starts where most tutorials skip: the architecture that makes skills efficient, not just possible.

§ · Chapters

Where the time goes.

00:00 – 01:42

01 · What Are Skills?

Skills as onboarding guides for Claude; persistent markdown files that load on demand; if you repeat a process, make it a skill.

01:42 – 03:05

02 · Progressive Disclosure

Three-tier loading: frontmatter triggers SKILL.md body on match which triggers linked files on demand.

03:05 – 05:26

03 · Skill Structure and Organization

Directory layout: SKILL.md SOP, scripts executed not loaded, references with examples and schemas, assets. The 97% context reduction example.

05:26 – 07:28

04 · Writing the Perfect Description

Max 1024 chars; third person only; answer WHAT and WHEN; include 3-5 trigger phrases. Bad vs good examples contrasted.

07:28 – 09:05

05 · The SKILL.md Body

Real-world research-lead skill walkthrough in VS Code; imperative writing style; frontmatter, model spec, allowed tools.

09:05 – 10:31

06 · Goal Format

Objective, inputs required, execution steps as numbered pipeline nodes. Specific tool calls with expected output per step.

10:31 – 12:55

07 · Model Routing and Subagents

model: sonnet, context: fork; Opus orchestrates, Sonnet executes; cost implications; allowed-tools scoping per skill.

12:55 – 14:07

08 · Invocation Control

Default both can invoke, disable-model-invocation:true for human-in-loop on destructive ops, user-invocable:false for background knowledge.

14:07 – 15:59

09 · Writing the SKILL.md Body - Instructions

Exact commands with full paths; expected output format per step; dependencies explicit. Vague vs specific imperative steps.

15:59 – 17:59

10 · Degrees of Freedom

High freedom for multiple valid approaches, medium for preferred pattern with variation OK, low freedom for exact commands do not modify.

17:59 – 19:05

11 · Feedback Loops

Bake validate-fix-repeat into the skill; max 3 rounds; report to user on persistent failure; skills evolve through iteration.

19:05 – 20:14

12 · MCP Tool References

Scripts for fixed deterministic pipelines with zero token overhead; MCP for judgment and external service interaction.

20:14 – 22:45

13 · Design Patterns Overview

Five patterns from Anthropic: Sequential, Iterative Refinement, Multi-MCP Coordination, Context-Aware Branching, Domain-Specific Intelligence.

22:45 – 25:21

14 · Testing Your Skills

Three tests in order: Trigger Test fresh session, Functional Test 4-5 runs subagent variants, Value Benchmark with vs without skill.

25:21 – 28:25

15 · Iterating with Claude A/B

Claude A designs, Claude B tests in fresh session. 7-step loop: complete, notice repeated context, create skill, review, test, bring failures back, refine.

28:25 – 31:20

16 · Practical Setup and Marketplace

VS Code plus Claude Code environment; dot-claude folder; Anthropic GitHub repo for skill creator; skillsmp.com marketplace.

31:20 – 36:05

17 · Live Demo: Converting a Goal to a Skill

Live conversion of Gamma slides goal file into SKILL.md using the skill creator; correct frontmatter, model routing, scripts output.

36:05 – 39:35

18 · Testing the Gamma Skill

Fresh session test; skill triggers correctly on natural language prompt; live generation of The Importance of Salt 11 slides via Gamma API.

§ · Storyboard

Visual structure at a glance.

open

hook open 00:01

progressive disclosure

promise progressive disclosure 01:42

skill structure

value skill structure 03:05

writing descriptions

value writing descriptions 05:26

97% context reduction

insight 97% context reduction 11:11

invocation control

value invocation control 12:55

degrees of freedom

value degrees of freedom 16:58

design patterns

value design patterns 20:14

A/B iteration

value A/B iteration 25:21

live demo skill created

demo live demo skill created 36:05

Gamma slide output

payoff Gamma slide output 38:38

CTA

cta CTA 39:35

§ · Frameworks

Named ideas worth stealing.

01:42 model

Progressive Disclosure 3-tier loading

Frontmatter 100 tokens every session
SKILL.md body loaded on match
Linked files loaded on demand per step

Each tier loads only when needed. The frontmatter is the catalog; the body is the SOP; the linked files are the execution layer.

Steal for Any system where you want Claude to self-discover and execute the right workflow without pre-loading everything

15:59 model

Degrees of Freedom Spectrum

High freedom: multiple valid approaches, context determines best
Medium freedom: preferred pattern, variation OK
Low freedom: exact commands, do not modify

Match the constraint level to the fragility of the task.

Steal for Deciding how much latitude to give Claude in any repeatable workflow

17:59 model

Feedback Loop Pattern

Generate output
Validate with script
Fix issues
Repeat until valid max 3 rounds
Report to user on persistent failure

Baking validation into the skill itself turns probabilistic AI output into deterministic quality gates.

Steal for Any skill where output quality can be validated programmatically

25:21 model

Claude A/B Iteration Method

Complete a task with Claude A
Notice repeated context patterns
Ask Claude A to create a skill
Review for bloat
Test with Claude B fresh session
Bring failures back to Claude A
Refine and repeat

Two-instance development loop that surfaces description and instruction gaps invisible when building and testing in the same session.

Steal for Building any skill where you cannot objectively evaluate your own output from inside the build context

12:55 model

Invocation Control Triad

Default: both user and model can invoke
disable-model-invocation: true: user only for destructive ops
user-invocable: false: model only for background knowledge

Fine-grained control over who can trigger a skill. Destructive operations always require human approval.

Steal for Any agentic system where some actions require human review before execution

§ · Quotables

Lines you could clip.

11:11

"3,363 lines of Python just returns us 105 lines of output. There is no bloat, just answers to each of the steps in this process."

Concrete number that makes the efficiency case visceral → TikTok hook

16:00

"If I handed this to someone, would they know what to do with this?"

Simple heuristic that applies to any workflow documentation → IG reel cold open

24:50

"A bad skill is worse than no skill."

Contrarian, quotable, forces the value benchmark mindset → newsletter pull-quote

18:25

"You can get it to rewrite its own skill while it is learning. Your skills are evolving."

Unexpected framing: skills as living documents not static configs → Twitter/X thread opener

§ · Resources Mentioned

Things they pointed at.

28:25linkAnthropic skills GitHub repo ↗

30:20linkskillsmp.com ↗

35:00toolGamma API ↗

20:50toolHeyReach

10:30toolPerplexity ↗

10:30toolAirtable

§ · CTA Breakdown

How they asked for the click.

39:04 next-video

"Check out the videos on the screen. I do have a community that just launched if you wanna check that out."

Soft, no hard sell. Community mention plus suggested videos. Minimal friction.

§ 04 · The Script

Word for word.

HOOK opening / re-engagementCTA the pitch analogy story

00:00HOOKClaude skills are making waves, but most people have no idea what they actually are. They think it's just another prompting trick, but it's really not. Skills are actually portable workflow packages that teach Claude your processes so you can write them once and run them forever.

00:12HOOKIn this video, I'll break down the complete anthropic guide on how to write the best skills, when to use them, and how to test them properly. So let's get into it. So the first and most obvious question is what are skills?

00:23HOOKYou can think of skills like an onboarding guide for Claude. Now Claude is really smart already, but it doesn't know your specific workflows. It doesn't know your tools, your standards, your business context, and things like that.

00:33HOOKSo a skill teaches Claude something once, and then it can use it over and over again for the exact same thing. And that's really powerful because if you can build a really good skill up front, that means that your repeatable process is gonna be good every single time it runs. But we need to dive much further into this because it actually solves a much larger problem.

00:50HOOKSo if we look over here at what skills are, they are persistent markdown files with instructions. And for those of you who don't know, markdown is just a file type. It's easily readable in plain English, so nothing overly complex here.

01:00These files load on demand, not every single session, which is really important because it's part of Viruses, such a smart solution. It's portable across Claude AI, Claude code, APIs, and things like that, and it's part of the agent skills open standard. So that's where all the providers are getting together, and they're all gonna be able to use these skills interchangeably so that they aren't locked into each other's ecosystems, which should obviously make things a lot more difficult.

01:22So the TLDR is that if you are doing the exact same process over and over again, you should probably turn that into a skill. But let's dive deeper into that first. So this brings us neatly onto progressive disclosure, which is something that you need to understand, but because it is the core innovation behind why we're doing all of this.

01:37So it's a three tier loading system that revolutionizes how Claude actually accesses information. So instead of reading everything up front, it loads it in specific pieces. If we look over here, the first thing that it does is load the front matter.

01:49So this is the name and the description of the skill only. We would use this every single time it loads into a session. The token cost here is only around 100, and this is really important because you can think of this like a catalog.

02:02Say you ask Claude to do something for you. It's going to go and search the front matter of your skills to see which one suits your request. We'll get into a practical demonstration of this very soon.

02:13The second part of progressive disclosure is the skill dot m d body, and this is where it loads the full instructions. So after it's decided, hey. This front matter over here seems to match what the user needs, we're then gonna go and load the full body, and it uses this when it matches that request.

02:27And the token cost here is usually hundreds if you've written something quite well. Can obviously be higher if you've written something terrible, but the point is is that you can see here through this progressive nature, we're no longer just loading everything upfront. We first just load the little bit into every session.

02:40Then if it matches something, we load the full skill dot m d body, which has the full instructions of what the skill is gonna do. And then it also has any linked files that come along with this, where it loads any references, scripts, templates, things like that. And these are only loaded again if it needs it as part of a step in this workflow that it's doing.

02:57So the token cost here is zero until it's needed. Alrighty. So now that we understand one of the problems that it solves, the second thing that we need to look at is the skill structure and the organization.

03:06So every single skill has a directory with the skill.md as the entry point. So understanding what goes where is critical because you need to understand how to build this efficiently. So if we take a look at an example from my environment, lives in your skills forward slash and then whatever the skill name is.

03:20In this case, we are looking at something called research lead, which is actually one of my workflows. So the main file over here is my skill dot m d, and this is the SOP. You always wanna try and keep this under 500 lines just for efficiency.

03:33The second thing that we have here are any scripts that you want. So as part of my workflow, I have some scripts that run such as scrape underscore linkedin dot p y, research with perplexity, analyze with OpenAI, whatever. I've got some scripts as part of this research lead workflow here.

03:48And the important thing to note here is that these are executed. They are not loaded into the context window. So again, you can see we have that separation over here, which doesn't bloat the context window of the AI.

04:00That is the whole purpose of having a structure like this because we want the probabilistic nature of AI, but we want the deterministic function of tools, which is scripts in this case. More importantly, we don't want to waste context on whatever it is that we're doing because then it leads to those problems that I mentioned earlier.

04:14So running these as scripts helps solve that problem because these are executed locally by a Python script, which is just programming language, and the results are then shot back up into Claude when it needs them. So instead of 500 lines of trash, we now just have 20 word result, whatever it is that these scripts do.

04:31Then as part of this, you also have references, and you can think of this as a good place to stash your examples. So again, if we take my research lead as an example here, part of that workflow, I would want to go out and DM a bunch of people. So in the references section is where I would put in few examples of what a good DM looks like.

04:47So as part of the skill, it knows to check the references folder and then say, hey, this is what good looks like. I'm gonna match this exactly. Because remember, AI is a pattern matching machine.

04:56So giving that examples is a really amazing thing to do. And then you can also have assets and these can be templates or other assets that the AI might need for the skill in order to achieve its goal. And again, the important thing to remember here is that everything is loaded only when it's needed, so we protect that precious context window.

05:12Then we need to take a look at writing the perfect description. Now remember, the description is what goes in the front matter that gets loaded into every single session. The reason this is so important, like I mentioned earlier, is that if Claude cannot see the skill, it's never gonna know how to use it or fire it, in which case it won't get used.

05:27So one of the first things you need to understand is use a max of 1,024 characters. You wanna keep it concise but comprehensive enough for accurate matching.

05:35Very important. Third person only. So process emails, not I process emails.

05:40It's the wrong POV, and that will totally break the discovery of the skill that you're writing. You need to make sure that you answer the what and the when. So what does it do and when should Claude use it?

05:49And then finally on this, just include three to five trigger phrases. So specific words or phrases that should activate the skill. So let's take a look at some bad examples over here.

05:58Helps with projects. That's far too vague, and with AI, you wanna be as specific as possible. The way I like to think about it is if you are speaking to another human, if you are giving them this little minimal instruction here, would they understand what it is that you're talking about?

06:11Not, you probably need to revisit this thing. If you don't know how to write this, it means you don't understand the problem well enough to describe it, in which case you need to go and address that first. But there are other types of bad examples.

06:21So I can help you process these emails, and you see here we've written it in the wrong POV, so it breaks the discovery because it's not in third person. And then finally, implements sophisticated data pipelines, and that's just consultant word salad. There's absolutely no reason to use overly complex language if it doesn't need it.

06:38Keep it simple. Always. As simple as possible, as specific as possible so that anyone or anything that reaches would know exactly what it's for.

06:45Let's see what good examples look like. Manages linear sprint planning, including task creation and status tracking. Used when the user mentions sprint backlog or tickets.

06:53Pretty straightforward. Processes Gmail inbox to identify high risk emails and deliver an executive briefing to Slack. Run with forward slash email digest.

07:02So you can see these are really good examples because they tell the system exactly what to do or when to use something. So it can pick that up as part of the front matter that we mentioned earlier. Because you have to remember, Claude is using semantic matching here, meaning that it's matching based on meaning.

07:15It's not using regex or something like that. So create ticket matches log some tasks, but exact trigger words still give it the highest hit rate. So always keep that in mind.

07:24Why don't we take a closer look at what skill.md looks like in the real world? So we have our structure over here. We have dot claud and then forward slash skills, and in this case, research lead.

07:34That's one of my skills. So our skill dot m d file, you can think of it like the SOP, and this is a step by step instructions that Claude is gonna follow in order to achieve its goal. So if we come on over here

07:45to my research lead skill and click on the skill m d, we can see some of the things we've been speaking about already. So up at the top over here, we have our front matter, the name. We have the description, transform a LinkedIn URL into a complete research package with personalized outreach,

08:00and a little bit more info there. Then underneath that, we have the model, and this is something that's really important to understand. Because what you can do with this instead of getting Opus down here to run everything in your environment or always use Opus for your skills, you can specify the model.

08:15So I could start the process by chatting in my little box at the bottom here, but then it can spawn it off as a subprocess with Sonnet so you save a lot of money because not all of your tasks need Opus. So that's something important to keep in mind.

08:29Then we also have allowed tools, and this is an important concept as well because you don't need to load everything into every single skill that you have running. You have probably have a ton of MCP servers. You might have some scripts, stuff like that.

08:40The point is is that locking this down to only use the thing that it needs is again far more efficient. In this case, for researching a lead, we have a bunch of scripts that live in this folder over here, and it uses these scripts in order to achieve our goal. So first, let's have a look a little bit more at what a goal looks like.

08:58Now one of the things that people get wrong is that they think skills just replace prompts, and that's kind of half true, but also not really because a skill is more like an instruction manual or SOP as I've mentioned earlier. And Anthropic says that you can write these in two ways.

09:13One, you can write it imperatively, which is where we're telling this thing what to do step by step. The other way is to take more of a prompt structured approach like we used to do.

09:22Both of those will work, but this one is the preferred method. And then they have a caveat that they say at the end of all of that, and it pretty much just reads that as long as the skill does exactly what you want it to do every single time, that's what you should be aiming for. So the structure is important, especially this front matter part, but what you do after that is mostly up to you to choose how it gets you to where you wanna go.

09:44For me, the goal format always makes the most sense because I'm coming off of a framework that I made myself as part of some of my other videos you can check out on the channel now if you haven't seen them already. Point is is that to me having it laid out in this way, it's not only clear for me, but it's also very clear for the AI to understand exactly what we're trying to achieve with the skill.

10:04We're just trying to achieve a goal here. In this case, transform a LinkedIn URL into a complete research package with personalized outreach content constrained to relevant personalization only.

10:14So relevant relates to a problem they're likely facing that we can solve, and I just give a little bit more information. And then over here, we have the inputs required. So I just give the ad a little bit of knowledge about what inputs might be required for this, and the orchestrator will then run through everything.

10:27So I list the execution steps as part of this pipeline. Step one, scrape LinkedIn profile. Step two, research company and person via perplexity.

10:36Step three, run AI analysis, so on and so forth. So it's literally a list of how we achieve this goal, and you can think of it like nodes in an n n n workflow, to be honest. It's pretty much the same thing.

10:47But as part of these steps, I'm also giving it very specific instructions on what tools to use, and they relate to these over here. So my scripts are just these tools.

10:58And these will then go out and run locally, which is a very important distinction because we don't need to bloat the context. We don't need AI to do any of this work when we can use programming language to do it for free, and then just deliver us the results.

11:09So if we take a look at this, these scripts, this is just the execution layer. All of this gets executed locally, and it's not loaded, which is really important.

11:17So 3,363 lines of Python just returns us a 105 lines of output with context that the AI actually needs.

11:26There is no bloat, just answers to each of the steps in this process in order to get its goal achieved. So that's a 97% reduction, which is really handy.

11:34And again, this is just part of separating concerns. There's tons of frameworks behind this already. I had one busy turning it into a version two along with using this because this is way more efficient with some of the other tips and tricks that I've been using in my other videos.

11:46But then we also have references and if we come over here, we can see one of the references I've got in here are output structures. So it's just JSON structures for each analysis type in the research lead pipeline. So this just gives the AI a little bit of structure around how I want things done.

12:00But then equally as part of this, I could also have the definition of what a good direct message looks like. Stuff like that. I would put it in there so that this can just go and match patterns when it needs to to deliver the same thing repeatedly and accurately every time.

12:14Then finally, we have assets and these are just templates or images and config files that might be part of the system. In this case, I don't really have any as part of my research lead system yet, but that's because it achieves it with everything else that I've already got in it. The point is though that not everything needs all of these.

12:27You don't need to fill them out. It's interchangeable. You might not use any scripts if there is MCP available and MCP makes sense to use.

12:34It would then use that as a tool. I use scripts because for this specific workflow, it makes complete sense to do everything locally. I I don't need to use any tokens or anything like that for this specific workflow.

12:44But now that you have an understanding of what the structure looks like, we need to look at a few other things behind skills. Next up, we need to look at invocation control, and this is important to understand because it basically decides who should be able to trigger this specific skill. So first up, we have default behavior.

12:58So you can invoke it? Yes. Claude can invoke it?

13:01Yes. You use this for most of your skills. Let Claude find them and use them easily.

13:05That's the default behavior and for the most part, most people are gonna wanna be using that kind of thing. Next up, we have disable model invocation equals true, and this means that you can invoke it, but Claude cannot invoke it. You'd wanna use this for destructive operations or something where you want a human in the loop.

13:20Because remember, you don't just want the system going out there and YOLO ing every single thing, especially if it needs your approval to get in front of other people or send something to someone, deal with sensitive information. Those are a few use cases. But then there's also user invocable equals false, and that means that you can't do anything, but Claude can do it.

13:37There are a few use cases when you'd want to have this. Me, personally, I would never have that, but mostly useful for background knowledge that Claude should know, but the users can't directly trigger. So if we have a look at what that might look like in our front matter, we would have name, deploy, description, deploy the application to production, disable model invocation,

13:54true. And in that sense, Claude wouldn't be able to do a single thing because you don't want Claude just yellowing things into production, but the user can say yes. So as part of a workflow, that's probably something I would want to set especially if I have things that are starting to travel into production.

14:07We need that human review before we go in there. And then I also wanna look at model routing and sub agents. If you remember earlier when I touched on the skills, I told you about being able to run Sonnet.

14:17So for my research lead, we kick it off with whatever I'm talking to in that chat box. Usually, that's Opus. Then it will spawn our Sonnet model, which is much cheaper to run a specific workflow.

14:26The context here is to switch to fork, which is an isolated subprocess, and then we just give it our allowed tools. So we have the user asking, it reads the front matter, it matches the skill, it then spawns Sonnet, and it executes and returns whatever I need from my research lead process.

14:43Pretty straightforward and obviously it's up to you to decide or work with the AI in order to decide which model makes sense to use for what. You can use Haiku for the things that are far less complex. Maybe you wanna keep using Opus for the things that need a lot of judgment and a lot of reasoning behind them.

14:58Point is you can toggle with all of those and you can play with them as part of your workflows. Then I just wanna touch on the skill that MD body again. So we had a look at how I structure mine already using the imperative form as an s SOP,

15:09but you need to understand what actually makes this good. So exact commands with full parts, expected output format for each step, dependencies listed explicitly,

15:19things like that. Claude knows exactly what to run and what to expect back. It's really helpful when we give it examples or when we're being very specific because it needs no room for ambiguity.

15:28So like I've got over here, step one, do this. Use this thing in order to achieve that. This is what the output would look like.

15:35Step two, exact same thing over and over again for the specific workflow that I'm doing here. But what makes an instruction bad is if you say something like research the lead and find out about their company, then write some DMs. That is insanely vague and you could get a 100 different results based on that.

15:50So doing something like this is not gonna get you where you wanna go, especially as a repeated process. This wouldn't work if you gave it to a human. If you gave these instructions to a human, they would go and do something, but it's probably not gonna be the thing that you actually wanted them to do.

16:03So again, when you are writing these things or getting AI to write them for you, be very specific and think, if I handed this to someone, would they know what to do with this? If we look at this over here, if I handed this thing to someone, they would understand exactly what is happening here and exactly what I would want them to do.

16:19So that's the way that you wanna look at writing your skill bodies. Then we have degrees of freedom, and this is essentially just telling Claude the level of freedom that it has. So from loose guidance to the exact commands that we have.

16:30So high freedom means that multiple approaches are valid. Context determines the best one. An example of this would be review the code, check for bugs, edge cases, readability, things like that.

16:40So it has high freedom to do that. For medium freedom, that's where a preferred pattern exists, but some variation is okay. So an example of this would be generate a report using this template, customize sections as needed.

16:53And then low freedom, these are operations that are fragile where consistency is critical. So a specific sequence must be followed every single time in the exact same way. And that for me is where mostly I use my Python scripts.

17:05So I say to the AI as part of the skill body, run Python three scripts migrate p y whatever. Do not modify the command. I'm giving it very specific instructions about what to run here.

17:15It doesn't really have a lot of freedom or choice in the matter of how it does this task because I'm being very specific. And that's really important for business logic and building systems. You don't want to give this thing high freedom if it needs to achieve a very specific command.

17:28That makes absolutely no sense. So you need to factor that into your skills as well. How much freedom do you wanna give this thing as part of what it's trying to do.

17:35One thing I like to do is to also have feedback loops. So generate, validate, fix, repeat. AI does that automatically, but you can bake these into your skills and and you can give them an extra nudge just to put any of your own custom rules in there as part of this generate, validate, fix iteration loop that it runs through.

17:50So if we look at an example from my research lead, we have validate the output. So we run Python three scripts validate dot p y output dot JSON. If validation fails, read the error message carefully, fix the specific issues, and run validation again.

18:04Only proceed when validation returns valid equals true. Maximum three rounds if still failing report to user. So, again, you can customize this to whatever the hell it is that you want, but feedback loops are very important because it learns every single time that it runs through this iteration,

18:18and you can also get it to update some of your other SOPs or your Claw. Md for anything specific that it might learn. The point is here is that your skills are evolving.

18:27The goals mostly stay the same, but when you're testing and you're iterating through your initial builds, that is when you are learning and evolving the system so you can get it to rewrite its own skill while it's learning. The point here is that when we go through this little loop over here of generate output, validate with script, fix issues, repeat until valid, We're trying to get to a very good definition of done, something that matches the quality that we need for the system we are building.

18:51So then next up, we obviously need to talk about MCP because that forms part of it. In your environment, they're generally gonna be like two types of tools. One of them being MCP, which then has a whole world of tools.

19:01But then we also already mentioned scripts, which in this case are also tools. Point is is that you don't want to use MCP for every single thing out there just because it exists. If we need to do something with Superbase or Vercel or something like that, Notion, whatever, there are tons of times where using MCP makes sense.

19:19But there are also times where using it is just gonna add latency or context bloat and things like that into your system, And that's where using something like a standard Python script makes much more sense. So I would say use scripts specifically if you have a fixed pipeline, the same steps every single time, something repeatable that you need to do over and over again in the exact same way that doesn't require any form of AI judgment.

19:40You're gonna wanna use a script for that because then there is absolutely no token overhead like I mentioned earlier. It's literally just the script running locally, doing its thing, whatever it needs to do. And then I would use MCP when I want some judgment behind it or if the AI needs to interact with something specifically on my behalf, that's when you would wanna use it.

19:57When you have those extended services like Vercel and Superbase, when it needs to go out into the wild and reach into some of these fancy tools and do work on your behalf, MCP makes sense for that. For me, I'm always gonna default to using scripts for deterministic stuff, and I will get MCP involved when I needed to do work on my behalf with those external systems.

20:14Then we need to take a look at a few design patterns that Anthropic put forward. So the first one is sequential, and this is where steps run-in a strict order. This is like that n n n node flow that you can think of.

20:25And that's what I've done with my research lead because everything can be done locally. I don't really need MCP or anything like that. It's mostly just using Python scripts here along with a couple of APIs into other systems.

20:35But the point is everything runs sequentially. So we research the lead, we scrape a profile, we research that, we analyze everything that we've done, and then we write the DMs, we store them, and then send them using HeyReach. So that is a very sequential thing, and most business operations function that way.

20:53We want the steps to run-in this specific order. Then the other one is iterative refinement where the output quality needs validate fix repeat loops. So an example here would would be for building a website.

21:03So generate HTML, validate, fix, revalidate until we get to whatever our definition of done looks like. The third one is multi MCP coordination, and that's where the workflow spans multiple external services.

21:16So an example here, if we were building an app, we might use Superbase for the database. We would use it ourselves for deploying it. We might use Slack to notify a bunch of people.

21:24We might use Firecrawl to do web scraping as part of that. Whole bunch of different things, but they're all running via MCP and the AI is coordinating all of that stuff for you. If you haven't seen my ChatGPT versus Claude video, it's on the screen now.

21:37And I use this exact sort of thing in there. We use MCP for everything. Then the fourth one is context aware branching.

21:43And this is where you have the same input with different execution paths. So an example of this would be to process a file. If it's a CSV, take this data path.

21:52If it's a PDF, go down that path. Very straightforward. It's just context aware in the way that the process will run based on its initial point.

21:59Then pattern five, we have domain specific intelligence. And that's where we have embedded business rules or compliance, audit trails, things like that. So an example as part of my research lead workflow that I've got, I have a relevance filter and it discards anything that looks like theater.

22:14So if it finds someone's LinkedIn profile that talks a bunch of stuff that isn't really relevant, it filters all of that out. It only focuses on keeping things that will help write its DM effectively

22:24because I've baked in some of my own domain specific intelligence when it comes to sales and AI and things like that. So I've told the AI what to look for and it is now intelligent in the way that it does that.

22:35So these five design patterns, they're probably the most common things. And realistically, for most of the work that you're gonna be doing for clients, it could include many of these different things, if not all of them, as part of their system. Just keep in mind something on MCP again.

22:48Only use it when you absolutely need to use it for your client systems, because even though they kinda solved the token bloat problem with lazy loading, it still exists to some extent, especially if you're building a massive system. But then, obviously, we need to test the skills that we're writing.

23:00We can't just go out there and do a bunch of stuff and expect it to work without running any tests. They need to be performed in this order because each of them reveals a different aspect of the quality or their effectiveness behind them. So test one is the trigger test.

23:13You open a brand new session, not an existing one because there's context bias. Obviously obviously, if you've been building the skill with the agent inside there and you had to ask it a question, it already has some context awareness to it.

23:24So the best thing to do is to use natural language inside a new window. So if I flip back here, say I was using this one to build me my initial skill, I would just come over here, open up a new window and I would say, hey, let's go and run lead research. Now as part of that, we're testing whether this thing can semantically match what I've written in my front matter,

23:43which is that little sentence up top that Claude uses to identify a skill. And we're doing that in a new window because this has no context awareness of anything that I've spoken about with regards to research lead up until now. Whereas if I did it in the window in which I built that skill, it obviously has all the context as part of that conversation up to the point where it might have cut something off.

24:02But the point being that context is still gonna be in there, which is why we want to use a new window to see firstly if it can actually trigger the workflow which validates our front matter is working as it should. So if we have under triggering, that means that the skill never fires. So we need to broaden the description in our front matter describing what this thing is or does.

24:22But then there's also over triggering where it fires for everything, and you need to narrow the description. So, again, it's about finding the right balance when you're writing your front matter. Test two is the functional test, and that's pretty straightforward.

24:33We're just running the skill four to five times with different inputs. Does behavior stay consistent when we're doing that? Try it with sub agents as well.

24:40That's very important to do. Again, if we're not just using our main Opus agent and we switch the model to Sonnet, get it to do that and see how it responds. And then the third part of testing is the value benchmark.

24:51The hardest question here is, is this skill actually helping? So compare Claude's output with the skill versus without it. If the skill doesn't improve the consistency, the quality, or the speed of whatever you're doing, it probably doesn't need to be a skill.

25:04Again, it should be something that is repeatable, but it needs to be adding that value to you. So if it's not making your life faster or more efficient, does it really need to be a skill or do you need to readdress whatever it is that you're looking at? And then finally on this, we need to look at what Anthropic recommends for iterating with Claude, this is part of their a b building and testing.

25:21So they recommend that you use two instances working in tandem, and it makes complete sense as you'll soon find out why. So Claude a is the designer. Helps you write and refine the skill dot m d.

25:31It understands the agent instructions, and it works with your domain expertise to craft the effective skills. Claude b is the tester.

25:39So you use the skill in a fresh session like we just discussed on real task, and this reveals where the instructions might be too vague or incomplete and things like that. So the workflow for this might be complete a task with Claude a. Use normal prompting to accomplish a workflow.

25:53Notice repeated context. So you identify what instructions you provide repeatedly, and that's probably gonna be your skill.

25:59If you have a workflow that you do over and over again, that can also be a skill. Things like that. Then for step three, just ask Claude a to create a skill.

26:07So again, if we're using my research lead analogy over here, this was something that I obviously did over and over again to find leads. So initially, it started as an in workflow, then I moved it into an agentic format with using my own gotcha framework.

26:21And now I'm bringing it into skills because as you've seen, they're far more efficient specifically for the context loading. But the point is is that I went through my process step by step. How do I research leads?

26:32How do I go out there and find them? And then I've built an automation around that. And in that same sense, we're not just doing that as a skill format.

26:40I So would come in here and I would say, this is my typical working model. I go into this system. I take this action.

26:47I do a bunch of these steps. And I would talk to Claude about it, describing my entire workflow.

26:53And then at the end, just say, hey. Let's turn this into a skill. And that's what we're doing over here.

26:57If you had walked through something with it or you had been working together, you can just turn that into a skill as well. Tons of ways that you can approach this. Then obviously, we need to review it.

27:06We wanna make sure there's no bloat. We wanna make sure that as a first part of the process that it is actually functioning and running at a basic level. And we do that by running tests with Claude b.

27:15So fresh session, similar tasks, observe the behavior, see what it does. Then the step six is to bring failures back to Claude a.

27:22Claude b forgot to filter test accounts. So we're feeding Claude a more information about where it went wrong when it initially built our skill. You just go back and forward like that until you get the system built.

27:34It's refining with two agents, and it's probably the most efficient thing you can do right now, especially if you already have the context that you can feed agent a upfront. And at this point, I should probably give you some pointers on where you can get the skill creator and embed it in your own environments that you can make this entire process a lot easier because Anthropic already thought about this for you.

27:51Alrighty. So let's have a very quick overview of how all of these pieces together from a practical perspective and how we can use Claw to actually help us arrange and organize all of these things. Now for the most part, I'm gonna be using an IDE, which is just this interface you see here.

28:04In order to settle all of this up, you can do it in the CLI and again, this is an open framework, so it will probably work with most of the applications that you're using nowadays as well. So I'm using Versus Code as my IDE, but again, any IDE will do just right. So one thing that you could do to get everything set up for you is literally just come down here to Claude, ask it to research skills, and then get it to set up your environment

28:25according to what Anthropic says is best practice. That's one way, and then this thing would just go ahead and do it. I'm not gonna take that approach because I use this system along with my Claude MD that I'm busy migrating to a version two to attach with my framework.

28:38So I use this a little bit differently in that sense because I've got other moving parts that form part of my skills. But for the most part, if you wanted a shortcut, just ask Claude to set this up for you. If you wanted to do it manually, everything lives in this dot Claude folder over here.

28:51This is currently at a project level. And you see we've got our skill dot m d, which is our skill creator, its scripts, its references. So if you wanna get this meta skill creator that creates all of the skills for you,

29:04there are multiple ways you can do that. One, you could come on over to your web browser and you can have a look at this main GitHub repo over here.

29:11This is from Anthropic and you can connect this directly to the IDE that we're currently working in or you could just reference this and again give this to your agent down here and say, hey, can you pull out the skill creator from here and add it as a skill to our environment? And it would do that. But probably one of the easiest things for you to do is to come down here, do forward slash plugins,

29:31go over to the marketplace tab and that repo link, just copy and paste it into here. And you'll see you'll then have it added down here.

29:40In which case under plugins, your skills become browsable from whichever repo you paste in there. So you can also search for them if you don't feel like scrolling through this entire thing. But if you scroll down, you will see a few skills in here including the skill creator.

29:54So this is the official one from Anthropic. And then once that's installed, it will be installed over here and you'll be able to use it and just come to this thing and say, hey, build me a skill based on this workflow whatever it is you're working on. And it would then go ahead and build it based on the best practices outlined in this skill.md over here.

30:10So this is the meta skill creator. Very handy. But then there's already tons and tons of existing skills out there from people who have done these workloads for you already over and over again and they're already tested.

30:21We can find those in the marketplace over here. So if you check out skillsmp.com, you'll be able to find an insane amount of agent skills.

30:30You saw you can always come down here and try and find whatever it is that you're looking for. They give you a few examples. So skills about trading, data analysis related skills.

30:38Let's have a look for sales and marketing. See what they've got for that. So you can see here there are a bunch of skills for this already.

30:45And then all we would have to do is click here and this is a little markdown structure that you can see. So again, there's our name, there's the front matter description that we spoke about in the video, lead research assistant, this skill helps you identify and qualify potential leads for a business by analyzing product and service.

31:02So there are tons of things you could do from here. You could just copy and paste this back into your chat window. Say, hey, turn this into a skill and it will go and do it for you.

31:09Or there are other options on the right hand side here depending on what you're using. You can see it's got wget which means it would just download it. You can also use n p x which is pretty handy if you didn't want to rebuild this thing yourself.

31:20Point is, this is where you come to get existing skills. There are a ton of other places where you can get skills as well. But for me, I prefer to build my own because like I said, I I work in a very specific way.

31:31So now that I've got the very basic stuff out of the way, I'm gonna jump into my environment and I'm gonna show you how I would build skills and how it works within my environment. Okay. And we're in my environment now.

31:41So let's pretend that you had a slide generator and you wanted to turn it into a repeatable workflow because you've nailed exactly how you want your slides done. Now, I've already got something like that in here.

31:51I've got under my goals folder, I've got Gamma slides. You don't need to understand what goals is. It's just my framework that I was using historically before I moved across to skills.

32:00So I'm just using this because it already, as you can see, is a repeatable workflow. I was already doing the exact same thing. Just inefficient because now that skills are so granular, you can actually do this type of thing a lot more efficiently, which is why I'm moving my framework across to half skills and then half of some of the other stuff that I've been doing in my videos if you've been watching me for a while now.

32:19So anyway, what we're gonna do is we're going to get our agents to turn these gamma slides into a skill. Shouldn't be that hard because we've already got the format nailed over here. We were already doing that kind of thing.

32:30But for the most part, it's going to use our skill creator to put it into the right format. So all I'm gonna do is I'm gonna come over here and I'm gonna tell this thing to do it.

32:40I want to create a new skill based on one of our goals for Gamma slides. So can you take Gamma slides

32:48and turn them into the new skills dot m d format? Place it in the gotcha v two folder. So this thing's now going to try and figure out what it is that I just said to her.

32:57It's probably gonna read our goals folder. In your case, if you were writing a new skill, you might not already have something as nailed down as this. But again, you would understand the workflow that you've been building or the idea that you want to have.

33:09And then you would start to articulate that to the AI and work back and forth with it in order to build some kind of repeatable workflow or an application, a system, whatever it is that you're building. And ultimately, you would get to a point where you would have something structured like a goal like this. I'm just gonna bypass permissions.

33:28Specifically, I do with my framework is myclaw. M d is tailored all around this. So it understands exactly how to build these goals and how to separate them with tools and things like that.

33:37I don't wanna go into that in this video. I have deep dives in other videos on my channel. You can check one on the screen now.

33:42Point is though, I'm gonna be keeping that exact same structure just migrating it across to skills and I'll put on a new video and deep dive into how all of that will go together. So I'm not gonna go deep into that in this video. Okay.

33:53So this thing is allegedly done. Here's what I created, the Gamma Slide skill. So if we go on over to Gotcha v two, now yours would probably be living up here.

34:02Mine is just like this because I'm using this as a test environment while I migrate frameworks. But if we come down here, can see it used the skill creator to create our Gamma slides. So if we have a look at skills.md,

34:13the name is Gamma slides. The description is to generate presentation slide decks from markdown content using the Gamma API. Used when the user says create slides, make presentation, build a deck, generate a slide deck, so on and so forth.

34:26It's given that the Sonnet model to do that and it wants to use it as a sub process which is very cool. And then it's limited its tools as well because we only need one tool in order to create that and that is probably gonna be a script that connects to the Gamma API itself and then pushes in some of the parameters that we want.

34:42So we can have a look over here. The objective is to generate a presentation deck blah blah blah. The inputs required, so we need to have content or a content idea, a title, the slide count,

34:54and then it lists that we need the Gamma API key which lives in my dot e n v file down here. And then all it does is create the execution steps. So prepare markdown content

35:04which creates the clean file that we create locally here. Now we create it locally so I can do editing here before I push it up to Gamma and then use Gamma AI to create it. Otherwise, we would burn a ton of credits between two systems and that's just silly.

35:16So we get things working really nicely in this environment, which is cheaper, and then push it up to Gamma, and Gamma's AI will make it for us. So I'm not gonna go through this whole file, but you get the point. It is a step by step framework on exactly how to do it.

35:29We tell it exactly what it needs, the text amount, whether we're using images or no images and things like that. It all gets baked into this skill or standard operating procedure. And then in order to help us achieve this goal,

35:42we have our scripts which is how we connect to the environment and this is just the tool that it uses. It's the Gamma API. You can see down here, this is just a Python script.

35:51It costs nothing to run. It doesn't need any context. It does all of the work.

35:55Don't need to use MCP for this because the AI inside Gamma will already take care of all of the building for us. So we don't need our subscription to do any of that lifting. No need for MCP.

36:05And it really is that easy, guys. Now, obviously, you would be working with the AI if you wanted to. I imagine if you started with a fresh environment, you wouldn't have any of this stuff.

36:13But the point is you could just come down here and articulate your thoughts or your problems with the AI and work through whatever it is that you're trying to turn into a skill. It's basically just understanding your problem well enough and then having solved it manually and then coming in here and trying to refine that into a system.

36:27And Claude will go ahead and build that for you, especially if you're using my framework or someone else's framework. It can help you do that even better and then you attach it to your skills like we did over here. But then obviously, all we've done here is create it.

36:39So we wouldn't want to test it in the with the exact same agent. So we would open up another one and I would just come in here and I would say to this thing, use the

36:49Gamma skill to create me a slide deck on the importance of salt. And then it's gonna go ahead and it's gonna do that I'm gonna bypass permissions.

37:00Let me understand how the Gamma skill works by reading the relevant files. So you can see there, it's reading our skills, Gamma slides, figuring out how things work.

37:09Now I'll create the slide content and generate the deck. Let me first ensure the temp directory exists which it probably does. After it does that, it's gonna go and create our local markdown files.

37:18After that, it's then just gonna use the Gamma API, send it up to Gamma, and Gamma is gonna generate our slide deck, and we'll see how that thing turns out in just a few seconds. But before we do that, we should understand that obviously, we were building something more complex, there would be back and forth between this one which is Claude a and this one which is Claude b and if you remember that's what we were speaking about when we're building and testing these things.

37:39We want to have two agents talking. Now, this thing immediately found out what I was talking about. So that means that we nailed our front matter.

37:45That means that that little section at the top was very clear and that a brand new agent can pick that thing up, which is the most important part. And then it was just able to go and do exactly what it needs to do. It wrote our Gamma content.

37:57CTAContent is ready. Now let me generate the presentation via the Gamma API.

38:01CTAAnd so you can see how it's using the skill and in this case, the tool, which is a script in order to achieve its goal. We could extend this out even further. So if you want it to be better practice, I would provide it a definition of what good looks like, and that would live in the references folder, which it didn't create here because I didn't care enough to add it for this example.

38:20CTABut the point is is that I would add references in here, and then I would put in two or three MD files showing it exactly what a definition of good looks like for my slide structure so that I would always get that structure every single time. We're just matching patterns here. But I think it still did a pretty good job.

38:36CTALet's have a look at what it turned out. Look at this, the importance of salt. From ancient civilizations to modern industry, salt has shaped human history, biology, and culture in profound ways.

38:47CTAThis essential mineral touches every aspect of our lives. You get the point. This is the exact thing that you saw earlier.

38:53CTAThis is the same color theme and all of that that I use for every single one of my slides. So all of this is nailed from a functional perspective. So we have officially created a repeatable

39:03CTAsystem. And anytime I wanna come into my agent, all I need to do as you've now seen is come down here and speak to it. And that's the whole point of skills to make all of these repeatable systems much easier to use for us.

39:13CTASo hopefully, I've demystified it just a little bit. I will have another deep dive into this when I launch v two of my framework that uses skills as part of it, and we'll cover more things in there as well including plug ins. So if you have any comments, leave them below.

39:26CTAI do have a community that just launched, if you guys wanna check that out, it is open now. Otherwise, check out the videos on the screen. They're definitely gonna help you on your journey.

39:33CTAThanks very much for watching. See you on the next one.

— full transcript

§ 05 · For Joe

The system behind repeatable Claude workflows.

WHAT TO LEARN

A skill is not a better prompt. It is a workflow contract that executes consistently by keeping execution outside the context window.

Progressive disclosure means the model loads only what it needs at each step: the frontmatter every session, the full SOP on match, and scripts only during execution, so context stays lean by design.
The frontmatter description is the single most important thing to write: it must answer what the skill does and when to use it, in third person, in under 1024 characters, with three to five trigger phrases.
Scripts and external tools should be executed, not loaded into context. A 3363-line Python execution layer can return 105 lines of useful output, a 97% reduction in what the model actually has to process.
Match the degree of freedom to the fragility of the task: judgment work gets high freedom with multiple valid approaches; business logic that must run identically every time gets exact commands with no variation allowed.
Baking a validate-fix-repeat loop directly into the skill body turns probabilistic output into a quality gate. The skill catches its own errors before surfacing results, with a maximum of three rounds before escalating to the user.
The trigger test must always use a brand-new session: testing in the same window where you built the skill produces false positives because the context already contains everything the frontmatter was supposed to surface on its own.
A skill that does not demonstrably improve consistency, quality, or speed over raw model output should not ship. Comparing with-skill and without-skill output is a required step before treating any skill as production-ready.
The A/B iteration method where one instance designs and a separate fresh instance tests is the only reliable way to find gaps in skill instructions that are invisible from inside the build context.

§ 06 · Frame Gallery

Visual moments.

01:45

06:10

15:17

22:38

24:07

34:58