WEBVTT

00:00:00.160 --> 00:00:10.320
Andrea Carpathi, the former head of AI at Tesla, just went viral for this post titled LLM knowledge basis. And that's because he shared the secret to 10 x ing your output with Claude Code. But unfortunately,

00:00:10.320 --> 00:00:12.720
a lot of what he says sounds complicated

00:00:12.455 --> 00:00:47.135
when in reality, it's actually pretty simple. So I'm gonna break down and simplify the three key strategies Karpathy uses, view how each one works, and give you actionable advice you can apply today to 10 x your Claude Code projects. Strategy number one is LLM knowledge basis. Right now, most people use AI like a search engine. You ask a question, get an answer, close the window. Tomorrow, you start from scratch, nothing compounds. Karpathy nailed the problem in one line. The LLM is rediscovering knowledge from scratch on every question. There's no accumulation. His fix, have Claude build and maintain a knowledge base for you. He calls it a wiki. Think of this like a personal encyclopedia

00:00:47.135 --> 00:00:51.775
except Claude writes every page, keeps it organized, and updates it automatically

00:00:51.775 --> 00:01:09.015
when you add new stuff. There's no database, there's no infrastructure, just folders on your computer. Something that my mom could set up. And his system has three layers. In the demo that I go through later in this video, I'll share a prompt that you can copy and paste to set this all up. But it's important you understand the concepts first. Layer one is your raw resources.

00:01:09.095 --> 00:01:28.170
This is a folder where you drop in articles, transcripts, notes, PDFs, whatever training data could be helpful for your project. Think of this like a data dump. Claude can read from it, but never changes it because this serves as the source of truth. Layer two is the knowledge base, the wiki. This is where Claude organizes everything for you. Summaries, concepts, breakdowns, comparisons,

00:01:28.330 --> 00:01:30.090
profiles on people or tools,

00:01:30.410 --> 00:01:56.250
all cross referenced to the raw knowledge. And layer three is the schema. This is an instruction file that tells Claude how the knowledge base should be structured, what conventions to follow, and what to do when you add a new source. You can also tell Claude to do a health check, basically auditing the whole thing for contradictions, stale info, and gaps. Think of this like the librarian of the whole system. Sounds complex, but let me break down a simple example. Let's say you have a raw transcript from five podcasts of Carpathi talking about AI best practices.

00:01:56.330 --> 00:02:02.915
What you would do is upload that into the raw data folder, then a wiki would be created about Andrea Carpathi

00:02:02.915 --> 00:02:05.715
that would clearly reference these five transcripts

00:02:05.715 --> 00:02:16.115
as well as the topics that are covered there. So Claude would then look at the wiki and know where to look for specific information in the raw database. That way it doesn't have to look through all five of these raw transcripts,

00:02:16.390 --> 00:02:21.670
Instead, it can be more precise. You're creating a web of information that makes Claude's life easier,

00:02:21.830 --> 00:03:22.370
in turn making the output that much better. And the reason that this can work long term directly from Carpathi, humans abandon wikis because the maintenance burden grows faster than the value. LLMs don't get bored. The best part with this setup is it creates a foundation you can build on top of. So that's strategy one. You build the knowledge base, it compounds over time. But what if Claude could improve things without you thinking about it? That's strategy two, which is auto research. Karpathy open sourced a project called auto research. What he did was he had a small AI model he was training. Instead of manually tuning the code to make it better, he pointed an AI agent at it and said, find ways to improve this. No other guidance, just a goal and a way to measure it. And in this project, he was able to create an auto research loop. This is a loop where you propose a solution, you test a solution, evaluate it, keep or discard it, and then repeat. So the agent does exactly this. It proposes the change, it runs the training, it measures if it improved, keeps it or throws it out, and then proposes the next change over and over again. When comparte used this tool to test performance, found 20 improvements that stacked up to about 11%

00:03:22.370 --> 00:04:34.905
performance gain. And actually, the Shopify CEO saw this experiment and ran his own auto research loop on his own data. 37 experiments and a 19% improvement all while he was sleeping. This is the mindset behind this whole concept. To get the most out of the tools that have become available now, you have to remove yourself as the as the bottleneck. You can't be there to prompt the next thing. You're you need to take yourself outside. Okay. All this is really great, but there's some limitations here. So Carpathi's version works because he's measuring something that's just extremely measurable to computers. Right? Code runs faster or it doesn't. There's a clear number associated with if it's doing better or worse. But the reality is most of what you and I are building doesn't really look like this. How do you quantify if an app looks good? How do you quantify if a script resonates? Is this email draft good? You can't really put numbers to this. So, yes, you can use auto research straight up, but I wouldn't take it at face value. Instead, think about what the concept is and apply it to how you actually work. Ultimately, auto research is all about creating a system that gets better every time you use it because you're feeding results back into it. And there's different ways to do this in whatever you're working. Let's say you have a landing page and you wanna improve your conversion rate. Within the project itself, you can tell Claude, review my landing page headline, write five variations of the headline, split test them simultaneously,

00:04:35.145 --> 00:05:19.480
and track results using post hoc to determine which is the best. You've created a system to track performance and end of the week, you can come back to Claude, have it pull the numbers, and tell you which is better, make the decision, and then move on to the next experiment. The results aren't instant like Karpathy outlined, but it follows the same improvement principle. But this is still measurable. What about extending this concept to non measurable things? Well, is actually what excites me the most. So for example, let's say I use AI to create a report for a client. It'll generate the report, and then I'll go back and forth with AI until it's exactly what I want. I then can have a Claude skill. I use something called build partner colon improve system where I'll look at the back and forth I already had with Claude and enhance my knowledge base so the next output is better. Essentially, I'm using my chat history as a proxy

00:05:19.825 --> 00:05:57.445
for if the output was good or bad. In my experience, this is a phenomenal way to improve your systems over time. I personally run this skill manually, but if you want to automate this process so it's a little bit closer to auto research, you can use something called a loop or a schedule, which are two features that the creator of Claude code actually calls the most powerful features in Claude code. A loop lets you set Claude to run a specific command every so often in your session, and a schedule lets you set Claude to run it at a specific time and day, and it runs entirely on the cloud. Personally, I don't use loops or schedule features too much. Instead, I actually use something called hooks to help me. A hook essentially automates specific commands

00:05:57.525 --> 00:06:12.510
based on things that happen as you use clawed code. So I set up a hook that every time I start a new clawed code session, if I haven't ran build partner slash improve system in a while, it will remind me to run it. I then manually run it and it looks at my historical conversations

00:06:12.510 --> 00:08:41.545
to do the improvement for me. This is how I've created my version of an auto research loop for these less measurable things. And I know we are going through concepts quickly, so don't worry. If you do wanna go deeper on strategies like this where you can follow at your own pace, I put together a free five day email series where I walk through the concepts I'm covering here. And based on thousands of people that have gone through it, I am highly confident you're gonna love it. But if you don't, you can just unsubscribe anytime. Now, up to this point, we've gone through setting up your LLM knowledge base and we now understand what auto research is and how you can apply it to whatever you're working on. And before we get to a demo where I'll give you a single prompt where you can set up your whole machine, there's one more strategy that ties it all together. Strategy three is context engineering. From Crypathy, context engineering is the delicate art and science of filling the context window with just the right information for the next step. And usually, context engineering is the difference between people getting good and bad results. Here's a clip of him talking about when people complain about AI not working. Like, so many things, even if they don't work, I think to a large extent, you feel like it's a skill issue. It's not that the capability is not there. It's that you just haven't found a way to string it together of what's available. Like, I just don't I didn't give good enough instructions in He says it's a skill issue. Karpathy is not holding back, but he's frankly correct. So how do you properly context engineer? Well, there's two things. First, your Claude MD file. This is the instruction file that Claude reads at the start of every session. Most people either don't have one or it's three lines. This is key because it tells Claude what your project is, how it's structured, what conventions to follow, and what it tends to get wrong. We touch on this a touch in section one, but it is super critical. Here's a prompt you can paste right into Claude code. Create a Claude MD for this project, include what this project is, the folder structure, what I'm currently building, and common mistakes to avoid. Keep it under 50 lines. The key is that last keep it under 50 lines is because we don't want it to have too much bloat. That's an arbitrary number. It can extend past that, but you get the concept. The second is scope of Claude's seeds. If you're writing a script, Claude doesn't need your entire code base. It needs your script frameworks, your voice patterns, maybe a few examples of finished scripts. But the more irrelevant stuff you load into it, the worse the output will get. And this is why the LLM knowledge base we covered earlier is so important is because it creates a web of knowledge so that the LLM can effectively navigate. I also use skills to help simplify this too. I have a skill called build partner colon expert advice, where when someone asks a business question, the skill automatically loads the right expert framework. Let's say it's a pricing question, it references Hermozy.

00:08:41.545 --> 00:08:48.825
Let's say it's social media, it references mister beast and Gary Vee. Let's say you're starting a business, it references Elon Musk. All contextual information

00:08:49.065 --> 00:09:01.320
that is only important based on the specific topic I have a question about. The person asking doesn't have to know what context to provide, the skill handles it If you are interested in some of these skills that I'm referencing, you can get them for free on buildpartner.ai.

00:09:01.320 --> 00:09:46.195
It's a plugin I created, so you can go check that out. It's entirely free if you want. Now, we've covered a lot here. Right? LM knowledge, auto research, context engineering. Let me show you how to actually set this up. You don't need my exact system. You just need Claude code and one prompt and it'll get it all started. Open Claude code and paste this prompt in which is also in the description in this video. There's a lot here and this one prompt sets up all three strategies and Claude will just build it based on the project you're working on. But I do wanna call out some key things happening this prompt. So based on your back and forth with Claude code, you can just make sure that it applies them. The first is it creates the folders for you. This is the general structure, but you may have subfolders if you have a bunch of resources. Here you can see mine where I have in raw, I have different partitions. In wiki, there's different partitions.

00:09:46.355 --> 00:10:18.975
So as you build it out, you may have subfolders that are needed. The second part of this is a hook that when you drop in resources, it'll bring it into the raw folder and then Claw will automatically process it and update the wikis and then create the necessary linkages. You may wanna consider making a Clawed skill that the hook calls to make this more consistent. I have one called ingest source. Once you get that all set up, if you're using Obsidian to view your files, which I personally highly recommend, hit command g to see the graph view. You'll then be able to see all of your files and all of your folders and the information

00:10:19.220 --> 00:10:43.015
and the web and how they're linked together. Here, you can see mine. It's pretty cool. It's productivity porn. Let's call it what it is. But it does help you make sure that you're properly linking files within your wiki and setting up a proper LLM knowledge base. Now, if you got this far, you are an absolute legend and I'm confident that you'll love this video where I walk through how Anthropix team, the creators of Clawd Code, actually use Clawd Code. Go check that out and I'll see you over