WEBVTT

00:00:00.160 --> 00:00:05.360
Andre Karpathy posted a long thread on X about how his coding workflow

00:00:05.360 --> 00:00:06.720
completely flipped.

00:00:06.880 --> 00:00:20.255
He went from 80% manual coding to 80% agent coding in just a few weeks, and he was pretty honest about it. He said he is now mostly programming in English, telling the model what to write in plain words.

00:00:20.495 --> 00:00:25.215
But the interesting part was not the productivity gains. It was the failures.

00:00:25.295 --> 00:00:29.935
He laid out clearly how coding agents keep messing up in ways that are not obvious.

00:00:30.470 --> 00:00:33.430
They are not writing broken syntax anymore.

00:00:33.590 --> 00:00:38.950
The mistakes are deeper. They make assumptions about what you meant and keep going without checking.

00:00:39.110 --> 00:00:43.350
They pick one interpretation out of three possible ones and commit to it silently.

00:00:43.765 --> 00:00:45.845
They do not ask you to clarify.

00:00:46.005 --> 00:00:48.485
They just act confident and move forward.

00:00:48.805 --> 00:00:54.165
And then you end up reviewing a giant pull request that solves a problem you never actually had.

00:00:54.565 --> 00:01:20.705
So that thread got a lot of attention, and a developer named Forrest Chang took the core ideas from that post and turned them into a single file, a file called claud dot m d. It is a set of behavioral guidelines for claud code. The repo is called Andre Karpathy Skills, and it is on GitHub right now. It has more than 26 thousands, and honestly, it is one of the most practical things I have seen in this space recently.

00:01:20.945 --> 00:01:22.465
Not because it does anything fancy,

00:01:22.930 --> 00:01:30.530
but because it targets the exact problems that waste the most time when you are working with AI coding agents every day.

00:01:30.850 --> 00:01:35.650
Okay. So before I show you how to install it and use it, I want to explain why this matters.

00:01:36.185 --> 00:01:40.505
Because if you have not run into these problems yet, you probably will soon.

00:01:40.905 --> 00:01:47.545
And if you already have, then you know exactly what I am about to describe. The first problem is silent assumptions.

00:01:47.545 --> 00:01:50.185
You ask your agent to add user authentication.

00:01:50.600 --> 00:01:55.960
There are 10 different ways to interpret that. Session based? Token based? OAuth?

00:01:55.960 --> 00:02:05.065
The agent does not know. But instead of asking you which direction to go, it just picks one and starts building. And it picks the most complex version.

00:02:05.225 --> 00:02:11.145
So twenty minutes later, you have a 400 line auth system with OAuth, refresh tokens,

00:02:11.225 --> 00:02:13.145
and role based access control.

00:02:13.465 --> 00:02:21.390
All you needed was basic email and password for a prototype. The model guessed instead of asking. The second problem is overengineering.

00:02:21.630 --> 00:02:41.715
You ask for a simple function that formats a date string. You get back a configurable date formatting utility class with six methods, a builder pattern, and error handling for edge cases that will never happen. The model writes 200 lines when 30 would do. It is trained on massive code bases where abstraction is valued,

00:02:41.875 --> 00:02:56.020
so it defaults to that style even when the task is small. The third problem is scope creep in edits. You ask the agent to fix a bug in one function, it fixes the bug. But it also reformats the file, renames variables,

00:02:56.020 --> 00:03:01.585
cleans up comments that were not part of the task, and refactors an adjacent function that was fine.

00:03:01.905 --> 00:03:10.705
Now your diff is 40 lines instead of four, and you have to review every change to make sure nothing broke. The fourth problem is the lack of verification.

00:03:11.200 --> 00:03:14.080
You tell the agent to add form validation.

00:03:14.080 --> 00:03:22.240
It adds the validation code and says done, but did it actually test it? Did it check if it handles empty strings, special characters,

00:03:22.240 --> 00:03:31.185
values that are too long? Usually no. It just writes the code and moves on. There is no verification step. There is no success criteria.

00:03:31.265 --> 00:03:35.185
It does what you asked in the most literal way and calls it finished.

00:03:35.425 --> 00:03:42.010
So those are the four core problems, and the Carpathi Skills repo addresses each one with a matching principle.

00:03:42.170 --> 00:03:45.210
The first principle is called think before coding.

00:03:45.370 --> 00:04:02.805
It basically tells the agent to stop and surface any ambiguity before writing a single line. If there are multiple ways to interpret a request, list them. If something is unclear, ask. Do not guess. The second principle is simplicity first. Write the minimum code needed to solve the problem.

00:04:03.470 --> 00:04:05.150
No speculative features.

00:04:05.310 --> 00:04:11.870
No abstractions for things that are only used once. Alright. So now let me show you how to actually set this up.

00:04:12.270 --> 00:04:34.620
I am going to go through the installation, and then we are going to build a small project with it so you can see the difference. So I am on my screen right now. There are two ways to install this. The first way is the recommended one, and that is the Claude Code plugin route. You open Claude Code and run slash plugin marketplace add Carpathi skills that adds the marketplace.

00:04:34.940 --> 00:04:44.380
Then you run slash plugin install skills. And that is it. Now the guidelines are installed as a plugin, which means they apply across all your projects automatically.

00:04:44.620 --> 00:04:49.855
You do not have to copy any file into each project folder. It just works everywhere.

00:04:50.175 --> 00:04:52.815
The second way is the per project route.

00:04:53.055 --> 00:04:57.295
This is for when you only want the guidelines inside one specific project.

00:04:57.455 --> 00:05:12.520
You open your terminal, go to your project folder, and run this cURL command that downloads the file straight into your project root. Now if you already have a claud. Md file and you do not want to lose your existing rules, you can append instead of overwrite.

00:05:12.520 --> 00:05:21.825
The repo shows the command for that. You run echo to add a blank line and then curl the file content and pipe it into your existing claud. Md.

00:05:21.985 --> 00:05:26.625
Your project rules stay on top and the Carpathi guidelines get added at the bottom.

00:05:27.240 --> 00:05:31.400
Either way takes about ten seconds. Okay. So now that it is installed,

00:05:31.560 --> 00:05:39.880
let me actually build something with it so you can see how the agent behaves differently. I am going to ask Claude Code to build a simple ecommerce dashboard.

00:05:40.285 --> 00:06:10.355
Nothing crazy. Just a front end page that shows total revenue, number of orders, top selling products, and maybe a recent orders table. The kind of thing you would want to glance at to see how your store is doing. So I type my prompt, build an ecommerce dashboard page that shows total revenue, order count, top products, and a recent orders table, use React and Tailwind, keep it simple, and now watch what happens. With the Carpathathy guidelines active, the first thing the agent does is ask me questions.

00:06:10.595 --> 00:06:16.355
It wants to know if the data should come from a real API or if hard coded sample data is fine for now.

00:06:16.755 --> 00:06:44.545
It asks if I want the dashboard to be responsive or desktop only. It asks if I need any filtering or date range selector or if this is just a static snapshot view. This is exactly the behavior we want. It is not guessing. It is asking. I tell it sample data is fine, desktop only for now, and no filters needed. Just the basic overview. And then it builds exactly that, nothing more. The output is clean. One file, maybe 120

00:06:44.545 --> 00:06:51.025
lines of code. Four stat cards at the top, a simple table for recent orders, and a short list of top products.

00:06:51.480 --> 00:06:54.680
No router setup, no state management library,

00:06:54.760 --> 00:07:07.625
no API service layer, no authentication wrapper, no dark mode toggle. Just the dashboard I asked for. And if you have ever asked an agent to build a dashboard without these guidelines, you know what usually happens.

00:07:08.025 --> 00:07:13.705
You get six to eight files, maybe 500 lines, with a full component tree, context providers,

00:07:13.785 --> 00:07:22.500
a mock API with fetch hooks, loading skeletons, pagination, and a sidebar navigation for pages that do not even exist yet. For a prototype,

00:07:22.580 --> 00:07:25.300
for something you just needed to check a layout idea,

00:07:25.540 --> 00:07:32.180
that is the difference these guidelines make. And here is the other thing I noticed. The diff is exactly what I expected.

00:07:32.420 --> 00:07:43.305
Every line that changed traces back to what I asked for. There are no surprise edits, no renamed variables in other files, no reformatted comments, no drive by refactors.

00:07:43.465 --> 00:08:07.285
Just the code I asked for, and nothing else. That is the surgical changes principle doing its work. You spend way less time reviewing because there is less noise in the output. And now there is a trade off the repo is honest about. These guidelines bias toward caution over speed. For simple stuff like fixing a typo, you do not need the full rigor. The guidelines are meant for nontrivial work.

00:08:07.525 --> 00:08:13.525
The kind where wrong assumptions cost you hours and over engineering means throwing away the output.

00:08:14.005 --> 00:08:23.950
And I think that is what makes this repo different from a lot of other stuff in the AI coding space right now. It is not trying to impress you, it is solving a very specific,

00:08:23.950 --> 00:08:25.390
very real problem.

00:08:25.470 --> 00:08:28.270
Coding agents are capable, but they behave badly.

00:08:28.430 --> 00:08:31.470
They make silent mistakes that look correct on the surface.

00:08:31.630 --> 00:08:38.265
They build too much when you need too little. They touch things they should not touch, and they do not verify their own work.

00:08:38.585 --> 00:08:44.745
This file corrects those patterns in about 50 lines of markdown. If you are using Claude code regularly,

00:08:44.825 --> 00:09:03.426
this is worth trying. It takes ten seconds to install, and the worst case is you delete the file. But once you see the difference in your diffs and code reviews, you will want to keep it. Alright, so that's it from the video, and I hope you enjoyed it. If you did, please like this video and subscribe to the channel, and I'll see you in the next video.