AI Stack Engineer · Youtube · 09:04

This Free File Makes Claude Code 10x Cleaner (Karpathy Skills)

A 9-minute breakdown of the CLAUDE.md file that fixes the four most expensive AI coding agent failure modes.

Posted

April 14th 2026

1 months ago

Duration

09:04

Format

Tutorial

educational

Channel

AS

AI Stack Engineer

§ 01 · The Hook

The bait, then the rug-pull.

Andrej Karpathy posted a thread saying he now programs mostly in English — and then catalogued exactly how that breaks. Not broken syntax. Something worse: agents that guess silently, build too much, touch things they should not, and call it done without checking. One developer turned those observations into a single file. This is a breakdown of that file.

§ · Stated Promise

What the video promised.

stated at 01:33 "I am going to go through the installation, and then we are going to build a small project with it so you can see the difference." delivered at 05:31

§ · Chapters

Where the time goes.

00:00 – 01:15

01 · Karpathy thread + the four problems

Opens with Karpathy workflow flip (80% manual to 80% agent). Frames story around failures. Names four agent failure modes: silent assumptions, overengineering, scope creep, no verification.

01:15 – 03:35

02 · Excalidraw: problems mapped to cost

Three-column diagram: Your Request / What the Agent Does / What You Get. Walks each failure mode with concrete cost examples (400-line OAuth, 200-line date formatter, 40-line diff, untested validation).

03:35 – 04:11

03 · The four principles (GitHub README)

Introduces the Karpathy Skills repo. Maps each principle to the problem it solves: Think Before Coding, Simplicity First, Surgical Changes, Goal-Driven Execution.

04:11 – 05:31

04 · Installation walkthrough

Two paths: Claude Code plugin (global, recommended) via /plugin marketplace add + /plugin install; and per-project curl with append support for existing CLAUDE.md.

05:31 – 07:32

05 · Live demo: ecommerce dashboard

Builds dashboard with guidelines active. Agent asks 3 clarifying questions. Output: 1 file, 120 lines. Without guidelines: 6-8 files, 500+ lines, unasked-for features.

07:32 – 08:07

06 · VS Code: clean diff

Shows actual code output. Every changed line traces to what was asked. No renamed variables, no reformatted comments, no drive-by refactors.

08:07 – 09:04

07 · Tradeoff + close

Guidelines bias toward caution, not speed. For trivial tasks they are overkill. For nontrivial work where wrong assumptions cost hours, they are the fix.

§ · Storyboard

Visual structure at a glance.

karpathy thread

hook karpathy thread 00:01

github repo intro

promise github repo intro 01:04

excalidraw diagram

value excalidraw diagram 02:05

four principles

value four principles 03:35

install demo

value install demo 04:25

agent asks questions

value agent asks questions 06:10

clean dashboard output

value clean dashboard output 07:04

vs code clean diff

value vs code clean diff 07:32

tradeoff + CTA

cta tradeoff + CTA 08:27

§ · Frameworks

Named ideas worth stealing.

03:35 list

The Four Karpathy Principles

Think Before Coding — surface ambiguity first, ask not guess
Simplicity First — minimum code to solve the problem, no speculative features
Surgical Changes — touch only what the request requires
Goal-Driven Execution — define success criteria, loop until met

Four behavioral rules for AI coding agents derived from Karpathy X thread, packaged as a CLAUDE.md file by Forrest Chang.

Steal for Directly applicable to any Claude Code workflow. Also useful framing for a JoeFlow or CLAUDE.md content piece.

01:40 list

The Four Agent Failure Modes

Silent assumptions — picks one interpretation and commits without asking
Overengineering — writes 200 lines when 30 would do
Scope creep in edits — reformats, renames, refactors beyond the task
No verification — says done without checking edge cases

Problem taxonomy from Karpathy thread, named and made concrete with cost estimates.

Steal for Perfect for a short-form breakdown or a things-Claude-does-wrong hook video.

§ · Quotables

Lines you could clip.

08:28

"Coding agents are capable, but they behave badly. They make silent mistakes that look correct on the surface. They build too much when you need too little. They touch things they should not touch, and they do not verify their own work. This file corrects those patterns in about 50 lines of markdown."

Self-contained thesis, no setup needed, lands hard → TikTok hook or IG reel cold open

08:08

"Every line that changed traces back to what I asked for. There are no surprise edits, no renamed variables in other files, no reformatted comments, no drive-by refactors."

Concrete payoff after the demo — shows rather than tells → Newsletter pull-quote

00:48

"And then you end up reviewing a giant pull request that solves a problem you never actually had."

Tight punchline, universal pain point for any Claude Code user → TikTok hook

§ · Pacing

How they spent the runtime.

Hook length25s

Info densityhigh

Filler5%

§ · Resources Mentioned

Things they pointed at.

00:57toolandrej-karpathy-skills (GitHub) ↗

00:00linkKarpathy X thread on coding workflow flip

§ · CTA Breakdown

How they asked for the click.

09:03 subscribe

"If you did, please like this video and subscribe to the channel, and I will see you in the next video."

Standard end-screen CTA, no mid-roll asks. Clean close after the argument lands.

§ 04 · The Script

Word for word.

HOOK opening / re-engagementCTA the pitch metaphor analogy story

00:00HOOKAndre Karpathy posted a long thread on X about how his coding workflow completely flipped. He went from 80% manual coding to 80% agent coding in just a few weeks, and he was pretty honest about it.

00:14HOOKHe said he is now mostly programming in English, telling the model what to write in plain words. But the interesting part was not the productivity gains. It was the failures.

00:25He laid out clearly how coding agents keep messing up in ways that are not obvious. They are not writing broken syntax anymore. The mistakes are deeper.

00:35They make assumptions about what you meant and keep going without checking. They pick one interpretation out of three possible ones and commit to it silently. They do not ask you to clarify.

00:46They just act confident and move forward. And then you end up reviewing a giant pull request that solves a problem you never actually had. So that thread got a lot of attention, and a developer named Forrest Chang took the core ideas from that post and turned them into a single file, a file called claud dot m d.

01:05It is a set of behavioral guidelines for claud code. The repo is called Andre Karpathy Skills, and it is on GitHub right now. It has more than 26 thousands, and honestly, it is one of the most practical things I have seen in this space recently.

01:20Not because it does anything fancy, but because it targets the exact problems that waste the most time when you are working with AI coding agents every day. Okay.

01:31So before I show you how to install it and use it, I want to explain why this matters. Because if you have not run into these problems yet, you probably will soon. And if you already have, then you know exactly what I am about to describe.

01:44The first problem is silent assumptions. You ask your agent to add user authentication. There are 10 different ways to interpret that.

01:53Session based? Token based? OAuth?

01:55The agent does not know. But instead of asking you which direction to go, it just picks one and starts building. And it picks the most complex version.

02:05So twenty minutes later, you have a 400 line auth system with OAuth, refresh tokens, and role based access control. All you needed was basic email and password for a prototype.

02:16The model guessed instead of asking. The second problem is overengineering. You ask for a simple function that formats a date string.

02:25You get back a configurable date formatting utility class with six methods, a builder pattern, and error handling for edge cases that will never happen. The model writes 200 lines when 30 would do. It is trained on massive code bases where abstraction is valued,

02:41so it defaults to that style even when the task is small. The third problem is scope creep in edits. You ask the agent to fix a bug in one function, it fixes the bug.

02:52But it also reformats the file, renames variables, cleans up comments that were not part of the task, and refactors an adjacent function that was fine. Now your diff is 40 lines instead of four, and you have to review every change to make sure nothing broke.

03:08The fourth problem is the lack of verification. You tell the agent to add form validation. It adds the validation code and says done, but did it actually test it?

03:18Did it check if it handles empty strings, special characters, values that are too long? Usually no.

03:24It just writes the code and moves on. There is no verification step. There is no success criteria.

03:31It does what you asked in the most literal way and calls it finished. So those are the four core problems, and the Carpathi Skills repo addresses each one with a matching principle. The first principle is called think before coding.

03:45It basically tells the agent to stop and surface any ambiguity before writing a single line. If there are multiple ways to interpret a request, list them. If something is unclear, ask.

03:56Do not guess. The second principle is simplicity first. Write the minimum code needed to solve the problem.

04:03No speculative features. No abstractions for things that are only used once. Alright.

04:08So now let me show you how to actually set this up. I am going to go through the installation, and then we are going to build a small project with it so you can see the difference. So I am on my screen right now.

04:20There are two ways to install this. The first way is the recommended one, and that is the Claude Code plugin route. You open Claude Code and run slash plugin marketplace add Carpathi skills that adds the marketplace.

04:34Then you run slash plugin install skills. And that is it. Now the guidelines are installed as a plugin, which means they apply across all your projects automatically.

04:44You do not have to copy any file into each project folder. It just works everywhere. The second way is the per project route.

04:53This is for when you only want the guidelines inside one specific project. You open your terminal, go to your project folder, and run this cURL command that downloads the file straight into your project root. Now if you already have a claud.

05:06Md file and you do not want to lose your existing rules, you can append instead of overwrite. The repo shows the command for that. You run echo to add a blank line and then curl the file content and pipe it into your existing claud.

05:21Md. Your project rules stay on top and the Carpathi guidelines get added at the bottom. Either way takes about ten seconds.

05:29Okay. So now that it is installed, let me actually build something with it so you can see how the agent behaves differently.

05:35I am going to ask Claude Code to build a simple ecommerce dashboard. Nothing crazy. Just a front end page that shows total revenue, number of orders, top selling products, and maybe a recent orders table.

05:49The kind of thing you would want to glance at to see how your store is doing. So I type my prompt, build an ecommerce dashboard page that shows total revenue, order count, top products, and a recent orders table, use React and Tailwind, keep it simple, and now watch what happens. With the Carpathathy guidelines active, the first thing the agent does is ask me questions.

06:10It wants to know if the data should come from a real API or if hard coded sample data is fine for now. It asks if I want the dashboard to be responsive or desktop only. It asks if I need any filtering or date range selector or if this is just a static snapshot view.

06:26This is exactly the behavior we want. It is not guessing. It is asking.

06:31I tell it sample data is fine, desktop only for now, and no filters needed. Just the basic overview. And then it builds exactly that, nothing more.

06:41The output is clean. One file, maybe 120 lines of code.

06:45Four stat cards at the top, a simple table for recent orders, and a short list of top products. No router setup, no state management library, no API service layer, no authentication wrapper, no dark mode toggle.

06:59Just the dashboard I asked for. And if you have ever asked an agent to build a dashboard without these guidelines, you know what usually happens. You get six to eight files, maybe 500 lines, with a full component tree, context providers,

07:13a mock API with fetch hooks, loading skeletons, pagination, and a sidebar navigation for pages that do not even exist yet. For a prototype, for something you just needed to check a layout idea,

07:25that is the difference these guidelines make. And here is the other thing I noticed. The diff is exactly what I expected.

07:32Every line that changed traces back to what I asked for. There are no surprise edits, no renamed variables in other files, no reformatted comments, no drive by refactors. Just the code I asked for, and nothing else.

07:46That is the surgical changes principle doing its work. You spend way less time reviewing because there is less noise in the output. And now there is a trade off the repo is honest about.

07:56These guidelines bias toward caution over speed. For simple stuff like fixing a typo, you do not need the full rigor. The guidelines are meant for nontrivial work.

08:07CTAThe kind where wrong assumptions cost you hours and over engineering means throwing away the output. And I think that is what makes this repo different from a lot of other stuff in the AI coding space right now. It is not trying to impress you, it is solving a very specific,

08:23CTAvery real problem. Coding agents are capable, but they behave badly. They make silent mistakes that look correct on the surface.

08:31CTAThey build too much when you need too little. They touch things they should not touch, and they do not verify their own work. This file corrects those patterns in about 50 lines of markdown.

08:42CTAIf you are using Claude code regularly, this is worth trying. It takes ten seconds to install, and the worst case is you delete the file.

08:50CTABut once you see the difference in your diffs and code reviews, you will want to keep it. Alright, so that's it from the video, and I hope you enjoyed it. If you did, please like this video and subscribe to the channel, and I'll see you in the next video.

— full transcript

§ 05 · For Joe

Your CLAUDE.md is a behavioral contract, not a hint.

Steal this for JoeFlow / MCN+ / any Claude Code project

The agent does not need better prompts — it needs explicit rules about what NOT to do.

Install the plugin route (global, 10 seconds): /plugin marketplace add forrestchang/andrej-karpathy-skills then /plugin install andrej-karpathy-skills@karpathy-skills
Or append to your existing CLAUDE.md with the curl append command — your rules stay on top, Karpathy principles go at the bottom
The four failure modes (silent assumptions, overengineering, scope creep, no verification) are the ones that kill review time — name them in your own CLAUDE.md
The Excalidraw diagram format (Your Request / What Agent Does / What You Get) is a reusable content frame for any AI tool breakdown video
The tradeoff is honest and worth saying out loud: these rules slow down trivial tasks, so scope them to nontrivial work

§ 05 · For You

How to stop your AI coding tool from making expensive silent decisions.

If you use Claude Code or any AI coding agent

The agent is not broken — it is filling in blanks you did not know you left open.

When you ask for something vague, the agent picks the most complex interpretation by default — make requests specific or expect bloat
A CLAUDE.md file in your project root (or installed as a plugin) gives the agent a standing ruleset — it will not guess when rules exist
The key rule is Think Before Coding — if the agent does not ask clarifying questions on an ambiguous request, the file is not active
Check your diffs: if changed lines do not trace back to what you asked, the Surgical Changes principle is not working
Install takes 10 seconds; worst case is you delete the file

§ 06 · Frame Gallery

Visual moments.

04:35