WEBVTT

00:00:00.240 --> 00:00:24.475
So Andrzej Karpathy, he's the guy who literally helped build modern AI, who was at OpenAI, who got Tesla autopilot working, and who coined the term vibe coding. He just told us if you're still building apps the way you were last year, he's got bad news for us. So he just gave a brilliant talk. I've spent hours breaking it down to help you understand how it may affect what you build as a developer, founder, or software company.

00:00:24.715 --> 00:00:31.800
Thanks to monday.com for sponsoring this video. More on them later. In the next few minutes, I'm gonna walk you through what he actually said about 2026,

00:00:31.960 --> 00:00:53.655
what it means for what you should be building and how you should be building, and the four frameworks that I think every AI builder needs to have in their head right now. I'm Rob from Switch Dimension, and if you're trying to build with AI in 2026 and beyond, this might be the most important video you watch this month, so stick with me. This kind of a brain is the single most powerful thing I've built in the last couple of months. Highly recommend you do something similar yourself.

00:00:57.630 --> 00:01:00.990
So one of the first things that stood out to me was Karpathy's

00:01:00.990 --> 00:01:03.550
point about the December inflection.

00:01:03.630 --> 00:01:23.085
So what Karpathy is saying is round about December, the model's output started to actually just work. He stopped correcting it. He started just trusting the model, and he basically went on this vibe coding bender. He said he has a ton of side projects that he's built up, and we all have that problem. Multiple different projects we're working on the same time because now we can build anything.

00:01:23.325 --> 00:01:37.710
So essentially, if you haven't sat down in the last sixty days and seriously tried to build something end to end with clog code codex cursor in agent mode, you are really flying blind according to Karpathy. So go do that this weekend. Seriously,

00:01:38.110 --> 00:01:51.325
build something. So I think Karpathi's biggest takeaway in this talk was the idea of software three point o, and we've heard that so many different times before. But this is a slightly different angle I don't think many have seen. So Karpathi's

00:01:51.325 --> 00:02:06.780
breakdown of the software evolution is software one point zero is handwritten rules. So basically, writing out code. Software two point o was training neural networks via large data sets, and software three point zero is where the large language model itself becomes the programmable computer,

00:02:06.860 --> 00:02:16.865
the interpreter, and your code basically is the prompt and the context window is your lever. So that's Karpathi's high level take on it. How does this affect us practically as builders?

00:02:16.945 --> 00:02:25.200
So in traditional SaaS, this is Levercast, an app I built. It uses AI to help you post to all your social media platforms. So I would just click create,

00:02:25.360 --> 00:02:29.600
drop in my thoughts, generate a post with an existing style guide,

00:02:29.920 --> 00:02:50.000
and then it takes the pain out of the first draft for my social media posts across the various different channels that I work on. So that's a traditional SaaS app with a little bit of AI sprinkled in on top. So here's the big shift and here's what's changing. We're going to be increasingly using our agents to do more work first. We're gonna be living in claw code, codex cursor, whatever is your agent of choice.

00:02:50.320 --> 00:03:15.670
In this new world, what people will increasingly be doing is basically just telling their AI agents to carry out a task. So instead of me going and logging in to Leverkast and going through all the workflow, I might just say, hey. I've got an idea for a post. I could drop it in and say, hey. Use the Leverkast MCP to carry this out. Or not even. I could actually use a skill that's prebuilt in that will actually do all of this for me. I go off to do something else or I have this working on a schedule and then bang,

00:03:15.990 --> 00:03:27.825
all my posts are published. So besides that being a cheap plug for my own software, really what we're getting at here is a change in how we need to think about building things. As a builder, you really need to think about the current capabilities

00:03:27.825 --> 00:03:29.025
of multimodal

00:03:29.025 --> 00:03:34.465
models and even think ahead a few months to what's coming next and how that might affect the software

00:03:34.465 --> 00:04:04.050
you're building or you've already built. Do you need to have an MCP, an API, or some kind of agentic workflow in front of it so that an agent can discover it quickly? So Karpathy's example of this was back last year, he built this quick app that basically allowed you to turn your menus into magic. It's the idea that you could take any menu and looking at this menu here, might not know exactly what the food is or what it might look like. By dumping in this menu into his app, it basically was able to create AI representations

00:04:04.050 --> 00:04:06.450
of what the food was and what it looked like.

00:04:06.850 --> 00:04:25.045
So last year, this was an entire application that he built that somebody could use. This year, he could solve the same problem with just Chatchi BT or Nano Banana. So you can see the entire menu gen workflow replaced by me just dropping in that original menu, and here you can see an overlay of the dishes on the starters.

00:04:25.285 --> 00:04:34.570
So his big point here is the menu gen app that he built a year ago doesn't really need to exist anymore, and here's why that's really uncomfortable.

00:04:34.730 --> 00:04:36.330
A huge percentage

00:04:36.490 --> 00:04:45.450
of the apps people are building right now shouldn't exist either. They're basically orchestrating things the model can already do natively

00:04:45.825 --> 00:04:57.585
that's only just appeared in the last couple of months. Their software one point zero plumbing wrapped around what should just be a single software three point zero prompt. Anyway, you get the point. So here's the test.

00:04:58.190 --> 00:05:29.810
Take what you're building and ask, could I do this with a single multimodal prompt and the right tool calls or an MCP or two? If the answer is yes, you're building plumbing that's about to get eaten by the next model release. Stop or pivot. For you Americans or Canadians who love hockey, we don't even have ice in Ireland. This is where I insert a quote about you're skating to where the puck is going to be, not where it is right now. So if you watch this channel at all, you're going to be no stranger to vibe coding apps. And I wanted to talk today about monday.com's

00:05:29.890 --> 00:05:36.610
vibe because basically I think it's different. It's a natural language app builder baked directly into the monday.com

00:05:36.610 --> 00:05:37.010
platform.

00:05:37.565 --> 00:05:46.605
So, yeah, of course, there's tons of different app builders out there, but I think what's really different about this, you're actually building your application on top of your monday.com

00:05:46.605 --> 00:05:47.645
infrastructure

00:05:47.805 --> 00:05:54.650
and context. And as we talk about in this video, the context and the understanding is really important. You've got your workflows,

00:05:54.650 --> 00:05:55.610
your intakes,

00:05:55.610 --> 00:05:59.530
your OKRs, everything that's important to your company is managed in there.

00:06:00.170 --> 00:06:10.825
But often there's this last mile problem where you want to solve something like, you know, I wish I had the perfect form for pulling that in or, uh, we need a clean page that represents our OKRs.

00:06:10.985 --> 00:06:12.585
So with monday.com's

00:06:12.585 --> 00:06:40.485
new vibe, you can use natural language to create bespoke interfaces with inputs and outputs that write back to your monday.com data, like a sales forecasting app, a campaign health tracker, a time tracker app. So it's pretty much living in Monday where you and your staff works. This really is a super tool. Viewers can get started for free. Check out the link in the description down below. So already, Karpathy has given us some real gold. Let's move on to his next point. And this is all about verifiability

00:06:40.485 --> 00:06:54.510
and building a moat as we go forward. So Karpathy went on to explain why a lot of Frontier models fall down and how there's an advantage in that. The reason models are so good at code is because it's deterministic and it's verifiable,

00:06:54.590 --> 00:07:08.805
and that's great feedback for the model. Unfortunately, a lot of the world isn't as verifiable as code. So Carpathi suggests, what are the domains that the large frontier labs are not focusing on right now that's have some level of verifiability?

00:07:08.885 --> 00:07:12.245
Financial trading, supply chain, and routing optimization,

00:07:12.485 --> 00:07:16.820
continuous integration and migration agents, data cleaning, labeling.

00:07:17.140 --> 00:07:23.140
You all have domain expertise in some niche part of any workflow or company.

00:07:23.300 --> 00:07:29.780
What are the things that you can build in that space that take advantage of the things Carpathi is talking about in this conversation?

00:07:30.365 --> 00:07:43.245
So thankfully, in this talk, he basically retires the term vibe coding and replaces it. So vibe coding is great. It raised the floor. Pretty much anyone can build now. We all needed that. It pretty much democratizes building.

00:07:43.560 --> 00:07:49.800
But But he says what professionals are doing now is agentic engineering, and that's the kind of stuff I teach a lot in my channel and on my course.

00:07:50.120 --> 00:08:30.575
This means things like using specs, plans, managing your context window, doing proper review, making sure that we have unit level smoke tests to end tests, and blockers in place in continuous integration so that we're not pushing bad code. Karpathi says he's seeing people who are getting good at this go 10 x faster. If you're on x at all, you'll see that, oh, people running 10 to 20 agents at the same time. Personally, I kinda think that's nonsense. I'm really good at this. I've been doing it for two years, and the most I can keep in my head at any time is maybe three to four agents running at the same time. I give them a little bit of work. I check what they're working on. I have to review the code. These are production databases,

00:08:30.575 --> 00:08:52.630
production systems I'm working on. I'm not just gonna push up whatever the agent was working on. So if you're feeling behind and you're working with agents, but you don't feel like you're working with 10 or building entire, uh, empires in a single prompt, then don't worry. But the thing is we really need to think about how we get to that point because what Karpathy is talking about is we need to build again for where that puck is going.

00:08:52.870 --> 00:09:15.570
So we get to that point where not unlike in December when things just worked, I think we'll get to a point where our our agentic harnesses just work, and we can run 10 to 20 agents at the same time if we choose. We're not there yet, but we're on that path. By the way, if you're interested in building agentic harnesses and production systems like that so you can build apps really quickly,

00:09:15.810 --> 00:09:19.170
you can join my course and community at switchdimension.com.

00:09:19.170 --> 00:09:37.985
It's currently closed to new members right now, but I have a new course dropping in the coming weeks. If you want to be part of that cohort, just drop your name and email into the wait list. Okay. So let's wrap this all up into the four things you should build. So the first thing he suggests you build are tools that enhance your understanding,

00:09:38.305 --> 00:09:52.850
not just your speed. So here's what I do myself, and here's a really easy way for you to get started with this. Here, I've got Claude Cowork, but it doesn't matter what tool you use. I could do this in Claude code. You could do this in cursor. You can do it wherever you want. Essentially, what you need is a folder

00:09:53.075 --> 00:10:04.915
and an initial prompt. That prompt to your agent is where you tell it about your company, your app, your domain space, your life, whatever else it is, and ask it to create a set of strategy documents

00:10:05.075 --> 00:10:24.855
around that application business, whatever the domain space is. So in my case, I have a company called Switch Dimension that helps people learn how to build with AI, and I got it to produce a whole lot of strategy documents around that through various conversations. And they're just simply stored as markdown in a folder. So my big problem is I want to build everything and take every opportunity.

00:10:25.015 --> 00:10:38.180
I just have a quick conversation with my strategy agent, which now has an understanding of where I'm going, and it keeps me on track. It gives me focus. It says you're going in the wrong direction if you work on that. Here's what you should be focusing on. Here's the opportunities.

00:10:38.180 --> 00:10:45.220
And every time I ask it to generate a doc or create new content, it does that with real context and understanding

00:10:45.220 --> 00:10:56.955
of my Switch Dimension world. This kind of a brain is the single most powerful thing I've built in the last couple of months. Highly recommend you do something similar yourself. So the next thing you should build is agent first infrastructure.

00:10:57.035 --> 00:11:18.120
Everything we've built is pretty much built for humans. That's documents, dashboards, install flows, DNS settings. The entire Internet is built really for humans. So the real win in this next generation is stripping away all of the human UI. Would an agent know how to use this directly without any kind of human translation in between? We're seeing this on websites

00:11:18.435 --> 00:11:39.660
in ecommerce where LLM dot TXT files are there so that when agents arrive on a website, can quickly figure out how to use it rather than reading through all the emotional marketing that's directed at humans. It wants to just know how the API works, how can work with your product, is it trustworthy as fast as possible, build for that. Number three, we talked about verifiable domain capabilities.

00:11:39.660 --> 00:11:51.075
So a lot of the big labs are covering the large domain areas, but they're not going to have the time or be specifically focused on reinforcement learning for all these little sub niches.

00:11:51.235 --> 00:12:00.970
There are millions of them in every part of our life and business. Can you build a reinforcement learning environment around this, fine tune it, and own that capability?

00:12:00.970 --> 00:12:19.305
So don't dismiss this as an opportunity for you. We can now build anything. That's exciting. Take your handbrake off and go and do it. And number four, the big one. Let's build apps that only exist now because of software three point zero. So that is not a faster spreadsheet,

00:12:19.305 --> 00:12:20.185
a faster

00:12:20.905 --> 00:12:23.065
UI interface on top of a workflow.

00:12:23.540 --> 00:13:31.850
We need completely new things like that large language model knowledge base Karpathi was talking about. That literally, we couldn't exist because there was no code that could actually do it. There's so many new things we can build now that we're still thinking about apps and the old way of doing things. There's a whole new approach now with these reasoning models that we can apply. Let's be honest, AI will change the jobs market. It will change how we work. If we think positively about what we can build now, the reframing of data, the compilation across all these modalities, We've got large language models now that actually push and progress other areas of science and medicine. We have a really exciting time ahead of us. On a personal note, I went to my own side project folder this week, and I killed at least three project after watching Karpathy's talk. Both of them failed the menu gen test and the software three point o test. They were basically software one point zero plumbing for things the next model release is probably gonna do natively. This, of course, is painful, but it's better just for me to kill them now than to shift them and watch them die in three months time. But saying that, there are one or two apps that I am doubling down on that hits all four of the criteria up above. I'll be sharing more on that soon.

00:13:32.090 --> 00:13:37.095
So hit me up in the comments, and I'm asking this seriously. What is the app you've built that probably

00:13:37.095 --> 00:14:06.550
shouldn't exist anymore? I want to see how many of us are in the same boat. No judgment. I'm in it too. If this was useful, subscribe. I'm doing a follow-up on building agent first infrastructure that goes way deep into the practical side. And if you want the actual playbooks I use building with agents, go and check out the Switch Dimension Lose Neller and community. They are linked below. I've also linked to Karpathy's full talk down below. I'm supposed to link one of my own videos, so you go there and watch one of those next, but I highly recommend checking it out. And I've also got a ton of other videos on this topic on the channel.
