STARTUP HAKK · Youtube · 17:47

Stop Renting Your AI. Here's How To Own It.

Spencer reverse-engineers Claude Code's source to prove the gap between cloud AI and local AI is engineering — then ships a free, open-source terminal agent to close it.

Posted
May 3rd 2026
19 days ago
Duration
17:47
Format
Tutorial
educational
Channel
SH
STARTUP HAKK
§ 01 · The Hook

The bait, then the rug-pull.

One number changes the math: 1.6%. That's the share of Claude Code's codebase that's actual AI decision logic. Spencer from STARTUP HAKK leads with this reverse-engineered stat to force a question — if the intelligence is 1.6% of the equation, why are developers paying cloud-subscription prices for all of it? The answer, he argues, is that the gap isn't magic. It's engineering. And engineering gaps close when developers decide to close them.

§ · Stated Promise

What the video promised.

stated at 00:25 "Make sure you stay to the end because we're gonna give you two free things in this, and I mean free as in free as in free." delivered at 07:09
§ · Chapters

Where the time goes.

00:00 – 00:41

01 · The 1.6% Hook

Reverse-engineered stat: only 1.6% of Claude Code is AI logic. The rest is infrastructure. If that's true, why pay subscription prices? Cloud models also get quietly nerfed over time.

00:47 – 01:53

02 · Spencer + The Engineering Gap

Channel intro. Fractional CTO background. The gap between cloud AI and local AI is an engineering gap, not a magic gap. StarterPack has built something to close it.

01:53 – 02:23

03 · Model Drift Problem

Cloud model behavior shifts over time. Guardrails tighten. Output quality drifts. Prompts that worked last quarter produce different results now. Rate limits stifle innovation.

02:23 – 04:19

04 · OpenMonoAgent Launch

The product: a free, local-first terminal coding agent. Single install command. Run on gaming PCs ($1K with RTX 3090) or mini-PC bricks (~20 tok/s, ~25W). No metering, unlimited tokens.

04:19 – 05:23

05 · Feature Walkthrough

Embedded inference with zero setup. Docker-sandboxed by default. 20+ MCP tools built in. Built in C#/.NET. Blazing fast LSP for C# and TypeScript.

05:23 – 07:09

06 · Playbooks vs. Skills

Skills are prompts — the model can drift, skip, or misinterpret. Playbook gates are code — the executor calls them, the LLM is not in the loop and cannot hallucinate past them. Typed, composable, stateful workflow automation.

07:09 – 08:01

07 · Giveaway CTA

Free Ryzen mini-PC inference box giveaway. Sign up at openmonoagent.ai. Manifesto restatement: AI shouldn't be a subscription.

08:01 – 09:21

08 · Zero-Cost Architecture

C# choice explained — infrastructure-grade, not a weekend project. Model-agnostic (swap the engine without buying a new car). No telemetry, no tracking. Install command on the landing page.

09:21 – 10:14

09 · Privacy Argument

Every cloud AI prompt leaves your machine. For client code or NDA work, that's real exposure. OpenMonoAgent has no server to exfiltrate data to — everything runs on your hardware.

10:14 – 11:34

10 · Why C#/.NET

Production-grade, cross-platform, type-safe, long-term maintainability. Python is for experiments; C# is for things meant to run for years. Onboarding is a first-class concern — single command because bad DX kills open source projects.

11:34 – 13:38

11 · Linux/Git Historical Precedent

Linux was called a toy. Git was called a toy. The pattern repeats: incumbents dismiss → developers adopt → it becomes the default. Local AI agents are next. Spencer is taking that bet.

13:38 – 14:42

12 · Democratization

Real democratization: a developer in Nairobi has the same AI coding tools as Google engineers. No credit card. Free permanently — because free is the only price that's truly universal.

14:42 – 17:02

13 · Live Demo

SpencerFiresup OpenMonoAgent on a snake game project. 41 tok/s on RTX 3090. Reviews the project, spots missing .gitignore, fixes code quality issue, initializes git repo. Comparable to Claude Code in real usage.

17:02 – 17:47

14 · Outro CTA

openmonoagent.ai install command. Star the GitHub repo. Like and subscribe. If you need custom software, starterpack.com.

§ · Storyboard

Visual structure at a glance.

hook
intro
manifesto
playbooks
linux/git
live demo
CTA
§ · Frameworks

Named ideas worth stealing.

00:00 concept

The 1.6% Reframe

Open with a counterintuitive precision stat about what the competitor actually delivers vs. what you pay for. Forces the audience to question the value prop before the product is even named.

Steal for JoeFlow equivalent: cost of cloud transcription subscriptions vs. local Whisper over 12 months, broken into a per-hour price that sounds absurd
01:33 concept

Engineering Gap vs. Magic Gap

Any perceived gap between cloud and local tools is engineering, not magic. Engineering gaps close when developers decide to close them. Removes the mystique from the incumbent.

Steal for The $6 Stack positioning — the complexity of self-hosting is an engineering gap, not a complexity that only enterprises can afford to solve
07:10 model

Playbooks vs. Skills (Typed Gates)

Skills = prompts (model can ignore, drift, misinterpret). Playbooks = code (executor calls them, LLM not in loop, cannot hallucinate past a gate). The distinction between suggestion and guarantee.

Steal for Any agent reliability argument — Claude Code skills vs. structured workflow gates
11:34 model

Linux/Git Toy Pattern

  1. Incumbents call it a toy
  2. Developers adopt it anyway
  3. It becomes the default
  4. The pattern repeats

Every foundational infrastructure tool was dismissed by incumbents as a toy. Local AI is next in the sequence.

Steal for Self-host revolution content arc — every piece of the $6 Stack was a toy before it was default infrastructure
12:40 model

The Agent/Model Layer Separation

The agent is the layer. The model is the engine. Changing engines should not require buying a new car. Model-agnosticism as a core design principle.

Steal for MCN platform architecture content — separating the orchestration layer from any single model vendor
§ · Quotables

Lines you could clip.

01:33
"The gap between Cloud and Local AI is not a magic gap. It's an engineering gap, and we've helped close that gap."
Clean thesis, no setup needed, provocative framing → TikTok hook
07:15
"A skill is a prompt. The model can drift, skip, or misinterpret. A playbook gate is code. The executor calls this, and the LM is not in the loop. It cannot skip it, hallucinate past it, or decide it knows better."
Technically precise, quotable for developer audience, no setup needed → IG reel cold open
03:55
"AI shouldn't be a subscription that you rent. It should be infrastructure that you own sitting on your desk, serving your code, answering only to you."
The manifesto line — read off a slide on-screen, perfectly paced → newsletter pull-quote
12:40
"The agent is the layer. The model is the engine. Changing engine should not require you to buy a new car."
Tight metaphor, self-contained, no context needed → TikTok hook
13:20
"Free is the only price that actually is universal."
Six words. Clean thesis. → IG reel cold open
§ · Pacing

How they spent the runtime.

Hook length41s
Info densitymedium
Filler15%
§ · Resources Mentioned

Things they pointed at.

02:35productOpenMonoAgent ↗
02:35toolQwen 3.6 model
02:35toolDeepSeek
02:35toolGemma 4
§ · CTA Breakdown

How they asked for the click.

17:02 link
"Go check it out. Do us a big favor. Leave a star there. And as always, make sure you like and subscribe."

Soft and multi-part — star repo, like, subscribe, visit openmonoagent.ai, starterpack.com for custom dev. No hard sell. The giveaway CTA (mid-video, t=429) was sharper and earlier.

§ 04 · The Script

Word for word.

HOOK opening / re-engagementCTA the pitch metaphor analogy
00:00HOOKSo researchers reverse engineered Cloud Code's entire source and found something that should make every developer stop and think. Only about 1.6% of the code base is actually AI decision logic.
00:09HOOK1.6%. The other 98% is infrastructure, context pipelines, memory systems, permission layers, safety scaffolding. So here's the question.
00:17HOOKIf the intelligence is 1.6% of the equation, why are we paying cloud subscription prices? That's like a 100%.
00:23HOOKAnd here's the part nobody's talking about. Cloud models get quietly nerfed over time, and you adapt to those changes without even realizing it. Today, I wanna tell you about a new release of something we've released here at Starter Pack.
00:32HOOKMake sure you stay to the end because we're gonna give you two free things in this, and I mean free as in free as in free. And I'm gonna show you what we built to prove this. Let's dive into it today.
00:47Welcome to Starter Pack. I'm Spencer, and here at Starter Pack, we love to build custom software solutions for companies. With a decade of executive leadership as a fractional CTO on twenty five years in software development, I helped transform tech teams and products, including building out custom AI solutions.
00:59Now look, the AI tooling market wants you to believe that renting access to a model like Cloud Code or Codecs is just the cost of doing business now, but it's not. The gap between Cloud AI and Local AI is way closer than you think. I've got a tool here today that's gonna help you, and it's absolutely free.
01:14I'm not selling you anything. Before we get into it, one of the biggest things you can do is drop a comment, and as always, make sure you follow what I'm gonna give you here. So researchers who reverse engineered Cloud Code found that the actual AI decision logic is around 2% of it.
01:26The remaining 98% is context pipelines and a lot of other pieces. This matters because it reframes the entire conversation. The gap between Cloud and Local AI is not a magic gap.
01:35It's an engineering gap, and we've helped close that gap with a free service we're gonna give you guys today. Engineering gaps close when developers decide to close them. Most people try to local model once, compared it to one of the others and said this wasn't quite ready, but what they were missing was they were actually just missing the harness.
01:50The models are are moving fast, and this is Quinn. This is DeepSeek. There's a lot of the other models, and there's a lot you can do.
01:56Gemma four was released. And here's something that doesn't get talked about enough. The cloud model behavior shifts over time, and most users absorb those changes.
02:03Guardrails tightened. Output quality drifts. We've seen clawed code dramatically drop over the last few months.
02:08Prompts that worked reliably last quarter start producing different results this quarter. Now you adapt and you rewrite your prompts and you try to modify things with skills and you do this thinking you can beat it, But the biggest problem here is rate limits are absolutely stifling innovation. Price restructures,
02:23capability rollbacks. When a vendor controls your model, they control your whole workflow. You've gotta take that control back.
02:29That's why we launched Open Mono Agent. It's an AI that you don't have to meter here. Right?
02:33Unlimited tokens forever. Now you're saying, what are you selling me, Spencer? Hear me out here because this is your machine, your agent, and you use it from anywhere.
02:41Now it's a kicker. So I know a lot of times you've probably tried to set up a local model in the past and thought, man, is way too complex, way too difficult. Look, guys, it's one copy.
02:49You copy this, paste it in. It's gonna give you three options. You can either run the whole stack on your machine, which I'm gonna show you here in a minute, or you can install the inference on one machine.
02:57Now you think, oh, I've gotta have a really expensive h 100 that's $30,000. Absolutely not. We are doing this on standard hardware.
03:04See this stuff back here behind me? These are normal gaming machines. Very low end by today's standards in a lot of cases.
03:10This one right behind me just has a thirty ninety in it, guys. This is about a thousand dollar gaming machine. You can find these on your local Craigslist, Facebook Marketplace, like wherever you go pick up your stuff.
03:18But even more importantly, we've built them out on these little nook and bricks here that actually work, then give you about 20 tokens per second. That's very comparable to what you get with Claude Code or with Codex. So for about 20 tokens per second, you can own the whole hardware, and this is very reasonably priced.
03:32And I'm gonna show you something here at the end that you're definitely gonna wanna make sure you stay at the end. Now how do you get started on this? It's really easy to start.
03:38Copy this, paste it, run it. I wanna go through some of the features here with you because one of our manifesto here is that AI shouldn't be a subscription that you rent. It should be infrastructure that you own sitting on your desk, serving your code, answering only to you.
03:50Now this is local first always. That means you own everything. The model run, everything to the top, to the bottom of the stack.
03:56Nothing goes across the cloud. Unlimited tokens. You want this thing to run for four days?
04:00All you. Like I said, these things here run on about 25 watts. K?
04:04The other thing is we've built this so that they're sandboxed by default. You get a Docker native so your agent mounts your project in, and it doesn't escape. Permission gates are right inside that Docker, and it's fully 100% open source.
04:16Don't believe me? Well, here you go. Here's the whole project,
04:19open source, right here on GitHub, all for you. We have a massive amount of documentation that we've worked on, and the whole project is ready to go. This is not just proof of concept, guys.
04:28I'm gonna show you a working demo here in a minute. But this is the full thing. You can go in and read the documents.
04:32Each of these go into the different parts. And I'm gonna go through some of those parts with you here, wanna but talk through some of these. So first of all, it's embedded inference zero setup.
04:39You literally run the script. If you decide to run it on two machines with the inference and your agent, then you can run the agent on your dev laptop and run your inference on the machine back home, and we connect them with a relay server. It's trivially simple.
04:50TUI is our interface that we use, and so it's built for long sessions, you can continue to run it indefinitely. It's Docker sandboxed. We have over 20 different MCP tools built in.
05:00It's built for .net, focused on dot net. So we actually built it with csharp.net, and you'll see all the code, and it is blazing fast.
05:07LSP for c sharp and TypeScript. Playbooks, this is our version of skills. Playbooks is dominant over skills.
05:14There's so much more you can do with a playbook. These are typed composable stateful workflow automation, step sequencing, gates, and templates, not just markdown recipes. This is not just one flat text file, folks, and it's very easy.
05:26The agent itself will actually help you write these. Now we also have our dual box mode that I talked about here, and one of the best parts is we're hosting a free Relay server where you can actually go and sign up and set up your Relay between the two boxes. Absolutely free, totally encrypted, 100% secure.
05:40We're not getting any data from you. We'll get your email so that that's your ID. Other than that, like and we're not doing anything those.
05:46Next is you have persistent sessions. We actually are saving your sessions in JSON. They stay on your machine.
05:51We're not saving them. I'm sorry. They're saving on your machine on the agent machine.
05:54This thing runs we've probably installed it about a thousand times, and I'm not exaggerating here. You wanna see some of these servers behind me? These are about half of the dev servers that we have.
06:01See the ones underneath the desk over here, the ones over. We have about 20 to 30 different type of workstations varying from boxes these size up to 50 nineties. We've not done anything larger than fifty ninety on this.
06:12And with that, we have this incredible set here. So you can do these little bricks, right, which are Ryzen nine seventy nine forty HS, get about 20 tokens per second. Thirty ninety is about 50.
06:22The forty ninety is we got a little typo here we need to get fixed up. It's about 60 tokens per second, and the 50 nineties are running closer to a 100 tokens per second. We actually have tested with five different developers all running against one fifty ninety at the same time using the dual box setup.
06:35So you can go through and compare how it stacks up against other things, but really, you're truly up and running in one single command, like two commands because you're install, and then you're up and running with MonoAgent. This is yours, guys. It's open.
06:46It's local. It's yours. It's forever.
06:48Incredibly fast. Right? We have absolutely worked to optimize this.
06:51100 open source here, folks. Biggest favor that I ask is you just leave us a star on this, because as you know, one of the best ways you can get it is to help us to get the stars in there. Now we can go through more of these features, and I wanna dive into a couple more of these features with you here before I go on a little bit more.
07:04CTASo with this, uh, I wanna talk a little bit about playbooks. Your agent needs guarantees, not just suggestions.
07:09CTASo a skill is like a suggestion. It's like, hey, if you kinda wanna do this thing, go over here and act like this. A skill actually tells it exactly how to run.
07:16CTAIt's not just a prompt. A skill is a prompt. The model can drift, skip, or misinterpret.
07:20CTAA playbook gate is code. The executor calls this, and the LM is not in the loop. It cannot skip it, hallucinate past it, or decide it knows better.
07:28CTANow I'm telling you guys, this is way better. We have tons of documentation around this. We have really worked hard to make this work really, really well, and we're really incredibly proud of this.
07:37CTASo if you're doing something with OpenClaw, this is gonna run circles around that. Now, last but not least here, one of the things that I've been talking about.
07:44CTAGo and make sure you sign up. We're doing a free giveaway. Sign up here because we're gonna give away one of these Ryzen boxes where you can run your own inference box at less than, like, I think they're about 25 watts.
07:55CTASo it's pretty incredible. These are amazing. You know, believe me, I'm not trying to sell these.
07:59CTAThis is just a link to Amazon. Right? You can go get one of these boxes yourself.
08:03But you can see that this is a great opportunity, and we are giving it away. I just want my goal here, as the manifesto states, because you can see the manifesto. My manifesto is that it shouldn't be a subscription, and this is what we're trying to do.
08:14I wanna give the opportunity for people to be able to set up and learn how to use AI locally. See, I have a lot of beefs with the big frontier models, and at this point, we all do. We have a lot of beefs with them.
08:24We have a lot of complaints. Open MonoAgent is a terminal native coding agent that runs entirely on your machine powered by local LLMs at zero cost. It's written in csharp.net,
08:34which was a a deliberate choice, not a limitation, because AI tooling should be built like infrastructure, not like a weekend side project. It installs with a single command, and guess what? It runs on every platform.
08:44The agents will run on Mac OS, Linux, and on Windows. Right? So you can install the agent, run it locally, and then run the inference on some other box like one of these and run it and at a very low affordable cost, and you own the whole stack.
08:58Did I mention that you own the whole stack? It installs with a single command, and it's model agnostic. You can change out the model, but we have the models already all tweaked for you.
09:06So if you use the CPU, we have a very specific 3.6 model. We have another QUEN 3.6 model. These have been tested and work
09:13fantastically. There's no telemetry, no tracking, no free tier, like, the whole thing free. I want you guys to be able to use this because go to monoagent.ai
09:21right now and install the command. It's right there on the landing page, and it could be running in minutes. Now every prompt you send to cloud AI tool leaves your machine, and that's not paranoia.
09:30That's just how it works. Right? For personal projects, that's probably fine.
09:33But for client code, proprietary algorithms, anything under an NDA, that's real exposure. Open Mono Agent has no server to exfiltrate your data to because everything runs locally on your hardware. There's no, well, we may use interactions to improve our models.
09:46There's no terms and services buried there. You go pull down the code, you can see what we're doing. You can even modify it.
09:51Do what you want with it. Help us improve it. Do a pull request.
09:53Right? We're gonna continue to build on it. We have a lot of huge plans.
09:56Next week, we're rolling out our mobile apps, which will allow the mobile phone to then be able to be in control of it. We then are also rolling out a Versus Code extension that's gonna continue to improve upon this. So when your AI stack is local, the compliance conversations simplifies dramatically because there's nothing leaving the building.
10:12Now we built OpenMonoAgent in csharp.net. This was by choice.
10:16.Net is cross platform, it's production grade, and has one of the most mature ecosystems for system level tooling in software development. After a long time in the industry, I've watched developers use NPM packages, use Python, and I've seen just a lot of soiled projects. C sharp gives us type safe long term maintainability
10:33and performance characteristics that matters when you're building something that's meant to run for a long time. Python's fantastic if you're doing experimentations, but C sharp is what you reach for when you need a real production thing to stand up.
10:43Now if you wanna contribute, fork it, extend it, open a pull request, help us build it out. We're continuing to add more things to it.
10:50You can even go see the road map in there, and I have a team that's dedicated to this. I have multiple senior developers. I have a full time PhD AI engineer.
10:58I have multiple other junior developers. I have a large team that's working to continue to build this faster. Why?
11:04Because we love to build custom software solutions for people, especially stuff that's built that you own. The fastest way to kill an open source project is to make it a three day configuration exercise before you get into anything in front of it. Open Mono Agent installs with a single command, not because it's simple under the hood, but because we engineered the setup to be invisible.
11:20This took a lot of time. Developer time is expensive, but if the tool costs more than the configuration and saves you in the first week, you've already lost in the argument.
11:27So we made onboarding a first class concern because that's where most developer tools go to die. So get it running, connect your local LLM, and go to openmonoagent.ai
11:36because the command's right there on the front page. Now every piece of foundational infrastructure in modern software starts as something that somebody gives away. Linux was dismissed by enterprise vendors as toy that would never handle real workload.
11:47Now you know how that ended. Right? Git replaced a version in ClearCase, not because it had a better sales team, because developers adopted it and was genuinely better.
11:55The pattern repeats across every technology generation. Incumbents call it a toy, developers use it anyways, and eventually becomes the default. The companies charging you for AI coding tools today are going to call local agents a toy, and they're gonna be wrong.
12:07They're betting on a pattern one more time, and I'm gonna take that bet every single time. Now, OpenModel Agent doesn't care which LLM you run, but we've picked out some really good ones that we've tuned for specific setups. Vendor lock in on AI models is not just a version of a problem the software industry has been solving for decades with varying success.
12:24When a better model ships next month, and one will, you swap it out with one rebuild. You don't even rebuild. You just literally swap the model out.
12:30The inference then continues to run on the new model, and you're off and running. The agent is the layer. The model is the engine.
12:36Changing engine should not require you to buy a new car. So that flexibility compounds over time in a way that single model benchmark scores can't match. So this is what democratizing AI really looks like.
12:47And it shows up in a lot of marketing copy in other places, but real democratization means a developer in Nairobi has the same AI coding skill developers that they have at Google. No credit card required.
12:57It means students building their first serious project has access to the same category of tools as a funded startup. It means developers in countries with weaker purchasing power aren't priced out of the tools that they need to be able to compete in the market. Open Mono Agent is free because free is the only price that actually is universal.
13:12There's no purchasing power here. So that's a thesis. Not free trial, not free tier, free, permanently, because the mission requires it.
13:19Now we may work into some larger things in the future, but mostly this is an opportunity for us to be able to work with folks to be able to show and demonstrate our understanding of how AI works. So Open Mono Agent isn't just free to use. It's free to study, modify, fork, redistribute.
13:32And again, I'm putting my money so much where my mouth is that I'm even going to give one away for free. Right? We're doing this.
13:40We're gonna announce it on May 15. So go and get signed up because this is free as in free as in free. Now if you're a dot net developer or a C sharp practitioner or someone who wants to help us better the local AI tooling, dive in and do a pull request.
13:52We welcome it. But the biggest thing I can ask is star the repo. Open an issue.
13:55Tell us what's broken, what you wish it would do. The project's gonna grow as fast as the community decides that it should, And that's always the best open source project out there. And I'm willing to commit some resources to this.
14:06The companies charging you for AI coding tools are going to call Open Mono Agent a toy. And I'm just gonna plan on that. But again, remember Linux was a toy.
14:14Git was a toy. The entire foundation was a toy. These tools got called toys by incumbents who had nothing, who had something to lose from developers owning their own stack.
14:21Does that sound familiar? Now, I wanna give you guys just a little quick demo. This is gonna be super fast because we've already gotten really long on this video, but I wanna be able to show you guys how well this works here, okay?
14:31So I'm firing this up here. So we can see here that I've got a local project, and this is actually a little small snake game project that that the tool actually wrote itself, but I'm not gonna write that one for you here now. So all we do is type open
14:42mono agent. Okay? So kinda like writing Claude.
14:45Boom. There we go. Coder review found, but no tool graph.
14:48So we'll talk about Coder review graph. This is a great powerful tool. We'll talk about that another time.
14:52So let's say, review the project. Give me feedback
14:58on what we need to improve. Okay, so firing this up, you can see it's already firing through tokens. Looks like we're burning about 41 tokens per second.
15:07This is on this machine that was behind me here, so if you look back behind me, this is on this machine here that's running the 3,090. Alright, now let's check the build state. So again, I'm running this and so I'm gonna say yes, I want this to, you know, to run.
15:19Okay, we're gonna tell it to give it access. Okay, so it's a minimal ASP dot core static file server, give this kind of the outline. It says, hey, this should be you know, so it's telling me to do some Canva wrappers.
15:30K? Game state uses all let's, you know, let's do global, so it's gonna give us some suggestions there.
15:36Oh, hey, there's no gitignore bin, right? So we've got some problems here, right? So it's telling us it's already doing the you know, going to do some modifications on this.
15:43So it's saying, for code quality, game over doesn't return anything, but it's called return game over. Right? So it's giving us examples.
15:49So I can say please now, because you know Sam told us not to say please because, like, that burns out the tokens. Guess what? We don't care about tokens.
15:54I already burned 38 tokens, 38,000 tokens, nobody cares. Please fix number one and create a git repo for this project.
16:04Okay. Let's let it go to town here. It's probably gonna ask me to do a couple it's probably gonna, you know, ask me for some prompts, because things like creating the git, was probably, see, it's gonna take a file permissions.
16:13So we can also set these two. My head's in the way, but down here in the other corner. In fact, let me get my head out of the way.
16:18We already got a prompt here. You can actually change the different modes on it, just like you would expect from any of these models. So we can see that we've, you know, it's prompting us for some of these, but we have some of our different slash commands that we can change out.
16:29It's still asking for various different permissions, and I just clicked off of one of them. So let's say yes.
16:35Oh, let's do a instead. I keep saying yes. So the git ignore isn't taking effect because this, right, and so no commit yet, so git reset head doesn't work.
16:43Let's use this. So you can see it's working through this. This would be what you would expect from like a cloud code, right?
16:48And this is an example of, you know, what you would expect from that. So there it goes. There's our Git repo, right, initiated the Git branch, did all this, committed it, boom.
16:56And all of this is running inside. So let's Control C out of this. Hey look, we have our Git, right?
17:01CTAGenerated the Git, added all this. So you can see that all of this is that it's working great.
17:06CTASo I can go through and I can demo this for you a long time, but really the biggest thing is pull this down, try it out, sign up for the free giveaway. Go for this. This is one of the big things we're very excited about, openmonoagent.ai.
17:17CTARight? This is a great opportunity for you to be able to dive in, learn how AI works, look at things under the cover, and make sure you are running your own local AI instead of giving away all of your data to these large data providers regardless of what their terms and services say. So go check it out.
17:31CTADo us a big favor. Leave a star there. And as always, make sure you like and subscribe.
17:34CTAI'm gonna be teaching about this over the next week and teaching about some of the different features of it. So make sure you follow along because we're gonna be building a lot of this. And as always, if we can help you with custom software solutions, go check out starterpack.com.
17:45CTAAnd otherwise, we will catch you tomorrow.
— full transcript
§ 05 · For Joe

The stat-hook + manifesto format.

Self-host revolution playbook

One precise, counterintuitive number does more work than five minutes of explanation — find yours and open with it.

  • Find JoeFlow's 1.6% equivalent: cost-per-hour of Whisper API vs. local Whisper over 12 months, or what percentage of a SaaS tool's code is actually the AI vs. the scaffolding around it.
  • Pair the stat with the manifesto line in the same breath — the number creates the opening, the manifesto closes it.
  • Use the Linux/Git toy pattern for the self-host revolution arc: every tool that's now default infrastructure was called a toy. Self-hosted Supabase, Nginx, PM2 were all toys. The $6 Stack is next in the sequence.
  • The playbooks-vs-skills framing (code vs. suggestion) is the right way to talk about agent reliability for JoeFlow sessions — JoeFlow skills could be described the same way.
  • Spencer's demo ran 41 tok/s on a $1K gaming PC. If JoeFlow ever does a local Whisper benchmark, lead with tokens-per-dollar or minutes-per-dollar vs. cloud.
§ 05 · For You

How to actually own your AI stack.

For developers tired of metered cloud tools

Local AI is not a compromised fallback — for coding tasks, a $1,000 gaming PC running open-source models can match cloud agents token-for-token at zero marginal cost.

  • OpenMonoAgent installs in one command: visit openmonoagent.ai and paste the command on the landing page.
  • You can run the inference engine on a separate machine (even a $200 mini-PC brick) and the agent on your dev laptop — they connect via a free encrypted relay.
  • Your code never leaves your machine — important for anything under NDA or client-confidential.
  • The model is swappable: if a better open-source model ships next month, swap the engine without touching the agent.
  • It's free, open-source, and the giveaway of a Ryzen inference box runs through May 15 — sign up at openmonoagent.ai.
§ 06 · Frame Gallery

Visual moments.