WEBVTT

00:00:00.000 --> 00:00:22.415
Hermes went from zero to 40,000 GitHub stars in forty six days and to compare OpenClaw did it in sixty one. So for Agencik systems this is the fastest adoption ever seen on GitHub and when you look at what they do, the memory systems, the identity layers, and the self learning loops, you can understand why. But before I installed it, I did something most people don't do. I went and read through the issues

00:00:22.575 --> 00:00:31.330
and pretty quickly I realized something. The off the shelf systems are fast to begin with. They're fast to start. But you inherit somebody else's architecture,

00:00:31.570 --> 00:00:32.450
their assumptions,

00:00:32.450 --> 00:00:37.010
and therefore their problems too. You can't fix what you don't understand underneath.

00:00:37.010 --> 00:00:46.305
So instead of replacing Claude code, I rebuilt the parts I actually wanted inside my own setup. And honestly, it turned out ridiculously good. Not because it's better than Hermes,

00:00:46.545 --> 00:01:05.570
but because I actually understand every single layer now. And I built it in a modular way so I can swap pieces in and out, reuse workflows across projects, and evolve the system as the space changes. So in this video, I'm gonna show you the exact Hermes features I rebuilt inside CoreCode and the parts I deliberately skipped and why understanding the architecture underneath

00:01:05.735 --> 00:01:24.900
gives you way more leverage long term than just installing something like Hermes blindly. So let's get into it. But before I show you what I built, let me show you the three hidden costs of installing something like Hermes off the shelf to save you some time and pain later. So cost number one is that you inherit assumptions that you didn't even know existed in the first place. So as an example,

00:01:25.140 --> 00:01:34.395
the infamous self learning loop on Hermes, the bit that everyone celebrates has no external guardrails. So effectively telling it to build its own skills automatically

00:01:34.475 --> 00:01:50.050
then grade your own homework. So we've got the self validation problem. The same model that writes the skill is also the sole judge of its correctness. So without that external validation step, it basically can't see its own blind spots. It thinks everything is good. And what that means in practice is it can quietly overwrite

00:01:50.050 --> 00:02:03.035
the changes that you've made to make your skills better with worse versions and has no version control or audit log. So you can say goodbye to your good hard work. So cost number two is that you can't fix what you don't understand.

00:02:03.355 --> 00:02:13.100
So OpenCLaw is one cycle ahead of Hermes. So the first version came out in November. The first version of Hermes came out in February, but it's the same category of product. But when you look at OpenCLR,

00:02:13.180 --> 00:02:14.860
we've got over 200

00:02:14.860 --> 00:02:21.260
vulnerabilities identified and filed since February. You can see that we've got a ton of critical and high vulnerabilities

00:02:21.260 --> 00:02:24.940
that exist for OpenCLR. And a security researcher even found 386

00:02:24.940 --> 00:02:37.295
malicious packages on the skills marketplace from a single threat actor. So when something breaks at this scale, when something is critical to security, you're left debugging somebody else's code because you don't understand the assumptions underneath

00:02:37.375 --> 00:02:50.310
or their choices they made when they were building it. So cost number three then is it doesn't scale across your business. So we've got Paul here who's a nontechnical CEO. He spent over a hundred hours and over a thousand dollars testing OpenCLAW

00:02:50.310 --> 00:03:08.315
over two months. He wanted to understand if the hype was real, if it could do things that personal AI systems promised they could do, but basically later found that the bugs and security gaps that he identified disqualified it being from any sort of usable. He's now moved on to Claude and has replicated a bunch of the functionality, 30% of OpenCLOS features

00:03:08.830 --> 00:03:38.225
in the last couple of months. So Hermes may be faster to start, but your own setup is actually gonna be faster to scale. And the hidden costs of off the shelf software like OpenCraw or Hermes only show up once you're already committed and in the process of building with them. So let's get into what I actually built and what parts I lifted from Hermes. So the first thing that Hermes actually nails and the first thing I therefore rebuilt is the identity layer. So that agent needs to know who you are, who your business is, and what you stand for. Otherwise, every AI output

00:03:38.650 --> 00:03:59.285
is gonna sound like an AI output. So in Hermes, this represents itself as a memory dot m d file and a user dot m d file. It's a super simple setup and designed for one individual client or a single business. But that's also where its limitations come in because it's assuming that you're one person working on one set of stuff, and there's no concept of switching brand contacts, client contacts,

00:03:59.365 --> 00:04:09.470
or business contacts inside a single setup. So if you wanted to run Hermes for multiple clients, you'd effectively have to install for each individual client its own Hermes installation

00:04:09.550 --> 00:04:18.750
with its own memory and user dot m d files. So if you run an agency or multiple clients or even just two distinct brands of your own, you either bake it into one identity and one system

00:04:18.910 --> 00:04:45.740
in one install and live with that or you spin up entirely separate Hermes installs and each one of those has its own memory its own skills and its own learning loop. So I'm sure you can see how that embeds a maintenance problem because the skills aren't shared between the clients even though some of the procedures might be repeatable. And it's not a direct knock on Hermes, it's just what they built it for but it's not fit for purpose for a business owner running multiple clients or multiple brands. So the way that we've built this is to effectively

00:04:45.900 --> 00:04:56.355
inject context in the same way. So we have it for our own identity inside a user. Md file, have memories inside a memory. Md file but we also inject shared brand context

00:04:56.355 --> 00:04:57.635
like voice

00:04:58.355 --> 00:04:59.315
ICP

00:05:06.310 --> 00:05:11.270
So each individual client has its own set of shared context, their brand voice,

00:05:11.350 --> 00:05:26.565
their ICP, their positioning, and their visual identity. But they're still able to actually access and share the procedures or the skills across those client folders. So we've effectively built the folder structure so you can handle multiple clients or multiple brands but still share the relevant shared context

00:05:26.645 --> 00:05:42.890
so you don't have to maintain it in multiple places. It's just one single install versus Hermes for multiple clients would be individual installs that each have their own memory and learnings. Now what Hermes actually does is injects the memory dot m d and user dot m d into the start of every single conversation which drastically improves

00:05:43.050 --> 00:05:57.975
the short term recall of important information. So let's on go now to talk about memory, which is probably the most important feature after this shared brand context for getting better results. And I've got to give it to Hermes. They've actually really thought through the way you store, inject, and recall

00:05:58.240 --> 00:06:23.565
information at various points in the life cycle. Now before we move on to that if you're enjoying the content so far then drop down below, hit the subscribe button, hit the like on the video, it's massively helpful to me. So let's get back into the memory system that Hermes uses that's actually very very powerful. So when you consider memory, we've basically got three levels here. We've got storage of context, then we've got how does that context actually get injected into every conversation, and then more long term, how do we recall memories

00:06:23.645 --> 00:06:32.400
that aren't recent but are still important? The ones that we have to go back and search for it. So simply put, Hermes auto saves and summarizes conversations

00:06:32.400 --> 00:06:43.595
every single conversation turn. It then injects important memories back into every conversation through the memory dot md, the user dot md, and sold.m d files. And that is capped

00:06:43.675 --> 00:06:45.595
at, I think, 1,300

00:06:45.595 --> 00:06:49.195
tokens, which means we're only loading in a limited snapshot

00:06:49.195 --> 00:07:00.160
of recent important information for every session. But its biggest limitation is when you go back to actually recall the information that has not been injected into that recent memory and that's because it's searching by keyword

00:07:00.240 --> 00:07:10.275
and not meaning. So we might be able to recall exact long term memories if we remember the words we used when we were talking to Claude but it's much harder if we can't exactly remember

00:07:10.275 --> 00:07:18.515
what words we used when we talked to Claude about it, which is pretty likely. Right? And kind of rendering long term recall in this case a bit useless.

00:07:18.595 --> 00:07:35.150
Who remembers the exact words they used with a client six months ago in that conversation they were having with Claude. And this is where it gets really powerful when you're building a custom setup because we can take the stuff that we like about Hermes or the stuff in green like the fact we are capping a memory dot m d file at 2,500

00:07:35.150 --> 00:07:42.585
characters or 1,300 characters and injecting that as a recent memory into the conversation as a memory dot md file. Then where there were limitations

00:07:42.745 --> 00:08:09.585
like in the recall where we only had keyword search we can take other memory systems like memsearch in this example and make recall much more powerful and that's exactly what we've done with our own agentic operating system. So we're still using some patterns of the recall from Hermes where we effectively check that injected context first but then when the information is not found in that local memory we go deeper and actually search by meaning and not by keywords And that's part of the MemSearch architecture,

00:08:09.585 --> 00:08:25.210
not the Hermes architecture. So you can plug and play the bits that you like when you build your own custom system and make it bespoke for your context. Say you needed verbatim recall, you might implement Mem Palace instead of Mem Search for example. Now here's the bit where Hermes gets controversial

00:08:25.370 --> 00:09:33.165
which is that self learning loop we talked about earlier. So one of Hermes biggest selling points is the self learning loop. So an agent finishes a task it's gonna write itself effectively a new skill every time and use it the next time, which sounds brilliant in practice. And the first time it happens, it's probably pretty special. But what happens by the tenth skill or the twentieth skill when you've made tiny iterations on effectively the same process? So effectively what we're doing is we are starting on day one. We are telling it to do a specific task. And then a couple of weeks later when we come back to do a similar task, it's gonna create two skills that are fairly similar, have a similar description, but are kept as separate skills, maintained as separate skills because it's not gonna capture the nuance in our process. And we also have poor visibility of all the skills that we have existing already, so it's just gonna continue to create more skills. And each one is gonna capture that approach at the moment in time with that context for that specific situation. So over time you risk ending up with 15 skills that all do roughly the same thing like LinkedIn post v one, v two, LinkedIn post for this client this client instead. It posts writer one and two, all with slightly different context

00:09:33.245 --> 00:09:42.765
and slightly different bits of logic baked in. They've all got similar descriptions, it doesn't know which one to use at any which time. Then when your brand voice shifts or when a client's positioning changes,

00:09:43.180 --> 00:10:39.005
you've got like 15 places to go and update and maintain it. So yes it's absolutely faster to build this way initially but it's a hell of a commitment to actually maintain properly and basically therefore impossible to scale across multiple clients without the whole thing turning into a bit of a mess. Now we've created personally in house in our own AgenTek OS a whole logic around how to tackle this, and we call this skill systems. So a skill shouldn't be just a one off task. A skill is a modular component that feeds into a skill system. So each one does one job. It lives in one place. It has a consistent named format and gets updated in one place and all the updates propagate to the rest of the system. So when you want to do something complex like write a LinkedIn post in your brand voice for a specific audience in a specific format, you don't create a write a LinkedIn post skill that bakes in all of these things. You actually have the voice, the ICP, the formatting already maintained as separate skills and then the LinkedIn post system just grabs the correct context,

00:10:39.430 --> 00:10:58.895
the up to date context from one single file for the voice, for the ICP, and the formatting. And then this skill or skill system prompt is effectively chaining those together in the right order. So when your brand voice does shift, you just have one file to update and then every skill system that uses that is gonna pull from that single file. So it's infinitely

00:10:58.895 --> 00:11:20.440
maintainable and scalable. So Hermes is faster to build the first skill but building your own approach is gonna be faster to build the tenth, the hundredth skill system that depends on the actual skill and infinitely easier to maintain. So it begs the question, should you build this for yourself or grab something off the shelf? Well, if you install someone else's stack, you've basically inherited their assumptions about identity,

00:11:20.915 --> 00:12:09.560
memory, about how their learning loop should work, about whether you'll need multi client context. And some of those assumptions will work for you, and they might work for you. And Hermes is great as an off the shelf comparison to something like OpenCLR, which was a lot more buggy. But some of those assumptions might not work for you and then you're left actually trying to maintain or fix the broken parts versus actually just building it more slowly for yourself and understanding the assumptions and making it more scalable. So if you are building it for yourself, you're making those choices on purpose. Yes. You will move You'll get some of it wrong but every layer is something you can see, you can edit and actually reuse. You can build it in that modular way. And when something does break, you'll have better knowledge of how to actually find the part that's broken and fix that so it's maintainable in the future. So that's effectively the trade off. It's gonna be faster to start with Hermes but faster to scale with your own built setup.

00:12:09.640 --> 00:12:55.680
And neither is gonna be the right answer for everyone. Right? It's just a personal choice. Now I'm definitely not saying my version of the Agenetic operating system or every custom version is better than Hermes in every way. Absolutely not. But I understand exactly what assumptions have been made under the hood and I can build on it in a modular way, in a slower way that's gonna end up being completely custom to my own setup. So if you want my exact Agentic OS, it's inside the AgenTek Academy in the description below. And it's basically installed in one line, get it up and running today. And we run through exactly what's inside the OS and all the logic so you're not just left installing something again without understanding the assumptions. You can plug and play the parts you like and leave out the stuff that doesn't work for you. Now if you want to see more around what we've got inside our agentic operating system, watch the next video. Thanks for watching.
