WEBVTT

00:00:00.000 --> 00:00:07.200
When you combine Hermes Agent with tools like DeepSeek, you unlock capabilities that 99% of people don't even realize exist.

00:00:07.200 --> 00:00:14.560
And I'm gonna show you exactly how to connect Hermes with the world's most powerful models, which means that you can use Hermes from $0,

00:00:14.655 --> 00:00:23.855
build as much as you want to with no limitations or rate limits, a my new system for using Hermes' most powerful feature that will make you 10 times more productive,

00:00:24.095 --> 00:00:36.770
even if you're a complete beginner. And if you're new, I'm Jack. I built this on my life tech startup with the Gizling customers. Now I'm building my own AI businesses, and I just share the stuff that actually works. So if you haven't already, grab that beautiful coffee,

00:00:36.930 --> 00:01:16.765
and let's dive straight in. Beautiful. So let's talk about Hermes then and exactly how we leverage this with DeepSeek. As I'm sure you know, if this Hermes thing is new to you and you wanna have to set it up, I'll put a link on screen somewhere so you can get it all rocking and rolling. And now officially, can actually jump in and have a look at the Hermes plus DeepSeek. And I'm not just talking about DeepSeek, although DeepSeek is gonna be doing some heavy lifting for us, but you've gotta see why and why I haven't seen many people talk about the strategy I'm gonna cover here. So first thing we have to understand though, before we even take any step, is that software now costs less than minimum wage. So essentially, the balance has shifted such that we can employ AI that are as a good, if not better than junior developers,

00:01:16.845 --> 00:01:26.260
to do things for us over time. So the question simply becomes for us, how many minutes of the day do we have basically AI systems working for us, building,

00:01:26.500 --> 00:01:49.895
improving things whilst we're going about our day to day things using Hermes. Now we talk a lot about building no code software with Claude code. We talk about Hermes. Remember, they're two very different things. Claude Code lives inside repos. It's got a tight tool loop, session bound, and it's built for code bases. Hermes lives across our entire life. It's persistent. So it learns from every task that we give. It is self evolving

00:01:49.895 --> 00:01:58.890
in that sense. The more stuff we say, the better it gets. It schedules background jobs and it builds essentially the idea of Hermes is a deep model understanding

00:01:58.890 --> 00:02:11.755
of who you are. And the better it knows you, the better it can help you with your life. And the whole point of systems like this connect Claude Co. To Hermes quite nicely. So let's talk about the reflection of pair. Okay? So here's a learning loop. Every task teaches Hermes

00:02:11.755 --> 00:02:12.555
essentially

00:02:12.555 --> 00:02:21.560
who you are, and that's how it actually gets better and understands more things like this. Hermes itself works whilst you sleep, so we're gonna give it something in this video that shows you how you can leverage

00:02:21.880 --> 00:02:31.320
deep seek overnight to do some pretty exceptional stuff for when it's valuable. So the idea with this is that effectively we pick our brain. We pick a frontier model, the most intelligent

00:02:31.555 --> 00:02:35.555
and effective model to be what we'd call the conductor, the organizer.

00:02:35.635 --> 00:02:52.610
I have personally found in my experience, Claude Opus 4.7 is the best model for them. And the idea here then is that what Hermes can hot swap between every single frontier model and use the right model for the right job. Specifically, I'm gonna show why DeepSeg v four is so powerful

00:02:52.850 --> 00:03:32.235
in this setup. I've seen a lot of interesting ideas about running it on your computer, and that is really fun. Sometimes that you'll find that your literal, like, MacBook will melt through the table because, like, the amount of effort it requires. So there's some different trade offs to be aware of here. And we can even see here just how powerful DeepSeek v four is. Now the message from this graph, because we don't wanna date a benchmark Like I always say, we've seen it in the profile. We've gotta take her on a date first, see what's up. It's not trying to say that DeepSeek v four is better than OPUS 4.7. Of course, it isn't. But the question is, would you pay 1% of the price for 95% of the value? And if we can truly get, like, a 100 x output and we don't need that full max level redlining

00:03:32.235 --> 00:03:33.115
brainpower,

00:03:33.115 --> 00:03:34.475
what could we realistically

00:03:34.475 --> 00:03:35.275
accomplish?

00:03:35.275 --> 00:03:52.380
That's kind of the question here. And you can see just how comparable they are to each other and how powerful and how well DeepSeek v four has actually physically done here. And this is the benchmark just for your reference. And so the idea is it's a 100 times cheaper, and we get the same job done overnight, which is freaking fantastic. And you can see here, $75

00:03:52.380 --> 00:03:55.455
per million tokens out versus immediately 87¢.

00:03:55.455 --> 00:04:02.495
I know what you can buy with 87¢. You can't even buy a hamburger these days without the amount of money. And so the idea here is that we're gonna tag in OpenRooter.

00:04:02.495 --> 00:04:11.310
OpenRooter is overpowered because once we give it access to OpenRooter, we can see our usage, we can track our usage in a beautiful dashboard. And on top of that, we can access

00:04:11.470 --> 00:04:27.215
all of the models and switch it dynamically whenever we want to without the need to have and control a thousand different keys and track usage over here and track usage over there. In other words, very powerful. It's one key. It unlocks everything. And this brings us nicely onto the idea of the multi brain model system,

00:04:27.295 --> 00:04:43.290
being that essentially every model has its own strengths and weaknesses. And we can build systems that bring in the best model for the particular job, meaning it runs twenty four seven overnight. And so first of all, two models that you absolutely should have in your Hermes agent. Number one is going to be OpenAI's

00:04:43.290 --> 00:04:48.555
ChatGPT because for your $20 subscription, you effectively get to use your ChatGPT,

00:04:48.555 --> 00:04:53.035
which is gonna be insane amounts of value and gives you access to ChatGPT 5.5.

00:04:53.195 --> 00:04:56.075
The second is to Gemini CLI,

00:04:56.075 --> 00:05:11.090
and with just simply an email address and a Google account, we can use Gemini. So this is an example of leveraging a model for a specific thing. I'm Then gonna show you how Deepsea comes to this and makes it incredible. So check this out. I can say, for example, hey, though. I would like you to use the Gemini

00:05:11.090 --> 00:05:45.720
CLI to go ahead and look at Jack Roberts's last video and give me a breakdown analysis visually of what you see in the first ten seconds and send that one off. And CLI just stands for command line interface. It's just a very quick and the easiest way to connect to any service. If you don't have the Gemini CLI installed, all I'm going to do is come over to this GitHub repo right here, click on code, click on copy of that code, and then head over to your language model of choice, could be the called codec, codecs, or anti gravity. And so for example, if I'm here, I'm just gonna get this command which is, hey there, I'd like you to install the Gemini

00:05:46.095 --> 00:05:56.575
CLI onto my computer. Okay? And all it's gonna do is go ahead and grab that GitHub repo. And as you can see, I've already gotten mine installed. And this CLI works exactly the same way with GitHub, with OpenAI,

00:05:56.575 --> 00:06:02.260
with Vercel. It's incredible. Now let's come back over and see how it's gotten on. So check this out. It's user CLI. Fantastic.

00:06:02.420 --> 00:06:33.340
It's got my YouTube channel, and it's literally broken down visually all the stuff it can do because Gemini is so powerful at breaking down video. It is the multimodal model, and we can now basically tag in. And this one of cool thing about home is it can bring in any model that wanted to, and this is running on my local computer. Right? And so essentially, I have the Gemini CLI on that. So if I say, hey, user, it can literally use it for us right there directly. And if it was on hosted somewhere, we'd need to basically install it. But because it's it's fantastic. And look, it's breaking down the way that I move in my first ten seconds.

00:06:33.500 --> 00:07:01.510
But this is just the beginning. It actually gets way crazier than this. Now I've been playing with loads of different models. Now one of the techniques I found is called the council or the triad. And the idea is we have this super intelligent model, and we bring DeepSeek v four to do an insane amount of heavy lifting, then we have a super intelligent model that reviews and delegates. But we need to build it in the proper system. Now there's a couple of things that you need to know about OpenRooter to get the most out of this model. Some might actually surprise you. These here are expressions

00:07:01.510 --> 00:07:14.150
that we can add to the end of any model that we're doing and effectively does some really interesting and useful things that are gonna help us with Hermes. First of all is Nitro. So this can append to any model and it auto routes to the fastest provider

00:07:14.695 --> 00:07:21.335
at that moment. For example, Anthropic forward slash Claude Opus Nitra. Fantastic. You've got Exacto,

00:07:21.335 --> 00:07:24.375
which is a little Italian, but it's very fantastic, or Asatom.

00:07:24.455 --> 00:07:26.135
Only providers rigorously

00:07:26.135 --> 00:07:27.975
certified for tool calling accuracy.

00:07:28.460 --> 00:07:37.020
Freaking really awesome. Right? Because if we're having these like systems and models doing agentic things for us, in other words, they need to tool call. They need to check databases.

00:07:37.340 --> 00:08:03.970
You know, not every model is great at tool calling. So we only wanna grab the ones that are great at doing that thing. Another call we have on the smart routing side is open route to auto. So this picks the best model for your prompt. They're not diamond, no extra fee. That's pretty cool. Right? That's pretty fantastic. It can do that. Then you got basically bringing your own keys, which is fantastic. So for example, if we're using something that's getting rate limited like a DeepSeek v four, we can actually bring our DeepSeek key into OpenRouter

00:08:03.970 --> 00:08:14.355
just to save us that time and make the whole process easier. We've got fallbacks. And the last one is zero completion. So you're never charged for blank or error responses across their customer base that saves almost $20,000

00:08:14.355 --> 00:08:22.355
a week, which is pretty handy, we might say. So let's talk about the triad and how DeepSeek actually physically fits into this. The idea is three different models. One verdict,

00:08:22.675 --> 00:08:25.635
no single brain. No brain isolation.

00:08:25.635 --> 00:08:29.020
Okay. So if you think of it like this, this is the general strategy.

00:08:29.020 --> 00:08:37.900
This is very well reflected in research. I think the triad sounds a lot cooler. I think it needs a little bit of a rebrand. Triad sounds cool to me. The idea here is we have plan, we have execute,

00:08:38.305 --> 00:08:55.470
and critique. And I've often found, genuinely speaking, none. I will never ship anything. I will never ship anything unless it is severely and brutally critiqued. I actually use the word brutally critique because I wanted to be as critical as possible. Trying to criticize is a skill set in of itself. The idea here is that we have Claude Opus 4.7,

00:08:55.470 --> 00:08:58.190
which at the moment is the king, ruthlessly,

00:08:58.190 --> 00:09:06.590
okay, planning. Now when it's create a plan, the deep sea, the giant whale, okay, that is like one one hundredth of the cost for 95% of that performance

00:09:06.945 --> 00:10:19.930
doing all the heavy work. Some say it's deep seek labor. I don't know. Call it what you want to, but it's working hard for us day and night. And this is the deep seek that can churn in the background for twenty four hours while we're sipping our lattes and enjoying our beautiful days and spending time with our families. And then we have a a critic model, which is just going to essentially pick that apart and make sure that it's correct. And again, then we have a planning model and it works in a beautiful circle like that. So for example, we have called Opus that can decompose a task, write the brief, and execute the workflow. DeepSeek v four is gonna grind and see the plan overnight. It's cheap enough to retry often if it doesn't get it right. And then we can bring in a different model for the critic. It doesn't have to be Gemini three. It could be any model that you want to. I typically like to not have it being Opus, I like to just get a slightly different flavor, a different scoop of ice cream, if you will, from a different very capable model. Most likely, ChatGPT 5.5, but you can tag in Gemini if you want to. And to do this, I'm gonna be using what I call the Pantheon. So last video, I showed you how you pull up this entire beautiful Hermes operating system that effectively is a beautiful dashboard that basically allows you to connect Hermes to your Chord code operating system because we do coding on our computer. Right? This gives you an overview of your spend, your costs. This dreams for you overnight. So based on your entire chat history with Claude, your usage,

00:10:20.010 --> 00:10:25.930
how you're using a chat GPT and Claude in every model in your computer, this will give you dynamic feedback and suggestions.

00:10:26.010 --> 00:11:08.665
It is auto dreaming. You can mark this off as done, go through these things, and it's incredible. What's really powerful here is we can connect this to Hermes, which basically means that Hermes has access to all the data and everything you're doing with coding. It shows you all your skills and all of your fantastic memory systems. Now in Hermes itself, one of the really interesting things that we can do with the Hermes agent here is actually connect it to our memory systems, and we can connect it to everything we're doing. I'll put a link on screen for that full guide breakdown. If you wanna check that one out, you might find that one super helpful. So this will be a link in the description. I I referenced that video earlier so you get it. Obviously, can chat to Hermes in the chat if you want to. It's just helpful to have all the one dashboard. But what I wanna look at here is the panther. Now, you can do this just in Hermes chat. I like to do this because I like to have a visual

00:11:08.905 --> 00:11:11.465
look at everything that I'm particularly designing.

00:11:11.545 --> 00:11:19.820
I just find this way easier to do. So I'm gonna add in a persona. Let's call this one something like Orpheus. That sounds fantastic. I'm gonna give it a job. Right? Deeply reasons

00:11:19.900 --> 00:12:05.435
on any topic. Okay? So this is gonna be a very powerful one. And I pulled together for you a template for this triad system that effectively breaks down the flow so that actually Hermes understands how it works. And it breaks down into three separate prompts. We have Opus the conductor, who's the conductor of the Hermes triad. We have DeepSeek the worker. You're the worker. Here's the loop. You read the proof. You identify three to five angles, listed by the conductor, and then we have the GPT 5.5 who is the critic that looks down and it kind of assesses it dynamically. All you can do is literally come down here, grab the flow like so, copy all the stuff. Obviously, if you're using the Hermes dashboard, awesome. If not, you could just paste this into Hermes. And so I've just pasted mine in here. And here, I'm gonna add a little bit of description, which basically explains a few lines on what this percent is for and when to summon them. This is for when I want to go very deep on a topic.

00:12:05.595 --> 00:12:06.635
We're gonna leverage

00:12:06.800 --> 00:12:10.000
Opus as a conductor. DeepSeek is an extremely

00:12:10.240 --> 00:12:35.545
deep workhorse that can work for hours on the topic, and then we're gonna have ChatGPT to review everything. This is for extremely powerful deep work. Awesome. And then we're happy with that. Again, you can just give this information that you want to to Hermes. I personally find the system really helpful. I like visually seeing everything. I think it's fantastic. Then pick the model that you want orchestrating it. So for us, it's open to 4.7, and then we just click on create Orpheus. And then when that's done, your dashboard will reflect this. Of course, I'll just give it to your guy in Hermes, and I can click on Orpheus.

00:12:35.625 --> 00:13:55.835
And let's have a look at what they're doing, and we can see we've got the flow. So if at any point, I just wanna come in and amend it, I can do that, which is really cool and just makes things a lot easier, I think, which is fantastic. But we're good. So now we just wanna sync these. So I'm just gonna come down here. I'm gonna cap this syncing prompt and shoot over to Hermes. And I'm just gonna come up to the top of the screen and drop this bad boy in here so we can have a little bit of a conversation. Now, obviously, it's great. Again, I just like to visually see this. And once we've done that, what we need to do then is connect OpenRooter. And to do that, we've got to use the terminal. So for example, we can do that on our computer or can do anti gravity. If I do control space bar and I just type in deeper terminal, we have this guy pop up. Now how we actually solve this? Many different ways. Easiest way to do this, if you haven't already, I'm gonna grab the Hermes setup button real quick. Actually, when the model setup button, you're gonna do command space, type in terminal and it will appear. And we just enter Hermes space setup space model. And then from here, you can basically select all the ones that you've got. And what you're gonna find on here is open router. So you're gonna press the space bar to select it like so. And then I've already got mine in, so I'm just gonna keep that one. But here, you'd be prompted for an API key, and we just grab that from open router, which is this website here. And again, shows you all the different performance, which is really interesting to actually see where the central gravity goes. But then you can move to open router here, click on get an API key, then you simply come over to new key, and then you can create it. You also have this b y o k on the left. If I click on bring basically, bring your own keys, I can search all the providers. So I can come down here, and I'm gonna type in DeepSeek.

00:13:55.835 --> 00:14:12.715
I can configure this real quick. Beautiful. And I can add in any API keys I want to. So we're gonna add a priority API key. And I recommend this for DeepSeek just so you never hit any rate limits. Then you just head over to DeepSeek API key into Google or Brave, and you're looking for the DeepSeek platform. Again, you just sign in, drop like $510

00:14:12.715 --> 00:14:42.515
as much as you want to into DeepSeek. You are gonna be using it a lot, so I'd throw some good money. You have to get crazy, obviously, but put some money in there, and then you literally copy and paste that key here, and then you have essentially your own keys for DeepSeek if you want to, so you never get rate limited just in case. And then essentially, at this point, we can do anything we want to. Again, we could chat in Telegram, but I could come up to Hermes and be like, hey, the dude, I'd like to use Orpheus to solve a problem for me. Could you just confirm you know what Orpheus is and let me know if there's anything that you need from me? Send that one off like that. Obviously, what I'm using with speech to text is Glidel.

00:14:42.595 --> 00:14:54.675
It enables me to yap and get a lot more yapping doing in the time, which is fantastic. And just like that, we have response. So I know Orpheus. Orpheus is your deep work tried persona. Opus sets as acts as a coordinator, first interrogating the problem five to 10 clarifying questions,

00:14:55.210 --> 00:15:56.940
then it writes one tight one page brief for worker slash critic loop. The intended flow goal is you give it a goal. Orpheus asks clarifying questions, then write a proof, worker attacks it from multiple angles, the critic tears it apart until it's shippable, and then Orpheus validates final artifact. And what I need from you is the problem you want solved, your desired output format, how deep do you want me to go, deep dive or just work for hours mode, and any constraints, budget, tools, deadline, audience tone, things to avoid, and what success looks like. Little hack for you as well guys. You'll know for example that Hermes itself saves information in its agent memory and its soul. What I would love you to do for your soul, and I'll publish for you as well, and I'll put a link for it down below so you can grab it. And so this is the soul dot md. Effectively come down and just replace it. This is exactly how to run it and how to use it. It's stuff on your identity, your mission, and goals. What's your goal this year? I wanna hit $20,000 a month in my SaaS. I wanna double my investment portfolio. I want to acquire seven companies. Whatever it is, whether you maybe you're billionaire, you wanna be a trillionaire. Whatever the thing is, make sure Hermes knows what it is. Explain your business, your revenue, your runway, bank payments,

00:15:57.100 --> 00:16:21.135
any key information you wanted to know, key metrics that Hermes needs to check and be aware of, voice and communication details, how exactly you would like to have it speak to you, show by default, one question at a time, how it needs to write to you, the rhythm, all this detail. Take a look at this. If there's anything else you want, you can add it in there. But honestly, these are just some really great questions. You just feed it this information about yourself, then Hermes is gonna have that fantastic context, and then that will appear in your soul.md.

00:16:21.320 --> 00:16:24.840
And you can even say, hey there, I want you to add this to my soul.md.

00:16:24.840 --> 00:17:00.320
Go hands free mode, and then just yap to your heart's content. And literally, that will cover everything. Hey there, I'd like to use my Orpheus skill. Could you just confirm to me anything that you need to know about that persona before I give you a task, please? This is cool. Orpheus persona, what I need to ask for a task. Opus is a conductor. DeepSeek is the worker. GPT 5.5 is the critic. When you know desired outcome, success criteria, so let's just tell it. Hey there. My desired outcome is to know I would love to sell websites and AI services to businesses. My outcome is to know which niche should I personally choose. Success criteria is a list of the top three niches that have effectively high margin,

00:17:00.560 --> 00:17:02.160
maybe typically unsexy

00:17:02.115 --> 00:17:06.115
and untargeted, but have a high need need for AI and automation services.

00:17:06.755 --> 00:17:19.100
Could you do a quick one for me? Maybe like ten minutes. I think that's a pretty sharpish on this. The audience is for me. Constraint is just keep it nice and snappy and short. Not too short, obviously, like a good good amount of detail. I want emojis per thing.

00:17:19.340 --> 00:17:21.980
And then the known assumptions, what do I already believe?

00:17:22.300 --> 00:17:57.230
I well, I have some beliefs that businesses like roofers and pool cleaning companies would be an excellent start based on my experience. An output format, yeah, just give it me in emojis and breakdowns. I know you're set to interrogate me first, but yeah, you can ask me a couple of clarifying questions if you like to. So you can already see this process and how valuable this is. I send this off. Now imagine actually setting any important decision through this. Like, if you just ask Claude directly, I've actually found it myself. It just agrees with you. It just sometimes just agrees with you for no reason. Like, dude, you're just agreeing with everything I'm saying. This isn't good. We need to you have to back build interrogators.

00:17:57.390 --> 00:18:29.430
And the beauty of this triad strategy here, guys, is the fact that we've got different models. We have critique. So think of the number of loops. The idea of progress. Right? WD 40 is like effectively solves like 90% of household elements. Right? The only reason it got that is they had a very quick improvement loop. Like, WD 40 was the fortieth version that actually worked, hence the name. And think about how many incremental improvements that you get when you have a critic and review agent running around like this. And then we have obviously OPUS 4.7 that's setting the strategy. It's just gonna make you unbelievably

00:18:29.430 --> 00:18:33.995
effective. Like, it's insane. So when I scroll down, I'm effectively hearing what it's understanding.

00:18:34.075 --> 00:18:52.390
It's picking the top three niches, which is great. Success criteria. Click quick clarifying questions. Cool. So geography is gonna be worldwide. Actually, let's do let's do Texas, please. Office shape. I'll let you tell me what you think is gonna be most effective for that. Price point, I'm gonna guess again that could be, you know, 1 to 15 k. That's absolutely fine.

00:18:53.110 --> 00:19:00.555
Sales motion, I'll let you lead on what you think the best one is gonna be. And to basically guide my decisions, guide my thinking on it, I think that'd be fantastic.

00:19:00.715 --> 00:19:52.830
And then, yeah, go ahead and let me know what the output is. Now I've just used this as a random example, but you get the idea of how the system could actually work. One little tip that might save you as well is add in fallback. So for example, if tokens ever run out, default to x y z instead. So it can physically do that for you. Beautiful. So now I come down and at this response here. I've got fire, water, mold restoration. It's saying white wins. Emergency leads can be insanely time sensitive. One job can be three to 50,000 thoughts, guys. If you're not working with these companies, after this video, we need to go ahead and do that right now. Emergency lead capture is really cool. It's giving you pricing ideology. It's giving you, you know, foundation repair, drainage, waterproofing, and all the reasons why. And what I could do is basically come down and say, this is awesome. Could you just explain to me your thinking behind this? What each model did and how you arrived at this conclusion, please, so I can best understand that. And so let's see what it did here. And I just do this to show it's working. I feel like a math tutor now, and Hermes is my little student telling me everything it's done.

00:19:53.615 --> 00:20:00.255
We've got office and but Opus was a conductor, set the frame, deep sea works. Jeep two was a critic, and then this gave the final synthesis.

00:20:00.575 --> 00:20:13.910
So the point is to avoid one model just vibing the answers. So what is good niche? Instead of asking that, you ask which Texas local service niche is most likely to buy a high margin website, blah blah blah. So judging this on this, which is called the deep sea worker score, basically,

00:20:14.150 --> 00:20:15.510
scored the market brutally,

00:20:15.510 --> 00:20:48.370
went through everything, just kind of explains its thought process, which is cool. But you can set it up to do these loops overnight whilst you're sleeping. You can even use free models if you want to, But I would generally just advise against that just because free is not free as we say in The USA. The point here is that like for a fraction of the cost that you get with DeepSeek, your quality differential is insane. At the end of the day, if it runs for a million years and the answer is still garbage, what use is that to use? So I like to go for that ratio of what I personally use Opus 4.7 or what. But for massive stuff, using DeepSeek is insane because you get, like, I'd say 95%

00:20:48.370 --> 00:20:56.625
the value for like 1% of the cost. And so then we're building out this pantheon of special skills with this particular triad skill. Now the idea here is that Hermes plus DeepSeq

00:20:57.141 --> 00:21:07.701
with the whole infrastructure we've got is an agent that grows with you. But it does lead us on to one final question. And that's how to get Hermes to its maximum potential, which we're gonna learn in this video right here.
