WEBVTT

00:00:00.080 --> 00:00:01.760
Okay. This is crazy.

00:00:02.160 --> 00:00:06.080
Come a bit closer because we need to talk about AI clones.

00:00:06.240 --> 00:00:09.600
What you're seeing right now is my very own clone.

00:00:10.000 --> 00:00:12.400
This is not real. This is all AI.

00:00:12.720 --> 00:00:15.345
Just look how realistic the movements are.

00:00:16.785 --> 00:00:27.025
Yeah. That's scary. Right? Like, this technology has come a far away because I used to have an AI clone and everyone called me out on it. They all noticed it was fake. But if you can't tell,

00:00:27.505 --> 00:00:34.520
this is AI too. Now this is what my clone used to look like when I tried to do this with v o three, like, nine months ago.

00:00:34.920 --> 00:00:40.120
And this is what we got right now. What this means is that you don't have to be in the studio anymore

00:00:40.280 --> 00:00:43.640
because all I have to do is upload an audio clip of me talking

00:00:44.175 --> 00:00:45.615
and a reference image,

00:00:45.935 --> 00:00:47.135
and this is what you get.

00:00:47.775 --> 00:00:54.895
Okay. So now it's the real then. And to prove this, I'm going to drink this cup of water because we all know AI cannot do that.

00:01:04.000 --> 00:01:13.505
Yeah. All jokes aside, this is actually crazy. Right? Like, did I fool you or not? Let me know in the comments down below. In this video, I will breaking down exactly

00:01:13.505 --> 00:01:40.905
how you can make your own AI clone that looks exactly like you to make fun videos like I've just shown you. I'll also break down a few different use cases of how you can make something useful out of this. Now for this video, we're using CDance through Higgs Field. It has all the tools that we need to make these kind of videos. If you wanna follow along, click the link in the description down below. I'll also add in a link to my school community where you can find all of the prompts I used and some other files that might be useful.

00:01:42.345 --> 00:01:46.585
Okay. So the big question is, what is so good about these AI clones?

00:01:46.665 --> 00:01:57.340
Well, there are three reasons why I think this looks good. First, we have the amount of detail. If you look at these shots, you can see everything from my skin texture to my imperfections,

00:01:57.420 --> 00:02:58.540
pimples, freckles, like, all of that. You see it on my skin. If we actually take a look at this other example right here, then let me walk you through it why I think this looks amazing. We got a few things going on and I wanna discuss the detail. My face looks exactly like the image input. If I show you the image input that I used here, it's this image and it used that quite well. We can even see the bags underneath my eyes because there have just been too much work to do. I also like how this character is walking to the screen. She's taking a video and it looks like, of course, if you really zoom into it, you will notice, but I I do think it looks realistic. The other thing that I wanna point this guy right here, that's exactly the type of, like, look or the exactly the type of face I would have at Influencers in the Wild. Like, that's why I'm in this studio, locked away from everyone else. I don't wanna film in public like this because I will have people like this dude looking at me, and that's why I'm just being in the public with my AI. The scooter right here, if we take a look at the light that is reflecting on some of these things. So we see the vending machine right here. We have the light.

00:02:58.860 --> 00:03:06.165
Then also on this pole right there, we have the reflection of the light too. Overall, this is becoming so good that people are getting fooled by it.

00:03:16.180 --> 00:03:26.580
The other reason why I think these AI clones are so good is the natural dialogue. Just have a listen to this. One of the easiest ways to tell something is AI generated is the dialogue,

00:03:27.315 --> 00:03:31.235
specifically the flow of it. Like, we're talking about pauses,

00:03:31.235 --> 00:03:32.835
ums, and ahs.

00:03:33.235 --> 00:03:34.595
If you don't have them,

00:03:34.915 --> 00:04:02.715
you have a big chance that it's AI generated. That's crazy. Like, tell me that doesn't sound like me. Tell me that doesn't look like me. Like, the amount of detail and even, like, how I look away where I'm when I'm taking a break, where I'm pausing, like, that's something I do quite often. Like, you you don't see me looking into the camera enough. Like, I as as soon as I start to think about something or, like, I'm I'm, like, looking up or looking to the side I'm doing that here too. We just take a look at this. Something is AI generated is the dialogue.

00:04:02.955 --> 00:04:04.875
There you had it. Just looked at the side.

00:04:05.195 --> 00:04:51.980
Like, we're talking about The hand movements, it all looks and sound natural. I use a lot of pauses in my speech. Even here, it is using that. But the ability for Cdans to understand that there's pauses, the ability for Cdans to not cut out that pause, to to also have my character take a little break when I'm taking a pause and look away maybe. That is just incredible. If we compare that to what Heijen made, for example, in this example, not even that old video that I did about Heijen. I'm going to show you how I generate avatars that look and sound exactly like me. No matter the background, the clothing, or the camera angles, you will remain completely unshakable. Like the dialogue, the movement, all of it just feels more thick. That also brings me to the last reason why I think this looks so good, which is the natural movement.

00:04:57.335 --> 00:05:07.410
That comes to, like, natural movements. This AI understands so well how good the natural movement is. Here's a little vlog that I made, more of a travel thing, but look at the natural movement here.

00:05:08.050 --> 00:05:08.850
Most

00:05:21.250 --> 00:05:59.635
people think of this as scary and inauthentic and all of that, but I'm here to say anyone can now be a creator. You don't have any limitations anymore. The only limitation you have is is credits. And that is, like, that is a big thing, but I will show you how to save on some credits later. Now that you understand what it is that it makes this look so good, it is time for you to learn, like, how to actually make your own cloak. And it is way simpler than you might think. I will walk you through all the different steps to create your own cloak. Because first, you have to make your reference. Now for making a reference, I literally just use my phone and I took a selfie of me. So this is the selfie that I made. This already is good enough for 99%

00:05:59.635 --> 00:06:00.755
of regenerations.

00:06:00.755 --> 00:07:43.290
The only tip I have is I wished I took, like, a a top down selfie to show my outfit, like a selfie like this. To explain the reason why is in this video right here, I had some outfit issues. As you see, in the first shot, I'm wearing black pants, and in the second shot, I'm wearing beige pants. Even though I prompted it to be black pants, um, because it doesn't have a reference of me wearing black pants, it can sometimes fuck up. So that's the tip that I wanna give you. If you are doing this as a reference and you wanna include more of your outfit, make sure you maybe also include another reference of your full outfit. But all in all, the main thing you need is a selfie view. And I hear you thinking, why are we not using a caret sheet? Now let me explain. This is the exact caret sheet I made with g t image two that looks just like me. The reason why I don't use these character sheets for my vlog type videos is because of the level of detail. Right now with this character sheet, we have, like, my full image and we have, like, the side angles, all of that. But if we zoom in, it's not as detailed as a actual self of me. To show you a comparison, I've used the same prompt, but one video had the selfie as a reference and the other video had the character sheet as a reference. If we take a look at them side by side and if we zoom in on my face, then we can see that there's a lot more detail and a lot less smoothness, which also can be good if you have some imperfections. But on my selfie image, it looked a lot better in my opinion. So that's why I went with the selfie approach. So that is how you make your clone or how you clone yourself. Okay. The next challenge we have is making the AI sound like you. Now there are three different ways of how you can approach it. I will first show you the best method. If you have a good mic, then the best thing you can do is record your own audio. I've done this two different ways. I've used Audacity.

00:07:43.370 --> 00:07:54.585
This is literally a free tool where you can just plug in your microphone and then literally say anything you want. Then you take that audio and you upload it as a reference inside of CDance.

00:07:54.585 --> 00:08:24.700
Now that's the beauty of c dance. It can take in audio references if you didn't know that. Make sure that it is less than thirteen seconds. For some reason, you can only do thirteen seconds and not fifteen seconds, so keep that in mind when you're generating or recording your voice. The only issue we have is it still alternates and changes your voice. Here's a side by side comparison of my real recording and what C Dance made out of it. Okay. So this is crazy. Come a bit closer because we need to talk about AI clones. Okay. This is crazy.

00:08:25.260 --> 00:08:55.760
Come a bit closer because we need to talk about AI clones. Now, as you hear, the CDense version is different from my original audio output, so expect it to change your voice a little bit. If you wanted to exaggerate a bit more, then make sure that your input audio is also, like, way more expressive. The other thing it does, it can add in some background noise. Now this works best for talking heads videos that are, like, in a studio environment, like this example. Yeah. That's scary. Right? Like, this technology has come a far away because

00:08:55.840 --> 00:09:20.515
I used to have an AI clone and everyone called me out on it. They all noticed it was fake. If you're not in position to record each and every clip or just takes too long, then you can use this different method, which is cloning your voice. The easiest and cheapest way to do it if you already have Higgs Field is to use audio and then voice over. Here, you can create your own voice clone by submitting a audio file that is max two minutes long of you speaking.

00:09:20.595 --> 00:12:32.105
I did that right here. If you want to have the best possible clone, I would recommend using Eleven Labs, making a professional voice clone right there, and using that to make your voice overs. For example, this is what I sound like with an Eleven Labs voice. This is my voice clone on Eleven Labs. What do you think? Now that we got the two input references, so we got the image of us, and then we also have the voice segment, we can now start generating. And let me explain how I prompt for this. So the first thing I do is I go to video, then I go to CDance, and then over here, I'm uploading both of these things. The next thing that you wanna do is you wanna add in your prompt. Now there are so many different ways to prompt it. There's no right or wrong, but the method I'd like to use for this is called timeline prompting. And the reason why is because it's very specific. It also allows you and any eye tool that you're using to think about, like, what you're actually putting in there. If I go over and break down the prompt, it's like this. I add in the format. So I'm basically explaining it's a nine seconds single continuous shot. I know you got the setting inside of Higgs field where you can select, like, how many seconds you have. Make sure you match that with what you have in your prompt. So then we go over the subject. That's where I add and tag image one. So in Higgs field, make sure you add in and tag that image one. Then for the wardrobe, I'm going over, like, same. It's the exact same as image one. Environment, exact same because it's one continuous shot. It's not that, like, difficult for this one. Then the style anchor. So I'm going in with a locked studio talking head, podcast YouTuber creator aesthetic. Here, you can change this up to what you have in mind. If it's not a continuous shot, then you can change something like, oh, outside in the park, we have a handheld camera shot, like all of that. You can basically add in all the different details of how you want it to look. This shot was quite easy, to be honest, because a talking headshot is a static shot. It's all set on a tripod, so we don't have anything difficult going on. The delivery, gonna it's be conversational, reflective, mid tempo with natural micro pulses, lip sync driven, and it's captured by a podcast mic. Clean direct micro tone, no reverb. Here, you don't wanna simply copy what I did. If you're outside, if you have any type of background noise, you can add that in here. Now for the logic rule, it's basically to prevent that the AI does anything weird. So I'm just repeating myself with a single continuous shot. No cuts, no jumps, no zooms. I don't wanna see that in this shot. If you do wanna do that, then prompt it that you wanna have happen. Negative prompt, no music, no captions. Sometimes it tends to add in captions and music. Then for the action, this is where the timeline starts. It's not really a timeout prompt because we have just one prompt. That's the only shot in the timeline. So for zero to nine seconds, it is a talking headshot where I am in the studio and I've tagged myself again here, and I am prompting, like, how I deliver these words. So I'm saying, like, it's relaxed conversational energy. But while he's saying that, his right hand lifts up and he casually points toward the right side, not sure which side that is, gesturing into open space. Then I repeat what I have said, and then I turn back to the lens and I continue, and this is what we got right now. That's just to give the AI a bit more of a breakdown of what we want to see. So this is what it generated for me. Now this is what my clone used to look like when I tried to do this with v o three, like, nine months ago.

00:12:32.790 --> 00:12:35.030
And this is what we got right now.

00:12:35.750 --> 00:13:09.640
Honestly, this is quite impressive. The first time I saw this, I showed it to my friends, and they all thought it was the real me. The main thing I did here is I did use ten eighty p quality, and that is because I'm using this for YouTube videos, which are upscaled to four k. Ideally, you might even wanna upscale this video, but although sometimes when you upscale, tends to look a bit fake. Um, but, yeah, this is very usable already. If you do wanna save some credits though, then you might wanna switch over to seven twenty p because as you see, it's a lot lot cheaper, um, than ten eighty p. So that is one tip for you there. Now you don't have to reinvent the wheel and prompt all of this yourself.

00:13:09.800 --> 00:13:57.180
Um, you don't get as a polished of a result as I just had. So to save you some time, I've made this MD file, which you can find in the school community. Then you upload this MD file into your cloth or into your Chachi Pit. Then you also upload a reference image of you, and you can also even add in your audio file. Then you just start prompting it. It can be a super simple prompt. Like, I just did a very short prompt where I said, using the MD file and my reference images and my audio file, prompt me a handheld camera scene of this man giving a tour in a boutique small but nice hotel room. It has to be, like, thirteen seconds long, one continuous tick. Then I also give a breakdown of what the transcript is saying, like, the audio file is saying, so you don't leave that guesswork up to the AI. Then it will spit out a prompt. If you copy the prompt, put it into Higgs field, also adding your references there,

00:13:57.500 --> 00:14:01.420
and then you hit generate, it will make you something like this. So

00:14:01.820 --> 00:14:04.140
this is where I'll be staying for the time being.

00:14:04.620 --> 00:14:07.100
Here's the bathroom, pretty fancy,

00:14:07.500 --> 00:14:10.220
and here's the bed, and then there's the huge TV.

00:14:10.825 --> 00:14:11.465
And,

00:14:12.105 --> 00:15:57.130
yeah, I'd say it's pretty nice. Yeah. Everything was AI generated, even the voice. I used 11 apps for that. Now I hear you thinking, how can I use this in real life? Like, what is actually the benefit of having this AI clone? I've already shown you a number of different examples, but now let's get a bit specific. The main use case of using this right now would be to generate short clips. Unfortunately, you can only do thirteen second videos, but you can stitch them together to make something longer, but it will cost you a lot of credits. So how are people using this in real life? Let me showcase you a few examples. So, for example, the first one that I wanna share with you is this page right here. This guy literally went viral for making AI videos of some random guy. Like, it doesn't even have to be you. Like, I use me as an example right now, but it can be anyone. You use this guy, put him in a suit, and it's, like, completely fake. They make money through this, like, WOB course, I reckon, and they already got, like, 130 k followers. Now here's another one. I don't know why they're picking old people, I think, for credibility, but this person has almost 400 k followers, even has their own website. And it's probably some dudes just making these clips. Now, there are also people making more entertainment style videos like here. Again, they didn't use a image of themselves. They have built this imaginary character. These already have, like, 2,000,000 followers and they're making real money with clips like that. One last example I wanna give you is people using this for podcasts. So, uh, I think the AI creator space is very much a, um, show, don't tell. Yeah. So what I'm trying to tell you here is that you can make entertainment videos of yourself. You can put yourself on a podcast. You can make any type of video you can imagine. To get a bit more practical, I've made a few examples of what I think this could be used for. So here, for example, I use it as AI VFX.

00:15:57.130 --> 00:16:00.170
If you have played Minecraft, then you'd know exactly what this is.

00:16:01.865 --> 00:16:33.730
Man, I'd never need to walk ever again. Now this can be a fun intro opener for my reel or for my YouTube videos, and you can do this too. The other one that I wanna share with you is more of an AI ad. Now here, I made a Uniqlo ad. So

00:16:29.745 --> 00:16:48.550
there you go, Uniqlo. That was some free promotion for you. I will send the invoice later. I think this this is, like, four shots. I generated everything, like, two or three times. In total, this will cost you, like, 500 credits max on Hicksfield, which comes down to depending on which plan you have. Max, that shot would cost you $25,

00:16:48.550 --> 00:17:06.395
which still feels pretty expensive. But the way I look at it, like, spend a lot of credits and I burn a lot of money on it, I look at this as, like, a tool. Like, instead of you buying a expensive camera or instead of you buying, like, a literal set and purchasing, like, all the people that can help you with that, renting out all the equipment,

00:17:06.635 --> 00:18:00.985
you can make something looking half decent with AI, and big brands are already doing it. My main point is that there are just so many different ways to make cool stuff with AI right now. I used to not even want to be on camera, and now that's even more justified because I can just use a clone of me. I've seen so many people, so many characters that already are doing this. They are benefiting from it. They are using this in their business, in their social media, in every type of aspect. The main question I have to you is, are you gonna be one of these early adapters or are you just gonna be mad about this? I can see both ways, but still, I would give it a try. Again, the link to Hicksfield and the link to my prompts are in the description down below. Check it out. If you wanna see more implications of how I use CDense, then click the video that's on the screen right now. I have a few cool videos that literally explain all kind of different use cases of how you can master CDNs, but I also have one about AI filmmaking.
