The bait, then the rug-pull.
In the first nine seconds, the promise is already on the table: long-form in, a full week of short-form out, no editing software touched. What follows is not a concept pitch but a working system — five stages, one repo, two prompts, twenty-eight scheduled posts.
Where the time goes.
01 · Hook
Cold open showing live clips on YouTube and TikTok. Full pipeline promise stated in 9 seconds.
02 · Who I Am + What We're Building
Presenter intro. Agentic systems builder for 1.5 years. Three-part structure outlined.
03 · The 5 Pipeline Stages Explained
Miro board architecture walkthrough: Smart Selection, Auto-Reframe (MediaPipe, three source layouts), Auto-Transcribe, Auto-Edit (Remotion/SVG/SFX), Publish (Late API, 50+ posts from one video).
04 · Get the Free System
Repo access via texting AI agent Finn on WhatsApp or SMS. References separate social media manager video.
05 · Setting Up Claude Code in VS Code
Install Claude Code extension, enable bypass-permissions mode, connect Anthropic account.
06 · Cloning the Repo + Guided Setup
Clone repo, paste URL into Claude chat, ask Claude to run full setup. Claude reads README and installs all dependencies autonomously.
07 · Project Structure Walkthrough
Folder tour: .claude/ (clip skills), output/, public/ (logos/SFX/source videos), remotion/ (compositions), tools/ (MediaPipe, EMA smooth, FFmpeg), CLAUDE.md, .env.
08 · Live Demo: Full Pipeline in Action
Two prompts submitted. First: local video file. Pipeline runs ~30-35 min. Seven clips produced, edited with captions and compositions, scheduled to 4 platforms = 28 posts. Late dashboard confirms scheduling.
09 · Demo 2: Podcast Clip Extraction
Lex Fridman YouTube URL submitted. Claude downloads first hour, runs batch face-tracking, produces 7 podcast clips in output subfolder.
10 · Final Thoughts + Next Steps
LinkedIn plug for behind-the-scenes content. Subscribe + Finn CTA repeated. Teases next system video.
Visual structure at a glance.
Named ideas worth stealing.
IX Clip Extractor Pipeline
- 1. Smart Selection
- 2. Auto-Reframe
- 3. Auto-Transcribe
- 4. Auto-Edit
- 5. Publish
Five-stage agentic workflow that converts any long-form video into edited short-form clips scheduled across multiple platforms.
Source Video Layouts
- Screen-share (top: screen, bottom: face)
- Speaker side-by-side (two-host podcast split)
- Close-up (single speaker)
Three reframing templates that handle the most common long-form video formats for vertical output.
Lines you could clip.
"It picks the best moments, reframes everything vertically, and it edits all of it. Plus, it schedules it across every single platform."
"This is like a five-month work that I am giving away for free in this video."
"This now replaces the total need of external tools out there because you can host them on your own project."
How they asked for the click.
"Just have a conversation with Finn. I am gonna leave her WhatsApp and her number if you wanna use SMS."
Unusual CTA — repo access is gated behind texting an AI agent named Finn. No direct GitHub link shown on screen, which adds friction but filters for serious viewers.
Word for word.
Five stages, one repo, zero editors — the ownership math.
The clip extractor is not a shortcut — it is a skills-based architecture where Claude Code holds the pipeline logic and you hold the creative decisions.
- Claude Code bypass-permissions mode is what makes autonomous multi-step pipelines practical — without it, every file operation pauses for manual approval.
- The clip selection skill file is the most valuable piece of the system — the framework that defines what makes a moment worth posting determines output quality more than any downstream tool.
- Remotion lets you define video compositions in code rather than timelines, which means Claude can write and execute edits programmatically without an editor open.
- A two-prompt session producing 28 scheduled posts is only possible because clip selection is human-approved while editing and publishing are fully autonomous.
- MediaPipe plus EMA smooth plus FFmpeg is a self-hosted face-tracking stack that reframes talking-head video vertically at no subscription cost.
- The Late API abstracts multi-platform posting to a single integration — switching platforms means changing a config value, not rebuilding the pipeline.
- Iteration on the system happens in conversation with Claude: fixing a missing thumbnail CTA or adding a new visual composition is a follow-up prompt, not a code refactor.





































































