All skills
Long-Form Video Clip Pipeline
Video / ShortTurns long episodes into publish-ready highlight clips, automatically
Live output preview
Input Format: Input FormatOutputWatch the Output: Watch the Output
A plan is required to view this content
Choose a plan to access input format, sample outputs, and live previews.
View Plans →About the skill
What it does
A five-stage automation that turns a long-form YouTube episode (podcast, interview, talk) into standalone, watchable highlight clips: Download → Transcribe → AI Segment → Verify → Cut.
- Download —
yt-dlppulls the video plus auto-subtitles (VTT) at best quality. - Transcribe —
openai-whisperproduces a local, word-level timestamped transcript. Whisper emits segment-level timestamps natively; word boundaries are interpolated, so themediummodel (~98% accuracy) is recommended for production,largefor noisy audio. - AI Segment — Claude finds the 3-5 strongest standalone segments, scoring each for hook strength 1-10 (minimum 6), ensuring a complete narrative arc and clean cut boundaries.
- Verify — Claude re-checks cut boundaries for sentence integrity, dropping clips that start or end mid-sentence.
- Cut —
FFmpegcuts 16:9 landscape clips;longform_pipeline.pyre-encodes for frame accuracy, whileclip_cutter.pyoffers fast stream-copy (-c copy).
scored_pipeline.py adds a 10-expert LLM quality panel: it only cuts clips scoring 90+, with a dry-run mode that scores without cutting.
When to use it
- Converting long YouTube content (podcasts/interviews/talks) into highlight clips
- Processing a YouTube back catalog into a clips channel
- Finding the best standalone segments from transcripts
- Cutting clips with verified sentence boundaries
- Running a high-volume clip operation ($0.50-1.00 per episode)
Method / frameworks
- Hook-strength scoring (1-10, min 6) — aligned with short-form retention research: intro retention (past first 3s) ideally >70%, completion >60%. Weak-hook segments are dropped.
- Narrative-arc integrity — clips must contain a complete arc watchable out of context.
- Word-level timestamp verification — for clean boundaries (the WhisperX/forced-alignment principle: <100 ms alignment).
- 10-expert LLM panel scoring — a multi-judge quality gate (scored_pipeline).
- Stream-copy vs re-encode trade-off — speed (keyframe ±1-2s) or frame-accurate cuts.
How do I use this skill?
You don't "run" a skill — after installing it you just tell the agent your task (e.g. ask for the relevant job), and the skill kicks in by itself when its description matches.
Upload the video-clip-pipeline.zip you downloaded as-is — no packaging needed, the format is already correct (folder at root).
- Open Settings → Customize → Skills
- Upload → select the
video-clip-pipeline.zipyou downloaded - Claude reads
SKILL.md; the name + description appear. Ready ✅
Scripts run in Anthropic's code-execution environment (sandbox) — not on your machine.