Multi-Source Transcript Harvester

Data & Research apify

Scans founder-story YouTube channels and podcasts with a single command, filters by duration, and produces normalized Markdown transcript files plus an audit log.

Live output preview

A plan is required to view this content

Choose a plan to access input format, sample outputs, and live previews.

View Plans →

About the skill

Multi-Source Transcript Harvester

A transcript raw-data layer that collects founder/entrepreneur story content with a single command. It takes a sources-manifest.yml + a list of source slugs + a per-source top-N parameter, and routes each source according to its type. For YouTube channels it first resolves the channel ID, lists the uploaded videos, filters by duration (discarding short clips), then fetches the subtitles of the selected videos and converts them into normalized Markdown transcripts. Podcast RSS sources are marked as Phase 4 and silently skipped.

When to use: When you want to feed a founder-pattern library similar to Starter Story / Indie Hackers / My First Million, set up a multi-channel transcript pool for a research agent, or batch-convert competitor/sector content into text. It is not a standalone agent; it is invoked by upstream agents such as founder-pattern-roadmap-advisor or by batch wrappers.

Output: For each video, a normalized .md transcript file with frontmatter, a per-source _harvest-log.jsonl audit record, and a summary envelope (sources processed, number of transcripts fetched/skipped, estimated cost). It is idempotent with resume: on a second invocation only new videos are fetched. With dry_run you can first see and approve the plan and cost estimate. It supports both English and Turkish sources.

How do I use this skill?

You don't "run" a skill — after installing it you just tell the agent your task (e.g. ask for the relevant job), and the skill kicks in by itself when its description matches.

Upload the multi-source-transcript-harvester.zip you downloaded as-is — no packaging needed, the format is already correct (folder at root).

Open Settings → Customize → Skills
Upload → select the multi-source-transcript-harvester.zip you downloaded
Claude reads SKILL.md; the name + description appear. Ready ✅

Scripts run in Anthropic's code-execution environment (sandbox) — not on your machine.