Autoresearch — Conversion Copy Optimization Engine
ReportEvolve your copy to 85+ with a 5-expert simulated panel — no traffic needed.
Live output preview
A plan is required to view this content
Choose a plan to access input format, sample outputs, and live previews.
View Plans →About the skill
What it does
Autoresearch adapts Andrej Karpathy's "autoresearch" optimization loop to conversion copy: instead of live traffic, a simulated expert panel finds the winning version in minutes. The flow has three layers. (1) Intake & Parse — read the source, auto-detect content type (landing page / email / ad / form), and list the optimizable elements (hero headline, subheadline, CTA, problem section, social proof, etc.). (2) Round Structure — per element: Round 1 generates 10 variants, batch-scores all of them with the 5-judge panel in a single API call, ranks by average and keeps the top 3; Round 2 analyzes the winning patterns and pushes them further with 10 new variants; Round 3 (if below threshold) targets the weakest dimension. (3) Cross-breed — assemble each element's winner, generate 5 holistic combinations and score them as complete units; the highest holistic score is the output.
5 simulated judges: a mid-market B2B CMO ("would I stop?"), a skeptical founder ("do I believe this?"), a Conversion Rate Optimizer ("clear and action-driving?"), a senior copywriter ("compelling and differentiated?"), and your own CEO/founder voice ("ROI-obsessed, no BS"). Each scores 0-100; the final score is the average across all five.
When to use it
For pre-launch content optimization — when there's no traffic yet. The correct sequence: hit 85+ simulated score with autoresearch first, then deploy, then validate with real traffic. Industry benchmarks explain the gap: global average landing-page conversion is ~2.35%, headline optimization alone can drive up to 34% lift, yet a real A/B test needs ~100,000 visitors per variant. Autoresearch compresses that validation into minutes.
When NOT to use it: post-launch real-traffic A/B testing — that requires real analytics, not simulated scoring.
Method / frameworks
Karpathy-style autoresearch loop · LLM-as-Judge multi-juror panel (5 personas) · evolutionary variant generation (generate → batch-score → rank → keep top-3 → evolve) · element-specific score dimensions · multi-element cross-breeding · quality gates (<70 don't ship, 80-84 shippable, 85-89 ship with confidence, 90+ rare). Anti-pattern discipline: never call the API per variant (always batch), don't over-optimize a single dimension, don't run more than 5 rounds, don't cross-breed before each element has its own winner.
How do I use this skill?
Upload the autoresearch.zip you downloaded as-is — no packaging needed, the format is already correct (folder at root).
- Open Settings → Customize → Skills
- Upload → select the
autoresearch.zipyou downloaded - Claude reads
SKILL.md; the name + description appear. Ready ✅
Scripts run in Anthropic's code-execution environment (sandbox) — not on your machine.