Autoresearch — Conversion Copy Optimization Engine

Report

Evolve your copy to 85+ with a 5-expert simulated panel — no traffic needed.

Live output preview

Input Format: Input FormatOutputWatch the Output: Watch the Output

A plan is required to view this content

Choose a plan to access input format, sample outputs, and live previews.

View Plans →

About the skill

What it does

Autoresearch adapts Andrej Karpathy's "autoresearch" optimization loop to conversion copy: instead of live traffic, a simulated expert panel finds the winning version in minutes. The flow has three layers. (1) Intake & Parse — read the source, auto-detect content type (landing page / email / ad / form), and list the optimizable elements (hero headline, subheadline, CTA, problem section, social proof, etc.). (2) Round Structure — per element: Round 1 generates 10 variants, batch-scores all of them with the 5-judge panel in a single API call, ranks by average and keeps the top 3; Round 2 analyzes the winning patterns and pushes them further with 10 new variants; Round 3 (if below threshold) targets the weakest dimension. (3) Cross-breed — assemble each element's winner, generate 5 holistic combinations and score them as complete units; the highest holistic score is the output.

5 simulated judges: a mid-market B2B CMO ("would I stop?"), a skeptical founder ("do I believe this?"), a Conversion Rate Optimizer ("clear and action-driving?"), a senior copywriter ("compelling and differentiated?"), and your own CEO/founder voice ("ROI-obsessed, no BS"). Each scores 0-100; the final score is the average across all five.

When to use it

For pre-launch content optimization — when there's no traffic yet. The correct sequence: hit 85+ simulated score with autoresearch first, then deploy, then validate with real traffic. Industry benchmarks explain the gap: global average landing-page conversion is ~2.35%, headline optimization alone can drive up to 34% lift, yet a real A/B test needs ~100,000 visitors per variant. Autoresearch compresses that validation into minutes.

When NOT to use it: post-launch real-traffic A/B testing — that requires real analytics, not simulated scoring.

Method / frameworks

Karpathy-style autoresearch loop · LLM-as-Judge multi-juror panel (5 personas) · evolutionary variant generation (generate → batch-score → rank → keep top-3 → evolve) · element-specific score dimensions · multi-element cross-breeding · quality gates (<70 don't ship, 80-84 shippable, 85-89 ship with confidence, 90+ rare). Anti-pattern discipline: never call the API per variant (always batch), don't over-optimize a single dimension, don't run more than 5 rounds, don't cross-breed before each element has its own winner.

How do I use this skill?

You don't "run" a skill — after installing it you just tell the agent your task (e.g. ask for the relevant job), and the skill kicks in by itself when its description matches.

Upload the autoresearch.zip you downloaded as-is — no packaging needed, the format is already correct (folder at root).

Open Settings → Customize → Skills
Upload → select the autoresearch.zip you downloaded
Claude reads SKILL.md; the name + description appear. Ready ✅

Scripts run in Anthropic's code-execution environment (sandbox) — not on your machine.