Interview Evaluation Scorer

Report

Turns messy interview notes into a defensible scorecard and a clear hiring decision.

Live output preview

Input Format: Input FormatOutputWatch the Output: Watch the Output

A plan is required to view this content

Choose a plan to access input format, sample outputs, and live previews.

View Plans →

About the skill

What it does

It takes the slippery "I had a good feeling" verdict of an interview and turns it into a defensible, evidence-bound scorecard. Every competency is rated 1-5 using BARS (Behaviorally Anchored Rating Scales — Smith & Kendall, 1963) — not a bare number but the observed behavior that number stands for. Each evidence sentence in the notes is mapped onto the STAR (Situation-Task-Action-Result) frame; answers missing Action+Result (claims with no measurable outcome) automatically get low evidence-strength, and adjectives like "natural leader / team player" don't enter the score at all.

Scores combine via predictive-validity weighting (Schmidt & Hunter 1998, revised by Sackett et al. 2022): structured interviews and work samples carry high predictive power, "general impression / culture fit" low. The composite collapses into a 0-100 Fit Score and a five-tier verdict (Strong Yes → No). If a "must-have" competency falls below threshold, a knock-out rule fires — the gap can't be masked by a high average.

The critical gate is the bias & legal audit: hints of age, gender, marital status, pregnancy, belief or health are treated as job-irrelevant, stripped from the score and flagged critical (Turkish Labor Law art.5 equal treatment, KVKK art.6 special-category data, EEOC 4/5 adverse-impact rule). Halo/horn and recency effects are calibrated. With multiple panelists it computes interrater agreement and flags divergent criteria as "calibration needed".

When to use it

When you need an objective score + hiring recommendation from one candidate's interview notes.
When you want to calibrate and merge divergent panelist scores in a panel interview.
When you want to compare two finalists side by side on the same rubric.
When a decision rationale must be documented against audit/litigation risk.

Method / frameworks

BARS (Smith & Kendall, 1963) — behaviorally anchored 1-5 scale; bans anchor-less scoring.
STAR structuring — evidence-strength filter separating claim from proof.
Frame-of-Reference (FOR) calibration (Bernardin & Buckley, 1981) — interrater agreement + panelist calibration.
Predictive-validity weighting (Schmidt & Hunter 1998; Sackett et al. 2022).
Adverse Impact / legal frame — EEOC Uniform Guidelines (1978) 4/5 rule + Turkish Labor Law art.5 + KVKK art.6.

How do I use this skill?

You don't "run" a skill — after installing it you just tell the agent your task (e.g. ask for the relevant job), and the skill kicks in by itself when its description matches.

Upload the mulakat-degerlendirme-notlandirici.zip you downloaded as-is — no packaging needed, the format is already correct (folder at root).

Open Settings → Customize → Skills
Upload → select the mulakat-degerlendirme-notlandirici.zip you downloaded
Claude reads SKILL.md; the name + description appear. Ready ✅

Scripts run in Anthropic's code-execution environment (sandbox) — not on your machine.