Translation Quality Evaluator

Report

Scores source + target with the MQM error typology and returns a pass/fail verdict.

Live output preview

Input Format: Input FormatOutputWatch the Output: Watch the Output

A plan is required to view this content

Choose a plan to access input format, sample outputs, and live previews.

View Plans →

About the skill

What it does

Compares a source text with its translation at segment (sentence/line) level and maps every deviation to a branch of the MQM (Multidimensional Quality Metrics) error typology: Accuracy (fidelity to source — mistranslation, omission, addition, untranslated), Fluency (target-language correctness — grammar, spelling, punctuation, register), Terminology (term base / domain fit), Style (style guide / tone), Locale Conventions (date/number/currency/measurement format) and Audience/Design (placeholder, markup, length overflow). Each finding receives a severity (neutral/minor/major/critical) and its corresponding penalty weight (0/1/5/10).

The score follows the penalty-weighted MQM Scoring Model: MQM Score = max(0, (1 − Σpenalty / (wordcount × 10)) × 100). This 0–100 value is compared against a threshold calibrated by DQF content profile (legal ≥99, marketing ≥95, ui ≥97, utility ≥90) to produce a PASS/FAIL/CONDITIONAL verdict. Critically, in zero-tolerance profiles a single critical error forces an automatic FAIL even at a high score (severity hard-stop, independent of the threshold). Every major/critical finding carries a concrete suggested fix the reviewer can apply directly.

When to use it

Pre-delivery LQA (Language Quality Assurance) of a translation/localization.
Auditing MT/MTPE post-editing quality against the ISO 18587 light/full PE distinction.
Blind comparison / vendor scorecard between two LSPs or translators.
Systematic terminology or style-guide compliance checks.

Method / frameworks

Translation quality splits into an analytical (error-based) and a holistic (fluency-based) paradigm; this skill uses the analytical one as its backbone and the holistic one as a check. Canonical frameworks: MQM (error typology aligned with ASTM F2575 / ISO 5060:2024 — backbone), the MQM Scoring Model (penalty-weighted error rate; minor=1/major=5/critical=10), TAUS DQF (content profile + 1-4 Adequacy/Fluency dynamic thresholds), ISO 17100:2015 (TEP process, reviser competence), ISO 18587:2017 (MT post-editing light vs full), and, when a reference exists, automatic metrics (BLEU, chrF, COMET — neural, closest to human judgment). The deprecated LISA QA Model is mapped to its MQM equivalent when encountered. Known anti-patterns are explicitly guarded: mistaking Fluency for Accuracy, scoring preferential changes as errors, severity-blind single scores, and placeholder/locale blindness.

How do I use this skill?

You don't "run" a skill — after installing it you just tell the agent your task (e.g. ask for the relevant job), and the skill kicks in by itself when its description matches.

Upload the ceviri-kalite-degerlendirici.zip you downloaded as-is — no packaging needed, the format is already correct (folder at root).

Open Settings → Customize → Skills
Upload → select the ceviri-kalite-degerlendirici.zip you downloaded
Claude reads SKILL.md; the name + description appear. Ready ✅

Scripts run in Anthropic's code-execution environment (sandbox) — not on your machine.