Summarization Coherence Judge

SummarizationCoherenceJudge evaluates the writing quality of a summary: structure, clarity, and logical flow. It complements SummarizationConsistencyJudge by focusing on how the summary reads rather than whether it is factual, returning a 0.0–1.0 score derived from a raw 0–10 judgement.

Improving summary readability
1from opik.evaluation.metrics import SummarizationCoherenceJudge
2
3metric = SummarizationCoherenceJudge()
4
5score = metric.score(
6 output="""SUMMARY: First, the product launched. Revenue grew. Margins fell. Next steps TBD.""",
7)
8
9print(score.value) # 0.0–1.0 after normalisation
10print(score.reason)

Inputs

ArgumentTypeRequiredDescription
outputstrYesSummary text to evaluate.
inputstrOptionalOriginal document/talk track for additional context (not required).

Configuration

ParameterDefaultNotes
modelgpt-5-nanoUpgrade when assessing long-form or domain-specific summaries.
temperature0.0Raise slightly (≤0.3) to expose diverse stylistic critiques.
trackTrueToggle off to skip logging.
project_nameNoneOverride when tracking across projects.

Pair this judge with SummarizationConsistencyJudge to ensure summaries are both factual and easy to skim. The evaluator returns a 0–10 integer that Opik normalises to 0–1.