Summarization Coherence Judge

SummarizationCoherenceJudge evaluates the writing quality of a summary: structure, clarity, and logical flow. It complements SummarizationConsistencyJudge by focusing on how the summary reads rather than whether it is factual, returning a 0.0–1.0 score derived from a raw 0–10 judgement.

Improving summary readability

1 from opik.evaluation.metrics import SummarizationCoherenceJudge
2 
3 metric = SummarizationCoherenceJudge()
4 
5 score = metric.score(
6     output="""SUMMARY: First, the product launched. Revenue grew. Margins fell. Next steps TBD.""",
7 )
8 
9 print(score.value)   # 0.0–1.0 after normalisation
10 print(score.reason)

Inputs

Argument	Type	Required	Description
`output`	`str`	Yes	Summary text to evaluate.
`input`	`str`	Optional	Original document/talk track for additional context (not required).

Configuration

Parameter	Default	Notes
`model`	`gpt-5-nano`	Upgrade when assessing long-form or domain-specific summaries.
`temperature`	`0.0`	Raise slightly (≤0.3) to expose diverse stylistic critiques.
`track`	`True`	Toggle off to skip logging.
`project_name`	`None`	Override when tracking across projects.

Pair this judge with SummarizationConsistencyJudge to ensure summaries are both factual and easy to skim. The evaluator returns a 0–10 integer that Opik normalises to 0–1.