ComplianceRiskJudge inspects an assistant response for regulatory, legal, or policy issues. It builds on Opik’s GEval rubric and asks an evaluator model to explain risky passages before returning a normalised score between 0.0 and 1.0 (derived from a raw 0–10 verdict).
Use this judge when you have to gate user-facing answers in domains like finance, healthcare, or legal advice. Read score.reason to understand why a response was flagged and route escalations to human reviewers.
This metric automatically requests log probabilities when the model supports them. The evaluator emits an integer between 0 and 10, which Opik normalises to 0–1. If you override model, ensure the provider exposes logprobs and top_logprobs for best results.