Advanced configuration
Opik’s metrics expose several power-user controls so you can tailor evaluations to your workflows. This guide covers the most common tweaks: asynchronous scoring, evaluator randomness, and log-probability handling.
Asynchronous scoring with ascore
Every built-in metric inherits from BaseMetric, which defines an async counterpart to score named ascore. Use it when you need to run evaluations inside an async pipeline or when the underlying provider (e.g., LangChain, Ragas) requires an event loop.
Within synchronous code you can still call score—Opik will run the async implementation under the hood when needed. When integrating with async frameworks (FastAPI endpoints, streaming agents, or notebooks using nest_asyncio), prefer the explicit await metric.ascore(...) form.
Controlling evaluator temperature
GEval-based judges accept a temperature argument. Lower temperatures improve reproducibility by keeping the evaluator deterministic; higher values explore more rubric variations and can surface edge cases.
Opik caches evaluator chain-of-thought prompts per (task, criteria, model, completion_kwargs) combination. Changing temperature or other LiteLLM keyword arguments (e.g., top_p) produces a fresh cache entry so experiments stay isolated.
Log probabilities and evaluator models
When the LiteLLM backend supports logprobs and top_logprobs, Opik automatically requests them to stabilise GEval scores (mirroring the original paper). If you switch to a model that does not expose log probabilities, the metric still works—the score is computed from the raw judgement only.
You can inspect the evaluator’s capabilities at runtime:
If you need to propagate additional LiteLLM options (for example, response_format or frequency_penalty), instantiate LiteLLMChatModel manually and pass it to the metric:
Because the model fingerprint is part of the cache key, changing these kwargs forces a new evaluator rubric to be generated.
Tracking controls
Most metrics accept track and project_name keyword arguments so you can decide whether each run writes to Opik and which project it belongs to:
Disable tracking when running quick, ad-hoc experiments locally, or set project_name="llm-migration" to group evaluations by initiative.