In Opik 2.0, experiments are project-scoped. Make sure to specify a projectName when calling evaluate() so results are associated with the correct project.
The evaluate function allows you to run comprehensive evaluations of LLM tasks against datasets using customizable metrics.
The function accepts a single options parameter of type EvaluateOptions, which contains the following properties:
The function returns a Promise that resolves to an EvaluationResult object containing:
For reproducible evaluations, use a DatasetVersion instead of Dataset:
client is provided, it uses the global Opik client instance