Evaluate Function
The evaluate function allows you to run comprehensive evaluations of LLM tasks against datasets using customizable metrics.
Parameters
The function accepts a single options parameter of type EvaluateOptions, which contains the following properties:
Returns
The function returns a Promise that resolves to an EvaluationResult object containing:
- Aggregated scores across all evaluated samples
- Individual sample results
- Execution metadata
Example Usage
Notes
- The function automatically creates an experiment in Opik for tracking and analysis
- If no
clientis provided, it uses the global Opik client instance - You can provide type parameters to properly type your dataset and task inputs/outputs
- Errors during evaluation will be properly logged and re-thrown