Evaluate Function
The evaluate
function allows you to run comprehensive evaluations of LLM tasks against datasets using customizable metrics.
Parameters
The function accepts a single options
parameter of type EvaluateOptions
, which contains the following properties:
Returns
The function returns a Promise that resolves to an EvaluationResult
object containing:
- Aggregated scores across all evaluated samples
- Individual sample results
- Execution metadata
Example Usage
Notes
- The function automatically creates an experiment in Opik for tracking and analysis
- If no
client
is provided, it uses the global Opik client instance - You can provide type parameters to properly type your dataset and task inputs/outputs
- Errors during evaluation will be properly logged and re-thrown