EvaluationResult¶

class opik.evaluation.evaluation_result.EvaluationResult(experiment_id: str, dataset_id: str, experiment_name: str | None, test_results: List[opik.evaluation.test_result.TestResult], experiment_url: str | None, trial_count: int)¶

Bases: object

experiment_id: str¶

dataset_id: str¶

experiment_name: str | None¶

test_results: List[TestResult]¶

experiment_url: str | None¶

trial_count: int¶

aggregate_evaluation_scores() → EvaluationResultAggregatedScoresView¶

Aggregates evaluation scores from test results and returns the aggregated scores view.

The method calculates aggregated scores from test results and encapsulates the results in an EvaluationResultAggregatedScoresView object, which contains information about the experiment and computed aggregated scores.

The aggregated scores dictionary has keys for each found score name and values containing the statistics for that score.

Returns:: EvaluationResultAggregatedScoresView object containing details about the experiment and the aggregated scores calculated from test results.

group_by_dataset_item_view() → EvaluationResultGroupByDatasetItemsView¶

Create a view of evaluation results grouped by dataset items.

Returns:: EvaluationResultGroupByDatasetItemsView containing organized results with aggregated score statistics