Trajectory Accuracy

TrajectoryAccuracy checks how closely a ReAct-style agent followed a sensible sequence of thoughts, actions, and observations to achieve the stated goal. It is useful for auditing complex workflow agents and reinforcement-learning traces.

Auditing an agent run
1from opik.evaluation.metrics import TrajectoryAccuracy
2
3metric = TrajectoryAccuracy()
4
5score = metric.score(
6 goal="Book travel to Paris",
7 trajectory=[
8 {
9 "thought": "Check available flights",
10 "action": "search_flights(destination='Paris')",
11 "observation": "Found flights for next week",
12 },
13 {
14 "thought": "Summarise the best option",
15 "action": "summarise(options)",
16 "observation": "Shared top three flights",
17 },
18 ],
19 final_result="Here are the best flights to Paris next week.",
20)
21
22print(score.value) # Already normalised between 0.0 and 1.0
23print(score.reason) # Explanation of the verdict

Inputs

ArgumentTypeRequiredDescription
goalstrYesThe agent’s objective or task description.
trajectorylist[dict]YesSequence of steps with thought, action, and observation keys.
final_resultstrYesOutcome that the agent reported after completing the trajectory.

Configuration

ParameterDefaultNotes
modelgpt-5-nanoJudge used to score the trajectory.
temperatureNoneForwarded to the underlying model when provided.
trackTrueDisable to skip logging to Opik.
project_nameNoneOverride the tracking project name.

The metric returns a value in the 0.0–1.0 range together with a detailed explanation highlighting missing steps, misaligned actions, or other issues.