Trajectory accuracy

Trajectory Accuracy

TrajectoryAccuracy checks how closely a ReAct-style agent followed a sensible sequence of thoughts, actions, and observations to achieve the stated goal. It is useful for auditing complex workflow agents and reinforcement-learning traces.

Auditing an agent run
1from opik.evaluation.metrics import TrajectoryAccuracy
2
3metric = TrajectoryAccuracy()
4
5score = metric.score(
6 goal="Book travel to Paris",
7 trajectory=[
8 {
9 "thought": "Check available flights",
10 "action": "search_flights(destination='Paris')",
11 "observation": "Found flights for next week",
12 },
13 {
14 "thought": "Summarise the best option",
15 "action": "summarise(options)",
16 "observation": "Shared top three flights",
17 },
18 ],
19 final_result="Here are the best flights to Paris next week.",
20)
21
22print(score.value) # Already normalised between 0.0 and 1.0
23print(score.reason) # Explanation of the verdict

Inputs

ArgumentTypeRequiredDescription
goalstrYesThe agentโ€™s objective or task description.
trajectorylist[dict]YesSequence of steps with thought, action, and observation keys.
final_resultstrYesOutcome that the agent reported after completing the trajectory.

Configuration

ParameterDefaultNotes
modelgpt-5-nanoJudge used to score the trajectory.
temperatureNoneForwarded to the underlying model when provided.
trackTrueDisable to skip logging to Opik. When False, disables tracing for both the metric and underlying LLM judge calls.
project_nameNoneOverride the tracking project name.

The metric returns a value in the 0.0โ€“1.0 range together with a detailed explanation highlighting missing steps, misaligned actions, or other issues.