For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Copy to LLMGithubGo to App
DocumentationIntegrationsAgent OptimizationSelf-hosting OpikSDK & API referenceOpik University
DocumentationIntegrationsAgent OptimizationSelf-hosting OpikSDK & API referenceOpik University
  • Getting Started
    • Home
    • Quickstart
    • Quickstart notebook
    • Roadmap
    • FAQ
    • Changelog
  • Observability
    • Concepts
    • Log traces
    • Log conversations
    • Log user feedback
    • Log media & attachments
    • Cost tracking
    • Opik Assist
  • Evaluation
    • Overview
    • Concepts
    • Manage datasets
    • Evaluate single prompts
    • Evaluate your agent
    • Evaluate agent trajectories
    • Evaluate multimodal traces
    • Evaluate multi-turn agents
    • Manually logging experiments
    • Re-running an existing experiment
    • Annotation Queues
      • Overview
      • Heuristic metrics
      • Hallucination
      • LLM Juries
      • G-Eval
      • Conversation-level GEval
      • Compliance risk
      • Prompt uncertainty
      • Moderation
      • Meaning Match
      • Usefulness
      • Summarization consistency
      • Summarization coherence
      • Dialogue helpfulness
      • Answer relevance
      • Context precision
      • Context recall
      • Trajectory accuracy
      • Agent task completion
      • Agent tool correctness
      • Conversational metrics
      • Custom model
      • Advanced configuration
      • Custom metric
      • Custom conversation metric
      • Structured Output Compliance
      • Task span metrics
  • Prompt engineering
    • Prompt management
    • Prompt Playground
    • Prompt Generator and Improver
    • Opik's MCP server
  • Testing
    • Pytest integration
  • Production
    • Production monitoring
    • Online Evaluation rules
    • Gateway
    • Guardrails
    • Anonymizers
    • Alerts
    • Dashboards
  • Administration
    • Overview
    • Roles and Permissions
  • Contributing
    • Contribution Overview
LogoLogo
Copy to LLMGithubGo to App
On this page
  • Trajectory Accuracy
  • Inputs
  • Configuration
EvaluationMetrics

Trajectory accuracy

Was this page helpful?
Previous

Agent task completion

Next
Built with

Trajectory Accuracy

TrajectoryAccuracy checks how closely a ReAct-style agent followed a sensible sequence of thoughts, actions, and observations to achieve the stated goal. It is useful for auditing complex workflow agents and reinforcement-learning traces.

Auditing an agent run
1from opik.evaluation.metrics import TrajectoryAccuracy
2
3metric = TrajectoryAccuracy()
4
5score = metric.score(
6 goal="Book travel to Paris",
7 trajectory=[
8 {
9 "thought": "Check available flights",
10 "action": "search_flights(destination='Paris')",
11 "observation": "Found flights for next week",
12 },
13 {
14 "thought": "Summarise the best option",
15 "action": "summarise(options)",
16 "observation": "Shared top three flights",
17 },
18 ],
19 final_result="Here are the best flights to Paris next week.",
20)
21
22print(score.value) # Already normalised between 0.0 and 1.0
23print(score.reason) # Explanation of the verdict

Inputs

ArgumentTypeRequiredDescription
goalstrYesThe agent’s objective or task description.
trajectorylist[dict]YesSequence of steps with thought, action, and observation keys.
final_resultstrYesOutcome that the agent reported after completing the trajectory.

Configuration

ParameterDefaultNotes
modelgpt-5-nanoJudge used to score the trajectory.
temperatureNoneForwarded to the underlying model when provided.
trackTrueDisable to skip logging to Opik. When False, disables tracing for both the metric and underlying LLM judge calls.
project_nameNoneOverride the tracking project name.

The metric returns a value in the 0.0–1.0 range together with a detailed explanation highlighting missing steps, misaligned actions, or other issues.