For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Copy to LLMGithubGo to App
DocumentationIntegrationsBuilding Self-Improving AgentsSelf-hosting OpikSDK & API reference
DocumentationIntegrationsBuilding Self-Improving AgentsSelf-hosting OpikSDK & API reference
  • Getting Started
    • Home
    • Quickstart
    • Upgrading to Opik 2.0
    • Ollie Agent
    • FAQ
    • Changelog
  • Observability
    • Overview
    • Getting started
    • Concepts
    • Debugging agents with Ollie and Opik Connect
  • Development
    • Overview
    • Agent playground
    • Prompt playground
  • Evaluation
    • Overview
    • Getting started
    • Concepts
  • Production
  • Administration
    • Overview
    • Roles and Permissions
  • Contributing
    • Contribution Overview
LogoLogo
Copy to LLMGithubGo to App
On this page
  • Why use Opik for observability
  • What you can capture
  • How it works
  • Integrations
  • Next steps
Observability

Observability Overview

Was this page helpful?
Previous

Getting started with Observability

Next
Built with

If you want to jump straight to code, head to the Getting started guide to add tracing in under five minutes.

LLM applications are more than a single API call. A typical agent involves retrieval steps, tool calls, prompt assembly, multiple LLM invocations, and post-processing — all wired together in ways that are invisible at runtime. When something goes wrong, you need to see exactly what happened at every step.

Opik gives you full visibility into every request your agent handles. Every LLM call, every tool invocation, every retrieval step is captured as a trace you can inspect, search, and analyze.

Why use Opik for observability

Debugging LLM applications without observability means guessing. You see the final output but not why the model hallucinated, which retrieval step returned irrelevant context, or where latency spiked.

With Opik, you can:

  • See the full execution path of every request — from user input through tool calls and LLM completions to the final response
  • Root-cause production issues fast — filter and search traces by status, latency, cost, or custom tags to find the problem in seconds
  • Track costs and latency over time — monitor token usage and spending across models and providers
  • Capture multi-turn conversations — group related traces into threads to understand how interactions evolve across turns
  • Close the feedback loop — attach human or automated scores to traces and use them to drive evaluations

What you can capture

Traces & spans

Full execution trees with inputs, outputs, timing, and metadata for every step

Conversations

Multi-turn threads that group related traces into coherent sessions

Cost tracking

Token usage and spending broken down by model, provider, and trace

Media & attachments

Images, audio, video, and files logged alongside your traces

User feedback

Qualitative and quantitative scores attached to individual traces

Agent graphs

Visual execution graphs showing how your agent’s steps connect

How it works

1

Connect your project

Run opik connect from your agent’s directory to pair it with Opik:

$opik connect --project <YOUR_PROJECT_NAME>
2

Instrument your code

The fastest way to add tracing is with opik-skills — install the skill and let your coding agent handle the rest:

$npx skills add comet-ml/opik-skills

Then ask your coding agent:

Instrument my agent with Opik using the /instrument command.

This works with Claude Code, Cursor, Codex, OpenCode, and other coding agents. You can also instrument manually with the SDK:

1import opik
2
3@opik.track
4def my_agent(user_message):
5 context = retrieve_context(user_message)
6 response = call_llm(user_message, context)
7 return response
3

View traces in the dashboard

Every request creates a trace with detailed span-level information. You can inspect the full execution tree, see inputs and outputs at each step, and filter by duration, cost, status, or tags.

4

Analyze and improve

Use traces to debug failures, identify slow steps, and track quality over time. Attach feedback scores, run evaluations against datasets, and use Ollie — Opik’s AI assistant — to help root-cause issues automatically.

Integrations

Opik has first-class support for 30+ frameworks in Python, TypeScript, and OpenTelemetry — so you can start capturing traces without changing how your application is built.

LangChain
LlamaIndex
Anthropic
AWS Bedrock
Google Gemini
CrewAI

View all integrations →

Next steps

  • Getting started — Add observability to your agent in minutes
  • Concepts — Understand traces, spans, threads, and feedback scores
  • Debugging agents with Ollie — Use AI-assisted root-cause analysis
  • Cost tracking — Monitor token usage and spending