Getting Started

🔌 Integrations and SDK

  • Added CloudFlare’s WorkersAI integration (docs)
  • Google ADK integration: tracing is now automatically propagated to all sub-agents in agentic systems with the new track_adk_agent_recursive feature, eliminating the need to manually add tracing to each sub-agent.
  • Google ADK integration: now we retrieve session-level information from the ADK framework to enrich the threads data.
  • New in the SDK! Real-time tracking for long-running spans/traces is now supported. When enabled (set os.environ["OPIK_LOG_START_TRACE_SPAN"] = "True" in your environment), you can see traces and spans update live in the UI—even for jobs that are still running. This makes debugging and monitoring long-running agents much more responsive and convenient.

🧵 Threads improvements

  • Added Token Count and Cost Metrics in Thread table
  • Added Sorting on all Thread table columns
  • Added Navigation from Thread Detail to all related traces
  • Added support for “pretty mode” in OpenAI Agents threads

🧪 Experiments improvements

  • Added support for filtering by configuration metadata to experiments. It is now also possible to add a new column displaying the configuration in the experiments table.

🛠 Agent Optimizer improvements

  • New Public API for Agent Optimization
  • Added optimization run display link
  • Added optimization_context

🛡️ Security Fixes

  • Fixed: h11 accepted some malformed Chunked-Encoding bodies
  • Fixed: setuptools had a path traversal vulnerability in PackageIndex.download that could lead to Arbitrary File Write
  • Fixed: LiteLLM had an Improper Authorization Vulnerability

👉 See full commit log on GitHub

Releases: 1.7.32, 1.7.33, 1.7.34, 1.7.35, 1.7.36

💡 Product Enhancements

  • Ability to upload CSV datasets directly through the user interface
  • Add experiment cost tracking to the Experiments table
  • Add hinters and helpers for onboarding new users across the platform
  • Added “LLM calls count” to the traces table
  • Pretty formatting for complex agentic threads
  • Preview support for MP3 files in the frontend

🛠 SDKs and API Enhancements

  • Good news for JS developers! We’ve released experiments support for the JS SDK (official docs coming very soon)
  • New Experiments Bulk API: a new API has been introduced for logging Experiments in bulk.
  • Rate Limiting improvements both in the API and the SDK

🔌 Integrations

  • Support for OpenAI o3-mini and Groq models added to the Playground
  • OpenAI Agents: context awareness implemented and robustness improved. Improve thread handling
  • Google ADK: added support for multi-agent integration
  • LiteLLM: token and cost tracking added for SDK calls. Integration now compatible with opik.configure(…)

👉 See full commit log on GitHub

Releases: 1.7.27, 1.7.28, 1.7.29, 1.7.30, 1.7.31

✨ New Features

  • Opik Agent Optimizer: A comprehensive toolkit designed to enhance the performance and efficiency of your Large Language Model (LLM) applications. Read more

  • Opik Guardrails: Guardrails help you protect your application from risks inherent in LLMs. Use them to check the inputs and outputs of your LLM calls, and detect issues like off-topic answers or leaking sensitive information. Read more

💡 Product Enhancements

  • New Prompt Selector in Playground — Choose existing prompts from your Prompt Library to streamline your testing workflows.
  • Improved “Pretty Format” for Agents — Enhanced readability for complex threads in the UI.

🔌 Integrations

  • Vertex AI (Gemini) — Offline and online evaluation support integrated directly into Opik. Also available now in the Playground.
  • OpenAI Integration in the JS/TS SDK
  • AWS Strands Agents
  • Agno Framework
  • Google ADK Multi-agent support

🛠 SDKs and API Enhancements

  • OpenAI LLM advanced configurations — Support for custom headers and base URLs.
  • Span Timing Precision — Time resolution improved to microseconds for accurate monitoring.
  • Better Error Messaging — More descriptive errors for SDK validation and runtime failures.
  • Stream-based Tracing and Enhanced Streaming support

👉 See full commit log on GitHub

Releases: 1.7.19, 1.7.20, 1.7.21, 1.7.22, 1.7.23, 1.7.24, 1.7.25, 1.7.26

Opik Dashboard:

Python and JS / TS SDK:

  • Added support for streaming in ADK integration
  • Add cost tracking for the ADK integration
  • Add support for OpenAI responses.parse
  • Reduce the memory and CPU overhead of the Python SDK through various performance optimizations

Deployments:

  • Updated port mapping when using opik.sh
  • Fixed persistence when using Docker compose deployments

Release: 1.7.15, 1.7.16, 1.7.17, 1.7.18

Opik Dashboard:

  • Updated the experiment page charts to better handle nulls, all metric values are now displayed.
  • Added lazy loading for traces and span sidebar to better handle very large traces.
  • Added support for trace and span attachments, you can now log pdf, video and audio files to your traces.
  • Improved performance of some Experiment endpoints

Python and JS / TS SDK:

  • Updated DSPy integration following latest DSPy release
  • New Autogen integration based on Opik’s OpenTelemetry endpoints
  • Added compression to request payload

Release: 1.7.12, 1.7.13, 1.7.14

Opik Dashboard:

  • Released Python code metrics for online evaluations for both Opik Cloud and self-hosted deployments. This allows you to define python functions to evaluate your traces in production.

Python and JS / TS SDK:

  • Fixed LLM as a judge metrics so they return an error rather than a score of 0.5 if the LLM returns a score that wasn’t in the range 0 to 1.

Deployments:

  • Updated Dockerfiles to ensure all containers run as non root users.

Release: 1.7.11

Opik Dashboard:

  • Updated the feedback scores UI in the experiment page to make it easier to annotate experiment results.
  • Fixed an issue with base64 encoded images in the experiment sidebar.
  • Improved the loading speeds of the traces table and traces sidebar for traces that have very large payloads (25MB+).

Python and JS / TS SDK:

  • Improved the robustness of LLM as a Judge metrics with better parsing.
  • Fix usage tracking for Anthropic models hosted on VertexAI.
  • When using LiteLLM, we fallback to using the LiteLLM cost if no model provider or model is specified.
  • Added support for thread_id in the LangGraph integration.

Releases: 1.7.4, 1.7.5, 1.7.6. 1.7.7 and 1.7.8.

Opik Dashboard:

  • Added search to codeblocks in the input and output fields.
  • Added sorting on feedback scores in the traces and spans tables:
  • Added sorting on feedback scores in the experiments table.

Python and JS / TS SDK:

  • Released a new integration with Google ADK framework.
  • Cleanup up usage information by removing it from metadata field if it’s already part of the Usage field.
  • Added support for Rouge metric - Thanks @rohithmsr !
  • Updated the LangChain callback OpikTracer() to log the data in a structured way rather than as raw text. This is expecially useful when using LangGraph.
  • Updated the LangChainJS integration with additional examples and small fixes.
  • Updated the OpenAI integration to support the Responses API.
  • Introduced a new AggregatedMetric metric that can be used to compute aggregations of metrics in experiments.
  • Added logging for LLamaIndex streaming methods.
  • Added a new text property on the Opik.Prompt object.

Releases: 1.6.14, 1.7.0, 1.7.1, 1.7.2

Opik Dashboard:

  • Render markdown in experiment output sidebar
  • The preference between pretty / JSON and YAML views are now saved
  • We now hide image base64 strings in the traces sidebar to make it easier to read

Python and JS / TS SDK:

General

  • Introduced a new .opik.sh installation script

Opik Dashboard:

  • You can now view the number of spans for each trace in the traces table
  • Add the option to search spans from the traces sidebar
  • Improved performance of the traces table

Python and JS / TS SDK:

  • Fixed issue related to log_probs in Geval metric
  • Unknown fields are no longer excluded when using the OpenTelemetry integration