🛠 Agent Optimizer 1.0 released!
The Opik Agent Optimizer now supports full agentic systems and not just single prompts.
With support for LangGraph, Google ADK, PydanticAI, and more, this release brings a simplified API, model customization for evaluation, and standardized interfaces to streamline optimization workflows. Learn more in the docs.
🧵 Thread-level improvements
Added Thread-Level Feedback, Tags & Comments: You can now add expert feedback scores directly at the thread level, enabling SMEs to review full agent conversations, flag risks, and collaborate with dev teams more effectively. Added support for thread-level tags and comments to streamline workflows and improve context sharing.

🖥️ UX improvements
- We’ve redesigned the Opik Home Page to deliver a cleaner, more intuitive first-use experience, with a focused value proposition, direct access to key metrics, and a polished look. The demo data has also been upgraded to showcase Opik’s capabilities more effectively for new users. Additionally, we’ve added inter-project comparison capabilities for metrics and cost control, allowing you to benchmark and monitor performance and expenses across multiple projects.


-
Improved Error Visualization: Enhanced how span-level errors are surfaced across the project. Errors now bubble up to the project view, with quick-access shortcuts to detailed error logs and variation stats for better debugging and error tracking.
-
Improved Sidebar Hotkeys: Updated sidebar hotkeys for more efficient keyboard navigation between items and detail views.
🔌 SDK, integrations and docs
- Added Langchain support in metric classes, allowing use of Langchain as a model proxy alongside LiteLLM for flexible LLM judge customization.
- Added support for the Gemini 2.5 model family.
- Updated pretty mode to support Dify and LangGraph + OpenAI responses.
- Added the OpenAI agents integration cookbook (link).
- Added a cookbook on how to import Huggingface Datasets to Opik
👉 See full commit log on GitHub
Releases: 1.7.37
, 1.7.38
, 1.7.39
, 1.7.40
, 1.7.41
, 1.7.42
🔌 Integrations and SDK
- Added CloudFlare’s WorkersAI integration (docs)
- Google ADK integration: tracing is now automatically propagated to all sub-agents in agentic systems with the new
track_adk_agent_recursive
feature, eliminating the need to manually add tracing to each sub-agent. - Google ADK integration: now we retrieve session-level information from the ADK framework to enrich the threads data.
- New in the SDK! Real-time tracking for long-running spans/traces is now supported. When enabled (set
os.environ["OPIK_LOG_START_TRACE_SPAN"] = "True"
in your environment), you can see traces and spans update live in the UI—even for jobs that are still running. This makes debugging and monitoring long-running agents much more responsive and convenient.
🧵 Threads improvements
- Added Token Count and Cost Metrics in Thread table
- Added Sorting on all Thread table columns
- Added Navigation from Thread Detail to all related traces
- Added support for “pretty mode” in OpenAI Agents threads
🧪 Experiments improvements
- Added support for filtering by configuration metadata to experiments. It is now also possible to add a new column displaying the configuration in the experiments table.
🛠 Agent Optimizer improvements
- New Public API for Agent Optimization
- Added optimization run display link
- Added
optimization_context
🛡️ Security Fixes
- Fixed: h11 accepted some malformed Chunked-Encoding bodies
- Fixed: setuptools had a path traversal vulnerability in PackageIndex.download that could lead to Arbitrary File Write
- Fixed: LiteLLM had an Improper Authorization Vulnerability
👉 See full commit log on GitHub
Releases: 1.7.32
, 1.7.33
, 1.7.34
, 1.7.35
, 1.7.36
💡 Product Enhancements
- Ability to upload CSV datasets directly through the user interface
- Add experiment cost tracking to the Experiments table
- Add hinters and helpers for onboarding new users across the platform
- Added “LLM calls count” to the traces table
- Pretty formatting for complex agentic threads
- Preview support for MP3 files in the frontend
🛠 SDKs and API Enhancements
- Good news for JS developers! We’ve released experiments support for the JS SDK (official docs coming very soon)
- New Experiments Bulk API: a new API has been introduced for logging Experiments in bulk.
- Rate Limiting improvements both in the API and the SDK
🔌 Integrations
- Support for OpenAI o3-mini and Groq models added to the Playground
- OpenAI Agents: context awareness implemented and robustness improved. Improve thread handling
- Google ADK: added support for multi-agent integration
- LiteLLM: token and cost tracking added for SDK calls. Integration now compatible with opik.configure(…)
👉 See full commit log on GitHub
Releases: 1.7.27
, 1.7.28
, 1.7.29
, 1.7.30
, 1.7.31
✨ New Features
-
Opik Agent Optimizer: A comprehensive toolkit designed to enhance the performance and efficiency of your Large Language Model (LLM) applications. Read more
-
Opik Guardrails: Guardrails help you protect your application from risks inherent in LLMs. Use them to check the inputs and outputs of your LLM calls, and detect issues like off-topic answers or leaking sensitive information. Read more
💡 Product Enhancements
- New Prompt Selector in Playground — Choose existing prompts from your Prompt Library to streamline your testing workflows.
- Improved “Pretty Format” for Agents — Enhanced readability for complex threads in the UI.
🔌 Integrations
- Vertex AI (Gemini) — Offline and online evaluation support integrated directly into Opik. Also available now in the Playground.
- OpenAI Integration in the JS/TS SDK
- AWS Strands Agents
- Agno Framework
- Google ADK Multi-agent support
🛠 SDKs and API Enhancements
- OpenAI LLM advanced configurations — Support for custom headers and base URLs.
- Span Timing Precision — Time resolution improved to microseconds for accurate monitoring.
- Better Error Messaging — More descriptive errors for SDK validation and runtime failures.
- Stream-based Tracing and Enhanced Streaming support
👉 See full commit log on GitHub
Releases: 1.7.19
, 1.7.20
, 1.7.21
, 1.7.22
, 1.7.23
, 1.7.24
, 1.7.25
, 1.7.26
Opik Dashboard:
Python and JS / TS SDK:
- Added support for streaming in ADK integration
- Add cost tracking for the ADK integration
- Add support for OpenAI
responses.parse
- Reduce the memory and CPU overhead of the Python SDK through various performance optimizations
Deployments:
- Updated port mapping when using
opik.sh
- Fixed persistence when using Docker compose deployments
Release: 1.7.15
, 1.7.16
, 1.7.17
, 1.7.18
Opik Dashboard:
- Updated the experiment page charts to better handle nulls, all metric values are now displayed.
- Added lazy loading for traces and span sidebar to better handle very large traces.
- Added support for trace and span attachments, you can now log pdf, video and audio files to your traces.

- Improved performance of some Experiment endpoints
Python and JS / TS SDK:
- Updated DSPy integration following latest DSPy release
- New Autogen integration based on Opik’s OpenTelemetry endpoints
- Added compression to request payload
Release: 1.7.12
, 1.7.13
, 1.7.14
Opik Dashboard:
- Released Python code metrics for online evaluations for both Opik Cloud and self-hosted deployments. This allows you to define python functions to evaluate your traces in production.

Python and JS / TS SDK:
- Fixed LLM as a judge metrics so they return an error rather than a score of 0.5 if the LLM returns a score that wasn’t in the range 0 to 1.
Deployments:
- Updated Dockerfiles to ensure all containers run as non root users.
Release: 1.7.11
Opik Dashboard:
- Updated the feedback scores UI in the experiment page to make it easier to annotate experiment results.
- Fixed an issue with base64 encoded images in the experiment sidebar.
- Improved the loading speeds of the traces table and traces sidebar for traces that have very large payloads (25MB+).
Python and JS / TS SDK:
- Improved the robustness of LLM as a Judge metrics with better parsing.
- Fix usage tracking for Anthropic models hosted on VertexAI.
- When using LiteLLM, we fallback to using the LiteLLM cost if no model provider or model is specified.
- Added support for
thread_id
in the LangGraph integration.
Releases: 1.7.4
, 1.7.5
, 1.7.6
. 1.7.7
and 1.7.8
.
Opik Dashboard:
- Added search to codeblocks in the input and output fields.
- Added sorting on feedback scores in the traces and spans tables:
- Added sorting on feedback scores in the experiments table.
Python and JS / TS SDK:
- Released a new integration with Google ADK framework.
- Cleanup up usage information by removing it from metadata field if it’s already
part of the
Usage
field. - Added support for
Rouge
metric - Thanks @rohithmsr ! - Updated the LangChain callback
OpikTracer()
to log the data in a structured way rather than as raw text. This is expecially useful when using LangGraph. - Updated the LangChainJS integration with additional examples and small fixes.
- Updated the OpenAI integration to support the Responses API.
- Introduced a new AggregatedMetric metric that can be used to compute aggregations of metrics in experiments.
- Added logging for LLamaIndex streaming methods.
- Added a new
text
property on the Opik.Prompt object.
Releases: 1.6.14
, 1.7.0
, 1.7.1
, 1.7.2
Opik Dashboard:
- Render markdown in experiment output sidebar
- The preference between pretty / JSON and YAML views are now saved
- We now hide image base64 strings in the traces sidebar to make it easier to read
Python and JS / TS SDK:
- Released a new integration with Flowise AI
- LangChain JS integration
- Added support for jinja2 prompts