Getting Started

Here are the most relevant improvements we’ve made since the last release:

🚨 Alerts

We’ve launched Alerts — a powerful way to get automated webhook notifications from your Opik workspace whenever important events happen (errors, feedback scores, prompt changes, and more). Opik now sends an HTTP POST to your endpoint with rich, structured event data you can route anywhere.

Now, you can make Opik a seamless part of your end-to-end workflows! With the new Alerts you can:

  • Spot production errors in near-real time
  • Track feedback scores to monitor model quality and user satisfaction
  • Audit prompt changes across your workspace
  • Funnel events into your existing workflows and CI/CD pipelines

And this is just v1.0! We’ll keep adding events and advanced filtering, thresholds and more fine-grained control in future iterations, always based on community feedback.

Alerts configuration interface showing webhook setup and event types

Read the full docs here - Alerts Guide

🖼️ Expanded Multimodal Image Support

We’ve added a better image support across our platform!

What’s new?

1. Image Support in LLM as a Judge online Evaluations - LLM as a Judge evaluations now support images alongside text, enabling you to evaluate vision models and multimodal applications. Upload images and get comprehensive feedback on both text and visual content.

2. Enhanced Playground Experience - The playground now supports image inputs, allowing you to test prompts with images before running full evaluations. Perfect for experimenting with vision models and multimodal prompts.

3. Improved Data Display - Base64 image previews in data tables, better image handling in trace views, and enhanced pretty formatting for multimodal content.

LLM Judge evaluation interface showing image support for multimodal evaluations

Links to official docs: Evaluating traces with images and Using images in the Plaground

Opik Optimizer Updates

1. Support Multi-Metric Optimization - Support for optimizing multiple metrics simultaneously with comprehensive frontend and backend changes. Read more

2. Hierarchical Reflective Optimizer - New optimizer with self-reflective capabilities. Read more about it here

Enhanced Feedback & Annotation experience

1. Improved Annotation Queue Export - Enhanced export functionality for annotation queues: export your annotated data seamlessly for further analysis.

2. Annotation Queue UX Enhancements

  • Hotkeys Navigation - Improved keyboard navigation throughout the interface for a fast annotation experience
  • Return to Annotation Queue Button - Easy navigation back to annotation queues
  • Resume Functionality - Continue annotation work where you left off
  • Queue Creation from Traces - Create annotation queues directly from trace tables

3. Inline Feedback Editing - Quickly edit user feedback directly in data tables with our new inline editing feature. Hover over feedback cells to reveal edit options, making annotation workflows faster and more intuitive.

Inline feedback editing interface showing hover-triggered edit options in data tables

Read more about our Annotation Queues

User Experience Enhancements

1. Dark Mode Refinements - Improved dark mode styling across UI components for better visual consistency and user experience.

Dark mode interface showing improved styling and visual consistency across UI components

2. Enhanced Prompt Readability - Better formatting and display of long prompts in the interface, making them easier to read and understand.

3. Improved Online Evaluation Page - Added search, filtering, and sorting capabilities to the online evaluation page for better data management.

4. Better token and cost control

  • Thread Cost Display - Show cost information in thread sidebar headers
  • Sum Statistics - Display sum statistics for cost and token columns in the traces table.
Total cost display showing cost information in thread sidebar headers and sum statistics
Total duration display showing duration statistics and timing information

5. Filter-Aware Metric Aggregation - Better experiment item filtering in the experiments details tables for better data control.

6. Pretty Mode Enhancements - Improved the Pretty mode for Input/Output display with better formatting and readability across the product.

TypeScript SDK Updates

  • Opik Configure Tool - New opik-ts configure tool with a guided developer experience and local flag support
  • Prompt Management - Comprehensive prompt management implementation
  • LangChain Integration - Aligned LangChain integration with Python architecture

Python SDK Improvements

  • Context Managers - New context managers for span and trace creation
  • Bedrock Integration - Enhanced Bedrock integration with invoke_model support
  • Trace Updates - New update_trace() method for easier trace modifications
  • Parallel Agent Support - Support for logging parallel agents in ADK integration
  • Enhanced feedback score handling with better category support

Integration updates

1. OpenTelemetry Improvements

  • Thread ID Support - Added support for thread_id in OpenTelemetry endpoint
  • System Information in Telemetry - Enhanced telemetry with system information

2. Model Support Updates - Added support for Claude Haiku 4.5 and updated model pricing information across the platform.

And much more! 👉 See full commit log on GitHub

Releases: 1.8.63, 1.8.64, 1.8.65, 1.8.66, 1.8.67, 1.8.68, 1.8.69, 1.8.70, 1.8.71, 1.8.72, 1.8.73, 1.8.74, 1.8.75, 1.8.76, 1.8.77, 1.8.78, 1.8.79, 1.8.80, 1.8.81, 1.8.82, 1.8.83