Getting Started

Here are the most relevant improvements we’ve made since the last release:

🔌 Playground & Provider Enhancements

We’ve expanded the Playground with new provider support and enhanced functionality to make prompt experimentation more powerful.

What’s new:

  • Display Metric Results in Output - Playground output cells now display metric results directly, making it easier to evaluate prompt performance at a glance
Playground output showing metric results directly in the output cell
  • Model Selector for OpikAI Features - Easily select which model powers the Prompt Generator and Prompt Improver features
Model selector dropdown for OpikAI features like Prompt Generator and Prompt Improver
  • Native AWS Bedrock Integration - Bedrock is now available as a native provider in the Playground, giving you direct access to Amazon’s models without additional configuration
AWS Bedrock integration in the Playground provider selection
  • Gemini 3 Flash Support - Added support for Gemini 3 Flash in both the Playground and online scoring, expanding your model options for fast, cost-effective evaluations

🧪 Online Evaluation & Scoring

We’ve made online evaluation more flexible and easier to manage across your projects.

What’s improved:

  • Multi-Project Evaluation Rules - Online evaluation rules can now be applied across multiple projects, reducing duplication and simplifying rule management
Multi-project support for online evaluation rules
  • Clone Score Rules - Quickly duplicate existing online score rules to create variations without starting from scratch

🎨 UI & UX Enhancements

We’ve refined the user experience across the platform with improved responsiveness and dashboard polish.

What’s improved:

  • Mobile Responsiveness - Better support for mobile devices when logging traces
  • Dashboard Enhancements - Unified widget editor design, dashboard count in sidebar, and various UX improvements to the dashboard experience

📦 SDK Improvements

We’ve updated our SDKs with new capabilities and modernized dependencies.

What’s new:

  • Python 3.9 End of Life - Python 3.9 support has been retired as it reached end-of-life. Please upgrade to Python 3.10+
  • Experiment Tags in evaluate() - You can now add tags to experiments directly when calling the evaluate() method
  • Vercel AI SDK v6 - Upgraded TypeScript SDK integration from Vercel AI SDK v5 to v6
  • Prompt Version Tags (TypeScript) - TypeScript SDK now supports prompt version tags for better prompt management

And much more! 👉 See full commit log on GitHub

Releases: 1.9.57, 1.9.58, 1.9.59, 1.9.60, 1.9.61, 1.9.62, 1.9.63, 1.9.64, 1.9.65, 1.9.66, 1.9.67, 1.9.68, 1.9.69, 1.9.70, 1.9.71, 1.9.72, 1.9.73, 1.9.74, 1.9.75, 1.9.76, 1.9.77, 1.9.78