Agent Optimization

Opik Agent Optimizer is a turnkey, open-source agent and prompt optimization SDK. It automatically tunes prompts, tools, and agent workflows using the datasets, metrics, and traces you already log to Opik. Instead of hand-editing instructions and re-running evaluations, pick an optimizer (MetaPrompt, Hierarchical Reflective, Evolutionary, GEPA, etc.) and let it iterate for you online or fully offline inside Docker and Kubernetes.

Opik Agent Optimizer Dashboard showing optimization progress

Why teams choose Opik Agent Optimizer

  • Automatic prompt optimization – end-to-end workflow that installs in minutes and runs locally or in your stack.
  • Open-source and framework agnostic – no lock-in, use Opik’s first-party optimizers or community favorites like GEPA in the same SDK.
  • Agent-aware – optimize beyond system prompts, including MCP tool signatures, function-calling schemas, and full multi-agent systems.
  • Deep observability – every trial logs prompts, tool calls, traces, and metric reasons to Opik so you can explain and ship changes confidently.

Key capabilities

Optimizers
Tool & MCP support
Agent integrations
Dashboard analytics
Secure & offline

How it works

1

1. Prepare data & metrics

Use Opik datasets (CSV upload, API, or trace exports) plus deterministic metrics/ScoreResult functions. See Define datasets and Define metrics.

2

2. Pick an optimizer

Choose the best algorithm for your task (see Optimization algorithms). All optimizers expose the same API, so you can swap them easily or chain runs.

3

3. Inspect & ship

Results land in the Opik dashboard under Evaluation → Optimization runs, where you can compare prompts, failure modes, and dataset coverage before promoting the change.

Start fast

Optimization Algorithms

The optimizer implements both proprietary and open-source optimization algorithms. Each one has it’s strengths and weaknesses, we recommend first trying out either GEPA or the Hierarchical Reflective Optimizer as a first step:

AlgorithmDescription
MetaPrompt OptimizationUses an LLM (“reasoning model”) to critique and iteratively refine an initial instruction prompt. Good for general prompt wording, clarity, and structural improvements. Supports MCP tool calling optimization.
Hierarchical Reflective OptimizationUses hierarchical root cause analysis to systematically improve prompts by analyzing failures in batches, synthesizing findings, and addressing identified failure modes. Best for complex prompts requiring systematic refinement based on understanding why they fail.
Few-shot Bayesian OptimizationSpecifically for chat models, this optimizer uses Bayesian optimization (Optuna) to find the optimal number and combination of few-shot examples (demonstrations) to accompany a system prompt.
Evolutionary OptimizationEmploys genetic algorithms to evolve a population of prompts. Can discover novel prompt structures and supports multi-objective optimization (e.g., score vs. length). Can use LLMs for advanced mutation/crossover.
GEPA OptimizationWraps the external GEPA package to optimize a single system prompt for single-turn tasks using a reflection model. Requires pip install gepa.
Parameter OptimizationOptimizes LLM call parameters (temperature, top_p, etc.) using Bayesian optimization. Uses Optuna for efficient parameter search with global and local search phases. Best for tuning model behavior without changing the prompt.

Want to see numbers? Check the new optimizer benchmarks page for the latest performance table and instructions for running the benchmark suite yourself.

Next Steps

  1. Explore specific Optimizers for algorithm details.
  2. Refer to the FAQ for common questions and troubleshooting.
  3. Refer to the API Reference for detailed configuration options.

🚀 Want to see Opik Agent Optimizer in action? Check out our Example Projects & Cookbooks for runnable Colab notebooks covering real-world optimization workflows, including HotPotQA and synthetic data generation.