Opik Agent Optimizer: Getting Started

Simple few step process to getting you optimized

This guide will help you get started with Opik Agent Optimizer SDK for improving your LLM prompts through systematic optimization.

Prerequisites

  • Python 3.9 or higher
  • An Opik API key (sign up here if you don’t have one)

Getting Started with Optimizers

Here’s a step-by-step guide to get you up and running with Opik Agent Optimizer:

1

1. Install Opik and the optimizer package

Install the required packages using pip or uv (recommended for faster installation):

$# Using pip
>pip install opik opik-optimizer
>
># Using uv (recommended for faster installation)
>uv pip install opik-optimizer

Then configure your Opik environment:

$# Install the Opik CLI if not already installed
>pip install opik
>
># Configure your API key
>opik configure
2

2. Import necessary modules

First, we import all the required classes and functions from opik and opik_optimizer.

1from opik.evaluation.metrics import LevenshteinRatio
2from opik_optimizer import MetaPromptOptimizer, ChatPrompt
3from opik_optimizer.datasets import tiny_test
3

3. Define your evaluation dataset

You can use a demo dataset for testing, or create/load your own. You can learn more about creating datasets in the Manage Datasets documentation.

1# You can use a demo dataset for testing, or your own dataset
2dataset = tiny_test()
3print(f"Using dataset: {dataset.name}, with {len(dataset.get_items())} items.")
4

4. Configure the evaluation metric

metric tells the optimizer how to score the LLM’s outputs. Here, LevenshteinRatio measures the similarity between the model’s response and a “label” field in the dataset.

1# This example uses Levenshtein distance to measure output quality
2def levenshtein_ratio(dataset_item, llm_output):
3 metric = LevenshteinRatio()
4 return metric.score(reference=dataset_item['label'], output=llm_output)
5

5. Define your base prompt

This is the initial instruction that the MetaPromptOptimizer will try to enhance:

1prompt = ChatPrompt(
2 project_name="Prompt Optimization Quickstart",
3 messages=[
4 {"role": "system", "content": "You are an expert assistant. Your task is to answer questions accurately and concisely. Consider the context carefully before responding."},
5 {"role": "user", "content": "{text}"}
6 ]
7)
8print("Prompt defined.")
6

6. Choose and configure an optimizer

Instantiate MetaPromptOptimizer, specifying the model to be used in the optimization process.

1optimizer = MetaPromptOptimizer(
2 model="gpt-4",
3)
4print(f"Optimizer configured: {type(optimizer).__name__}")
7

7. Run the optimization

The optimizer.optimize_prompt(...) method is called with the dataset, metric configuration, and prompt to start the optimization process.

1print("Starting optimization...")
2result = optimizer.optimize_prompt(
3 prompt=prompt,
4 dataset=dataset,
5 metric=levenshtein_ratio,
6)
7print("Optimization finished.")
8

9. View results in the CLI

After optimization completes, call result.display() to see a summary of the optimization, including the best prompt found and its score, directly in your terminal.

1print("Optimization Results:")
2result.display()

The OptimizationResult object also contains more details in result.history and result.details. The optimization results will be displayed in your console, showing the progress and final scores.

Opik agent optimization progress in CLI
9

10. View results in the Opik dashboard

In addition to the CLI output, your optimization results are also available in the Opik Agent Optimization dashboard for further analysis and visualization.

Opik agent optimization results in dashboard

Next Steps

  1. Explore different optimization algorithms to choose the best one for your use case
  2. Understand prompt engineering best practices
  3. Set up your own evaluation datasets
  4. Review the API reference for detailed configuration options

🚀 Want to see Opik Agent Optimizer in action? Check out our Example Projects & Cookbooks for runnable Colab notebooks covering real-world optimization workflows, including HotPotQA and synthetic data generation.