MIPRO Optimizer: Agent & Complex Optimization

Optimize complex agent behaviors and tool-using prompts.

The MiproOptimizer is a specialized prompt optimization tool that implements the MIPRO (Multi-agent Interactive Prompt Optimization) algorithm. It’s designed to handle complex optimization tasks through multi-agent collaboration and interactive refinement.

This Optimizer is currently deprecated and will be removed in a future release. We recommend trying out the EvolutionaryOptimizer instead.

When to Use This Optimizer: MiproOptimizer is the preferred choice for complex tasks, especially those involving tool use or requiring multi-step reasoning. If you are building an agent that needs to interact with external tools or follow a sophisticated chain of thought, and you want to optimize the underlying prompts and agent structure, MiproOptimizer (which leverages DSPy) is highly suitable.

Key Trade-offs:

  • The optimization process can be more involved than single-prompt optimizers due to its reliance on the DSPy framework and compilation process.
  • Understanding basic DSPy concepts (like Modules, Signatures, Teleprompters) can be beneficial for advanced customization and debugging, though Opik abstracts much of this.
  • Debugging can sometimes be more complex as you are optimizing a program/agent structure, not just a single string.

Got questions about MiproOptimizer or its use with DSPy? The Optimizer & SDK FAQ covers topics such as when to use MiproOptimizer, its relationship with DSPy, the role of num_candidates, and using it without deep DSPy knowledge.

How It Works

The MiproOptimizer leverages the DSPy library, specifically an internal version of its MIPRO (Modular Instruction Programming and Optimization) teleprompter, to optimize potentially complex prompt structures, including those for tool-using agents.

Here’s a simplified overview of the process:

  1. DSPy Program Representation: Your task, as defined by TaskConfig (including the instruction_prompt, input/output fields, and any tools), is translated into a DSPy program structure. If tools are provided, this often involves creating a dspy.ReAct or similar agent module.

  2. Candidate Generation (Compilation): The core of MIPRO involves compiling this DSPy program. This “compilation” is an optimization process itself:

    • It explores different ways to formulate the instructions within the DSPy program’s modules (e.g., the main instruction, tool usage instructions if applicable).
    • It also optimizes the selection of few-shot demonstrations to include within the prompts for these modules, if the DSPy program structure uses them.
    • This is guided by an internal DSPy teleprompter algorithm (like MIPROv2 in the codebase) which uses techniques like bootstrapping demonstrations and proposing instruction variants.
  3. Evaluation: Each candidate program (representing a specific configuration of instructions and/or demonstrations) is evaluated on your training dataset (dataset) using the specified metric. The MiproOptimizer uses DSPy’s evaluation mechanisms, which can handle complex interactions, especially for tool-using agents.

  4. Iterative Refinement: The teleprompter iteratively refines the program based on these evaluations, aiming to find a program configuration that maximizes the metric score. The num_candidates parameter in optimize_prompt influences how many of these configurations are explored.

  5. Result: The MiproOptimizer returns an OptimizationResult containing the best-performing DSPy program structure found. This might be a single optimized prompt string or, for more complex agents, a collection of optimized prompts that make up the agent’s internal logic (e.g., tool_prompts in the OptimizationResult).

Essentially, MiproOptimizer uses MIPRO algorithm to not just optimize a single string prompt, but potentially a whole system of prompts and few-shot examples that define how an LLM (or an LLM-based agent) should behave to accomplish a task.

The evaluation of each candidate program (Step 3) is crucial. It uses your metric and dataset to score how well a particular set of agent instructions or prompts performs. Since MiproOptimizer often deals with agentic behavior, understanding Opik’s broader evaluation tools is beneficial: - Evaluation Overview - Evaluate Agents (particularly relevant) - Evaluate Prompts - Metrics Overview

Configuration Options

Basic Configuration

1from opik_optimizer import MiproOptimizer
2
3optimizer = MiproOptimizer(
4 model="openai/gpt-4", # or "azure/gpt-4"
5 project_name="my-project",
6 temperature=0.1,
7 max_tokens=5000,
8 num_threads=8,
9 seed=42
10)

Advanced Configuration

The MiproOptimizer leverages the DSPy library for its optimization capabilities, specifically using an internal implementation similar to DSPy’s MIPRO teleprompter (referred to as MIPROv2 in the codebase).

The constructor for MiproOptimizer is simple (model, project_name, **model_kwargs). The complexity of the optimization is managed within the DSPy framework when optimize_prompt is called.

Key aspects passed to optimize_prompt that influence the DSPy optimization include:

  • task_config: This defines the overall task, including the initial instruction_prompt, input_dataset_fields, and output_dataset_field. If task_config.tools are provided, MiproOptimizer will attempt to build and optimize a DSPy agent that uses these tools.
  • metric: Defines how candidate DSPy programs (prompts/agents) are scored. The metric needs to be a function that accepts the parameters dataset_item and llm_output.
  • num_candidates: This parameter from optimize_prompt controls how many different program configurations are explored during optimization.

Example Usage

1from opik_optimizer import MiproOptimizer, TaskConfig, from_llm_response_text, from_dataset_field
2from opik.evaluation.metrics import LevenshteinRatio
3from opik_optimizer import datasets
4
5# Initialize optimizer
6optimizer = MiproOptimizer(
7 model="openai/gpt-4",
8 project_name="mipro_optimization_project",
9 temperature=0.1,
10 max_tokens=5000
11)
12
13# Prepare dataset
14dataset = datasets.hotpot_300()
15
16# Define metric and task configuration
17def levenshtein_ratio(dataset_item, llm_output):
18 return LevenshteinRatio().score(reference=dataset_item['answer'], output=llm_output)
19
20# Define some tools
21def calculator(expression):
22 """Perform mathematical calculations"""
23 return str(eval(expression))
24
25def search(query):
26 """Search for information on a given topic"""
27 # placeholder for search functionality
28 return "hello_world"
29
30# Define task configuration with tools
31task_config = TaskConfig(
32 instruction_prompt="Complete the task using the provided tools.",
33 input_dataset_fields=["question"],
34 output_dataset_field="answer",
35 use_chat_prompt=True,
36 tools=[search, calculator]
37)
38
39# Run optimization
40results = optimizer.optimize_prompt(
41 dataset=dataset,
42 metric=levenshtein_ratio,
43 task_config=task_config,
44 num_candidates=1 # Number of different program configurations to explore
45)
46
47# Access results
48results.display()

Model Support

The MiproOptimizer supports all models available through LiteLLM. This includes models from OpenAI, Azure OpenAI, Anthropic, Google (Vertex AI / AI Studio), Mistral AI, Cohere, locally hosted models (e.g., via Ollama), and many others.

For detailed instructions on how to specify different models and configure providers, please refer to the main LiteLLM Support for Optimizers documentation page.

Best Practices

  1. Task Complexity Assessment

    • Use MiproOptimizer for tasks requiring multi-step reasoning or tool use
    • Consider simpler optimizers for single-prompt tasks
    • Evaluate if your task benefits from DSPy’s programmatic approach
  2. Tool Configuration

    • Provide clear, detailed tool descriptions
    • Include example usage in tool descriptions
    • Ensure tool parameters are well-defined
  3. Dataset Preparation

    • Include examples of tool usage in your dataset
    • Ensure examples cover various tool combinations
    • Include edge cases and error scenarios
  4. Optimization Strategy

    • Start with a reasonable num_candidates (5-10)
    • Monitor optimization progress
    • Adjust based on task complexity
  5. Evaluation Metrics

    • Choose metrics that reflect tool usage success
    • Consider composite metrics for complex tasks
    • Include task-specific evaluation criteria

Research and References

Next Steps