MetaPrompt Optimizer

Refine and improve LLM prompts with systematic analysis.

The MetaPrompter is a specialized optimizer designed for meta-prompt optimization. It focuses on improving the structure and effectiveness of prompts through systematic analysis and refinement of prompt templates, instructions, and examples.

When to Use This Optimizer: MetaPromptOptimizer is a strong choice when you have an initial instruction prompt and want to iteratively refine its wording, structure, and clarity using LLM-driven suggestions. It excels at general-purpose prompt improvement where the core idea of your prompt is sound but could be phrased better for the LLM, or when you want to explore variations suggested by a reasoning model.

Key Trade-offs:

  • Relies on the quality of the reasoning_model for generating good candidates.
  • May be less suited than specialized optimizers if your primary goal is only few-shot example selection (see FewShotBayesianOptimizer) or only complex agent/tool-use optimization (see MiproOptimizer).
  • Optimization process involves multiple LLM calls for both reasoning and evaluation, which can impact cost and time.

Looking for common questions about this optimizer or others? Check out the Optimizer & SDK FAQ, which includes specific sections for MetaPromptOptimizer questions like when to use it, understanding model vs. reasoning_model, and how parameters like max_rounds affect optimization.

How It Works

The MetaPromptOptimizer automates the process of prompt refinement by using a “reasoning” LLM to critique and improve your initial prompt. Here’s a conceptual breakdown:

  1. Initial Prompt Evaluation: Your starting prompt messages is first evaluated on the dataset using the specified metric to establish a baseline score.

  2. Reasoning and Candidate Generation:

    • The optimizer takes your current best prompt and detailed context about the task (derived from prompt and metric).
    • It then queries a reasoning_model (which can be the same as the evaluation model or a different, potentially more powerful one). This reasoning_model is guided by a system prompt (the “meta-prompt”) that instructs it to act as an expert prompt engineer.
    • The reasoning_model analyzes the provided prompt and task context, then generates a set of new candidate prompts. Each candidate is designed to improve upon the previous best, with specific reasoning for the suggested changes (e.g., “added more specific constraints,” “rephrased for clarity”).
  3. Candidate Evaluation:

    • These newly generated candidate prompts are then evaluated on the dataset.
    • The optimizer uses an adaptive trial strategy: promising candidates (based on initial trials) may receive more evaluation trials up to max_trials_per_candidate.
  4. Selection and Iteration:

    • The best-performing candidate from the current round becomes the new “current best prompt.”
    • This process repeats for a configured number of max_rounds or until the improvement in score falls below improvement_threshold.
  5. Result: The highest-scoring prompt found throughout all rounds is returned as the optimized prompt.

This iterative loop of generation, evaluation, and selection allows the MetaPromptOptimizer to explore different phrasings and structures, guided by the reasoning capabilities of an LLM.

The evaluation of each candidate prompt (Step 3) uses the metric you provide and runs against the dataset. This process is fundamental to how the optimizer determines which prompts are better. For a deeper understanding of Opik’s evaluation capabilities, refer to: - Evaluation Overview - Evaluate Prompts - Metrics Overview

Configuration Options

Basic Configuration

1from opik_optimizer import MetaPromptOptimizer
2
3prompter = MetaPromptOptimizer(
4 model="openai/gpt-4", # or "azure/gpt-4"
5 # LLM parameters like temperature, max_tokens are passed as keyword arguments
6 temperature=0.1,
7 max_tokens=5000,
8 # Optimizer-specific parameters
9 reasoning_model="openai/gpt-4-turbo", # Optional: model for generating prompt suggestions
10 max_rounds=3,
11 num_prompts_per_round=4,
12 n_threads=8,
13 seed=42 # Seed for LLM calls if supported by the model
14)

Advanced Configuration

The MetaPromptOptimizer primarily uses the constructor parameters listed above for its configuration. Key parameters include:

  • model: The LLM used for evaluating generated prompts.
  • reasoning_model: (Optional) A separate, potentially more powerful LLM used by the optimizer to analyze and generate new prompt candidates. If not provided, model is used.
  • max_rounds: The number of optimization iterations.
  • num_prompts_per_round: How many new prompt candidates are generated in each round.
  • initial_trials_per_candidate / max_trials_per_candidate: Control the number of evaluations for each candidate.

Additional LLM call parameters (e.g., temperature, max_tokens, top_p) for both the evaluation model and the reasoning model can be passed as keyword arguments (**model_kwargs) to the constructor. These will be used when making calls to the respective LLMs.

The optimizer works by iteratively generating new prompt suggestions using its reasoning_model based on the performance of previous prompts and then evaluating these suggestions.

Example Usage

1from opik_optimizer import MetaPromptOptimizer
2from opik.evaluation.metrics import LevenshteinRatio
3from opik_optimizer import datasets, ChatPrompt
4
5# Initialize optimizer
6optimizer = MetaPromptOptimizer(
7 model="openai/gpt-4", # or "azure/gpt-4"
8 temperature=0.1,
9 max_tokens=5000,
10 n_threads=8,
11 seed=42
12)
13
14# Prepare dataset
15dataset = datasets.hotpot_300()
16
17# Define metric and task configuration (see docs for more options)
18def levenshtein_ratio(dataset_item, llm_output):
19 return LevenshteinRatio().score(reference=dataset_item['answer'], output=llm_output)
20
21prompt = ChatPrompt(
22 project_name="my-project",
23 messages=[
24 {"role": "system", "content": "Provide an answer to the question."},
25 {"role": "user", "content": "{question}"}
26 ]
27)
28
29# Run optimization
30results = optimizer.optimize_prompt(
31 prompt=prompt,
32 dataset=dataset,
33 metric=levenshtein_ratio,
34)
35
36# Access results
37results.display()

Model Support

The MetaPrompter supports all models available through LiteLLM. This provides broad compatibility with providers like OpenAI, Azure OpenAI, Anthropic, Google, and many others, including locally hosted models.

For detailed instructions on how to specify different models and configure providers, please refer to the main LiteLLM Support for Optimizers documentation page.

Configuration Example using LiteLLM model string

1optimizer = MetaPromptOptimizer(
2 model="google/gemini-pro", # or any LiteLLM supported model
3 temperature=0.1,
4 max_tokens=5000
5)

Best Practices

  1. Template Design

    • Start with clear structure
    • Use consistent formatting
    • Include placeholders for variables
  2. Instruction Writing

    • Be specific and clear
    • Use active voice
    • Include success criteria
  3. Example Selection

    • Choose diverse examples
    • Ensure relevance to task
    • Balance complexity levels
  4. Optimization Strategy

    • Focus on one component at a time
    • Track changes systematically
    • Validate improvements

Research and References