Extending Optimizers | Opik Documentation

Opik Agent Optimizer is designed to be a flexible framework for prompt and agent optimization. While it provides a suite of powerful built-in algorithms, you might have unique optimization strategies or specialized needs. This guide discusses how you can conceptually think about building your own optimizer logic that integrates with Opik’s evaluation ecosystem and how you can contribute to the broader Opik Agent Optimizer project.

Currently, direct inheritance and registration of custom optimizer classes into the opik-optimizer SDK by end-users is not a formally exposed feature. This guide provides a conceptual overview for advanced users and potential future development or contributions.

Core Concepts for a Custom Optimizer

If you were to design a new optimization algorithm to work within Opik’s ecosystem, it would typically need to interact with several key components:

Prompt (ChatPrompt): Your optimizer would take a ChatPrompt object as input. The chat prompt is a list of messages, where each message has a role, content, and optional additional fields. This includes variables that needs to be replaced with actual values.
Evaluation Mechanism (Metric & Dataset): Your optimizer would need a way to score candidate prompts. This is achieved by creating a metric (function that takes dataset_item and llm_output as arguments and returns a float) and an evaluation dataset.
Optimization Loop: This is the heart of your custom optimizer. It would involve:
- Candidate Generation: Logic for creating new prompt variations. This could be rule-based, LLM-driven, or based on any other heuristic.
- Candidate Evaluation: Using the metric and dataset to get a score for each candidate.
- Selection/Progression: Logic to decide which candidates to keep, refine further, or how to adjust the generation strategy based on scores.
- Termination Condition: Criteria for when to stop the optimization (e.g., number of rounds, score threshold, no improvement).
Returning Results (OptimizationResult): Upon completion, your optimizer should ideally structure its findings into an OptimizationResult object. This object standardizes how results are reported, including the best prompt found, its score, history of the optimization process, and any other relevant details.

Conceptual Structure of an Optimizer

While the exact implementation would vary, a custom optimizer might conceptually have methods like:

1 # Conceptual representation - not actual SDK code for direct implementation
2 
3 class CustomOptimizer:
4     def __init__(self, model: str, # and other relevant params like in existing optimizers
5                  **model_kwargs):
6         self.model = model
7         self.model_kwargs = model_kwargs
8         # Custom initialization for your algorithm
9 
10     def optimize_prompt(
11         self,
12         prompt: ChatPrompt,
13         dataset: Union[str, Dataset],
14         metric: Callable[[Dict, str], float],
15         # Custom parameters for your optimizer's control
16         max_iterations: int = 10,
17         # ... other params
18     ) -> OptimizationResult:
19 
20         history = []
21         current_best_prompt = prompt
22         current_best_score = -float('inf') # Assuming higher is better
23 
24         for i in range(max_iterations):
25             # 1. Generate candidate prompts based on current_best_prompt
26             candidate_prompts = self._generate_candidates(current_best_prompt)
27 
28             # 2. Evaluate candidates
29             scores = []
30             for candidate in candidate_prompts:
31                 # This would involve an internal evaluation call
32                 # conceptually similar to existing optimizers' evaluate_prompt methods
33                 score = self._evaluate_single_prompt(candidate, dataset, metric)
34                 scores.append(score)
35                 history.append({"prompt": candidate, "score": score, "round": i})
36 
37             # 3. Select the best candidate from this round
38             # and update current_best_prompt and current_best_score
39             # ... (selection logic)
40 
41             # 4. Check termination conditions
42             # ... (termination logic)
43 
44         # 5. Prepare and return OptimizationResult
45         return OptimizationResult(
46             prompt=current_best_prompt,
47             score=current_best_score,
48             history=history,
49             # ... other fields
50         )
51 
52     def _generate_candidates(self, base_prompt: ChatPrompt) -> List[ChatPrompt]:
53         # Your custom logic to create new prompt variations
54         pass
55 
56     def _evaluate_single_prompt(self, prompt: ChatPrompt, dataset, metric) -> float:
57         # Your logic to evaluate a single prompt
58         # This would likely involve setting up an LLM call with the prompt_text,
59         # running it against samples from the dataset, and then using the metric
60         # to calculate a score.
61         # See existing optimizers for patterns of how they use `evaluate_prompt` internally.
62         pass

The opik-optimizer SDK already provides robust mechanisms for prompt evaluation that existing optimizers leverage. A custom optimizer would ideally reuse or adapt these internal evaluation utilities to ensure consistency with the Opik ecosystem.

How to Contribute to Opik Agent Optimizer

Opik is continuously evolving, and community feedback and contributions are valuable!

Feature Requests & Ideas: If you have ideas for new optimization algorithms, features, or improvements to existing ones, please share them through our community channels or by raising an issue on our GitHub repository (if available for opik-optimizer).
Bug Reports: If you encounter issues or unexpected behavior, detailed bug reports are greatly appreciated.
Use Cases & Feedback: Sharing your use cases and how Opik Agent Optimizer is (or isn’t) meeting your needs helps us prioritize development.
Code Contributions: While direct pull requests for new optimizers might require significant coordination if the SDK isn’t fully open for such extensions yet, expressing interest and discussing potential contributions with the development team is a good first step. Keep an eye on the project’s contribution guidelines.

Key Takeaways

Building a new optimizer involves defining a candidate generation strategy, an evaluation loop using Opik’s metric and dataset paradigm, and a way to manage the optimization process.
The ChatPrompt and OptimizationResult objects are key for integration.
While the SDK may not formally support pluggable custom optimizers by third parties at this moment, understanding these design principles is useful for advanced users and potential future contributions.

We encourage you to explore the existing optimizer algorithms to see different approaches to these challenges.