The Opik Agent Optimizer can optimize both simple prompts and complex agent workflows. For most use cases, you can optimize prompts directly using ChatPrompt. When you need multi-prompt workflows, agent orchestration, or custom execution logic, you’ll use OptimizableAgent to create a custom agent class.
Use ChatPrompt directly (default approach):
Use OptimizableAgent when you need:
Optimizers work seamlessly with both approaches. The optimizer calls your agent’s invoke_agent() method repeatedly during optimization, passing different prompt candidates to evaluate.
For most optimization tasks, you can use ChatPrompt directly without creating a custom agent. The optimizer uses a default LiteLLM-based agent under the hood.
When integrating with specific agent frameworks (Google ADK, LangGraph, CrewAI, etc.), you’ll create a custom OptimizableAgent subclass. This allows the optimizer to work with your framework’s execution model.
Here’s an example for Google ADK:
The key points:
prompts dict: prompt = list(prompts.values())[0]messages = prompt.get_messages(dataset_item)See sdks/opik_optimizer/scripts/llm_frameworks/ for working examples of framework integrations (ADK, LangGraph, CrewAI, etc.). Each script doubles as both documentation and regression tests.
For multi-step agent workflows, you must use OptimizableAgent because ChatPrompt only handles a single prompt. Multi-prompt optimization allows you to optimize multiple prompts that work together in a pipeline.
Here’s a simple example of a two-step workflow that analyzes input and then generates a response:
When optimizing, pass a dictionary of prompts instead of a single prompt:
The optimizer will optimize both prompts in the dictionary, trying different combinations to improve performance.
The prompts dict keys (like “analyze” and “respond”) are used to identify which prompt to optimize. The optimizer can optimize all prompts or specific ones based on the optimize_prompt parameter.
All OptimizableAgent subclasses must implement invoke_agent():
Parameters:
prompts: Dictionary mapping prompt names to ChatPrompt objectsdataset_item: Dataset row used to format prompt messagesallow_tool_use: Whether tools may be executed (for tool-calling prompts)seed: Optional random seed for reproducibilityReturns: A single string output that will be scored by your metric function
Use ChatPrompt.get_messages() to format the prompt with dataset values:
For multi-prompt workflows, pass additional context when calling get_messages():
prompt.model and prompt.model_kwargs for consistencyseed parameter when making LLM callsself.trace_metadataFor advanced multi-prompt examples, see sdks/opik_optimizer/benchmarks/agents/hotpot_multihop_agent.py which implements a complex multi-hop retrieval pipeline with Wikipedia search.
sdks/opik_optimizer/scripts/llm_frameworks/