Tool Optimization | Opik Documentation

Tool optimization is a specialized feature that allows you to optimize prompts that use external tools and the Model Context Protocol (MCP). This capability is currently exclusively supported by the MetaPrompt Optimizer and is currently in Beta.

What is Tool Optimization?

Tool optimization extends traditional prompt optimization to handle prompts that include:

MCP tools - Model Context Protocol tools for external integrations (Beta)
Tool schemas - Structured tool definitions and parameters
Multi-step workflows - Complex agent workflows involving multiple tools

Important Distinction: Many optimizers (including GEPA, MetaPrompt, etc.) can optimize agents that use tools - this means optimizing prompts for agents that have access to function calling or external tools. However, true tool optimization (optimizing the tools themselves, their schemas, or MCP integrations) is currently only available through MetaPrompt Optimizer and is in Beta.

Why Tool Optimization Matters:

Traditional prompt optimization focuses on text-based prompts, but modern AI applications often require:

Integration with external APIs and services
Structured data processing through function calls
Complex multi-step reasoning with tool usage
Dynamic tool selection based on context Tool optimization ensures these sophisticated prompts can be improved just like simple text prompts.

Supported Tool Types

1. Agent Function Calling (Not True Tool Optimization)

Many optimizers can optimize agents that use function calling, but this is different from true tool optimization. Here’s an example from the GEPA optimizer:

1 from opik_optimizer import GepaOptimizer, ChatPrompt
2 
3 # GEPA example: optimizing an agent with function calling
4 prompt = ChatPrompt(
5     system="You are a helpful assistant. Use the search_wikipedia tool when needed.",
6     user="{question}",
7     tools=[
8         {
9             "type": "function",
10             "function": {
11                 "name": "search_wikipedia",
12                 "description": "This function searches Wikipedia abstracts.",
13                 "parameters": {
14                     "type": "object",
15                     "properties": {
16                         "query": {
17                             "type": "string",
18                             "description": "The term or phrase to search for."
19                         }
20                     },
21                     "required": ["query"]
22                 }
23             }
24         }
25     ],
26     function_map={
27         "search_wikipedia": lambda query: search_wikipedia(query, use_api=True)
28     }
29 )
30 
31 # GEPA optimizes the agent's prompt, not the tools themselves
32 optimizer = GepaOptimizer(model="gpt-4o-mini")
33 result = optimizer.optimize_prompt(prompt=prompt, dataset=dataset, metric=metric)

This is agent optimization (optimizing prompts for agents that use tools), not tool optimization (optimizing the tools themselves).

2. MCP (Model Context Protocol) Tools - Beta

True tool optimization is currently only available for MCP tools and is in Beta. MCP tools provide standardized interfaces for external integrations:

1 # MCP tool optimization example (Beta)
2 # See scripts/litellm_metaprompt_context7_mcp_example.py for working examples
3 
4 from opik_optimizer import MetaPromptOptimizer
5 
6 # MCP tools are configured through mcp.json manifests
7 # The MetaPrompt Optimizer can optimize MCP tool descriptions and usage
8 optimizer = MetaPromptOptimizer(model="gpt-4")
9 
10 # MCP tool optimization is currently in Beta
11 # Check the scripts/ directory for working examples

MCP Tool Optimization Status:

Currently in Beta
Only supported by MetaPrompt Optimizer
See PR#3341 for implementation detail
Working examples available in scripts/litellm_metaprompt_context7_mcp_example.py

How Tool Optimization Works

The MetaPrompt Optimizer handles tool-enabled prompts through a specialized optimization process:

1. Tool-Aware Analysis

The optimizer analyzes:

Tool schemas - Understanding available functions and their parameters
Tool usage patterns - How tools are typically invoked in the prompt
Tool dependencies - Relationships between different tools
Context requirements - What information tools need to function effectively

2. Prompt-Tool Integration Optimization

The optimizer can improve:

Tool selection logic - Better instructions for when to use which tools
Parameter formatting - Clearer guidance on how to structure tool inputs
Error handling - Instructions for handling tool failures or edge cases
Tool chaining - Optimizing multi-step tool workflows

3. Context Enhancement

Tool optimization also improves:

Input validation - Better prompts for validating tool inputs
Output processing - Instructions for handling tool outputs
Fallback strategies - Alternative approaches when tools are unavailable

Example: Optimizing a Research Assistant

Let’s see how tool optimization works with a research assistant that uses multiple tools:

1 from opik_optimizer import MetaPromptOptimizer, ChatPrompt
2 from opik.evaluation.metrics import LevenshteinRatio
3 
4 # Define a research assistant prompt with tools
5 research_prompt = ChatPrompt(
6     messages=[
7         {
8             "role": "system",
9             "content": """You are a research assistant. When given a research question:
10 1. Search for relevant information using the search tool
11 2. Analyze the results using the analysis tool
12 3. Provide a comprehensive answer based on your findings
13 
14 Always cite your sources and be thorough in your research."""
15         },
16         {
17             "role": "user",
18             "content": "{research_question}"
19         }
20     ],
21     tools=[
22         {
23             "type": "function",
24             "function": {
25                 "name": "search_academic_database",
26                 "description": "Search academic papers and research",
27                 "parameters": {
28                     "type": "object",
29                     "properties": {
30                         "query": {"type": "string"},
31                         "year_range": {"type": "string"},
32                         "max_results": {"type": "integer"}
33                     }
34                 }
35             }
36         },
37         {
38             "type": "function",
39             "function": {
40                 "name": "analyze_text",
41                 "description": "Analyze and summarize text content",
42                 "parameters": {
43                     "type": "object",
44                     "properties": {
45                         "text": {"type": "string"},
46                         "analysis_type": {"type": "string"}
47                     }
48                 }
49             }
50         }
51     ]
52 )
53 
54 # Initialize the optimizer
55 optimizer = MetaPromptOptimizer(
56     model="openai/gpt-4",
57     reasoning_model="openai/gpt-4-turbo"
58 )
59 
60 # Define evaluation metric
61 def research_quality_metric(dataset_item, llm_output):
62     return LevenshteinRatio().score(
63         reference=dataset_item['expected_answer'],
64         output=llm_output
65     )
66 
67 # Run optimization
68 result = optimizer.optimize_prompt(
69     prompt=research_prompt,
70     dataset=research_dataset,
71     metric=research_quality_metric,
72     n_samples=100,
73     max_rounds=5
74 )
75 
76 print("Optimized prompt with tools:")
77 print(result.prompt)

Best Practices for Tool Optimization

1. Tool Schema Design

Clear descriptions - Provide detailed descriptions for each tool
Comprehensive parameters - Include all necessary parameters with types
Example usage - Add examples in tool descriptions when helpful
Error handling - Define expected error conditions and responses

2. Prompt Structure

Tool introduction - Clearly explain available tools to the model
Usage guidelines - Provide specific instructions on when and how to use tools
Output formatting - Specify how tool outputs should be processed
Fallback instructions - Define what to do when tools fail

3. Evaluation Considerations

Tool usage metrics - Measure not just final output quality but tool usage effectiveness
Multi-step evaluation - Evaluate each step in tool-based workflows
Error rate tracking - Monitor tool failure rates and recovery strategies
Context preservation - Ensure important context is maintained across tool calls

Limitations and Considerations

Current Limitations

MetaPrompt Only - Tool optimization is currently only available with the MetaPrompt Optimizer
Tool Complexity - Very complex tool workflows may require manual optimization
Tool Availability - Optimization assumes tools are available during evaluation
Schema Changes - Tool schema modifications may require re-optimization

Performance Considerations

Evaluation Cost - Tool-enabled prompts require more LLM calls for evaluation
Tool Latency - External tool calls can slow down optimization
Resource Usage - Complex tool workflows may require significant computational resources

Future Roadmap

Tool optimization is an active area of development. Planned improvements include:

Multi-optimizer support - Extending tool optimization to other optimizers
Tool-specific metrics - Specialized evaluation metrics for tool usage
Automated tool discovery - Automatic detection and optimization of tool patterns
Tool performance optimization - Optimizing not just prompts but tool usage efficiency

Getting Started

To start optimizing tool-enabled prompts:

Choose MetaPrompt Optimizer - Currently the only optimizer supporting tool optimization
Define your tools - Create clear tool schemas with comprehensive descriptions
Structure your prompt - Include clear instructions for tool usage
Prepare evaluation data - Ensure your dataset includes tool usage scenarios
Run optimization - Use the standard optimization process with tool-enabled prompts

Need Help?

For questions about tool optimization or to request support for additional optimizers, please reach out on GitHub or check the MetaPrompt Optimizer documentation for detailed configuration options.

Tool Optimization (MCP & Function Calling)