Opik Agent Optimizer API Reference

Technical SDK reference guide

The Opik Agent Optimizer SDK provides a comprehensive set of tools for optimizing LLM prompts and agents. This reference guide documents the standardized API that all optimizers follow, ensuring consistency and interoperability across different optimization algorithms.

Key Features

  • Standardized API: All optimizers follow the same interface for optimize_prompt() methods
  • Multiple Algorithms: Support for various optimization strategies including evolutionary, few-shot, meta-prompt, and GEPA
  • MCP Support: Built-in support for Model Context Protocol tool calling
  • Consistent Results: All optimizers return standardized OptimizationResult objects
  • Counter Tracking: Built-in LLM and tool call counters for monitoring usage
  • Backward Compatibility: All original parameters preserved through kwargs extraction
  • Deprecation Warnings: Clear warnings for deprecated parameters with migration guidance

Core Classes

The SDK provides several optimizer classes that all inherit from BaseOptimizer and implement the same standardized interface:

  • ParameterOptimizer: Optimizes LLM call parameters (temperature, top_p, etc.) using Bayesian optimization
  • FewShotBayesianOptimizer: Uses few-shot learning with Bayesian optimization
  • MetaPromptOptimizer: Employs meta-prompting techniques for optimization
  • EvolutionaryOptimizer: Uses genetic algorithms for prompt evolution
  • GepaOptimizer: Leverages GEPA (Genetic-Pareto) optimization approach
  • HRPO (Hierarchical Reflective Prompt Optimizer): Uses hierarchical root cause analysis for targeted prompt refinement

Standardized Method Signatures

All optimizers implement these core methods with identical signatures:

optimize_prompt()

1def optimize_prompt(
2 self,
3 prompt: ChatPrompt | dict[str, ChatPrompt],
4 dataset: Dataset,
5 metric: MetricFunction,
6 agent: OptimizableAgent | None = None,
7 experiment_config: dict | None = None,
8 n_samples: int | None = None,
9 auto_continue: bool = False,
10 project_name: str | None = None,
11 optimization_id: str | None = None,
12 validation_dataset: Dataset | None = None,
13 max_trials: int = 10,
14 allow_tool_use: bool = True,
15 **kwargs: Any,
16) -> OptimizationResult

Deprecation Warnings

The following parameters are deprecated and will be removed in future versions:

Constructor Parameters

  • num_threads in optimizer constructors: Use n_threads instead

Example Migration

1# ❌ Deprecated
2optimizer = FewShotBayesianOptimizer(
3 model="gpt-4o-mini",
4 num_threads=16, # Deprecated
5)
6
7# ✅ Correct
8optimizer = FewShotBayesianOptimizer(
9 model="gpt-4o-mini",
10 n_threads=16, # Use n_threads instead
11)

FewShotBayesianOptimizer

1FewShotBayesianOptimizer(
2 model: str = 'gpt-4o',
3 model_parameters: dict[str, typing.Any] | None = None,
4 min_examples: int = 2,
5 max_examples: int = 8,
6 n_threads: int = 12,
7 verbose: int = 1,
8 seed: int = 42,
9 name: str | None = None,
10 enable_columnar_selection: bool = True,
11 enable_diversity: bool = True,
12 enable_multivariate_tpe: bool = True,
13 enable_optuna_pruning: bool = True,
14 prompt_overrides: dict[str, str] | collections.abc.Callable[[opik_optimizer.utils.prompt_library.PromptLibrary], None] | None = None,
15 skip_perfect_score: bool = True,
16 perfect_score: float = 0.95
17)

Parameters:

model
strDefaults to gpt-4o
LiteLLM model name for optimizer’s internal reasoning (generating few-shot templates)
model_parameters
dict[str, typing.Any] | None
Optional dict of LiteLLM parameters for optimizer’s internal LLM calls. Common params: temperature, max_tokens, max_completion_tokens, top_p.
min_examples
intDefaults to 2
Minimum number of examples to include in the prompt
max_examples
intDefaults to 8
Maximum number of examples to include in the prompt
n_threads
intDefaults to 12
Number of threads for parallel evaluation
verbose
intDefaults to 1
Controls internal logging/progress bars (0=off, 1=on)
seed
intDefaults to 42
Random seed for reproducibility
name
str | None
enable_columnar_selection
boolDefaults to True
Toggle column-aware example grouping (categorical Optuna params)
enable_diversity
boolDefaults to True
enable_multivariate_tpe
boolDefaults to True
Enable Optuna’s multivariate TPE sampler (default: True)
enable_optuna_pruning
boolDefaults to True
Enable Optuna pruner for early stopping (default: True)
prompt_overrides
dict[str, str] | collections.abc.Callable[[opik_optimizer.utils.prompt_library.PromptLibrary], None] | None
Optional dict or callable to override/customize prompt templates. If a dict, keys should match DEFAULT_PROMPTS keys. If a callable, receives the PromptLibrary instance for in-place modification.
skip_perfect_score
boolDefaults to True
perfect_score
floatDefaults to 0.95

Methods

begin_round

1begin_round(
2 context: OptimizationContext,
3 extras: Any
4)

Parameters:

context
OptimizationContext
extras
Any

cleanup

1cleanup()

evaluate

1evaluate(
2 context: OptimizationContext,
3 prompts: dict,
4 experiment_config: dict[str, typing.Any] | None = None,
5 sampling_tag: str | None = None
6)

Parameters:

context
OptimizationContext
Optimization context for this run.
prompts
dict
Dict of named prompts to evaluate (e.g., {“main”: ChatPrompt(…)}). Single-prompt optimizations use a dict with one entry.
experiment_config
dict[str, typing.Any] | None
Optional experiment configuration.
sampling_tag
str | None
Optional sampling tag for deterministic subsampling per candidate.

evaluate_prompt

1evaluate_prompt(
2 prompt: opik_optimizer.api_objects.chat_prompt.ChatPrompt | dict[str, opik_optimizer.api_objects.chat_prompt.ChatPrompt],
3 dataset: Dataset,
4 metric: MetricFunction,
5 agent: opik_optimizer.agents.optimizable_agent.OptimizableAgent | None = None,
6 n_threads: int | None = None,
7 verbose: int = 1,
8 dataset_item_ids: list[str] | None = None,
9 experiment_config: dict | None = None,
10 n_samples: int | float | str | None = None,
11 n_samples_strategy: str | None = None,
12 seed: int | None = None,
13 return_evaluation_result: bool = False,
14 allow_tool_use: bool = False,
15 use_evaluate_on_dict_items: bool | None = None,
16 sampling_tag: str | None = None
17)

Parameters:

prompt
opik_optimizer.api_objects.chat_prompt.ChatPrompt | dict[str, opik_optimizer.api_objects.chat_prompt.ChatPrompt]
dataset
Dataset
metric
MetricFunction
agent
opik_optimizer.agents.optimizable_agent.OptimizableAgent | None
n_threads
int | None
verbose
intDefaults to 1
dataset_item_ids
list[str] | None
experiment_config
dict | None
n_samples
int | float | str | None
n_samples_strategy
str | None
seed
int | None
return_evaluation_result
boolDefaults to False
allow_tool_use
boolDefaults to False
use_evaluate_on_dict_items
bool | None
sampling_tag
str | None

evaluate_with_result

1evaluate_with_result(
2 context: OptimizationContext,
3 prompts: dict,
4 experiment_config: dict[str, typing.Any] | None = None,
5 empty_score: float | None = None,
6 n_samples: int | float | str | None = None,
7 n_samples_strategy: str | None = None,
8 sampling_tag: str | None = None
9)

Parameters:

context
OptimizationContext
prompts
dict
experiment_config
dict[str, typing.Any] | None
empty_score
float | None
n_samples
int | float | str | None
n_samples_strategy
str | None
sampling_tag
str | None

finish_candidate

1finish_candidate(
2 context: OptimizationContext,
3 candidate_handle: Any,
4 score: float | None,
5 metrics: dict[str, typing.Any] | None = None,
6 extras: dict[str, typing.Any] | None = None,
7 candidates: list[dict[str, typing.Any]] | None = None,
8 dataset: str | None = None,
9 dataset_split: str | None = None,
10 trial_index: int | None = None,
11 timestamp: str | None = None,
12 round_handle: typing.Any | None = None
13)

Parameters:

context
OptimizationContext
candidate_handle
Any
score
float | None
metrics
dict[str, typing.Any] | None
extras
dict[str, typing.Any] | None
candidates
list[dict[str, typing.Any]] | None
dataset
str | None
dataset_split
str | None
trial_index
int | None
timestamp
str | None
round_handle
typing.Any | None

finish_round

1finish_round(
2 round_handle: Any,
3 context: opik_optimizer.core.state.OptimizationContext | None = None,
4 best_score: float | None = None,
5 best_candidate: typing.Any | None = None,
6 best_prompt: typing.Any | None = None,
7 stop_reason: str | None = None,
8 extras: dict[str, typing.Any] | None = None,
9 candidates: list[dict[str, typing.Any]] | None = None,
10 timestamp: str | None = None,
11 dataset_split: str | None = None,
12 pareto_front: list[dict[str, typing.Any]] | None = None,
13 selection_meta: dict[str, typing.Any] | None = None
14)

Parameters:

round_handle
Any
context
opik_optimizer.core.state.OptimizationContext | None
best_score
float | None
best_candidate
typing.Any | None
best_prompt
typing.Any | None
stop_reason
str | None
extras
dict[str, typing.Any] | None
candidates
list[dict[str, typing.Any]] | None
timestamp
str | None
dataset_split
str | None
pareto_front
list[dict[str, typing.Any]] | None
selection_meta
dict[str, typing.Any] | None

get_config

1get_config(
2 context: OptimizationContext
3)

Parameters:

context
OptimizationContext

get_default_prompt

1get_default_prompt(
2 key: str
3)

Parameters:

key
str
The prompt key to retrieve

get_history_entries

1get_history_entries()

get_history_rounds

1get_history_rounds()

get_metadata

1get_metadata(
2 context: OptimizationContext
3)

Parameters:

context
OptimizationContext

get_optimizer_metadata

1get_optimizer_metadata()

get_prompt

1get_prompt(
2 key: str,
3 fmt: Any
4)

Parameters:

key
str
The prompt key to retrieve
fmt
Any

list_prompts

1list_prompts()

on_trial

1on_trial(
2 context: OptimizationContext,
3 prompts: dict,
4 score: float,
5 prev_best_score: float | None = None
6)

Parameters:

context
OptimizationContext
prompts
dict
score
float
prev_best_score
float | None

optimize_prompt

1optimize_prompt(
2 prompt: opik_optimizer.api_objects.chat_prompt.ChatPrompt | dict[str, opik_optimizer.api_objects.chat_prompt.ChatPrompt],
3 dataset: Dataset,
4 metric: MetricFunction,
5 agent: opik_optimizer.agents.optimizable_agent.OptimizableAgent | None = None,
6 experiment_config: dict | None = None,
7 n_samples: int | float | str | None = None,
8 n_samples_minibatch: int | None = None,
9 n_samples_strategy: str | None = None,
10 auto_continue: bool = False,
11 project_name: str | None = None,
12 optimization_id: str | None = None,
13 validation_dataset: opik.api_objects.dataset.dataset.Dataset | None = None,
14 max_trials: int = 10,
15 allow_tool_use: bool = True,
16 optimize_prompt: bool | str | list[str] | None = 'system',
17 args: Any,
18 kwargs: Any
19)

Parameters:

prompt
opik_optimizer.api_objects.chat_prompt.ChatPrompt | dict[str, opik_optimizer.api_objects.chat_prompt.ChatPrompt]
The prompt to optimize (single ChatPrompt or dict of prompts)
dataset
Dataset
Opik dataset (training set - used for feedback/context) TODO/FIXME: This parameter will be deprecated in favor of dataset_training. For now, it serves as the training dataset parameter.
metric
MetricFunction
A metric function with signature (dataset_item, llm_output) -> float
agent
opik_optimizer.agents.optimizable_agent.OptimizableAgent | None
Optional agent for prompt execution (defaults to LiteLLMAgent)
experiment_config
dict | None
Optional configuration for the experiment
n_samples
int | float | str | None
Number of samples to use for evaluation
n_samples_minibatch
int | None
Optional number of samples for inner-loop minibatches
n_samples_strategy
str | None
Sampling strategy name (default “random_sorted”)
auto_continue
boolDefaults to False
Whether to continue optimization automatically
project_name
str | None
Opik project name for logging traces (defaults to OPIK_PROJECT_NAME env or “Optimization”)
optimization_id
str | None
Optional ID to use when creating the Opik optimization run
validation_dataset
opik.api_objects.dataset.dataset.Dataset | None
Optional validation dataset for ranking candidates
max_trials
intDefaults to 10
Maximum number of optimization trials
allow_tool_use
boolDefaults to True
Whether tools may be executed during evaluation (default True)
optimize_prompt
bool | str | list[str] | NoneDefaults to system
Which prompt roles to allow for optimization
args
Any
kwargs
Any

post_baseline

1post_baseline(
2 context: OptimizationContext,
3 score: float
4)

Parameters:

context
OptimizationContext
score
float

post_optimize

1post_optimize(
2 context: OptimizationContext,
3 result: OptimizationResult
4)

Parameters:

context
OptimizationContext
result
OptimizationResult

post_round

1post_round(
2 round_handle: Any,
3 context: opik_optimizer.core.state.OptimizationContext | None = None,
4 best_score: float | None = None,
5 best_candidate: typing.Any | None = None,
6 best_prompt: typing.Any | None = None,
7 stop_reason: str | None = None,
8 extras: dict[str, typing.Any] | None = None,
9 candidates: list[dict[str, typing.Any]] | None = None,
10 timestamp: str | None = None,
11 dataset_split: str | None = None,
12 pareto_front: list[dict[str, typing.Any]] | None = None,
13 selection_meta: dict[str, typing.Any] | None = None
14)

Parameters:

round_handle
Any
context
opik_optimizer.core.state.OptimizationContext | None
best_score
float | None
best_candidate
typing.Any | None
best_prompt
typing.Any | None
stop_reason
str | None
extras
dict[str, typing.Any] | None
candidates
list[dict[str, typing.Any]] | None
timestamp
str | None
dataset_split
str | None
pareto_front
list[dict[str, typing.Any]] | None
selection_meta
dict[str, typing.Any] | None

post_trial

1post_trial(
2 context: OptimizationContext,
3 candidate_handle: Any,
4 score: float | None,
5 metrics: dict[str, typing.Any] | None = None,
6 extras: dict[str, typing.Any] | None = None,
7 candidates: list[dict[str, typing.Any]] | None = None,
8 dataset: str | None = None,
9 dataset_split: str | None = None,
10 trial_index: int | None = None,
11 timestamp: str | None = None,
12 round_handle: typing.Any | None = None
13)

Parameters:

context
OptimizationContext
candidate_handle
Any
score
float | None
metrics
dict[str, typing.Any] | None
extras
dict[str, typing.Any] | None
candidates
list[dict[str, typing.Any]] | None
dataset
str | None
dataset_split
str | None
trial_index
int | None
timestamp
str | None
round_handle
typing.Any | None

pre_baseline

1pre_baseline(
2 context: OptimizationContext
3)

Parameters:

context
OptimizationContext

pre_optimize

1pre_optimize(
2 context: OptimizationContext
3)

Parameters:

context
OptimizationContext
The optimization context

pre_round

1pre_round(
2 context: OptimizationContext,
3 extras: Any
4)

Parameters:

context
OptimizationContext
extras
Any

pre_trial

1pre_trial(
2 context: OptimizationContext,
3 candidate: Any,
4 round_handle: typing.Any | None = None
5)

Parameters:

context
OptimizationContext
candidate
Any
round_handle
typing.Any | None

record_candidate_entry

1record_candidate_entry(
2 prompt_or_payload: Any,
3 score: float | None = None,
4 id: str | None = None,
5 metrics: dict[str, typing.Any] | None = None,
6 notes: str | None = None,
7 extra: dict[str, typing.Any] | None = None,
8 context: opik_optimizer.core.state.OptimizationContext | None = None
9)

Parameters:

prompt_or_payload
Any
score
float | None
id
str | None
metrics
dict[str, typing.Any] | None
notes
str | None
extra
dict[str, typing.Any] | None
context
opik_optimizer.core.state.OptimizationContext | None

run_optimization

1run_optimization(
2 context: OptimizationContext
3)

Parameters:

context
OptimizationContext
The optimization context with prompts, dataset, metric, etc.

set_default_dataset_split

1set_default_dataset_split(
2 dataset_split: str | None
3)

Parameters:

dataset_split
str | None

set_pareto_front

1set_pareto_front(
2 pareto_front: list[dict[str, typing.Any]] | None
3)

Parameters:

pareto_front
list[dict[str, typing.Any]] | None

set_selection_meta

1set_selection_meta(
2 selection_meta: dict[str, typing.Any] | None
3)

Parameters:

selection_meta
dict[str, typing.Any] | None

start_candidate

1start_candidate(
2 context: OptimizationContext,
3 candidate: Any,
4 round_handle: typing.Any | None = None
5)

Parameters:

context
OptimizationContext
candidate
Any
round_handle
typing.Any | None

with_dataset_split

1with_dataset_split(
2 dataset_split: str | None
3)

Parameters:

dataset_split
str | None

GepaOptimizer

1GepaOptimizer(
2 model: str = 'gpt-4o',
3 model_parameters: dict[str, typing.Any] | None = None,
4 n_threads: int = 12,
5 verbose: int = 1,
6 seed: int = 42,
7 name: str | None = None,
8 skip_perfect_score: bool = True,
9 perfect_score: float = 0.95,
10 prompt_overrides: dict[str, str] | collections.abc.Callable[[opik_optimizer.utils.prompt_library.PromptLibrary], None] | None = None
11)

Parameters:

model
strDefaults to gpt-4o
LiteLLM model name for the optimization algorithm
model_parameters
dict[str, typing.Any] | None
Optional dict of LiteLLM parameters for optimizer’s internal LLM calls. Common params: temperature, max_tokens, max_completion_tokens, top_p.
n_threads
intDefaults to 12
Number of parallel threads for evaluation
verbose
intDefaults to 1
Controls internal logging/progress bars (0=off, 1=on)
seed
intDefaults to 42
Random seed for reproducibility
name
str | None
skip_perfect_score
boolDefaults to True
perfect_score
floatDefaults to 0.95
prompt_overrides
dict[str, str] | collections.abc.Callable[[opik_optimizer.utils.prompt_library.PromptLibrary], None] | None
Accepted for API parity, but ignored (GEPA does not expose prompt hooks).

Methods

begin_round

1begin_round(
2 context: OptimizationContext,
3 extras: Any
4)

Parameters:

context
OptimizationContext
extras
Any

cleanup

1cleanup()

evaluate

1evaluate(
2 context: OptimizationContext,
3 prompts: dict,
4 experiment_config: dict[str, typing.Any] | None = None,
5 sampling_tag: str | None = None
6)

Parameters:

context
OptimizationContext
Optimization context for this run.
prompts
dict
Dict of named prompts to evaluate (e.g., {“main”: ChatPrompt(…)}). Single-prompt optimizations use a dict with one entry.
experiment_config
dict[str, typing.Any] | None
Optional experiment configuration.
sampling_tag
str | None
Optional sampling tag for deterministic subsampling per candidate.

evaluate_prompt

1evaluate_prompt(
2 prompt: opik_optimizer.api_objects.chat_prompt.ChatPrompt | dict[str, opik_optimizer.api_objects.chat_prompt.ChatPrompt],
3 dataset: Dataset,
4 metric: MetricFunction,
5 agent: opik_optimizer.agents.optimizable_agent.OptimizableAgent | None = None,
6 n_threads: int | None = None,
7 verbose: int = 1,
8 dataset_item_ids: list[str] | None = None,
9 experiment_config: dict | None = None,
10 n_samples: int | float | str | None = None,
11 n_samples_strategy: str | None = None,
12 seed: int | None = None,
13 return_evaluation_result: bool = False,
14 allow_tool_use: bool | None = None,
15 use_evaluate_on_dict_items: bool | None = None,
16 sampling_tag: str | None = None
17)

Parameters:

prompt
opik_optimizer.api_objects.chat_prompt.ChatPrompt | dict[str, opik_optimizer.api_objects.chat_prompt.ChatPrompt]
dataset
Dataset
metric
MetricFunction
agent
opik_optimizer.agents.optimizable_agent.OptimizableAgent | None
n_threads
int | None
verbose
intDefaults to 1
dataset_item_ids
list[str] | None
experiment_config
dict | None
n_samples
int | float | str | None
n_samples_strategy
str | None
seed
int | None
return_evaluation_result
boolDefaults to False
allow_tool_use
bool | None
use_evaluate_on_dict_items
bool | None
sampling_tag
str | None

evaluate_with_result

1evaluate_with_result(
2 context: OptimizationContext,
3 prompts: dict,
4 experiment_config: dict[str, typing.Any] | None = None,
5 empty_score: float | None = None,
6 n_samples: int | float | str | None = None,
7 n_samples_strategy: str | None = None,
8 sampling_tag: str | None = None
9)

Parameters:

context
OptimizationContext
prompts
dict
experiment_config
dict[str, typing.Any] | None
empty_score
float | None
n_samples
int | float | str | None
n_samples_strategy
str | None
sampling_tag
str | None

finish_candidate

1finish_candidate(
2 context: OptimizationContext,
3 candidate_handle: Any,
4 score: float | None,
5 metrics: dict[str, typing.Any] | None = None,
6 extras: dict[str, typing.Any] | None = None,
7 candidates: list[dict[str, typing.Any]] | None = None,
8 dataset: str | None = None,
9 dataset_split: str | None = None,
10 trial_index: int | None = None,
11 timestamp: str | None = None,
12 round_handle: typing.Any | None = None
13)

Parameters:

context
OptimizationContext
candidate_handle
Any
score
float | None
metrics
dict[str, typing.Any] | None
extras
dict[str, typing.Any] | None
candidates
list[dict[str, typing.Any]] | None
dataset
str | None
dataset_split
str | None
trial_index
int | None
timestamp
str | None
round_handle
typing.Any | None

finish_round

1finish_round(
2 round_handle: Any,
3 context: opik_optimizer.core.state.OptimizationContext | None = None,
4 best_score: float | None = None,
5 best_candidate: typing.Any | None = None,
6 best_prompt: typing.Any | None = None,
7 stop_reason: str | None = None,
8 extras: dict[str, typing.Any] | None = None,
9 candidates: list[dict[str, typing.Any]] | None = None,
10 timestamp: str | None = None,
11 dataset_split: str | None = None,
12 pareto_front: list[dict[str, typing.Any]] | None = None,
13 selection_meta: dict[str, typing.Any] | None = None
14)

Parameters:

round_handle
Any
context
opik_optimizer.core.state.OptimizationContext | None
best_score
float | None
best_candidate
typing.Any | None
best_prompt
typing.Any | None
stop_reason
str | None
extras
dict[str, typing.Any] | None
candidates
list[dict[str, typing.Any]] | None
timestamp
str | None
dataset_split
str | None
pareto_front
list[dict[str, typing.Any]] | None
selection_meta
dict[str, typing.Any] | None

get_config

1get_config(
2 context: OptimizationContext
3)

Parameters:

context
OptimizationContext

get_default_prompt

1get_default_prompt(
2 key: str
3)

Parameters:

key
str
The prompt key to retrieve

get_history_entries

1get_history_entries()

get_history_rounds

1get_history_rounds()

get_metadata

1get_metadata(
2 context: OptimizationContext
3)

Parameters:

context
OptimizationContext

get_optimizer_metadata

1get_optimizer_metadata()

get_prompt

1get_prompt(
2 key: str,
3 fmt: Any
4)

Parameters:

key
str
The prompt key to retrieve
fmt
Any

list_prompts

1list_prompts()

on_trial

1on_trial(
2 context: OptimizationContext,
3 prompts: dict,
4 score: float,
5 prev_best_score: float | None = None
6)

Parameters:

context
OptimizationContext
prompts
dict
score
float
prev_best_score
float | None

optimize_prompt

1optimize_prompt(
2 prompt: opik_optimizer.api_objects.chat_prompt.ChatPrompt | dict[str, opik_optimizer.api_objects.chat_prompt.ChatPrompt],
3 dataset: Dataset,
4 metric: MetricFunction,
5 agent: opik_optimizer.agents.optimizable_agent.OptimizableAgent | None = None,
6 experiment_config: dict | None = None,
7 n_samples: int | float | str | None = None,
8 n_samples_minibatch: int | None = None,
9 n_samples_strategy: str | None = None,
10 auto_continue: bool = False,
11 project_name: str | None = None,
12 optimization_id: str | None = None,
13 validation_dataset: opik.api_objects.dataset.dataset.Dataset | None = None,
14 max_trials: int = 10,
15 allow_tool_use: bool = True,
16 optimize_prompt: bool | str | list[str] | None = 'system',
17 args: Any,
18 kwargs: Any
19)

Parameters:

prompt
opik_optimizer.api_objects.chat_prompt.ChatPrompt | dict[str, opik_optimizer.api_objects.chat_prompt.ChatPrompt]
The prompt to optimize (single ChatPrompt or dict of prompts)
dataset
Dataset
Opik dataset (training set - used for feedback/context) TODO/FIXME: This parameter will be deprecated in favor of dataset_training. For now, it serves as the training dataset parameter.
metric
MetricFunction
A metric function with signature (dataset_item, llm_output) -> float
agent
opik_optimizer.agents.optimizable_agent.OptimizableAgent | None
Optional agent for prompt execution (defaults to LiteLLMAgent)
experiment_config
dict | None
Optional configuration for the experiment
n_samples
int | float | str | None
Number of samples to use for evaluation
n_samples_minibatch
int | None
Optional number of samples for inner-loop minibatches
n_samples_strategy
str | None
Sampling strategy name (default “random_sorted”)
auto_continue
boolDefaults to False
Whether to continue optimization automatically
project_name
str | None
Opik project name for logging traces (defaults to OPIK_PROJECT_NAME env or “Optimization”)
optimization_id
str | None
Optional ID to use when creating the Opik optimization run
validation_dataset
opik.api_objects.dataset.dataset.Dataset | None
Optional validation dataset for ranking candidates
max_trials
intDefaults to 10
Maximum number of optimization trials
allow_tool_use
boolDefaults to True
Whether tools may be executed during evaluation (default True)
optimize_prompt
bool | str | list[str] | NoneDefaults to system
Which prompt roles to allow for optimization
args
Any
kwargs
Any

post_baseline

1post_baseline(
2 context: OptimizationContext,
3 score: float
4)

Parameters:

context
OptimizationContext
score
float

post_optimize

1post_optimize(
2 context: OptimizationContext,
3 result: OptimizationResult
4)

Parameters:

context
OptimizationContext
result
OptimizationResult

post_round

1post_round(
2 round_handle: Any,
3 context: opik_optimizer.core.state.OptimizationContext | None = None,
4 best_score: float | None = None,
5 best_candidate: typing.Any | None = None,
6 best_prompt: typing.Any | None = None,
7 stop_reason: str | None = None,
8 extras: dict[str, typing.Any] | None = None,
9 candidates: list[dict[str, typing.Any]] | None = None,
10 timestamp: str | None = None,
11 dataset_split: str | None = None,
12 pareto_front: list[dict[str, typing.Any]] | None = None,
13 selection_meta: dict[str, typing.Any] | None = None
14)

Parameters:

round_handle
Any
context
opik_optimizer.core.state.OptimizationContext | None
best_score
float | None
best_candidate
typing.Any | None
best_prompt
typing.Any | None
stop_reason
str | None
extras
dict[str, typing.Any] | None
candidates
list[dict[str, typing.Any]] | None
timestamp
str | None
dataset_split
str | None
pareto_front
list[dict[str, typing.Any]] | None
selection_meta
dict[str, typing.Any] | None

post_trial

1post_trial(
2 context: OptimizationContext,
3 candidate_handle: Any,
4 score: float | None,
5 metrics: dict[str, typing.Any] | None = None,
6 extras: dict[str, typing.Any] | None = None,
7 candidates: list[dict[str, typing.Any]] | None = None,
8 dataset: str | None = None,
9 dataset_split: str | None = None,
10 trial_index: int | None = None,
11 timestamp: str | None = None,
12 round_handle: typing.Any | None = None
13)

Parameters:

context
OptimizationContext
candidate_handle
Any
score
float | None
metrics
dict[str, typing.Any] | None
extras
dict[str, typing.Any] | None
candidates
list[dict[str, typing.Any]] | None
dataset
str | None
dataset_split
str | None
trial_index
int | None
timestamp
str | None
round_handle
typing.Any | None

pre_baseline

1pre_baseline(
2 context: OptimizationContext
3)

Parameters:

context
OptimizationContext

pre_optimize

1pre_optimize(
2 context: OptimizationContext
3)

Parameters:

context
OptimizationContext

pre_round

1pre_round(
2 context: OptimizationContext,
3 extras: Any
4)

Parameters:

context
OptimizationContext
extras
Any

pre_trial

1pre_trial(
2 context: OptimizationContext,
3 candidate: Any,
4 round_handle: typing.Any | None = None
5)

Parameters:

context
OptimizationContext
candidate
Any
round_handle
typing.Any | None

record_candidate_entry

1record_candidate_entry(
2 prompt_or_payload: Any,
3 score: float | None = None,
4 id: str | None = None,
5 metrics: dict[str, typing.Any] | None = None,
6 notes: str | None = None,
7 extra: dict[str, typing.Any] | None = None,
8 context: opik_optimizer.core.state.OptimizationContext | None = None
9)

Parameters:

prompt_or_payload
Any
score
float | None
id
str | None
metrics
dict[str, typing.Any] | None
notes
str | None
extra
dict[str, typing.Any] | None
context
opik_optimizer.core.state.OptimizationContext | None

run_optimization

1run_optimization(
2 context: OptimizationContext
3)

Parameters:

context
OptimizationContext
The optimization context with prompts, dataset, metric, etc.

set_default_dataset_split

1set_default_dataset_split(
2 dataset_split: str | None
3)

Parameters:

dataset_split
str | None

set_pareto_front

1set_pareto_front(
2 pareto_front: list[dict[str, typing.Any]] | None
3)

Parameters:

pareto_front
list[dict[str, typing.Any]] | None

set_selection_meta

1set_selection_meta(
2 selection_meta: dict[str, typing.Any] | None
3)

Parameters:

selection_meta
dict[str, typing.Any] | None

start_candidate

1start_candidate(
2 context: OptimizationContext,
3 candidate: Any,
4 round_handle: typing.Any | None = None
5)

Parameters:

context
OptimizationContext
candidate
Any
round_handle
typing.Any | None

with_dataset_split

1with_dataset_split(
2 dataset_split: str | None
3)

Parameters:

dataset_split
str | None

MetaPromptOptimizer

1MetaPromptOptimizer(
2 model: str = 'gpt-4o',
3 model_parameters: dict[str, typing.Any] | None = None,
4 prompts_per_round: int = 4,
5 enable_context: bool = True,
6 num_task_examples: int = 5,
7 task_context_columns: list[str] | None = None,
8 n_threads: int = 12,
9 verbose: int = 1,
10 seed: int = 42,
11 name: str | None = None,
12 use_hall_of_fame: bool = True,
13 prompt_overrides: dict[str, str] | collections.abc.Callable[[opik_optimizer.utils.prompt_library.PromptLibrary], None] | None = None,
14 skip_perfect_score: bool = True,
15 perfect_score: float = 0.95
16)

Parameters:

model
strDefaults to gpt-4o
LiteLLM model name for optimizer’s internal reasoning/generation calls
model_parameters
dict[str, typing.Any] | None
Optional dict of LiteLLM parameters for optimizer’s internal LLM calls. Common params: temperature, max_tokens, max_completion_tokens, top_p.
prompts_per_round
intDefaults to 4
Number of candidate prompts to generate per optimization round
enable_context
boolDefaults to True
Whether to include task-specific context learning when reasoning
num_task_examples
intDefaults to 5
Number of dataset examples to show in task context (default: 10)
task_context_columns
list[str] | None
Specific dataset columns to include in context (None = all input columns)
n_threads
intDefaults to 12
Number of parallel threads for prompt evaluation
verbose
intDefaults to 1
Controls internal logging/progress bars (0=off, 1=on)
seed
intDefaults to 42
Random seed for reproducibility
name
str | None
use_hall_of_fame
boolDefaults to True
Enable Hall of Fame pattern extraction and re-injection
prompt_overrides
dict[str, str] | collections.abc.Callable[[opik_optimizer.utils.prompt_library.PromptLibrary], None] | None
Optional dict or callable to customize internal prompts.
skip_perfect_score
boolDefaults to True
perfect_score
floatDefaults to 0.95

Methods

begin_round

1begin_round(
2 context: OptimizationContext,
3 extras: Any
4)

Parameters:

context
OptimizationContext
extras
Any

cleanup

1cleanup()

evaluate

1evaluate(
2 context: OptimizationContext,
3 prompts: dict,
4 experiment_config: dict[str, typing.Any] | None = None,
5 sampling_tag: str | None = None
6)

Parameters:

context
OptimizationContext
Optimization context for this run.
prompts
dict
Dict of named prompts to evaluate (e.g., {“main”: ChatPrompt(…)}). Single-prompt optimizations use a dict with one entry.
experiment_config
dict[str, typing.Any] | None
Optional experiment configuration.
sampling_tag
str | None
Optional sampling tag for deterministic subsampling per candidate.

evaluate_prompt

1evaluate_prompt(
2 prompt: opik_optimizer.api_objects.chat_prompt.ChatPrompt | dict[str, opik_optimizer.api_objects.chat_prompt.ChatPrompt],
3 dataset: Dataset,
4 metric: MetricFunction,
5 agent: opik_optimizer.agents.optimizable_agent.OptimizableAgent | None = None,
6 n_threads: int | None = None,
7 verbose: int = 1,
8 dataset_item_ids: list[str] | None = None,
9 experiment_config: dict | None = None,
10 n_samples: int | float | str | None = None,
11 n_samples_strategy: str | None = None,
12 seed: int | None = None,
13 return_evaluation_result: bool = False,
14 allow_tool_use: bool | None = None,
15 use_evaluate_on_dict_items: bool | None = None,
16 sampling_tag: str | None = None
17)

Parameters:

prompt
opik_optimizer.api_objects.chat_prompt.ChatPrompt | dict[str, opik_optimizer.api_objects.chat_prompt.ChatPrompt]
dataset
Dataset
metric
MetricFunction
agent
opik_optimizer.agents.optimizable_agent.OptimizableAgent | None
n_threads
int | None
verbose
intDefaults to 1
dataset_item_ids
list[str] | None
experiment_config
dict | None
n_samples
int | float | str | None
n_samples_strategy
str | None
seed
int | None
return_evaluation_result
boolDefaults to False
allow_tool_use
bool | None
use_evaluate_on_dict_items
bool | None
sampling_tag
str | None

evaluate_with_result

1evaluate_with_result(
2 context: OptimizationContext,
3 prompts: dict,
4 experiment_config: dict[str, typing.Any] | None = None,
5 empty_score: float | None = None,
6 n_samples: int | float | str | None = None,
7 n_samples_strategy: str | None = None,
8 sampling_tag: str | None = None
9)

Parameters:

context
OptimizationContext
prompts
dict
experiment_config
dict[str, typing.Any] | None
empty_score
float | None
n_samples
int | float | str | None
n_samples_strategy
str | None
sampling_tag
str | None

finish_candidate

1finish_candidate(
2 context: OptimizationContext,
3 candidate_handle: Any,
4 score: float | None,
5 metrics: dict[str, typing.Any] | None = None,
6 extras: dict[str, typing.Any] | None = None,
7 candidates: list[dict[str, typing.Any]] | None = None,
8 dataset: str | None = None,
9 dataset_split: str | None = None,
10 trial_index: int | None = None,
11 timestamp: str | None = None,
12 round_handle: typing.Any | None = None
13)

Parameters:

context
OptimizationContext
candidate_handle
Any
score
float | None
metrics
dict[str, typing.Any] | None
extras
dict[str, typing.Any] | None
candidates
list[dict[str, typing.Any]] | None
dataset
str | None
dataset_split
str | None
trial_index
int | None
timestamp
str | None
round_handle
typing.Any | None

finish_round

1finish_round(
2 round_handle: Any,
3 context: opik_optimizer.core.state.OptimizationContext | None = None,
4 best_score: float | None = None,
5 best_candidate: typing.Any | None = None,
6 best_prompt: typing.Any | None = None,
7 stop_reason: str | None = None,
8 extras: dict[str, typing.Any] | None = None,
9 candidates: list[dict[str, typing.Any]] | None = None,
10 timestamp: str | None = None,
11 dataset_split: str | None = None,
12 pareto_front: list[dict[str, typing.Any]] | None = None,
13 selection_meta: dict[str, typing.Any] | None = None
14)

Parameters:

round_handle
Any
context
opik_optimizer.core.state.OptimizationContext | None
best_score
float | None
best_candidate
typing.Any | None
best_prompt
typing.Any | None
stop_reason
str | None
extras
dict[str, typing.Any] | None
candidates
list[dict[str, typing.Any]] | None
timestamp
str | None
dataset_split
str | None
pareto_front
list[dict[str, typing.Any]] | None
selection_meta
dict[str, typing.Any] | None

get_config

1get_config(
2 context: OptimizationContext
3)

Parameters:

context
OptimizationContext

get_default_prompt

1get_default_prompt(
2 key: str
3)

Parameters:

key
str
The prompt key to retrieve

get_history_entries

1get_history_entries()

get_history_rounds

1get_history_rounds()

get_metadata

1get_metadata(
2 context: OptimizationContext
3)

Parameters:

context
OptimizationContext

get_optimizer_metadata

1get_optimizer_metadata()

get_prompt

1get_prompt(
2 key: str,
3 fmt: Any
4)

Parameters:

key
str
The prompt key to retrieve
fmt
Any

list_prompts

1list_prompts()

on_trial

1on_trial(
2 context: OptimizationContext,
3 prompts: dict,
4 score: float,
5 prev_best_score: float | None = None
6)

Parameters:

context
OptimizationContext
prompts
dict
score
float
prev_best_score
float | None

optimize_prompt

1optimize_prompt(
2 prompt: opik_optimizer.api_objects.chat_prompt.ChatPrompt | dict[str, opik_optimizer.api_objects.chat_prompt.ChatPrompt],
3 dataset: Dataset,
4 metric: MetricFunction,
5 agent: opik_optimizer.agents.optimizable_agent.OptimizableAgent | None = None,
6 experiment_config: dict | None = None,
7 n_samples: int | float | str | None = None,
8 n_samples_minibatch: int | None = None,
9 n_samples_strategy: str | None = None,
10 auto_continue: bool = False,
11 project_name: str | None = None,
12 optimization_id: str | None = None,
13 validation_dataset: opik.api_objects.dataset.dataset.Dataset | None = None,
14 max_trials: int = 10,
15 allow_tool_use: bool = True,
16 optimize_prompt: bool | str | list[str] | None = 'system',
17 args: Any,
18 kwargs: Any
19)

Parameters:

prompt
opik_optimizer.api_objects.chat_prompt.ChatPrompt | dict[str, opik_optimizer.api_objects.chat_prompt.ChatPrompt]
The prompt to optimize (single ChatPrompt or dict of prompts)
dataset
Dataset
Opik dataset (training set - used for feedback/context) TODO/FIXME: This parameter will be deprecated in favor of dataset_training. For now, it serves as the training dataset parameter.
metric
MetricFunction
A metric function with signature (dataset_item, llm_output) -> float
agent
opik_optimizer.agents.optimizable_agent.OptimizableAgent | None
Optional agent for prompt execution (defaults to LiteLLMAgent)
experiment_config
dict | None
Optional configuration for the experiment
n_samples
int | float | str | None
Number of samples to use for evaluation
n_samples_minibatch
int | None
Optional number of samples for inner-loop minibatches
n_samples_strategy
str | None
Sampling strategy name (default “random_sorted”)
auto_continue
boolDefaults to False
Whether to continue optimization automatically
project_name
str | None
Opik project name for logging traces (defaults to OPIK_PROJECT_NAME env or “Optimization”)
optimization_id
str | None
Optional ID to use when creating the Opik optimization run
validation_dataset
opik.api_objects.dataset.dataset.Dataset | None
Optional validation dataset for ranking candidates
max_trials
intDefaults to 10
Maximum number of optimization trials
allow_tool_use
boolDefaults to True
Whether tools may be executed during evaluation (default True)
optimize_prompt
bool | str | list[str] | NoneDefaults to system
Which prompt roles to allow for optimization
args
Any
kwargs
Any

post_baseline

1post_baseline(
2 context: OptimizationContext,
3 score: float
4)

Parameters:

context
OptimizationContext
score
float

post_optimize

1post_optimize(
2 context: OptimizationContext,
3 result: OptimizationResult
4)

Parameters:

context
OptimizationContext
result
OptimizationResult

post_round

1post_round(
2 round_handle: Any,
3 context: opik_optimizer.core.state.OptimizationContext | None = None,
4 best_score: float | None = None,
5 best_candidate: typing.Any | None = None,
6 best_prompt: typing.Any | None = None,
7 stop_reason: str | None = None,
8 extras: dict[str, typing.Any] | None = None,
9 candidates: list[dict[str, typing.Any]] | None = None,
10 timestamp: str | None = None,
11 dataset_split: str | None = None,
12 pareto_front: list[dict[str, typing.Any]] | None = None,
13 selection_meta: dict[str, typing.Any] | None = None
14)

Parameters:

round_handle
Any
context
opik_optimizer.core.state.OptimizationContext | None
best_score
float | None
best_candidate
typing.Any | None
best_prompt
typing.Any | None
stop_reason
str | None
extras
dict[str, typing.Any] | None
candidates
list[dict[str, typing.Any]] | None
timestamp
str | None
dataset_split
str | None
pareto_front
list[dict[str, typing.Any]] | None
selection_meta
dict[str, typing.Any] | None

post_trial

1post_trial(
2 context: OptimizationContext,
3 candidate_handle: Any,
4 score: float | None,
5 metrics: dict[str, typing.Any] | None = None,
6 extras: dict[str, typing.Any] | None = None,
7 candidates: list[dict[str, typing.Any]] | None = None,
8 dataset: str | None = None,
9 dataset_split: str | None = None,
10 trial_index: int | None = None,
11 timestamp: str | None = None,
12 round_handle: typing.Any | None = None
13)

Parameters:

context
OptimizationContext
candidate_handle
Any
score
float | None
metrics
dict[str, typing.Any] | None
extras
dict[str, typing.Any] | None
candidates
list[dict[str, typing.Any]] | None
dataset
str | None
dataset_split
str | None
trial_index
int | None
timestamp
str | None
round_handle
typing.Any | None

pre_baseline

1pre_baseline(
2 context: OptimizationContext
3)

Parameters:

context
OptimizationContext

pre_optimize

1pre_optimize(
2 context: OptimizationContext
3)

Parameters:

context
OptimizationContext
The optimization context

pre_round

1pre_round(
2 context: OptimizationContext,
3 extras: Any
4)

Parameters:

context
OptimizationContext
extras
Any

pre_trial

1pre_trial(
2 context: OptimizationContext,
3 candidate: Any,
4 round_handle: typing.Any | None = None
5)

Parameters:

context
OptimizationContext
candidate
Any
round_handle
typing.Any | None

record_candidate_entry

1record_candidate_entry(
2 prompt_or_payload: Any,
3 score: float | None = None,
4 id: str | None = None,
5 metrics: dict[str, typing.Any] | None = None,
6 notes: str | None = None,
7 extra: dict[str, typing.Any] | None = None,
8 context: opik_optimizer.core.state.OptimizationContext | None = None
9)

Parameters:

prompt_or_payload
Any
score
float | None
id
str | None
metrics
dict[str, typing.Any] | None
notes
str | None
extra
dict[str, typing.Any] | None
context
opik_optimizer.core.state.OptimizationContext | None

run_optimization

1run_optimization(
2 context: OptimizationContext
3)

Parameters:

context
OptimizationContext
The optimization context with prompts, dataset, metric, etc.

set_default_dataset_split

1set_default_dataset_split(
2 dataset_split: str | None
3)

Parameters:

dataset_split
str | None

set_pareto_front

1set_pareto_front(
2 pareto_front: list[dict[str, typing.Any]] | None
3)

Parameters:

pareto_front
list[dict[str, typing.Any]] | None

set_selection_meta

1set_selection_meta(
2 selection_meta: dict[str, typing.Any] | None
3)

Parameters:

selection_meta
dict[str, typing.Any] | None

start_candidate

1start_candidate(
2 context: OptimizationContext,
3 candidate: Any,
4 round_handle: typing.Any | None = None
5)

Parameters:

context
OptimizationContext
candidate
Any
round_handle
typing.Any | None

with_dataset_split

1with_dataset_split(
2 dataset_split: str | None
3)

Parameters:

dataset_split
str | None

EvolutionaryOptimizer

1EvolutionaryOptimizer(
2 model: str = 'gpt-4o',
3 model_parameters: dict[str, typing.Any] | None = None,
4 population_size: int = 30,
5 num_generations: int = 15,
6 mutation_rate: float = 0.2,
7 crossover_rate: float = 0.8,
8 tournament_size: int = 4,
9 elitism_size: int = 3,
10 adaptive_mutation: bool = True,
11 enable_moo: bool = True,
12 enable_llm_crossover: bool = True,
13 enable_semantic_crossover: bool = False,
14 output_style_guidance: str | None = None,
15 infer_output_style: bool = False,
16 n_threads: int = 12,
17 verbose: int = 1,
18 seed: int = 42,
19 name: str | None = None,
20 prompt_overrides: dict[str, str] | collections.abc.Callable[[opik_optimizer.utils.prompt_library.PromptLibrary], None] | None = None,
21 skip_perfect_score: bool = True,
22 perfect_score: float = 0.95
23)

Parameters:

model
strDefaults to gpt-4o
LiteLLM model name for optimizer’s internal operations (mutations, crossover, etc.)
model_parameters
dict[str, typing.Any] | None
Optional dict of LiteLLM parameters for optimizer’s internal LLM calls. Common params: temperature, max_tokens, max_completion_tokens, top_p.
population_size
intDefaults to 30
Number of prompts in the population
num_generations
intDefaults to 15
Number of generations to run
mutation_rate
floatDefaults to 0.2
Mutation rate for genetic operations
crossover_rate
floatDefaults to 0.8
Crossover rate for genetic operations
tournament_size
intDefaults to 4
Tournament size for selection
elitism_size
intDefaults to 3
Number of elite prompts to preserve across generations
adaptive_mutation
boolDefaults to True
Whether to use adaptive mutation that adjusts based on population diversity
enable_moo
boolDefaults to True
Whether to enable multi-objective optimization (optimizes metric and prompt length)
enable_llm_crossover
boolDefaults to True
Whether to enable LLM-based crossover operations
enable_semantic_crossover
boolDefaults to False
Whether to use semantic crossover before standard LLM crossover
output_style_guidance
str | None
Optional guidance for output style in generated prompts
infer_output_style
boolDefaults to False
Whether to automatically infer output style from the dataset
n_threads
intDefaults to 12
Number of threads for parallel evaluation
verbose
intDefaults to 1
Controls internal logging/progress bars (0=off, 1=on)
seed
intDefaults to 42
Random seed for reproducibility
name
str | None
prompt_overrides
dict[str, str] | collections.abc.Callable[[opik_optimizer.utils.prompt_library.PromptLibrary], None] | None
Optional dict or callable to customize internal prompts.
skip_perfect_score
boolDefaults to True
perfect_score
floatDefaults to 0.95

Methods

begin_round

1begin_round(
2 context: OptimizationContext,
3 extras: Any
4)

Parameters:

context
OptimizationContext
extras
Any

cleanup

1cleanup()

evaluate

1evaluate(
2 context: OptimizationContext,
3 prompts: dict,
4 experiment_config: dict[str, typing.Any] | None = None,
5 sampling_tag: str | None = None
6)

Parameters:

context
OptimizationContext
Optimization context for this run.
prompts
dict
Dict of named prompts to evaluate (e.g., {“main”: ChatPrompt(…)}). Single-prompt optimizations use a dict with one entry.
experiment_config
dict[str, typing.Any] | None
Optional experiment configuration.
sampling_tag
str | None
Optional sampling tag for deterministic subsampling per candidate.

evaluate_prompt

1evaluate_prompt(
2 prompt: opik_optimizer.api_objects.chat_prompt.ChatPrompt | dict[str, opik_optimizer.api_objects.chat_prompt.ChatPrompt],
3 dataset: Dataset,
4 metric: MetricFunction,
5 agent: opik_optimizer.agents.optimizable_agent.OptimizableAgent | None = None,
6 n_threads: int | None = None,
7 verbose: int = 1,
8 dataset_item_ids: list[str] | None = None,
9 experiment_config: dict | None = None,
10 n_samples: int | float | str | None = None,
11 n_samples_strategy: str | None = None,
12 seed: int | None = None,
13 return_evaluation_result: bool = False,
14 allow_tool_use: bool | None = None,
15 use_evaluate_on_dict_items: bool | None = None,
16 sampling_tag: str | None = None
17)

Parameters:

prompt
opik_optimizer.api_objects.chat_prompt.ChatPrompt | dict[str, opik_optimizer.api_objects.chat_prompt.ChatPrompt]
dataset
Dataset
metric
MetricFunction
agent
opik_optimizer.agents.optimizable_agent.OptimizableAgent | None
n_threads
int | None
verbose
intDefaults to 1
dataset_item_ids
list[str] | None
experiment_config
dict | None
n_samples
int | float | str | None
n_samples_strategy
str | None
seed
int | None
return_evaluation_result
boolDefaults to False
allow_tool_use
bool | None
use_evaluate_on_dict_items
bool | None
sampling_tag
str | None

evaluate_with_result

1evaluate_with_result(
2 context: OptimizationContext,
3 prompts: dict,
4 experiment_config: dict[str, typing.Any] | None = None,
5 empty_score: float | None = None,
6 n_samples: int | float | str | None = None,
7 n_samples_strategy: str | None = None,
8 sampling_tag: str | None = None
9)

Parameters:

context
OptimizationContext
prompts
dict
experiment_config
dict[str, typing.Any] | None
empty_score
float | None
n_samples
int | float | str | None
n_samples_strategy
str | None
sampling_tag
str | None

finish_candidate

1finish_candidate(
2 context: OptimizationContext,
3 candidate_handle: Any,
4 score: float | None,
5 metrics: dict[str, typing.Any] | None = None,
6 extras: dict[str, typing.Any] | None = None,
7 candidates: list[dict[str, typing.Any]] | None = None,
8 dataset: str | None = None,
9 dataset_split: str | None = None,
10 trial_index: int | None = None,
11 timestamp: str | None = None,
12 round_handle: typing.Any | None = None
13)

Parameters:

context
OptimizationContext
candidate_handle
Any
score
float | None
metrics
dict[str, typing.Any] | None
extras
dict[str, typing.Any] | None
candidates
list[dict[str, typing.Any]] | None
dataset
str | None
dataset_split
str | None
trial_index
int | None
timestamp
str | None
round_handle
typing.Any | None

finish_round

1finish_round(
2 round_handle: Any,
3 context: opik_optimizer.core.state.OptimizationContext | None = None,
4 best_score: float | None = None,
5 best_candidate: typing.Any | None = None,
6 best_prompt: typing.Any | None = None,
7 stop_reason: str | None = None,
8 extras: dict[str, typing.Any] | None = None,
9 candidates: list[dict[str, typing.Any]] | None = None,
10 timestamp: str | None = None,
11 dataset_split: str | None = None,
12 pareto_front: list[dict[str, typing.Any]] | None = None,
13 selection_meta: dict[str, typing.Any] | None = None
14)

Parameters:

round_handle
Any
context
opik_optimizer.core.state.OptimizationContext | None
best_score
float | None
best_candidate
typing.Any | None
best_prompt
typing.Any | None
stop_reason
str | None
extras
dict[str, typing.Any] | None
candidates
list[dict[str, typing.Any]] | None
timestamp
str | None
dataset_split
str | None
pareto_front
list[dict[str, typing.Any]] | None
selection_meta
dict[str, typing.Any] | None

get_config

1get_config(
2 context: OptimizationContext
3)

Parameters:

context
OptimizationContext

get_default_prompt

1get_default_prompt(
2 key: str
3)

Parameters:

key
str
The prompt key to retrieve

get_history_entries

1get_history_entries()

get_history_rounds

1get_history_rounds()

get_metadata

1get_metadata(
2 context: OptimizationContext
3)

Parameters:

context
OptimizationContext

get_optimizer_metadata

1get_optimizer_metadata()

get_prompt

1get_prompt(
2 key: str,
3 fmt: Any
4)

Parameters:

key
str
The prompt key to retrieve
fmt
Any

list_prompts

1list_prompts()

on_trial

1on_trial(
2 context: OptimizationContext,
3 prompts: dict,
4 score: float,
5 prev_best_score: float | None = None
6)

Parameters:

context
OptimizationContext
prompts
dict
score
float
prev_best_score
float | None

optimize_prompt

1optimize_prompt(
2 prompt: opik_optimizer.api_objects.chat_prompt.ChatPrompt | dict[str, opik_optimizer.api_objects.chat_prompt.ChatPrompt],
3 dataset: Dataset,
4 metric: MetricFunction,
5 agent: opik_optimizer.agents.optimizable_agent.OptimizableAgent | None = None,
6 experiment_config: dict | None = None,
7 n_samples: int | float | str | None = None,
8 n_samples_minibatch: int | None = None,
9 n_samples_strategy: str | None = None,
10 auto_continue: bool = False,
11 project_name: str | None = None,
12 optimization_id: str | None = None,
13 validation_dataset: opik.api_objects.dataset.dataset.Dataset | None = None,
14 max_trials: int = 10,
15 allow_tool_use: bool = True,
16 optimize_prompt: bool | str | list[str] | None = 'system',
17 args: Any,
18 kwargs: Any
19)

Parameters:

prompt
opik_optimizer.api_objects.chat_prompt.ChatPrompt | dict[str, opik_optimizer.api_objects.chat_prompt.ChatPrompt]
The prompt to optimize (single ChatPrompt or dict of prompts)
dataset
Dataset
Opik dataset (training set - used for feedback/context) TODO/FIXME: This parameter will be deprecated in favor of dataset_training. For now, it serves as the training dataset parameter.
metric
MetricFunction
A metric function with signature (dataset_item, llm_output) -> float
agent
opik_optimizer.agents.optimizable_agent.OptimizableAgent | None
Optional agent for prompt execution (defaults to LiteLLMAgent)
experiment_config
dict | None
Optional configuration for the experiment
n_samples
int | float | str | None
Number of samples to use for evaluation
n_samples_minibatch
int | None
Optional number of samples for inner-loop minibatches
n_samples_strategy
str | None
Sampling strategy name (default “random_sorted”)
auto_continue
boolDefaults to False
Whether to continue optimization automatically
project_name
str | None
Opik project name for logging traces (defaults to OPIK_PROJECT_NAME env or “Optimization”)
optimization_id
str | None
Optional ID to use when creating the Opik optimization run
validation_dataset
opik.api_objects.dataset.dataset.Dataset | None
Optional validation dataset for ranking candidates
max_trials
intDefaults to 10
Maximum number of optimization trials
allow_tool_use
boolDefaults to True
Whether tools may be executed during evaluation (default True)
optimize_prompt
bool | str | list[str] | NoneDefaults to system
Which prompt roles to allow for optimization
args
Any
kwargs
Any

post_baseline

1post_baseline(
2 context: OptimizationContext,
3 score: float
4)

Parameters:

context
OptimizationContext
score
float

post_optimize

1post_optimize(
2 context: OptimizationContext,
3 result: OptimizationResult
4)

Parameters:

context
OptimizationContext
result
OptimizationResult

post_round

1post_round(
2 round_handle: Any,
3 context: opik_optimizer.core.state.OptimizationContext | None = None,
4 best_score: float | None = None,
5 best_candidate: typing.Any | None = None,
6 best_prompt: typing.Any | None = None,
7 stop_reason: str | None = None,
8 extras: dict[str, typing.Any] | None = None,
9 candidates: list[dict[str, typing.Any]] | None = None,
10 timestamp: str | None = None,
11 dataset_split: str | None = None,
12 pareto_front: list[dict[str, typing.Any]] | None = None,
13 selection_meta: dict[str, typing.Any] | None = None
14)

Parameters:

round_handle
Any
context
opik_optimizer.core.state.OptimizationContext | None
best_score
float | None
best_candidate
typing.Any | None
best_prompt
typing.Any | None
stop_reason
str | None
extras
dict[str, typing.Any] | None
candidates
list[dict[str, typing.Any]] | None
timestamp
str | None
dataset_split
str | None
pareto_front
list[dict[str, typing.Any]] | None
selection_meta
dict[str, typing.Any] | None

post_trial

1post_trial(
2 context: OptimizationContext,
3 candidate_handle: Any,
4 score: float | None,
5 metrics: dict[str, typing.Any] | None = None,
6 extras: dict[str, typing.Any] | None = None,
7 candidates: list[dict[str, typing.Any]] | None = None,
8 dataset: str | None = None,
9 dataset_split: str | None = None,
10 trial_index: int | None = None,
11 timestamp: str | None = None,
12 round_handle: typing.Any | None = None
13)

Parameters:

context
OptimizationContext
candidate_handle
Any
score
float | None
metrics
dict[str, typing.Any] | None
extras
dict[str, typing.Any] | None
candidates
list[dict[str, typing.Any]] | None
dataset
str | None
dataset_split
str | None
trial_index
int | None
timestamp
str | None
round_handle
typing.Any | None

pre_baseline

1pre_baseline(
2 context: OptimizationContext
3)

Parameters:

context
OptimizationContext

pre_optimize

1pre_optimize(
2 context: OptimizationContext
3)

Parameters:

context
OptimizationContext

pre_round

1pre_round(
2 context: OptimizationContext,
3 extras: Any
4)

Parameters:

context
OptimizationContext
extras
Any

pre_trial

1pre_trial(
2 context: OptimizationContext,
3 candidate: Any,
4 round_handle: typing.Any | None = None
5)

Parameters:

context
OptimizationContext
candidate
Any
round_handle
typing.Any | None

record_candidate_entry

1record_candidate_entry(
2 prompt_or_payload: Any,
3 score: float | None = None,
4 id: str | None = None,
5 metrics: dict[str, typing.Any] | None = None,
6 notes: str | None = None,
7 extra: dict[str, typing.Any] | None = None,
8 context: opik_optimizer.core.state.OptimizationContext | None = None
9)

Parameters:

prompt_or_payload
Any
score
float | None
id
str | None
metrics
dict[str, typing.Any] | None
notes
str | None
extra
dict[str, typing.Any] | None
context
opik_optimizer.core.state.OptimizationContext | None

run_optimization

1run_optimization(
2 context: OptimizationContext
3)

Parameters:

context
OptimizationContext
The optimization context with prompts, dataset, metric, etc.

set_default_dataset_split

1set_default_dataset_split(
2 dataset_split: str | None
3)

Parameters:

dataset_split
str | None

set_pareto_front

1set_pareto_front(
2 pareto_front: list[dict[str, typing.Any]] | None
3)

Parameters:

pareto_front
list[dict[str, typing.Any]] | None

set_selection_meta

1set_selection_meta(
2 selection_meta: dict[str, typing.Any] | None
3)

Parameters:

selection_meta
dict[str, typing.Any] | None

start_candidate

1start_candidate(
2 context: OptimizationContext,
3 candidate: Any,
4 round_handle: typing.Any | None = None
5)

Parameters:

context
OptimizationContext
candidate
Any
round_handle
typing.Any | None

with_dataset_split

1with_dataset_split(
2 dataset_split: str | None
3)

Parameters:

dataset_split
str | None

HierarchicalReflectiveOptimizer

1HierarchicalReflectiveOptimizer(
2 model: str = 'gpt-4o',
3 model_parameters: dict[str, typing.Any] | None = None,
4 reasoning_model: str | None = None,
5 reasoning_model_parameters: dict[str, typing.Any] | None = None,
6 max_parallel_batches: int = 5,
7 batch_size: int = 25,
8 convergence_threshold: float = 0.01,
9 n_threads: int = 12,
10 verbose: int = 1,
11 seed: int = 42,
12 name: str | None = None,
13 prompt_overrides: dict[str, str] | collections.abc.Callable[[opik_optimizer.utils.prompt_library.PromptLibrary], None] | None = None,
14 skip_perfect_score: bool = True,
15 perfect_score: float = 0.95
16)

Parameters:

model
strDefaults to gpt-4o
LiteLLM model name for the optimization algorithm (reasoning and analysis)
model_parameters
dict[str, typing.Any] | None
Optional dict of LiteLLM parameters for optimizer’s internal LLM calls. Common params: temperature, max_tokens, max_completion_tokens, top_p.
reasoning_model
str | None
reasoning_model_parameters
dict[str, typing.Any] | None
max_parallel_batches
intDefaults to 5
Maximum number of batches to process concurrently during hierarchical root cause analysis
batch_size
intDefaults to 25
Number of test cases per batch for root cause analysis
convergence_threshold
floatDefaults to 0.01
Stop if relative improvement is below this threshold
n_threads
intDefaults to 12
Number of parallel threads for evaluation
verbose
intDefaults to 1
Controls internal logging/progress bars (0=off, 1=on)
seed
intDefaults to 42
Random seed for reproducibility
name
str | None
prompt_overrides
dict[str, str] | collections.abc.Callable[[opik_optimizer.utils.prompt_library.PromptLibrary], None] | None
Optional dict or callable to override/customize prompt templates. If a dict, keys should match DEFAULT_PROMPTS keys. If a callable, receives the PromptLibrary instance for in-place modification.
skip_perfect_score
boolDefaults to True
perfect_score
floatDefaults to 0.95

Methods

begin_round

1begin_round(
2 context: OptimizationContext,
3 extras: Any
4)

Parameters:

context
OptimizationContext
extras
Any

cleanup

1cleanup()

evaluate

1evaluate(
2 context: OptimizationContext,
3 prompts: dict,
4 experiment_config: dict[str, typing.Any] | None = None,
5 sampling_tag: str | None = None
6)

Parameters:

context
OptimizationContext
Optimization context for this run.
prompts
dict
Dict of named prompts to evaluate (e.g., {“main”: ChatPrompt(…)}). Single-prompt optimizations use a dict with one entry.
experiment_config
dict[str, typing.Any] | None
Optional experiment configuration.
sampling_tag
str | None
Optional sampling tag for deterministic subsampling per candidate.

evaluate_prompt

1evaluate_prompt(
2 prompt: opik_optimizer.api_objects.chat_prompt.ChatPrompt | dict[str, opik_optimizer.api_objects.chat_prompt.ChatPrompt],
3 dataset: Dataset,
4 metric: MetricFunction,
5 agent: opik_optimizer.agents.optimizable_agent.OptimizableAgent | None = None,
6 n_threads: int | None = None,
7 verbose: int = 1,
8 dataset_item_ids: list[str] | None = None,
9 experiment_config: dict | None = None,
10 n_samples: int | float | str | None = None,
11 n_samples_strategy: str | None = None,
12 seed: int | None = None,
13 return_evaluation_result: bool = False,
14 allow_tool_use: bool | None = None,
15 use_evaluate_on_dict_items: bool | None = None,
16 sampling_tag: str | None = None
17)

Parameters:

prompt
opik_optimizer.api_objects.chat_prompt.ChatPrompt | dict[str, opik_optimizer.api_objects.chat_prompt.ChatPrompt]
dataset
Dataset
metric
MetricFunction
agent
opik_optimizer.agents.optimizable_agent.OptimizableAgent | None
n_threads
int | None
verbose
intDefaults to 1
dataset_item_ids
list[str] | None
experiment_config
dict | None
n_samples
int | float | str | None
n_samples_strategy
str | None
seed
int | None
return_evaluation_result
boolDefaults to False
allow_tool_use
bool | None
use_evaluate_on_dict_items
bool | None
sampling_tag
str | None

evaluate_with_result

1evaluate_with_result(
2 context: OptimizationContext,
3 prompts: dict,
4 experiment_config: dict[str, typing.Any] | None = None,
5 empty_score: float | None = None,
6 n_samples: int | float | str | None = None,
7 n_samples_strategy: str | None = None,
8 sampling_tag: str | None = None
9)

Parameters:

context
OptimizationContext
prompts
dict
experiment_config
dict[str, typing.Any] | None
empty_score
float | None
n_samples
int | float | str | None
n_samples_strategy
str | None
sampling_tag
str | None

finish_candidate

1finish_candidate(
2 context: OptimizationContext,
3 candidate_handle: Any,
4 score: float | None,
5 metrics: dict[str, typing.Any] | None = None,
6 extras: dict[str, typing.Any] | None = None,
7 candidates: list[dict[str, typing.Any]] | None = None,
8 dataset: str | None = None,
9 dataset_split: str | None = None,
10 trial_index: int | None = None,
11 timestamp: str | None = None,
12 round_handle: typing.Any | None = None
13)

Parameters:

context
OptimizationContext
candidate_handle
Any
score
float | None
metrics
dict[str, typing.Any] | None
extras
dict[str, typing.Any] | None
candidates
list[dict[str, typing.Any]] | None
dataset
str | None
dataset_split
str | None
trial_index
int | None
timestamp
str | None
round_handle
typing.Any | None

finish_round

1finish_round(
2 round_handle: Any,
3 context: opik_optimizer.core.state.OptimizationContext | None = None,
4 best_score: float | None = None,
5 best_candidate: typing.Any | None = None,
6 best_prompt: typing.Any | None = None,
7 stop_reason: str | None = None,
8 extras: dict[str, typing.Any] | None = None,
9 candidates: list[dict[str, typing.Any]] | None = None,
10 timestamp: str | None = None,
11 dataset_split: str | None = None,
12 pareto_front: list[dict[str, typing.Any]] | None = None,
13 selection_meta: dict[str, typing.Any] | None = None
14)

Parameters:

round_handle
Any
context
opik_optimizer.core.state.OptimizationContext | None
best_score
float | None
best_candidate
typing.Any | None
best_prompt
typing.Any | None
stop_reason
str | None
extras
dict[str, typing.Any] | None
candidates
list[dict[str, typing.Any]] | None
timestamp
str | None
dataset_split
str | None
pareto_front
list[dict[str, typing.Any]] | None
selection_meta
dict[str, typing.Any] | None

get_config

1get_config(
2 context: OptimizationContext
3)

Parameters:

context
OptimizationContext

get_default_prompt

1get_default_prompt(
2 key: str
3)

Parameters:

key
str
The prompt key to retrieve

get_history_entries

1get_history_entries()

get_history_rounds

1get_history_rounds()

get_metadata

1get_metadata(
2 context: OptimizationContext
3)

Parameters:

context
OptimizationContext

get_optimizer_metadata

1get_optimizer_metadata()

get_prompt

1get_prompt(
2 key: str,
3 fmt: Any
4)

Parameters:

key
str
The prompt key to retrieve
fmt
Any

list_prompts

1list_prompts()

on_trial

1on_trial(
2 context: OptimizationContext,
3 prompts: dict,
4 score: float,
5 prev_best_score: float | None = None
6)

Parameters:

context
OptimizationContext
prompts
dict
score
float
prev_best_score
float | None

optimize_prompt

1optimize_prompt(
2 prompt: opik_optimizer.api_objects.chat_prompt.ChatPrompt | dict[str, opik_optimizer.api_objects.chat_prompt.ChatPrompt],
3 dataset: Dataset,
4 metric: MetricFunction,
5 agent: opik_optimizer.agents.optimizable_agent.OptimizableAgent | None = None,
6 experiment_config: dict | None = None,
7 n_samples: int | float | str | None = None,
8 n_samples_minibatch: int | None = None,
9 n_samples_strategy: str | None = None,
10 auto_continue: bool = False,
11 project_name: str | None = None,
12 optimization_id: str | None = None,
13 validation_dataset: opik.api_objects.dataset.dataset.Dataset | None = None,
14 max_trials: int = 10,
15 allow_tool_use: bool = True,
16 optimize_prompt: bool | str | list[str] | None = 'system',
17 args: Any,
18 kwargs: Any
19)

Parameters:

prompt
opik_optimizer.api_objects.chat_prompt.ChatPrompt | dict[str, opik_optimizer.api_objects.chat_prompt.ChatPrompt]
The prompt to optimize (single ChatPrompt or dict of prompts)
dataset
Dataset
Opik dataset (training set - used for feedback/context) TODO/FIXME: This parameter will be deprecated in favor of dataset_training. For now, it serves as the training dataset parameter.
metric
MetricFunction
A metric function with signature (dataset_item, llm_output) -> float
agent
opik_optimizer.agents.optimizable_agent.OptimizableAgent | None
Optional agent for prompt execution (defaults to LiteLLMAgent)
experiment_config
dict | None
Optional configuration for the experiment
n_samples
int | float | str | None
Number of samples to use for evaluation
n_samples_minibatch
int | None
Optional number of samples for inner-loop minibatches
n_samples_strategy
str | None
Sampling strategy name (default “random_sorted”)
auto_continue
boolDefaults to False
Whether to continue optimization automatically
project_name
str | None
Opik project name for logging traces (defaults to OPIK_PROJECT_NAME env or “Optimization”)
optimization_id
str | None
Optional ID to use when creating the Opik optimization run
validation_dataset
opik.api_objects.dataset.dataset.Dataset | None
Optional validation dataset for ranking candidates
max_trials
intDefaults to 10
Maximum number of optimization trials
allow_tool_use
boolDefaults to True
Whether tools may be executed during evaluation (default True)
optimize_prompt
bool | str | list[str] | NoneDefaults to system
Which prompt roles to allow for optimization
args
Any
kwargs
Any

post_baseline

1post_baseline(
2 context: OptimizationContext,
3 score: float
4)

Parameters:

context
OptimizationContext
score
float

post_optimize

1post_optimize(
2 context: OptimizationContext,
3 result: OptimizationResult
4)

Parameters:

context
OptimizationContext
result
OptimizationResult

post_round

1post_round(
2 round_handle: Any,
3 context: opik_optimizer.core.state.OptimizationContext | None = None,
4 best_score: float | None = None,
5 best_candidate: typing.Any | None = None,
6 best_prompt: typing.Any | None = None,
7 stop_reason: str | None = None,
8 extras: dict[str, typing.Any] | None = None,
9 candidates: list[dict[str, typing.Any]] | None = None,
10 timestamp: str | None = None,
11 dataset_split: str | None = None,
12 pareto_front: list[dict[str, typing.Any]] | None = None,
13 selection_meta: dict[str, typing.Any] | None = None
14)

Parameters:

round_handle
Any
context
opik_optimizer.core.state.OptimizationContext | None
best_score
float | None
best_candidate
typing.Any | None
best_prompt
typing.Any | None
stop_reason
str | None
extras
dict[str, typing.Any] | None
candidates
list[dict[str, typing.Any]] | None
timestamp
str | None
dataset_split
str | None
pareto_front
list[dict[str, typing.Any]] | None
selection_meta
dict[str, typing.Any] | None

post_trial

1post_trial(
2 context: OptimizationContext,
3 candidate_handle: Any,
4 score: float | None,
5 metrics: dict[str, typing.Any] | None = None,
6 extras: dict[str, typing.Any] | None = None,
7 candidates: list[dict[str, typing.Any]] | None = None,
8 dataset: str | None = None,
9 dataset_split: str | None = None,
10 trial_index: int | None = None,
11 timestamp: str | None = None,
12 round_handle: typing.Any | None = None
13)

Parameters:

context
OptimizationContext
candidate_handle
Any
score
float | None
metrics
dict[str, typing.Any] | None
extras
dict[str, typing.Any] | None
candidates
list[dict[str, typing.Any]] | None
dataset
str | None
dataset_split
str | None
trial_index
int | None
timestamp
str | None
round_handle
typing.Any | None

pre_baseline

1pre_baseline(
2 context: OptimizationContext
3)

Parameters:

context
OptimizationContext

pre_optimize

1pre_optimize(
2 context: OptimizationContext
3)

Parameters:

context
OptimizationContext
The optimization context

pre_round

1pre_round(
2 context: OptimizationContext,
3 extras: Any
4)

Parameters:

context
OptimizationContext
extras
Any

pre_trial

1pre_trial(
2 context: OptimizationContext,
3 candidate: Any,
4 round_handle: typing.Any | None = None
5)

Parameters:

context
OptimizationContext
candidate
Any
round_handle
typing.Any | None

record_candidate_entry

1record_candidate_entry(
2 prompt_or_payload: Any,
3 score: float | None = None,
4 id: str | None = None,
5 metrics: dict[str, typing.Any] | None = None,
6 notes: str | None = None,
7 extra: dict[str, typing.Any] | None = None,
8 context: opik_optimizer.core.state.OptimizationContext | None = None
9)

Parameters:

prompt_or_payload
Any
score
float | None
id
str | None
metrics
dict[str, typing.Any] | None
notes
str | None
extra
dict[str, typing.Any] | None
context
opik_optimizer.core.state.OptimizationContext | None

run_optimization

1run_optimization(
2 context: OptimizationContext
3)

Parameters:

context
OptimizationContext
The optimization context with prompts, dataset, metric, etc.

set_default_dataset_split

1set_default_dataset_split(
2 dataset_split: str | None
3)

Parameters:

dataset_split
str | None

set_pareto_front

1set_pareto_front(
2 pareto_front: list[dict[str, typing.Any]] | None
3)

Parameters:

pareto_front
list[dict[str, typing.Any]] | None

set_selection_meta

1set_selection_meta(
2 selection_meta: dict[str, typing.Any] | None
3)

Parameters:

selection_meta
dict[str, typing.Any] | None

start_candidate

1start_candidate(
2 context: OptimizationContext,
3 candidate: Any,
4 round_handle: typing.Any | None = None
5)

Parameters:

context
OptimizationContext
candidate
Any
round_handle
typing.Any | None

with_dataset_split

1with_dataset_split(
2 dataset_split: str | None
3)

Parameters:

dataset_split
str | None

ParameterOptimizer

1ParameterOptimizer(
2 model: str = 'gpt-4o',
3 model_parameters: dict[str, typing.Any] | None = None,
4 default_n_trials: int = 20,
5 local_search_ratio: float = 0.3,
6 local_search_scale: float = 0.2,
7 n_threads: int = 12,
8 verbose: int = 1,
9 seed: int = 42,
10 name: str | None = None,
11 skip_perfect_score: bool = True,
12 perfect_score: float = 0.95
13)

Parameters:

model
strDefaults to gpt-4o
LiteLLM model name (used for metadata, not for optimization calls)
model_parameters
dict[str, typing.Any] | None
Optional dict of LiteLLM parameters for optimizer’s internal LLM calls. Common params: temperature, max_tokens, max_completion_tokens, top_p.
default_n_trials
intDefaults to 20
Default number of optimization trials to run
local_search_ratio
floatDefaults to 0.3
Ratio of trials to dedicate to local search refinement (0.0-1.0)
local_search_scale
floatDefaults to 0.2
Scale factor for narrowing search space during local search
n_threads
intDefaults to 12
Number of parallel threads for evaluation
verbose
intDefaults to 1
Controls internal logging/progress bars (0=off, 1=on)
seed
intDefaults to 42
Random seed for reproducibility
name
str | None
skip_perfect_score
boolDefaults to True
perfect_score
floatDefaults to 0.95

Methods

begin_round

1begin_round(
2 context: OptimizationContext,
3 extras: Any
4)

Parameters:

context
OptimizationContext
extras
Any

cleanup

1cleanup()

evaluate

1evaluate(
2 context: OptimizationContext,
3 prompts: dict,
4 experiment_config: dict[str, typing.Any] | None = None,
5 sampling_tag: str | None = None
6)

Parameters:

context
OptimizationContext
Optimization context for this run.
prompts
dict
Dict of named prompts to evaluate (e.g., {“main”: ChatPrompt(…)}). Single-prompt optimizations use a dict with one entry.
experiment_config
dict[str, typing.Any] | None
Optional experiment configuration.
sampling_tag
str | None
Optional sampling tag for deterministic subsampling per candidate.

evaluate_prompt

1evaluate_prompt(
2 prompt: opik_optimizer.api_objects.chat_prompt.ChatPrompt | dict[str, opik_optimizer.api_objects.chat_prompt.ChatPrompt],
3 dataset: Dataset,
4 metric: MetricFunction,
5 agent: opik_optimizer.agents.optimizable_agent.OptimizableAgent | None = None,
6 n_threads: int | None = None,
7 verbose: int = 1,
8 dataset_item_ids: list[str] | None = None,
9 experiment_config: dict | None = None,
10 n_samples: int | float | str | None = None,
11 n_samples_strategy: str | None = None,
12 seed: int | None = None,
13 return_evaluation_result: bool = False,
14 allow_tool_use: bool | None = None,
15 use_evaluate_on_dict_items: bool | None = None,
16 sampling_tag: str | None = None
17)

Parameters:

prompt
opik_optimizer.api_objects.chat_prompt.ChatPrompt | dict[str, opik_optimizer.api_objects.chat_prompt.ChatPrompt]
dataset
Dataset
metric
MetricFunction
agent
opik_optimizer.agents.optimizable_agent.OptimizableAgent | None
n_threads
int | None
verbose
intDefaults to 1
dataset_item_ids
list[str] | None
experiment_config
dict | None
n_samples
int | float | str | None
n_samples_strategy
str | None
seed
int | None
return_evaluation_result
boolDefaults to False
allow_tool_use
bool | None
use_evaluate_on_dict_items
bool | None
sampling_tag
str | None

evaluate_with_result

1evaluate_with_result(
2 context: OptimizationContext,
3 prompts: dict,
4 experiment_config: dict[str, typing.Any] | None = None,
5 empty_score: float | None = None,
6 n_samples: int | float | str | None = None,
7 n_samples_strategy: str | None = None,
8 sampling_tag: str | None = None
9)

Parameters:

context
OptimizationContext
prompts
dict
experiment_config
dict[str, typing.Any] | None
empty_score
float | None
n_samples
int | float | str | None
n_samples_strategy
str | None
sampling_tag
str | None

finish_candidate

1finish_candidate(
2 context: OptimizationContext,
3 candidate_handle: Any,
4 score: float | None,
5 metrics: dict[str, typing.Any] | None = None,
6 extras: dict[str, typing.Any] | None = None,
7 candidates: list[dict[str, typing.Any]] | None = None,
8 dataset: str | None = None,
9 dataset_split: str | None = None,
10 trial_index: int | None = None,
11 timestamp: str | None = None,
12 round_handle: typing.Any | None = None
13)

Parameters:

context
OptimizationContext
candidate_handle
Any
score
float | None
metrics
dict[str, typing.Any] | None
extras
dict[str, typing.Any] | None
candidates
list[dict[str, typing.Any]] | None
dataset
str | None
dataset_split
str | None
trial_index
int | None
timestamp
str | None
round_handle
typing.Any | None

finish_round

1finish_round(
2 round_handle: Any,
3 context: opik_optimizer.core.state.OptimizationContext | None = None,
4 best_score: float | None = None,
5 best_candidate: typing.Any | None = None,
6 best_prompt: typing.Any | None = None,
7 stop_reason: str | None = None,
8 extras: dict[str, typing.Any] | None = None,
9 candidates: list[dict[str, typing.Any]] | None = None,
10 timestamp: str | None = None,
11 dataset_split: str | None = None,
12 pareto_front: list[dict[str, typing.Any]] | None = None,
13 selection_meta: dict[str, typing.Any] | None = None
14)

Parameters:

round_handle
Any
context
opik_optimizer.core.state.OptimizationContext | None
best_score
float | None
best_candidate
typing.Any | None
best_prompt
typing.Any | None
stop_reason
str | None
extras
dict[str, typing.Any] | None
candidates
list[dict[str, typing.Any]] | None
timestamp
str | None
dataset_split
str | None
pareto_front
list[dict[str, typing.Any]] | None
selection_meta
dict[str, typing.Any] | None

get_config

1get_config(
2 context: OptimizationContext
3)

Parameters:

context
OptimizationContext

get_default_prompt

1get_default_prompt(
2 key: str
3)

Parameters:

key
str
The prompt key to retrieve

get_history_entries

1get_history_entries()

get_history_rounds

1get_history_rounds()

get_metadata

1get_metadata(
2 context: OptimizationContext
3)

Parameters:

context
OptimizationContext

get_optimizer_metadata

1get_optimizer_metadata()

get_prompt

1get_prompt(
2 key: str,
3 fmt: Any
4)

Parameters:

key
str
The prompt key to retrieve
fmt
Any

list_prompts

1list_prompts()

on_trial

1on_trial(
2 context: OptimizationContext,
3 prompts: dict,
4 score: float,
5 prev_best_score: float | None = None
6)

Parameters:

context
OptimizationContext
prompts
dict
score
float
prev_best_score
float | None

optimize_parameter

1optimize_parameter(
2 prompt: opik_optimizer.api_objects.chat_prompt.ChatPrompt | dict[str, opik_optimizer.api_objects.chat_prompt.ChatPrompt],
3 dataset: Dataset,
4 metric: MetricFunction,
5 parameter_space: opik_optimizer.algorithms.parameter_optimizer.ops.search_ops.ParameterSearchSpace | collections.abc.Mapping[str, typing.Any],
6 validation_dataset: opik.api_objects.dataset.dataset.Dataset | None = None,
7 experiment_config: dict | None = None,
8 max_trials: int | None = None,
9 n_samples: int | float | str | None = None,
10 n_samples_minibatch: int | None = None,
11 n_samples_strategy: str | None = None,
12 agent: opik_optimizer.agents.optimizable_agent.OptimizableAgent | None = None,
13 project_name: str = 'Optimization',
14 sampler: optuna.samplers._base.BaseSampler | None = None,
15 callbacks: list[collections.abc.Callable[[optuna.study.study.Study, optuna.trial._frozen.FrozenTrial], None]] | None = None,
16 timeout: float | None = None,
17 local_trials: int | None = None,
18 local_search_scale: float | None = None,
19 optimization_id: str | None = None
20)

Parameters:

prompt
opik_optimizer.api_objects.chat_prompt.ChatPrompt | dict[str, opik_optimizer.api_objects.chat_prompt.ChatPrompt]
The prompt or dict of prompts to evaluate with tuned parameters. When a dict is provided, parameters are optimized independently for each prompt.
dataset
Dataset
Dataset providing evaluation examples
metric
MetricFunction
Objective function to maximize
parameter_space
opik_optimizer.algorithms.parameter_optimizer.ops.search_ops.ParameterSearchSpace | collections.abc.Mapping[str, typing.Any]
Definition of the search space for tunable parameters. For multi-prompt, params without a prefix are expanded per prompt. Params already prefixed (e.g., ‘analyze.temperature’) are kept as-is.
validation_dataset
opik.api_objects.dataset.dataset.Dataset | None
Optional validation dataset. Note: Due to the internal implementation of ParameterOptimizer, this parameter is currently not fully utilized and we recommend not using it for this optimizer.
experiment_config
dict | None
Optional experiment metadata
max_trials
int | None
Total number of trials (if None, uses default_n_trials)
n_samples
int | float | str | None
Number of dataset samples to evaluate per trial (None for all)
n_samples_minibatch
int | None
Optional number of samples for inner-loop minibatches
n_samples_strategy
str | None
Sampling strategy name (default “random_sorted”)
agent
opik_optimizer.agents.optimizable_agent.OptimizableAgent | None
Optional custom agent instance to execute evaluations
project_name
strDefaults to Optimization
Opik project name for logging traces (default: “Optimization”)
sampler
optuna.samplers._base.BaseSampler | None
Optuna sampler to use (default: TPESampler with seed)
callbacks
list[collections.abc.Callable[[optuna.study.study.Study, optuna.trial._frozen.FrozenTrial], None]] | None
List of callback functions for Optuna study
timeout
float | None
Maximum time in seconds for optimization
local_trials
int | None
Number of trials for local search (overrides local_search_ratio)
local_search_scale
float | None
Scale factor for local search narrowing (0.0-1.0)
optimization_id
str | None
Optional ID to use when creating the Opik optimization run; when provided it must be a valid UUIDv7 string.

post_baseline

1post_baseline(
2 context: OptimizationContext,
3 score: float
4)

Parameters:

context
OptimizationContext
score
float

post_optimize

1post_optimize(
2 context: OptimizationContext,
3 result: OptimizationResult
4)

Parameters:

context
OptimizationContext
result
OptimizationResult

post_round

1post_round(
2 round_handle: Any,
3 context: opik_optimizer.core.state.OptimizationContext | None = None,
4 best_score: float | None = None,
5 best_candidate: typing.Any | None = None,
6 best_prompt: typing.Any | None = None,
7 stop_reason: str | None = None,
8 extras: dict[str, typing.Any] | None = None,
9 candidates: list[dict[str, typing.Any]] | None = None,
10 timestamp: str | None = None,
11 dataset_split: str | None = None,
12 pareto_front: list[dict[str, typing.Any]] | None = None,
13 selection_meta: dict[str, typing.Any] | None = None
14)

Parameters:

round_handle
Any
context
opik_optimizer.core.state.OptimizationContext | None
best_score
float | None
best_candidate
typing.Any | None
best_prompt
typing.Any | None
stop_reason
str | None
extras
dict[str, typing.Any] | None
candidates
list[dict[str, typing.Any]] | None
timestamp
str | None
dataset_split
str | None
pareto_front
list[dict[str, typing.Any]] | None
selection_meta
dict[str, typing.Any] | None

post_trial

1post_trial(
2 context: OptimizationContext,
3 candidate_handle: Any,
4 score: float | None,
5 metrics: dict[str, typing.Any] | None = None,
6 extras: dict[str, typing.Any] | None = None,
7 candidates: list[dict[str, typing.Any]] | None = None,
8 dataset: str | None = None,
9 dataset_split: str | None = None,
10 trial_index: int | None = None,
11 timestamp: str | None = None,
12 round_handle: typing.Any | None = None
13)

Parameters:

context
OptimizationContext
candidate_handle
Any
score
float | None
metrics
dict[str, typing.Any] | None
extras
dict[str, typing.Any] | None
candidates
list[dict[str, typing.Any]] | None
dataset
str | None
dataset_split
str | None
trial_index
int | None
timestamp
str | None
round_handle
typing.Any | None

pre_baseline

1pre_baseline(
2 context: OptimizationContext
3)

Parameters:

context
OptimizationContext

pre_optimize

1pre_optimize(
2 context: OptimizationContext
3)

Parameters:

context
OptimizationContext
The optimization context

pre_round

1pre_round(
2 context: OptimizationContext,
3 extras: Any
4)

Parameters:

context
OptimizationContext
extras
Any

pre_trial

1pre_trial(
2 context: OptimizationContext,
3 candidate: Any,
4 round_handle: typing.Any | None = None
5)

Parameters:

context
OptimizationContext
candidate
Any
round_handle
typing.Any | None

record_candidate_entry

1record_candidate_entry(
2 prompt_or_payload: Any,
3 score: float | None = None,
4 id: str | None = None,
5 metrics: dict[str, typing.Any] | None = None,
6 notes: str | None = None,
7 extra: dict[str, typing.Any] | None = None,
8 context: opik_optimizer.core.state.OptimizationContext | None = None
9)

Parameters:

prompt_or_payload
Any
score
float | None
id
str | None
metrics
dict[str, typing.Any] | None
notes
str | None
extra
dict[str, typing.Any] | None
context
opik_optimizer.core.state.OptimizationContext | None

set_default_dataset_split

1set_default_dataset_split(
2 dataset_split: str | None
3)

Parameters:

dataset_split
str | None

set_pareto_front

1set_pareto_front(
2 pareto_front: list[dict[str, typing.Any]] | None
3)

Parameters:

pareto_front
list[dict[str, typing.Any]] | None

set_selection_meta

1set_selection_meta(
2 selection_meta: dict[str, typing.Any] | None
3)

Parameters:

selection_meta
dict[str, typing.Any] | None

start_candidate

1start_candidate(
2 context: OptimizationContext,
3 candidate: Any,
4 round_handle: typing.Any | None = None
5)

Parameters:

context
OptimizationContext
candidate
Any
round_handle
typing.Any | None

with_dataset_split

1with_dataset_split(
2 dataset_split: str | None
3)

Parameters:

dataset_split
str | None

ParameterSearchSpace

1ParameterSearchSpace(
2 parameters: list[opik_optimizer.algorithms.parameter_optimizer.ops.search_ops.ParameterSpec] = PydanticUndefined
3)

Parameters:

parameters
list[opik_optimizer.algorithms.parameter_optimizer.ops.search_ops.ParameterSpec]Defaults to PydanticUndefined

ParameterSpec

1ParameterSpec(
2 name: <class 'str'>,
3 description: str | None = None,
4 distribution: <enum 'ParameterType'>,
5 low: float | None = None,
6 high: float | None = None,
7 step: float | None = None,
8 scale: Literal['linear', 'log'] = 'linear',
9 choices: list[Any] | None = None,
10 target: str | collections.abc.Sequence[str] | None = None,
11 default: Any | None = None
12)

Parameters:

name
<class 'str'>Defaults to PydanticUndefined
description
str | None
distribution
<enum 'ParameterType'>Defaults to PydanticUndefined
low
float | None
high
float | None
step
float | None
scale
Literal['linear', 'log']Defaults to linear
choices
list[Any] | None
target
str | collections.abc.Sequence[str] | None
default
Any | None

ParameterType

1ParameterType(
2 args: Any,
3 kwds: Any
4)

Parameters:

args
Any
kwds
Any

BaseOptimizer

1BaseOptimizer(
2 model: str,
3 verbose: int = 1,
4 seed: int = 42,
5 model_parameters: dict[str, typing.Any] | None = None,
6 reasoning_model: str | None = None,
7 reasoning_model_parameters: dict[str, typing.Any] | None = None,
8 name: str | None = None,
9 skip_perfect_score: bool = True,
10 perfect_score: float = 0.95,
11 prompt_overrides: dict[str, str] | collections.abc.Callable[[opik_optimizer.utils.prompt_library.PromptLibrary], None] | None = None,
12 display: opik_optimizer.utils.display.run.RunDisplay | None = None
13)

Parameters:

model
str
verbose
intDefaults to 1
seed
intDefaults to 42
model_parameters
dict[str, typing.Any] | None
reasoning_model
str | None
reasoning_model_parameters
dict[str, typing.Any] | None
name
str | None
skip_perfect_score
boolDefaults to True
perfect_score
floatDefaults to 0.95
prompt_overrides
dict[str, str] | collections.abc.Callable[[opik_optimizer.utils.prompt_library.PromptLibrary], None] | None
display
opik_optimizer.utils.display.run.RunDisplay | None

Methods

begin_round

1begin_round(
2 context: OptimizationContext,
3 extras: Any
4)

Parameters:

context
OptimizationContext
extras
Any

cleanup

1cleanup()

evaluate

1evaluate(
2 context: OptimizationContext,
3 prompts: dict,
4 experiment_config: dict[str, typing.Any] | None = None,
5 sampling_tag: str | None = None
6)

Parameters:

context
OptimizationContext
Optimization context for this run.
prompts
dict
Dict of named prompts to evaluate (e.g., {“main”: ChatPrompt(…)}). Single-prompt optimizations use a dict with one entry.
experiment_config
dict[str, typing.Any] | None
Optional experiment configuration.
sampling_tag
str | None
Optional sampling tag for deterministic subsampling per candidate.

evaluate_prompt

1evaluate_prompt(
2 prompt: opik_optimizer.api_objects.chat_prompt.ChatPrompt | dict[str, opik_optimizer.api_objects.chat_prompt.ChatPrompt],
3 dataset: Dataset,
4 metric: MetricFunction,
5 agent: opik_optimizer.agents.optimizable_agent.OptimizableAgent | None = None,
6 n_threads: int | None = None,
7 verbose: int = 1,
8 dataset_item_ids: list[str] | None = None,
9 experiment_config: dict | None = None,
10 n_samples: int | float | str | None = None,
11 n_samples_strategy: str | None = None,
12 seed: int | None = None,
13 return_evaluation_result: bool = False,
14 allow_tool_use: bool | None = None,
15 use_evaluate_on_dict_items: bool | None = None,
16 sampling_tag: str | None = None
17)

Parameters:

prompt
opik_optimizer.api_objects.chat_prompt.ChatPrompt | dict[str, opik_optimizer.api_objects.chat_prompt.ChatPrompt]
dataset
Dataset
metric
MetricFunction
agent
opik_optimizer.agents.optimizable_agent.OptimizableAgent | None
n_threads
int | None
verbose
intDefaults to 1
dataset_item_ids
list[str] | None
experiment_config
dict | None
n_samples
int | float | str | None
n_samples_strategy
str | None
seed
int | None
return_evaluation_result
boolDefaults to False
allow_tool_use
bool | None
use_evaluate_on_dict_items
bool | None
sampling_tag
str | None

evaluate_with_result

1evaluate_with_result(
2 context: OptimizationContext,
3 prompts: dict,
4 experiment_config: dict[str, typing.Any] | None = None,
5 empty_score: float | None = None,
6 n_samples: int | float | str | None = None,
7 n_samples_strategy: str | None = None,
8 sampling_tag: str | None = None
9)

Parameters:

context
OptimizationContext
prompts
dict
experiment_config
dict[str, typing.Any] | None
empty_score
float | None
n_samples
int | float | str | None
n_samples_strategy
str | None
sampling_tag
str | None

finish_candidate

1finish_candidate(
2 context: OptimizationContext,
3 candidate_handle: Any,
4 score: float | None,
5 metrics: dict[str, typing.Any] | None = None,
6 extras: dict[str, typing.Any] | None = None,
7 candidates: list[dict[str, typing.Any]] | None = None,
8 dataset: str | None = None,
9 dataset_split: str | None = None,
10 trial_index: int | None = None,
11 timestamp: str | None = None,
12 round_handle: typing.Any | None = None
13)

Parameters:

context
OptimizationContext
candidate_handle
Any
score
float | None
metrics
dict[str, typing.Any] | None
extras
dict[str, typing.Any] | None
candidates
list[dict[str, typing.Any]] | None
dataset
str | None
dataset_split
str | None
trial_index
int | None
timestamp
str | None
round_handle
typing.Any | None

finish_round

1finish_round(
2 round_handle: Any,
3 context: opik_optimizer.core.state.OptimizationContext | None = None,
4 best_score: float | None = None,
5 best_candidate: typing.Any | None = None,
6 best_prompt: typing.Any | None = None,
7 stop_reason: str | None = None,
8 extras: dict[str, typing.Any] | None = None,
9 candidates: list[dict[str, typing.Any]] | None = None,
10 timestamp: str | None = None,
11 dataset_split: str | None = None,
12 pareto_front: list[dict[str, typing.Any]] | None = None,
13 selection_meta: dict[str, typing.Any] | None = None
14)

Parameters:

round_handle
Any
context
opik_optimizer.core.state.OptimizationContext | None
best_score
float | None
best_candidate
typing.Any | None
best_prompt
typing.Any | None
stop_reason
str | None
extras
dict[str, typing.Any] | None
candidates
list[dict[str, typing.Any]] | None
timestamp
str | None
dataset_split
str | None
pareto_front
list[dict[str, typing.Any]] | None
selection_meta
dict[str, typing.Any] | None

get_config

1get_config(
2 context: OptimizationContext
3)

Parameters:

context
OptimizationContext

get_default_prompt

1get_default_prompt(
2 key: str
3)

Parameters:

key
str
The prompt key to retrieve

get_history_entries

1get_history_entries()

get_history_rounds

1get_history_rounds()

get_metadata

1get_metadata(
2 context: OptimizationContext
3)

Parameters:

context
OptimizationContext

get_prompt

1get_prompt(
2 key: str,
3 fmt: Any
4)

Parameters:

key
str
The prompt key to retrieve
fmt
Any

list_prompts

1list_prompts()

on_trial

1on_trial(
2 context: OptimizationContext,
3 prompts: dict,
4 score: float,
5 prev_best_score: float | None = None
6)

Parameters:

context
OptimizationContext
prompts
dict
score
float
prev_best_score
float | None

optimize_prompt

1optimize_prompt(
2 prompt: opik_optimizer.api_objects.chat_prompt.ChatPrompt | dict[str, opik_optimizer.api_objects.chat_prompt.ChatPrompt],
3 dataset: Dataset,
4 metric: MetricFunction,
5 agent: opik_optimizer.agents.optimizable_agent.OptimizableAgent | None = None,
6 experiment_config: dict | None = None,
7 n_samples: int | float | str | None = None,
8 n_samples_minibatch: int | None = None,
9 n_samples_strategy: str | None = None,
10 auto_continue: bool = False,
11 project_name: str | None = None,
12 optimization_id: str | None = None,
13 validation_dataset: opik.api_objects.dataset.dataset.Dataset | None = None,
14 max_trials: int = 10,
15 allow_tool_use: bool = True,
16 optimize_prompt: bool | str | list[str] | None = 'system',
17 args: Any,
18 kwargs: Any
19)

Parameters:

prompt
opik_optimizer.api_objects.chat_prompt.ChatPrompt | dict[str, opik_optimizer.api_objects.chat_prompt.ChatPrompt]
The prompt to optimize (single ChatPrompt or dict of prompts)
dataset
Dataset
Opik dataset (training set - used for feedback/context) TODO/FIXME: This parameter will be deprecated in favor of dataset_training. For now, it serves as the training dataset parameter.
metric
MetricFunction
A metric function with signature (dataset_item, llm_output) -> float
agent
opik_optimizer.agents.optimizable_agent.OptimizableAgent | None
Optional agent for prompt execution (defaults to LiteLLMAgent)
experiment_config
dict | None
Optional configuration for the experiment
n_samples
int | float | str | None
Number of samples to use for evaluation
n_samples_minibatch
int | None
Optional number of samples for inner-loop minibatches
n_samples_strategy
str | None
Sampling strategy name (default “random_sorted”)
auto_continue
boolDefaults to False
Whether to continue optimization automatically
project_name
str | None
Opik project name for logging traces (defaults to OPIK_PROJECT_NAME env or “Optimization”)
optimization_id
str | None
Optional ID to use when creating the Opik optimization run
validation_dataset
opik.api_objects.dataset.dataset.Dataset | None
Optional validation dataset for ranking candidates
max_trials
intDefaults to 10
Maximum number of optimization trials
allow_tool_use
boolDefaults to True
Whether tools may be executed during evaluation (default True)
optimize_prompt
bool | str | list[str] | NoneDefaults to system
Which prompt roles to allow for optimization
args
Any
kwargs
Any

post_baseline

1post_baseline(
2 context: OptimizationContext,
3 score: float
4)

Parameters:

context
OptimizationContext
score
float

post_optimize

1post_optimize(
2 context: OptimizationContext,
3 result: OptimizationResult
4)

Parameters:

context
OptimizationContext
result
OptimizationResult

post_round

1post_round(
2 round_handle: Any,
3 context: opik_optimizer.core.state.OptimizationContext | None = None,
4 best_score: float | None = None,
5 best_candidate: typing.Any | None = None,
6 best_prompt: typing.Any | None = None,
7 stop_reason: str | None = None,
8 extras: dict[str, typing.Any] | None = None,
9 candidates: list[dict[str, typing.Any]] | None = None,
10 timestamp: str | None = None,
11 dataset_split: str | None = None,
12 pareto_front: list[dict[str, typing.Any]] | None = None,
13 selection_meta: dict[str, typing.Any] | None = None
14)

Parameters:

round_handle
Any
context
opik_optimizer.core.state.OptimizationContext | None
best_score
float | None
best_candidate
typing.Any | None
best_prompt
typing.Any | None
stop_reason
str | None
extras
dict[str, typing.Any] | None
candidates
list[dict[str, typing.Any]] | None
timestamp
str | None
dataset_split
str | None
pareto_front
list[dict[str, typing.Any]] | None
selection_meta
dict[str, typing.Any] | None

post_trial

1post_trial(
2 context: OptimizationContext,
3 candidate_handle: Any,
4 score: float | None,
5 metrics: dict[str, typing.Any] | None = None,
6 extras: dict[str, typing.Any] | None = None,
7 candidates: list[dict[str, typing.Any]] | None = None,
8 dataset: str | None = None,
9 dataset_split: str | None = None,
10 trial_index: int | None = None,
11 timestamp: str | None = None,
12 round_handle: typing.Any | None = None
13)

Parameters:

context
OptimizationContext
candidate_handle
Any
score
float | None
metrics
dict[str, typing.Any] | None
extras
dict[str, typing.Any] | None
candidates
list[dict[str, typing.Any]] | None
dataset
str | None
dataset_split
str | None
trial_index
int | None
timestamp
str | None
round_handle
typing.Any | None

pre_baseline

1pre_baseline(
2 context: OptimizationContext
3)

Parameters:

context
OptimizationContext

pre_optimize

1pre_optimize(
2 context: OptimizationContext
3)

Parameters:

context
OptimizationContext
The optimization context

pre_round

1pre_round(
2 context: OptimizationContext,
3 extras: Any
4)

Parameters:

context
OptimizationContext
extras
Any

pre_trial

1pre_trial(
2 context: OptimizationContext,
3 candidate: Any,
4 round_handle: typing.Any | None = None
5)

Parameters:

context
OptimizationContext
candidate
Any
round_handle
typing.Any | None

record_candidate_entry

1record_candidate_entry(
2 prompt_or_payload: Any,
3 score: float | None = None,
4 id: str | None = None,
5 metrics: dict[str, typing.Any] | None = None,
6 notes: str | None = None,
7 extra: dict[str, typing.Any] | None = None,
8 context: opik_optimizer.core.state.OptimizationContext | None = None
9)

Parameters:

prompt_or_payload
Any
score
float | None
id
str | None
metrics
dict[str, typing.Any] | None
notes
str | None
extra
dict[str, typing.Any] | None
context
opik_optimizer.core.state.OptimizationContext | None

run_optimization

1run_optimization(
2 context: OptimizationContext
3)

Parameters:

context
OptimizationContext
The optimization context with prompts, dataset, metric, etc.

set_default_dataset_split

1set_default_dataset_split(
2 dataset_split: str | None
3)

Parameters:

dataset_split
str | None

set_pareto_front

1set_pareto_front(
2 pareto_front: list[dict[str, typing.Any]] | None
3)

Parameters:

pareto_front
list[dict[str, typing.Any]] | None

set_selection_meta

1set_selection_meta(
2 selection_meta: dict[str, typing.Any] | None
3)

Parameters:

selection_meta
dict[str, typing.Any] | None

start_candidate

1start_candidate(
2 context: OptimizationContext,
3 candidate: Any,
4 round_handle: typing.Any | None = None
5)

Parameters:

context
OptimizationContext
candidate
Any
round_handle
typing.Any | None

with_dataset_split

1with_dataset_split(
2 dataset_split: str | None
3)

Parameters:

dataset_split
str | None

ChatPrompt

1ChatPrompt(
2 name: str = 'chat-prompt',
3 system: str | None = None,
4 user: str | None = None,
5 messages: list[dict[str, typing.Any]] | None = None,
6 tools: list[dict[str, typing.Any]] | None = None,
7 function_map: dict[str, collections.abc.Callable] | None = None,
8 model: str = 'gpt-4o-mini',
9 model_parameters: dict[str, typing.Any] | None = None,
10 model_kwargs: dict[str, typing.Any] | None = None,
11 kwargs: Any
12)

Parameters:

name
strDefaults to chat-prompt
system
str | None
the system prompt
user
str | None
messages
list[dict[str, typing.Any]] | None
a list of dictionaries with role/content, with a content containing {input-dataset-field}
tools
list[dict[str, typing.Any]] | None
function_map
dict[str, collections.abc.Callable] | None
model
strDefaults to gpt-4o-mini
model_parameters
dict[str, typing.Any] | None
model_kwargs
dict[str, typing.Any] | None
kwargs
Any

Methods

copy

1copy()

get_messages

1get_messages(
2 dataset_item: dict[str, typing.Any] | None = None
3)

Parameters:

dataset_item
dict[str, typing.Any] | None

replace_in_messages

1replace_in_messages(
2 messages: list,
3 label: str,
4 value: str
5)

Parameters:

messages
list
label
str
value
str

set_messages

1set_messages(
2 messages: list
3)

Parameters:

messages
list

to_dict

1to_dict()

AlgorithmResult

1AlgorithmResult(
2 best_prompts: dict,
3 best_score: float,
4 history: Sequence = <factory>,
5 metadata: dict = <factory>
6)

Parameters:

best_prompts
dict
best_score
float
history
SequenceDefaults to <factory>
metadata
dictDefaults to <factory>

OptimizationResult

1OptimizationResult(
2 schema_version: <class 'str'> = 'v1',
3 details_version: <class 'str'> = 'v1',
4 optimizer: <class 'str'> = 'Optimizer',
5 prompt: opik_optimizer.api_objects.chat_prompt.ChatPrompt | dict[str, opik_optimizer.api_objects.chat_prompt.ChatPrompt],
6 score: <class 'float'>,
7 metric_name: <class 'str'>,
8 optimization_id: str | None = None,
9 dataset_id: str | None = None,
10 initial_prompt: opik_optimizer.api_objects.chat_prompt.ChatPrompt | dict[str, opik_optimizer.api_objects.chat_prompt.ChatPrompt] | None = None,
11 initial_score: float | None = None,
12 details: dict[str, Any] = PydanticUndefined,
13 history: list[dict[str, Any]] = [],
14 llm_calls: int | None = None,
15 llm_calls_tools: int | None = None,
16 llm_cost_total: float | None = None,
17 llm_token_usage_total: dict[str, int] | None = None
18)

Parameters:

schema_version
<class 'str'>Defaults to v1
details_version
<class 'str'>Defaults to v1
optimizer
<class 'str'>Defaults to Optimizer
prompt
opik_optimizer.api_objects.chat_prompt.ChatPrompt | dict[str, opik_optimizer.api_objects.chat_prompt.ChatPrompt]Defaults to PydanticUndefined
score
<class 'float'>Defaults to PydanticUndefined
metric_name
<class 'str'>Defaults to PydanticUndefined
optimization_id
str | None
dataset_id
str | None
initial_prompt
opik_optimizer.api_objects.chat_prompt.ChatPrompt | dict[str, opik_optimizer.api_objects.chat_prompt.ChatPrompt] | None
initial_score
float | None
details
dict[str, Any]Defaults to PydanticUndefined
history
list[dict[str, Any]]Defaults to []
llm_calls
int | None
llm_calls_tools
int | None
llm_cost_total
float | None
llm_token_usage_total
dict[str, int] | None

OptimizationContext

1OptimizationContext(
2 prompts: dict,
3 initial_prompts: dict,
4 is_single_prompt_optimization: bool,
5 dataset: Dataset,
6 evaluation_dataset: Dataset,
7 validation_dataset: opik.api_objects.dataset.dataset.Dataset | None,
8 metric: MetricFunction,
9 agent: opik_optimizer.agents.optimizable_agent.OptimizableAgent | None,
10 optimization: opik.api_objects.optimization.optimization.Optimization | None,
11 optimization_id: str | None,
12 experiment_config: dict[str, typing.Any] | None,
13 n_samples: int | float | str | None,
14 max_trials: int,
15 project_name: str,
16 n_samples_minibatch: int | None = None,
17 n_samples_strategy: str = 'random_sorted',
18 allow_tool_use: bool = True,
19 baseline_score: float | None = None,
20 extra_params: dict = <factory>,
21 trials_completed: int = 0,
22 should_stop: bool = False,
23 finish_reason: Optional = None,
24 current_best_score: float | None = None,
25 current_best_prompt: dict[str, opik_optimizer.api_objects.chat_prompt.ChatPrompt] | None = None,
26 dataset_split: str | None = None
27)

Parameters:

prompts
dict
initial_prompts
dict
is_single_prompt_optimization
bool
dataset
Dataset
evaluation_dataset
Dataset
validation_dataset
opik.api_objects.dataset.dataset.Dataset | None
metric
MetricFunction
agent
opik_optimizer.agents.optimizable_agent.OptimizableAgent | None
optimization
opik.api_objects.optimization.optimization.Optimization | None
optimization_id
str | None
experiment_config
dict[str, typing.Any] | None
n_samples
int | float | str | None
max_trials
int
project_name
str
n_samples_minibatch
int | None
n_samples_strategy
strDefaults to random_sorted
allow_tool_use
boolDefaults to True
baseline_score
float | None
extra_params
dictDefaults to <factory>
trials_completed
intDefaults to 0
should_stop
boolDefaults to False
finish_reason
Optional
current_best_score
float | None
current_best_prompt
dict[str, opik_optimizer.api_objects.chat_prompt.ChatPrompt] | None
dataset_split
str | None

OptimizationHistoryState

1OptimizationHistoryState(
2 context: Any = None
3)

Parameters:

context
Any

Methods

clear

1clear()

end_round

1end_round(
2 round_handle: Any,
3 best_score: float | None = None,
4 best_candidate: typing.Any | None = None,
5 best_prompt: typing.Any | None = None,
6 stop_reason: str | None = None,
7 extras: dict[str, typing.Any] | None = None,
8 candidates: list[dict[str, typing.Any]] | None = None,
9 timestamp: str | None = None,
10 pareto_front: list[dict[str, typing.Any]] | None = None,
11 selection_meta: dict[str, typing.Any] | None = None,
12 dataset_split: str | None = None
13)

Parameters:

round_handle
Any
best_score
float | None
best_candidate
typing.Any | None
best_prompt
typing.Any | None
stop_reason
str | None
extras
dict[str, typing.Any] | None
candidates
list[dict[str, typing.Any]] | None
timestamp
str | None
pareto_front
list[dict[str, typing.Any]] | None
selection_meta
dict[str, typing.Any] | None
dataset_split
str | None

finalize_stop

1finalize_stop(
2 stop_reason: str | None = None
3)

Parameters:

stop_reason
str | None

get_entries

1get_entries()

get_rounds

1get_rounds()

record_trial

1record_trial(
2 round_handle: Any,
3 score: float | None,
4 candidate: typing.Any | None = None,
5 trial_index: int | None = None,
6 metrics: dict[str, typing.Any] | None = None,
7 dataset: str | None = None,
8 dataset_split: str | None = None,
9 extras: dict[str, typing.Any] | None = None,
10 candidates: list[dict[str, typing.Any]] | None = None,
11 timestamp: str | None = None,
12 stop_reason: str | None = None,
13 candidate_id_prefix: str | None = None
14)

Parameters:

round_handle
Any
score
float | None
candidate
typing.Any | None
trial_index
int | None
metrics
dict[str, typing.Any] | None
dataset
str | None
dataset_split
str | None
extras
dict[str, typing.Any] | None
candidates
list[dict[str, typing.Any]] | None
timestamp
str | None
stop_reason
str | None
candidate_id_prefix
str | None

set_context

1set_context(
2 context: Any
3)

Parameters:

context
Any

set_default_dataset_split

1set_default_dataset_split(
2 dataset_split: str | None
3)

Parameters:

dataset_split
str | None

set_pareto_front

1set_pareto_front(
2 pareto_front: list[dict[str, typing.Any]] | None
3)

Parameters:

pareto_front
list[dict[str, typing.Any]] | None

set_selection_meta

1set_selection_meta(
2 selection_meta: dict[str, typing.Any] | None
3)

Parameters:

selection_meta
dict[str, typing.Any] | None

start_round

1start_round(
2 round_index: int | None = None,
3 extras: dict[str, typing.Any] | None = None,
4 timestamp: str | None = None
5)

Parameters:

round_index
int | None
extras
dict[str, typing.Any] | None
timestamp
str | None

with_dataset_split

1with_dataset_split(
2 dataset_split: str | None
3)

Parameters:

dataset_split
str | None

OptimizationRound

1OptimizationRound(
2 round_index: int,
3 trials: list = <factory>,
4 best_score: float | None = None,
5 best_so_far: float | None = None,
6 best_prompt: typing.Any | None = None,
7 best_candidate: typing.Any | None = None,
8 candidates: list[dict[str, typing.Any]] | None = None,
9 generated_prompts: list[dict[str, typing.Any]] | None = None,
10 stop_reason: str | None = None,
11 stopped: bool | None = None,
12 dataset_split: str | None = None,
13 extras: dict[str, typing.Any] | None = None,
14 timestamp: str = <factory>
15)

Parameters:

round_index
int
trials
listDefaults to <factory>
best_score
float | None
best_so_far
float | None
best_prompt
typing.Any | None
best_candidate
typing.Any | None
candidates
list[dict[str, typing.Any]] | None
generated_prompts
list[dict[str, typing.Any]] | None
stop_reason
str | None
stopped
bool | None
dataset_split
str | None
extras
dict[str, typing.Any] | None
timestamp
strDefaults to <factory>

Methods

to_dict

1to_dict()

OptimizationTrial

1OptimizationTrial(
2 trial_index: int | None,
3 score: float | None,
4 candidate: Any,
5 metrics: dict[str, typing.Any] | None = None,
6 dataset: str | None = None,
7 dataset_split: str | None = None,
8 candidate_id: str | None = None,
9 extras: dict[str, typing.Any] | None = None,
10 timestamp: str = <factory>
11)

Parameters:

trial_index
int | None
score
float | None
candidate
Any
metrics
dict[str, typing.Any] | None
dataset
str | None
dataset_split
str | None
candidate_id
str | None
extras
dict[str, typing.Any] | None
timestamp
strDefaults to <factory>

Methods

to_dict

1to_dict()

OptimizableAgent

1OptimizableAgent(
2 prompt: Any = None,
3 project_name: Any = None,
4 kwargs: Any
5)

Parameters:

prompt
Any
project_name
Any
kwargs
Any

Methods

init_agent

1init_agent(
2 prompt: Any
3)

Parameters:

prompt
Any

init_llm

1init_llm()

invoke

1invoke(
2 messages: list,
3 seed: int | None = None
4)

Parameters:

messages
list
List of message dictionaries
seed
int | None
Optional seed for reproducibility

invoke_agent

1invoke_agent(
2 prompts: Any,
3 dataset_item: Any,
4 allow_tool_use: Any = False,
5 seed: Any = None
6)

Parameters:

prompts
Any
dataset_item
Any
allow_tool_use
AnyDefaults to False
seed
Any

invoke_agent_candidates

1invoke_agent_candidates(
2 prompts: Any,
3 dataset_item: Any,
4 allow_tool_use: Any = False,
5 seed: Any = None
6)

Parameters:

prompts
Any
Mapping of prompt name to ChatPrompt.
dataset_item
Any
Dataset row used to render the prompt messages.
allow_tool_use
AnyDefaults to False
Whether tool execution is allowed in this invocation.
seed
Any
Optional seed for reproducibility.

invoke_dataset_item

1invoke_dataset_item(
2 dataset_item: dict
3)

Parameters:

dataset_item
dict

invoke_prompt

1invoke_prompt(
2 prompt: Any,
3 dataset_item: Any,
4 allow_tool_use: Any = False,
5 seed: Any = None
6)

Parameters:

prompt
Any
dataset_item
Any
allow_tool_use
AnyDefaults to False
seed
Any

llm_invoke

1llm_invoke(
2 query: str | None = None,
3 messages: list[dict[str, str]] | None = None,
4 seed: int | None = None,
5 allow_tool_use: bool | None = False
6)

Parameters:

query
str | None
messages
list[dict[str, str]] | None
seed
int | None
allow_tool_use
bool | NoneDefaults to False

MultiMetricObjective

1MultiMetricObjective(
2 metrics: list,
3 weights: list[float] | None = None,
4 name: str = 'multi_metric_objective',
5 reason: str | None = None,
6 reason_builder: collections.abc.Callable[[list[opik.evaluation.metrics.score_result.ScoreResult], list[float], float], str | None] | None = None
7)

Parameters:

metrics
list
weights
list[float] | None
name
strDefaults to multi_metric_objective
reason
str | None
reason_builder
collections.abc.Callable[[list[opik.evaluation.metrics.score_result.ScoreResult], list[float], float], str | None] | None

PromptLibrary

1PromptLibrary(
2 defaults: dict,
3 overrides: dict[str, str] | collections.abc.Callable[[opik_optimizer.utils.prompt_library.PromptLibrary], None] | None = None
4)

Parameters:

defaults
dict
Dictionary of default prompt templates
overrides
dict[str, str] | collections.abc.Callable[[opik_optimizer.utils.prompt_library.PromptLibrary], None] | None
Optional dict or callable to customize prompts

Methods

get

1get(
2 key: str,
3 fmt: object
4)

Parameters:

key
str
The prompt key to retrieve
fmt
object

get_default

1get_default(
2 key: str
3)

Parameters:

key
str
The prompt key to retrieve

keys

1keys()

set

1set(
2 key: str,
3 value: str
4)

Parameters:

key
str
The prompt key to set
value
str
The new prompt template

update

1update(
2 overrides: dict
3)

Parameters:

overrides
dict
Dictionary of key-value pairs to update