Few-shot Bayesian Optimization

The FewShotBayesianOptimizer is a sophisticated prompt optimization tool that combines few-shot learning with Bayesian optimization techniques. It’s designed to iteratively improve prompts by learning from examples and systematically exploring the optimization space.

How It Works

  1. Initialization

    • Takes a dataset of input-output pairs
    • Configures optimization parameters
    • Sets up evaluation metrics
  2. Bayesian Optimization

    • Uses Gaussian Process to model the optimization space
    • Selects promising prompt configurations
    • Balances exploration and exploitation
  3. Few-shot Learning

    • Dynamically selects relevant examples
    • Adapts to different problem types
    • Optimizes example selection
  4. Evaluation

    • Multi-threaded performance testing
    • Comprehensive metrics tracking
    • Validation against test set
  5. Refinement

    • Iterative improvement based on results
    • Adaptive parameter adjustment
    • Convergence monitoring

Configuration Options

Basic Configuration

1from opik_optimizer import FewShotBayesianOptimizer
2
3optimizer = FewShotBayesianOptimizer(
4 model="openai/gpt-4", # or "azure/gpt-4"
5 project_name="my-project",
6 temperature=0.1,
7 max_tokens=5000,
8 num_threads=8,
9 seed=42
10)

Advanced Configuration

1optimizer = FewShotBayesianOptimizer(
2 model="openai/gpt-4",
3 project_name="my-project",
4 temperature=0.1,
5 max_tokens=5000,
6 num_threads=8,
7 seed=42,
8 min_examples=2,
9 max_examples=8,
10 n_initial_prompts=5,
11 n_iterations=10,
12 acquisition_function="ei", # Expected Improvement
13 kernel="matern", # Kernel for Gaussian Process
14 length_scale=1.0, # Kernel length scale
15 noise_level=0.1 # Observation noise level
16)

Example Usage

Run optimization

results = optimizer.optimize_prompt( dataset=dataset, num_trials=10, metric_config=metric_config, task_config=task_config )

Access results

results.display()

1from opik_optimizer import FewShotBayesianOptimizer
2from opik.evaluation.metrics import LevenshteinRatio
3from opik_optimizer import (
4 MetricConfig,
5 TaskConfig,
6 from_llm_response_text,
7 from_dataset_field,
8)
9from opik_optimizer.demo import get_or_create_dataset
10
11# Initialize optimizer
12optimizer = FewShotBayesianOptimizer(
13 model="openai/gpt-4",
14 temperature=0.1,
15 max_tokens=5000
16)
17
18# Prepare dataset
19dataset = get_or_create_dataset("hotpot-300")
20
21# Define metric and task configuration (see docs for more options)
22metric_config = MetricConfig(
23 metric=LevenshteinRatio(),
24 inputs={
25 "output": from_llm_response_text(), # Model's output
26 "reference": from_dataset_field(name="answer"), # Ground truth
27 }
28)
29task_config = TaskConfig(
30 type="text_generation",
31 instruction_prompt="Provide an answer to the question.",
32 input_dataset_fields=["question"],
33 output_dataset_field="answer",
34 use_chat_prompt=True
35)
36
37# Run optimization
38results = optimizer.optimize_prompt(
39 dataset=dataset,
40 metric_config=metric_config,
41 task_config=task_config
42)
43
44# Access results
45results.display()

Model Support

The FewShotBayesianOptimizer supports all models available through LiteLLM. For a complete list of supported models and providers, see the LiteLLM Integration documentation.

Common Providers

  • OpenAI (gpt-4, gpt-3.5-turbo, etc.)
  • Azure OpenAI
  • Anthropic (Claude)
  • Google (Gemini)
  • Mistral
  • Cohere

Configuration Example

1optimizer = FewShotBayesianOptimizer(
2 model="openai/gpt-4", # or any LiteLLM supported model
3 project_name="my-project",
4 temperature=0.1,
5 max_tokens=5000
6)

Best Practices

  1. Dataset Preparation

    • Minimum 50 examples recommended
    • Diverse and representative samples
    • Clear input-output pairs
  2. Parameter Tuning

    • Start with default parameters
    • Adjust based on problem complexity
    • Monitor convergence metrics
  3. Evaluation Strategy

    • Use separate validation set
    • Track multiple metrics
    • Document optimization history
  4. Performance Optimization

    • Adjust num_threads based on resources
    • Balance min_examples and max_examples
    • Monitor memory usage

Research and References

Next Steps