Synthetic Data Optimizer Cookbook

Advanced example notebook using synthetic datasets

This page is a high-level entry point for the synthetic data workflow. Use the notebook or SDK script to run the full example end-to-end.

Launch the example

The notebook is the fastest way to explore synthetic data optimization in your browser.

PlatformLaunch Link
Google Colab (Preferred)Open in Colab
GitHubView the notebook on GitHub

What this example covers

  • Generating synthetic Q&A datasets from Opik traces
  • Using TinyQA (via tinyqabenchmarkpp) and variants like TinyQA++
  • Optimizing prompts with MetaPrompt on synthetic data
  • Reviewing results in the Opik UI

Where the full implementation lives

SDK codebase: browse sdks/opik_optimizer/ for dataset utilities, metrics, and optimizer implementations.

Next steps