Observability for Gretel with Opik

Gretel (NVIDIA) is a synthetic data platform that enables you to generate high-quality, privacy-safe datasets for AI model training and evaluation.

This guide explains how to integrate Opik with Gretel to create synthetic Q&A datasets and import them into Opik for model evaluation and optimization.

Account Setup

Comet provides a hosted version of the Opik platform, simply create an account and grab your API Key.

You can also run the Opik platform locally, see the installation guide for more information.

Getting Started

Installation

To use Gretel with Opik, you’ll need to have both the gretel-client and opik packages installed:

$ pip install gretel-client opik pandas

Configuring Opik

Configure the Opik Python SDK for your deployment type. See the Python SDK Configuration guide for detailed instructions on:

CLI configuration: opik configure
Code configuration: opik.configure()
Self-hosted vs Cloud vs Enterprise setup
Configuration files and environment variables

Configuring Gretel

In order to configure Gretel, you will need to have your Gretel API Key. You can create and manage your Gretel API Keys on this page.

You can set it as an environment variable:

$ export GRETEL_API_KEY="YOUR_API_KEY"

Or set it programmatically:

1 import os
2 import getpass
3 
4 if "GRETEL_API_KEY" not in os.environ:
5     os.environ["GRETEL_API_KEY"] = getpass.getpass("Enter your Gretel API key: ")
6 
7 # Set project name for organization
8 os.environ["OPIK_PROJECT_NAME"] = "gretel-integration-demo"

Two Approaches Available

This integration demonstrates two methods for generating synthetic data with Gretel:

Data Designer (recommended for custom datasets): Create datasets from scratch with precise control
Safe Synthetics (recommended for existing data): Generate synthetic versions of existing datasets

Method 1: Using Gretel Data Designer

Generate Q&A Dataset

Use Gretel Data Designer to generate synthetic Q&A data with precise control over the structure:

1 from gretel_client.navigator_client import Gretel
2 from gretel_client.data_designer import columns as C
3 from gretel_client.data_designer import params as P
4 import opik
5 
6 # Initialize Data Designer
7 gretel_navigator = Gretel()
8 dd = gretel_navigator.data_designer.new(model_suite="apache-2.0")
9 
10 # Add topic column (categorical sampler)
11 dd.add_column(
12     C.SamplerColumn(
13         name="topic",
14         type=P.SamplerType.CATEGORY,
15         params=P.CategorySamplerParams(
16             values=[
17                 "neural networks", "deep learning", "machine learning", "NLP",
18                 "computer vision", "reinforcement learning", "AI ethics", "data science"
19             ]
20         )
21     )
22 )
23 
24 # Add difficulty column
25 dd.add_column(
26     C.SamplerColumn(
27         name="difficulty",
28         type=P.SamplerType.CATEGORY,
29         params=P.CategorySamplerParams(
30             values=["beginner", "intermediate", "advanced"]
31         )
32     )
33 )
34 
35 # Add question column (LLM-generated)
36 dd.add_column(
37     C.LLMTextColumn(
38         name="question",
39         prompt=(
40             "Generate a challenging, specific question about {{ topic }} "
41             "at {{ difficulty }} level. The question should be clear, focused, "
42             "and something a student or practitioner might actually ask."
43         )
44     )
45 )
46 
47 # Add answer column (LLM-generated)
48 dd.add_column(
49     C.LLMTextColumn(
50         name="answer",
51         prompt=(
52             "Provide a clear, accurate, and comprehensive answer to this {{ difficulty }}-level "
53             "question about {{ topic }}: '{{ question }}'. The answer should be educational "
54             "and directly address all aspects of the question."
55         )
56     )
57 )
58 
59 # Generate the dataset
60 workflow_run = dd.create(num_records=20, wait_until_done=True)
61 synthetic_df = workflow_run.dataset.df
62 
63 print(f"Generated {len(synthetic_df)} Q&A pairs!")

Convert to Opik Format

Convert the Gretel-generated data to Opik’s expected format:

1 def convert_to_opik_format(df):
2     """Convert Gretel Q&A data to Opik dataset format"""
3     opik_items = []
4 
5     for _, row in df.iterrows():
6         # Create Opik dataset item
7         item = {
8             "input": {
9                 "question": row["question"]
10             },
11             "expected_output": row["answer"],
12             "metadata": {
13                 "topic": row.get("topic", "AI/ML"),
14                 "difficulty": row.get("difficulty", "unknown"),
15                 "source": "gretel_data_designer"
16             }
17         }
18         opik_items.append(item)
19 
20     return opik_items
21 
22 # Convert to Opik format
23 opik_data = convert_to_opik_format(synthetic_df)
24 print(f"Converted {len(opik_data)} items to Opik format!")

Upload to Opik

Upload your dataset to Opik for model evaluation:

1 # Initialize Opik client
2 opik_client = opik.Opik()
3 
4 # Create the dataset
5 dataset_name = "gretel-ai-qa-dataset"
6 dataset = opik_client.get_or_create_dataset(
7     name=dataset_name,
8     description="Synthetic Q&A dataset generated using Gretel Data Designer for AI/ML evaluation"
9 )
10 
11 # Insert the data
12 dataset.insert(opik_data)
13 
14 print(f"Successfully created dataset: {dataset.name}")
15 print(f"Dataset ID: {dataset.id}")
16 print(f"Total items: {len(opik_data)}")

Method 2: Using Gretel Safe Synthetics

Prepare Sample Data

If you have an existing Q&A dataset, you can use Safe Synthetics to create a synthetic version:

1 import pandas as pd
2 
3 # Create sample Q&A data (needs 200+ records for holdout)
4 sample_questions = [
5     'What is machine learning?',
6     'How do neural networks work?',
7     'What is the difference between AI and ML?',
8     'Explain deep learning concepts',
9     'What are the applications of NLP?'
10 ] * 50  # Repeat to get 250 records
11 
12 sample_answers = [
13     'Machine learning is a subset of AI that enables systems to learn from data.',
14     'Neural networks are computational models inspired by biological neural networks.',
15     'AI is the broader concept while ML is a specific approach to achieve AI.',
16     'Deep learning uses multi-layer neural networks to model complex patterns.',
17     'NLP applications include chatbots, translation, sentiment analysis, and text generation.'
18 ] * 50  # Repeat to get 250 records
19 
20 sample_data = {
21     'question': sample_questions,
22     'answer': sample_answers,
23     'topic': (['ML', 'Neural Networks', 'AI/ML', 'Deep Learning', 'NLP'] * 50),
24     'difficulty': (['beginner', 'intermediate', 'beginner', 'advanced', 'intermediate'] * 50)
25 }
26 
27 original_df = pd.DataFrame(sample_data)
28 print(f"Original dataset: {len(original_df)} records")

Generate Synthetic Version

Use Safe Synthetics to create a privacy-safe version of your dataset:

1 # Initialize Gretel client
2 gretel = Gretel()
3 
4 # Generate synthetic version
5 synthetic_dataset = gretel.safe_synthetic_dataset \
6     .from_data_source(original_df, holdout=0.1) \
7     .synthesize(num_records=100) \
8     .create()
9 
10 # Wait for completion and get results
11 synthetic_dataset.wait_until_done()
12 synthetic_df_safe = synthetic_dataset.dataset.df
13 
14 print(f"Generated {len(synthetic_df_safe)} synthetic Q&A pairs using Safe Synthetics!")

Convert and Upload to Opik

Convert the Safe Synthetics data to Opik format and upload:

1 # Convert to Opik format
2 opik_data_safe = convert_to_opik_format(synthetic_df_safe)
3 
4 # Create dataset in Opik
5 dataset_safe = opik_client.get_or_create_dataset(
6     name="gretel-safe-synthetics-qa-dataset",
7     description="Synthetic Q&A dataset generated using Gretel Safe Synthetics"
8 )
9 
10 dataset_safe.insert(opik_data_safe)
11 print(f"Safe Synthetics dataset created: {dataset_safe.name}")

Using with @track decorator

Use the @track decorator to create comprehensive traces when working with your Gretel-generated datasets:

1 from opik import track
2 
3 @track
4 def evaluate_qa_model(dataset_item):
5     """Evaluate a Q&A model using Gretel-generated data."""
6     question = dataset_item["input"]["question"]
7 
8     # Your model logic here (replace with actual model)
9     if 'neural network' in question.lower():
10         response = "A neural network is a computational model inspired by biological neural networks."
11     elif 'machine learning' in question.lower():
12         response = "Machine learning is a subset of AI that enables systems to learn from data."
13     else:
14         response = "This is a complex AI/ML topic that requires detailed explanation."
15 
16     return {
17         "question": question,
18         "response": response,
19         "expected": dataset_item["expected_output"],
20         "topic": dataset_item["metadata"]["topic"],
21         "difficulty": dataset_item["metadata"]["difficulty"]
22     }
23 
24 # Evaluate on your dataset
25 for item in opik_data[:5]:  # Evaluate first 5 items
26     result = evaluate_qa_model(item)
27     print(f"Topic: {result['topic']}, Difficulty: {result['difficulty']}")

Results viewing

Once your Gretel-generated datasets are uploaded to Opik, you can view them in the Opik UI. Each dataset will contain:

Input questions and expected answers
Metadata including topic and difficulty levels
Source information (Data Designer or Safe Synthetics)
Quality metrics and evaluation results

Feedback Scores and Evaluation

Once your Gretel-generated datasets are in Opik, you can evaluate your LLM applications using Opik’s evaluation framework:

1 from opik.evaluation import evaluate
2 from opik.evaluation.metrics import Hallucination
3 
4 # Define your evaluation task
5 def evaluation_task(x):
6     return {
7         "message": x["input"]["question"],
8         "output": x["response"],
9         "reference": x["expected_output"]
10     }
11 
12 # Create the Hallucination metric
13 hallucination_metric = Hallucination()
14 
15 # Run the evaluation
16 evaluation_results = evaluate(
17     experiment_name="gretel-qa-evaluation",
18     dataset=opik_data,
19     task=evaluation_task,
20     scoring_metrics=[hallucination_metric],
21 )

Dataset Size Requirements

Dataset Size	Holdout Setting	Example
< 200 records	`holdout=None`	`from_data_source(df, holdout=None)`
200+ records	Default (5%) or custom	`from_data_source(df)` or `from_data_source(df, holdout=0.1)`
Large datasets	Custom percentage/count	`from_data_source(df, holdout=250)`

When to Use Which Approach?

Use Case	Recommended Approach	Why
Creating new datasets from scratch	Data Designer	More control, custom column types, guided generation
Synthesizing existing datasets	Safe Synthetics	Preserves statistical relationships, privacy-safe
Custom data structures	Data Designer	Flexible column definitions, template system
Production data replication	Safe Synthetics	Maintains data utility while ensuring privacy

Environment Variables

Make sure to set the following environment variables:

$ # Gretel Configuration
> export GRETEL_API_KEY="your-gretel-api-key"
> 
> # Opik Configuration
> export OPIK_PROJECT_NAME="your-project-name"
> export OPIK_WORKSPACE="your-workspace-name"

Troubleshooting

Common Issues

Authentication Errors: Ensure your Gretel API key is correct and has the necessary permissions
Dataset Size: Safe Synthetics requires at least 200 records for holdout validation
Model Suite: Ensure you’re using a compatible model suite (e.g., “apache-2.0”)
Rate Limiting: Gretel may have rate limits; implement appropriate retry logic

Getting Help

Contact Gretel support for API-specific problems
Check Opik documentation for tracing and evaluation features

Next Steps

Once you have Gretel integrated with Opik, you can:

Evaluate your LLM applications using Opik’s evaluation framework
Create datasets to test and improve your models
Set up feedback collection to gather human evaluations
Monitor performance across different models and configurations