Import/Export by command line

The export/import command-line functions enable you to:

  • Export: Export specific traces, spans, datasets, prompts, and experiments from a project to local JSON or CSV files
  • Import: Import data from local JSON files into a project
  • Migrate: Move data between projects or environments, including experiments and prompts
  • Backup: Create local backups of specific project data

opik export WORKSPACE TYPE NAME

Exports specific data types from the specified workspace to local files.

Arguments:

  • WORKSPACE: The workspace name to export from
  • TYPE: The type of data to export (dataset, project, experiment, or prompt)
  • NAME: Exact name of the item to export

Options:

  • --path, -p: Directory to save exported data (default: opik_exports)
  • --max-results: Maximum number of items to export per data type (default: 1000)
  • --filter: Filter string using Opik Query Language (OQL) to narrow down the search (for projects only)
  • --force: Re-download items even if they already exist locally
  • --format: Format for exporting data (json or csv, default: json)
  • --debug: Enable debug output to show detailed information about the export process

Experiment-specific options:

  • --dataset NAME: Filter experiments by dataset name (only experiments using this dataset will be exported)
  • --max-traces INTEGER: Maximum number of traces to export (limits total traces downloaded)

Examples:

$# Export specific dataset by exact name
>opik export my-workspace dataset "my-test-dataset"
>
># Export specific project by exact name
>opik export my-workspace project "my-project" --filter "status = 'completed'"
>
># Export specific experiment by exact name
>opik export my-workspace experiment "my-experiment" --force
>
># Export experiment with dataset filtering
>opik export my-workspace experiment "my-experiment" --dataset "my-dataset"
>
># Export experiment with trace limit
>opik export my-workspace experiment "my-experiment" --max-traces 100
>
># Export specific prompt by exact name
>opik export my-workspace prompt "my-template"
>
># Export with custom output directory
>opik export my-workspace dataset "my-dataset" --path ./backup_data
>
># Export using default directory (opik_exports)
>opik export my-workspace dataset "my-dataset"
>
># Export with filter and limit
>opik export my-workspace project "my-project" --filter "start_time >= '2024-01-01T00:00:00Z'" --max-results 100
>
># Export traces in CSV format for analysis
>opik export my-workspace project "my-project" --format csv --path ./csv_data
>
># Export with debug output
>opik export my-workspace dataset "my-dataset" --debug --force
>
># Export datasets in CSV format for analysis
>opik export my-workspace dataset "my-dataset" --format csv --path ./analysis_data
>
># Export prompts in CSV format for analysis
>opik export my-workspace prompt "my-template" --format csv --path ./analysis_data
>
># Export experiments in CSV format for analysis
>opik export my-workspace experiment "my-experiment" --format csv --path ./analysis_data

opik import WORKSPACE TYPE NAME

Imports specific data types from local files to the specified workspace.

Arguments:

  • WORKSPACE: The workspace name to import to
  • TYPE: The type of data to import (dataset, project, experiment, or prompt)
  • NAME: Name pattern to match items (case-insensitive substring matching)

Options:

  • --path, -p: Directory containing exported data (default: opik_exports)
  • --dry-run: Show what would be imported without actually importing
  • --debug: Enable debug output to show detailed information about the import process

Note: Experiment imports automatically recreate experiments where possible. No additional flags are needed.

Examples:

$# Import datasets from default directory (opik_exports)
>opik import my-workspace dataset "my-dataset"
>
># Import projects from default directory
>opik import my-workspace project "my-project"
>
># Import experiments from default directory (automatically recreates experiments)
>opik import my-workspace experiment "my-experiment"
>
># Import prompts from default directory
>opik import my-workspace prompt "my-prompt"
>
># Import with name pattern matching
>opik import my-workspace dataset "test"
>
># Import from custom path
>opik import my-workspace dataset "my-dataset" --path ./custom-exports
>
># Dry run to see what would be imported
>opik import my-workspace project "my-project" --dry-run
>
># Import with debug output
>opik import my-workspace experiment "my-experiment" --debug

File Format

JSON Format (Default)

The exported data is stored in JSON files with the following structure:

OUTPUT_DIR/
└── WORKSPACE/
ā”œā”€ā”€ datasets/
│ ā”œā”€ā”€ dataset_DATASET_NAME_1.json
│ └── dataset_DATASET_NAME_2.json
ā”œā”€ā”€ projects/
│ └── PROJECT_NAME/
│ ā”œā”€ā”€ trace_TRACE_ID_1.json
│ └── trace_TRACE_ID_2.json
ā”œā”€ā”€ experiments/
│ ā”œā”€ā”€ experiment_EXPERIMENT_NAME_1.json
│ └── experiment_EXPERIMENT_NAME_2.json
└── prompts/
ā”œā”€ā”€ prompt_PROMPT_NAME_1.json
└── prompt_PROMPT_NAME_2.json

Each trace file contains:

1{
2 "trace": {
3 "id": "trace-uuid",
4 "name": "trace-name",
5 "start_time": "2024-01-01T00:00:00Z",
6 "end_time": "2024-01-01T00:01:00Z",
7 "input": {...},
8 "output": {...},
9 "metadata": {...},
10 "tags": [...],
11 "thread_id": "thread-uuid"
12 },
13 "spans": [
14 {
15 "id": "span-uuid",
16 "name": "span-name",
17 "start_time": "2024-01-01T00:00:00Z",
18 "end_time": "2024-01-01T00:01:00Z",
19 "input": {...},
20 "output": {...},
21 "metadata": {...},
22 "type": "general",
23 "model": "gpt-4",
24 "provider": "openai"
25 }
26 ],
27 "downloaded_at": "2024-01-01T00:00:00Z",
28 "project_name": "source-project"
29}

Each evaluation rule file contains:

1{
2 "id": "rule-uuid",
3 "name": "rule-name",
4 "project_id": "project-uuid",
5 "project_name": "project-name",
6 "sampling_rate": 1.0,
7 "enabled": true,
8 "filters": [...],
9 "action": "evaluator",
10 "type": "llm_as_judge",
11 "created_at": "2024-01-01T00:00:00Z",
12 "created_by": "user-id",
13 "last_updated_at": "2024-01-01T00:00:00Z",
14 "last_updated_by": "user-id",
15 "evaluator_data": {
16 "llm_as_judge_code": {
17 "prompt": "Evaluate the response...",
18 "model": "gpt-4",
19 "temperature": 0.0
20 }
21 },
22 "downloaded_at": "2024-01-01T00:00:00Z"
23}

Each experiment file contains:

1{
2 "experiment": {
3 "id": "experiment-uuid",
4 "name": "experiment-name",
5 "dataset_name": "dataset-name",
6 "type": "regular",
7 "metadata": {...},
8 "created_at": "2024-01-01T00:00:00Z"
9 },
10 "items": [
11 {
12 "trace_id": "trace-uuid",
13 "dataset_item_id": "dataset-item-uuid",
14 "dataset_item_data": {...},
15 "feedback_scores": [...],
16 "trace_reference": {
17 "trace_id": "trace-uuid",
18 "note": "Full trace data not included to avoid duplication"
19 }
20 }
21 ],
22 "downloaded_at": "2024-01-01T00:00:00Z"
23}

Each prompt file contains:

1{
2 "name": "prompt-name",
3 "current_version": {
4 "prompt": "Your prompt template here...",
5 "metadata": {...},
6 "type": "MUSTACHE",
7 "commit": "commit-hash"
8 },
9 "history": [
10 {
11 "prompt": "Previous version of the prompt...",
12 "metadata": {...},
13 "type": "MUSTACHE",
14 "commit": "previous-commit-hash"
15 }
16 ],
17 "downloaded_at": "2024-01-01T00:00:00Z"
18}

CSV Format

When using --format csv, data is exported as CSV files with flattened data structure. This format is ideal for:

  • Data Analysis: Easy to import into Excel, Google Sheets, or data analysis tools
  • Large Datasets: More efficient storage for large numbers of traces
  • Spreadsheet Integration: Direct compatibility with business intelligence tools

CSV File Structure:

OUTPUT_DIR/
└── WORKSPACE/
ā”œā”€ā”€ datasets/
│ └── datasets_DATASET_NAME.csv # Dataset data in CSV format
ā”œā”€ā”€ projects/
│ └── PROJECT_NAME/
│ └── traces_PROJECT_NAME.csv # All traces in a single CSV file
ā”œā”€ā”€ experiments/
│ └── experiments_EXPERIMENT_NAME.csv # Experiment data in CSV format
└── prompts/
└── prompts_PROMPT_NAME.csv # Prompt data in CSV format

CSV Format Benefits:

  • Single File: All data combined into one CSV file per data type
  • Flattened Structure: Nested JSON data is flattened with dot notation
  • Column Headers: Clear column names for easy analysis
  • Compatible: Works with Excel, Google Sheets, pandas, etc.
  • Universal Format: All data types (datasets, projects, experiments, prompts) support CSV export

Example CSV Structure:

1trace_id,trace_name,start_time,end_time,thread_id,span_id,span_name,span_type,span_model,span_provider,input,output,metadata
2trace-123,my-trace,2024-01-01T00:00:00Z,2024-01-01T00:01:00Z,thread-456,span-789,llm-call,llm,gpt-4,openai,"{""prompt"":""Hello""}","{""response"":""Hi""}","{""tokens"":10}"

Use Cases

1. Project Migration

$# Export all data from source project
>opik export my-workspace project "old-project" --path ./migration_data
>
># Import to new workspace
>opik import my-workspace project "old-project" --path ./migration_data

2. Data Backup

$# Create backup of specific data (requires knowing exact names)
>opik export my-workspace dataset "my-dataset" --path ./backup_$(date +%Y%m%d)
>opik export my-workspace project "my-project" --path ./backup_$(date +%Y%m%d)
>opik export my-workspace experiment "my-experiment" --path ./backup_$(date +%Y%m%d)

3. Environment Sync

$# Sync from staging to production
>opik export my-workspace project "staging-project" --filter "tags contains 'ready-for-prod'"
>opik import my-workspace project "staging-project"

4. Data Analysis

$# Export specific traces for analysis
>opik export my-workspace project "my-project" --filter "start_time >= '2024-01-01T00:00:00Z'" --max-results 1000
># Analyze the JSON files locally

5. Dataset Management

$# Export specific dataset from a workspace
>opik export my-workspace dataset "my-dataset"
>
># Import datasets to another workspace (uses default opik_exports directory)
>opik import my-workspace dataset "my-dataset"

6. Data Analysis with CSV

$# Export traces in CSV format for analysis
>opik export my-workspace project "my-project" --format csv --path ./analysis_data
>
># Export datasets in CSV format for analysis
>opik export my-workspace dataset "my-dataset" --format csv --path ./analysis_data
>
># Export experiments in CSV format for analysis
>opik export my-workspace experiment "my-experiment" --format csv --path ./analysis_data
>
># Export prompts in CSV format for analysis
>opik export my-workspace prompt "my-template" --format csv --path ./analysis_data
>
># Open in Excel or Google Sheets for analysis
># Or use with pandas in Python:
># import pandas as pd
># df = pd.read_csv('./analysis_data/my-workspace/projects/my-project/traces_my-project.csv')

7. Prompt Management

$# Export specific prompt from a workspace
>opik export my-workspace prompt "my-template" --path ./prompt_backup
>
># Export another prompt template
>opik export my-workspace prompt "system-prompt" --path ./templates
>
># Import prompts to another workspace
>opik import my-workspace prompt "my-template" --path ./prompt_backup
>
># Import with name pattern matching
>opik import my-workspace prompt "production" --path ./prompt_backup

8. Experiment Migration

$# Export specific experiment from source workspace
>opik export my-workspace experiment "my-experiment" --path ./experiment_data
>
># Export experiment with dataset filtering
>opik export my-workspace experiment "evaluation-exp" --dataset "test-dataset"
>
># Export experiment with trace limit (useful for large experiments)
>opik export my-workspace experiment "large-experiment" --max-traces 50
>
># Import experiments (automatically recreates experiments)
>opik import my-workspace experiment "my-experiment" --path ./experiment_data
>
># Import experiments matching name pattern
>opik import my-workspace experiment "evaluation" --path ./experiment_data

Troubleshooting

Common Issues

  1. ā€œNo traces foundā€

    • Check if the project name is correct
    • Verify you have access to the project
    • Try without filters first
  2. ā€œProject directory not foundā€

    • Make sure you’ve exported data first
    • Check the input directory path
    • Verify the project name matches
  3. ā€œOpik SDK not availableā€

    • Ensure Opik is properly installed
    • Check your Python environment
    • Verify the installation with opik healthcheck
  4. ā€œDataset/Project/Experiment/Prompt not foundā€

    • Check that the exact name is correct
    • Verify you have access to the item
    • Use --debug for more detailed error information
  5. ā€œNo datasets/projects foundā€

    • The system will show available items to help you choose the right name
    • Check spelling and case sensitivity
    • Ensure the item exists in the workspace
  6. ā€œDataset not foundā€

    • The system will show datasets used by matching experiments
    • Verify the dataset name is correct
    • Use --debug to see detailed search information

Getting Help

$# Get help for export command
>opik export --help
>
># Get help for import command
>opik import --help
>
># Check system health
>opik healthcheck

Example Workflow

Here’s a complete example of exporting and importing trace data:

JSON Format Workflow

$# 1. Export specific data from source workspace (JSON format)
>opik export my-workspace dataset "my-dataset" --path ./temp_data
>opik export my-workspace project "my-project" --path ./temp_data
>opik export my-workspace experiment "my-experiment" --path ./temp_data
>opik export my-workspace prompt "my-template" --path ./temp_data
>
># Alternative: Export experiment with specific dataset filtering
>opik export my-workspace experiment "evaluation-exp" --dataset "test-dataset" --max-traces 100 --path ./temp_data
>
># 2. Inspect the exported data
>ls ./temp_data/my-workspace/datasets/
>ls ./temp_data/my-workspace/projects/
>ls ./temp_data/my-workspace/experiments/
>ls ./temp_data/my-workspace/prompts/
>cat ./temp_data/my-workspace/projects/my-project/trace_*.json | head -20
>
># 3. Dry run import to see what would be imported
>opik import my-workspace dataset "my-dataset" --path ./temp_data --dry-run
>opik import my-workspace project "my-project" --path ./temp_data --dry-run
>opik import my-workspace experiment "my-experiment" --path ./temp_data --dry-run
>opik import my-workspace prompt "my-template" --path ./temp_data --dry-run
>
># 4. Actually import all data including experiments and prompts
>opik import my-workspace dataset "my-dataset" --path ./temp_data
>opik import my-workspace project "my-project" --path ./temp_data
>opik import my-workspace experiment "my-experiment" --path ./temp_data
>opik import my-workspace prompt "my-template" --path ./temp_data
>
># 5. Clean up temporary data
>rm -rf ./temp_data

CSV Format Workflow

$# 1. Export data in CSV format for analysis
>opik export my-workspace project "my-source-project" --format csv --path ./csv_data
>opik export my-workspace dataset "my-dataset" --format csv --path ./csv_data
>opik export my-workspace experiment "my-experiment" --format csv --path ./csv_data
>opik export my-workspace prompt "my-template" --format csv --path ./csv_data
>
># 2. Inspect the CSV files
>ls ./csv_data/my-workspace/projects/my-source-project/
>ls ./csv_data/my-workspace/datasets/
>ls ./csv_data/my-workspace/experiments/
>ls ./csv_data/my-workspace/prompts/
>
># Each trace is exported to its own CSV file
>head -5 ./csv_data/my-workspace/projects/my-source-project/trace_*.csv | head -20
>head -5 ./csv_data/my-workspace/datasets/datasets_test-dataset.csv
>
># 3. Analyze with pandas (optional)
>python -c "
>import pandas as pd
>import glob
>
># Read all trace CSV files and combine them
>trace_files = glob.glob('./csv_data/my-workspace/projects/my-source-project/trace_*.csv')
>df_traces = pd.concat([pd.read_csv(f) for f in trace_files], ignore_index=True) if trace_files else pd.DataFrame()
>df_datasets = pd.read_csv('./csv_data/my-workspace/datasets/datasets_test-dataset.csv')
>print(f'Exported {len(df_traces)} trace records from {len(trace_files)} files')
>print(f'Exported {len(df_datasets)} dataset records')
>print('Trace columns:', df_traces.columns.tolist() if not df_traces.empty else 'No traces')
>print('Dataset columns:', df_datasets.columns.tolist())
>"
>
># 4. For import, you would need to convert back to JSON format
># (CSV format is primarily for analysis, not import)

This workflow ensures you can safely migrate all data including experiments and prompts between workspaces while maintaining data integrity and providing visibility into the process. The CSV format is particularly useful for data analysis and reporting, while the JSON format preserves the complete structure needed for experiment and prompt recreation. The new command structure provides better organization with separate commands for datasets, projects, experiments, and prompts, making it easier to manage specific data types.