Import/Export by command line

The export/import command-line functions enable you to:

  • Export: Export all traces, spans, datasets, prompts, and evaluation rules from a project to local JSON or CSV files
  • Import: Import data from local JSON files into a project
  • Migrate: Move data between projects or environments
  • Backup: Create local backups of your project data

opik export WORKSPACE_OR_PROJECT

Exports all trace data from the specified workspace or project to local files.

Arguments:

  • WORKSPACE_OR_PROJECT: Either a workspace name (e.g., “my-workspace”) to export all projects, or workspace/project (e.g., “my-workspace/my-project”) to export a specific project

Options:

  • --path, -p: Directory to save exported data (default: ./)
  • --max-results: Maximum number of items to export per data type (default: 1000)
  • --filter: Filter string using Opik Query Language (OQL) to narrow down the search
  • --include: Data types to include (traces, datasets, prompts)
  • --exclude: Data types to exclude
  • --all: Include all data types
  • --name: Filter items by name using Python regex patterns
  • --trace-format: Format for exporting traces (json or csv, default: json)
  • --debug: Enable debug output to show detailed information about the export process

Examples:

$# Export all traces from a project
>opik export my-workspace/my-project
>
># Export all data types from a workspace
>opik export my-workspace --all
>
># Export only datasets
>opik export my-workspace/my-project --include datasets
>
># Export with custom output directory
>opik export my-workspace/my-project --path ./backup_data
>
># Export with filter and limit
>opik export my-workspace/my-project --filter "start_time >= '2024-01-01T00:00:00Z'" --max-results 100
>
># Export traces in CSV format for analysis
>opik export my-workspace/my-project --trace-format csv --path ./csv_data
>
># Export with debug output
>opik export my-workspace/my-project --debug --trace-format csv

opik import WORKSPACE_FOLDER WORKSPACE_NAME

Imports trace data from local files to the specified workspace or project.

Arguments:

  • WORKSPACE_FOLDER: Directory containing JSON files to import
  • WORKSPACE_NAME: The name of the workspace or workspace/project to import traces to

Options:

  • --dry-run: Show what would be imported without actually importing
  • --include: Data types to include (traces, datasets, prompts)
  • --exclude: Data types to exclude
  • --all: Include all data types
  • --name: Filter items by name using Python regex patterns

Examples:

$# Import traces to a project
>opik import ./my-data my-workspace/my-target-project
>
># Import all data types
>opik import ./my-data my-workspace/my-target-project --all
>
># Import only datasets
>opik import ./my-data my-workspace/my-target-project --include datasets
>
># Import with custom input directory
>opik import ./backup_data my-workspace/my-target-project
>
># Dry run to see what would be imported
>opik import ./my-data my-workspace/my-target-project --dry-run

File Format

JSON Format (Default)

The exported data is stored in JSON files with the following structure:

OUTPUT_DIR/
└── WORKSPACE/
└── PROJECT_NAME/
├── trace_TRACE_ID_1.json
├── trace_TRACE_ID_2.json
├── dataset_DATASET_NAME.json
└── prompt_PROMPT_NAME.json

Each trace file contains:

1{
2 "trace": {
3 "id": "trace-uuid",
4 "name": "trace-name",
5 "start_time": "2024-01-01T00:00:00Z",
6 "end_time": "2024-01-01T00:01:00Z",
7 "input": {...},
8 "output": {...},
9 "metadata": {...},
10 "tags": [...],
11 "thread_id": "thread-uuid"
12 },
13 "spans": [
14 {
15 "id": "span-uuid",
16 "name": "span-name",
17 "start_time": "2024-01-01T00:00:00Z",
18 "end_time": "2024-01-01T00:01:00Z",
19 "input": {...},
20 "output": {...},
21 "metadata": {...},
22 "type": "general",
23 "model": "gpt-4",
24 "provider": "openai"
25 }
26 ],
27 "downloaded_at": "2024-01-01T00:00:00Z",
28 "project_name": "source-project"
29}

Each evaluation rule file contains:

1{
2 "id": "rule-uuid",
3 "name": "rule-name",
4 "project_id": "project-uuid",
5 "project_name": "project-name",
6 "sampling_rate": 1.0,
7 "enabled": true,
8 "filters": [...],
9 "action": "evaluator",
10 "type": "llm_as_judge",
11 "created_at": "2024-01-01T00:00:00Z",
12 "created_by": "user-id",
13 "last_updated_at": "2024-01-01T00:00:00Z",
14 "last_updated_by": "user-id",
15 "evaluator_data": {
16 "llm_as_judge_code": {
17 "prompt": "Evaluate the response...",
18 "model": "gpt-4",
19 "temperature": 0.0
20 }
21 },
22 "downloaded_at": "2024-01-01T00:00:00Z"
23}

CSV Format

When using --trace-format csv, traces are exported as CSV files with flattened data structure. This format is ideal for:

  • Data Analysis: Easy to import into Excel, Google Sheets, or data analysis tools
  • Large Datasets: More efficient storage for large numbers of traces
  • Spreadsheet Integration: Direct compatibility with business intelligence tools

CSV File Structure:

OUTPUT_DIR/
└── WORKSPACE/
└── PROJECT_NAME/
├── traces.csv # All traces in a single CSV file
├── dataset_DATASET_NAME.json
└── prompt_PROMPT_NAME.json

CSV Format Benefits:

  • Single File: All traces combined into one traces.csv file
  • Flattened Structure: Nested JSON data is flattened with dot notation
  • Column Headers: Clear column names for easy analysis
  • Compatible: Works with Excel, Google Sheets, pandas, etc.

Example CSV Structure:

1trace_id,trace_name,start_time,end_time,thread_id,span_id,span_name,span_type,span_model,span_provider,input,output,metadata
2trace-123,my-trace,2024-01-01T00:00:00Z,2024-01-01T00:01:00Z,thread-456,span-789,llm-call,llm,gpt-4,openai,"{""prompt"":""Hello""}","{""response"":""Hi""}","{""tokens"":10}"

Use Cases

1. Project Migration

$# Export all data from source project
>opik export my-workspace/old-project --all --path ./migration_data
>
># Import to new project (specify the workspace/project directory)
>opik import ./migration_data/my-workspace/old-project my-workspace/new-project --all

2. Data Backup

$# Create backup of all data
>opik export my-workspace/production-project --all --path ./backup_$(date +%Y%m%d)

3. Environment Sync

$# Sync from staging to production
>opik export my-workspace/staging-project --filter "tags contains 'ready-for-prod'"
>opik import ./exported_data my-workspace/production-project

4. Data Analysis

$# Export specific traces for analysis
>opik export my-workspace/my-project --filter "start_time >= '2024-01-01T00:00:00Z'" --max-results 1000
># Analyze the JSON files locally

5. Dataset Management

$# Export datasets from a project
>opik export my-workspace/my-project --include datasets
>
># Import datasets to another project
>opik import ./exported_data my-workspace/target-project --include datasets

6. Data Analysis with CSV

$# Export traces in CSV format for analysis
>opik export my-workspace/my-project --trace-format csv --path ./analysis_data
>
># Open in Excel or Google Sheets for analysis
># Or use with pandas in Python:
># import pandas as pd
># df = pd.read_csv('./analysis_data/my-workspace/my-project/traces.csv')

Error Handling

The commands include comprehensive error handling:

  • Network errors: Automatic retry with user feedback
  • Authentication errors: Clear error messages with setup instructions
  • File system errors: Proper directory creation and permission handling
  • Data validation: JSON format validation and error reporting

Progress Tracking

Both commands show progress indicators:

  • Export: Shows number of traces found and export progress
  • Import: Shows number of files found and import progress
  • Rich output: Color-coded status messages and progress bars

Limitations

  • Large datasets: For projects with many traces, consider using filters to limit exports
  • Network dependency: Requires active connection to Opik server
  • Authentication: Must be properly configured with API keys
  • File size: Large trace files may take time to process

Troubleshooting

Common Issues

  1. “No traces found”

    • Check if the project name is correct
    • Verify you have access to the project
    • Try without filters first
  2. “Project directory not found”

    • Make sure you’ve exported data first
    • Check the input directory path
    • Verify the project name matches
  3. “Opik SDK not available”

    • Ensure Opik is properly installed
    • Check your Python environment
    • Verify the installation with opik healthcheck

Getting Help

$# Get help for export command
>opik export --help
>
># Get help for import command
>opik import --help
>
># Check system health
>opik healthcheck

Example Workflow

Here’s a complete example of exporting and importing trace data:

JSON Format Workflow

$# 1. Export traces from source project (JSON format)
>opik export my-workspace/my-source-project --path ./temp_data
>
># 2. Inspect the exported data
>ls ./temp_data/my-workspace/my-source-project/
>cat ./temp_data/my-workspace/my-source-project/trace_*.json | head -20
>
># 3. Dry run import to see what would be imported
>opik import ./temp_data my-workspace/my-target-project --dry-run
>
># 4. Actually import the traces
>opik import ./temp_data my-workspace/my-target-project
>
># 5. Clean up temporary data
>rm -rf ./temp_data

CSV Format Workflow

$# 1. Export traces in CSV format for analysis
>opik export my-workspace/my-source-project --trace-format csv --path ./csv_data
>
># 2. Inspect the CSV file
>ls ./csv_data/my-workspace/my-source-project/
>head -5 ./csv_data/my-workspace/my-source-project/traces.csv
>
># 3. Analyze with pandas (optional)
>python -c "
>import pandas as pd
>df = pd.read_csv('./csv_data/my-workspace/my-source-project/traces.csv')
>print(f'Exported {len(df)} trace records')
>print(df.columns.tolist())
>"
>
># 4. For import, you would need to convert back to JSON format
># (CSV format is primarily for analysis, not import)

This workflow ensures you can safely migrate trace data between projects while maintaining data integrity and providing visibility into the process. The CSV format is particularly useful for data analysis and reporting.