In Opik 2.0, datasets are project-scoped. Make sure to specify a projectName when creating datasets so they are associated with the correct project.
The Opik TypeScript SDK provides robust functionality for creating and managing datasets. Datasets in Opik serve as collections of data items that can be used for various purposes, including evaluation.
A dataset in Opik is a named collection of data items. Each dataset:
One of the key features of the Opik SDK is strong TypeScript typing support for datasets. You can define custom types for your dataset items to ensure type safety throughout your application:
The Opik SDK automatically handles deduplication when inserting items into a dataset. This feature ensures that identical items are not added multiple times.
Dataset versions are immutable snapshots. Use DatasetVersion for reproducible evaluations—ensuring the same data is used regardless of later changes.
Pass a DatasetVersion to evaluate() for reproducible experiments:
When comparing experiments (A/B tests), use the same dataset version to isolate the effect of your changes from data variations.
The generic type parameter T represents the DatasetItem type that defines
the structure of items stored in this dataset.
createDataset<T>Creates a new dataset.
Arguments:
name: string - The name of the datasetdescription?: string - Optional description of the datasetprojectName?: string - Optional project name to scope the dataset. If not provided, uses the client’s configured project.Returns: Promise<Dataset<T>> - A promise that resolves to the created Dataset object
getDataset<T>Retrieves an existing dataset by name.
Arguments:
name: string - The name of the dataset to retrieveprojectName?: string - Optional project name to scope the dataset lookup. If not provided, uses the client’s configured project.Returns: Promise<Dataset<T>> - A promise that resolves to the Dataset object
getOrCreateDataset<T>Retrieves an existing dataset by name or creates it if it doesn’t exist.
Arguments:
name: string - The name of the datasetdescription?: string - Optional description (used only if creating a new dataset)projectName?: string - Optional project name to scope the dataset. If not provided, uses the client’s configured project.Returns: Promise<Dataset<T>> - A promise that resolves to the existing or newly created Dataset object
getDatasets<T>Retrieves a list of datasets.
Arguments:
maxResults?: number - Optional maximum number of datasets to retrieve (default: 100)projectName?: string - Optional project name to filter datasets by. If not provided, uses the client’s configured project.Returns: Promise<Dataset<T>[]> - A promise that resolves to an array of Dataset objects
deleteDatasetDeletes a dataset by name.
Arguments:
name: string - The name of the dataset to deleteReturns: Promise<void>
insertInserts new items into the dataset with automatic deduplication.
Arguments:
items: T[] - List of objects to add to the datasetReturns: Promise<void>
updateUpdates existing items in the dataset.
Arguments:
items: T[] - List of objects to update in the dataset (must include IDs)Returns: Promise<void>
deleteDeletes items from the dataset.
Arguments:
itemIds: string[] - List of item IDs to deleteReturns: Promise<void>
clearDeletes all items from the dataset.
Returns: Promise<void>
getItemsRetrieves items from the dataset.
Arguments:
nbSamples?: number - Optional number of items to retrieve (if not set, all items are returned)lastRetrievedId?: string - Optional ID of the last retrieved item for paginationReturns: Promise<T[]> - A promise that resolves to an array of dataset items
insertFromJsonInserts items from a JSON string into the dataset.
Arguments:
jsonArray: string - JSON string in array formatkeysMapping?: Record<string, string> - Optional dictionary that maps JSON keys to dataset item field namesignoreKeys?: string[] - Optional array of keys to ignore when constructing dataset itemsReturns: Promise<void>
toJsonExports the dataset to a JSON string.
Arguments:
keysMapping?: Record<string, string> - Optional dictionary that maps dataset item field names to output JSON keysReturns: Promise<string> - A JSON string representation of all items in the dataset
getVersionViewGet a read-only view of a specific dataset version.
Arguments:
versionName: string - The version name (e.g., “v1”, “v2”)Returns: Promise<DatasetVersion<T>>
Throws: DatasetVersionNotFoundError if version doesn’t exist
getCurrentVersionNameGet the name of the latest version.
Returns: Promise<string | undefined> - Version name or undefined if no versions
getVersionInfoGet metadata about the latest version.
Returns: Promise<DatasetVersionPublic | undefined> - Version info or undefined
A read-only view of dataset items at a specific version. Cannot modify data.
getItemsRetrieve items from this version.
Arguments:
nbSamples?: number - Number of items to retrieve (default: all)Returns: Promise<T[]> - Array of dataset items
toJsonExport version items to JSON string.
Arguments:
keysMapping?: Record<string, string> - Map field names to output keysReturns: Promise<string> - JSON string
getVersionInfoGet the full version metadata object.
Returns: DatasetVersionPublic - Version info