Log experiments with REST API
If you’re working in Python or JavaScript, the easiest way to integrate with Opik is through our official SDKs.
But if your stack includes something else, like Go, Java, Kotlin, or Ruby, no problem! That’s where the REST API comes in: it gives you flexible, language-agnostic access to log and manage your projects and experiments directly in Opik.
This guide shows you how to record experiment results for your LLM application using the Experiments bulk logging API.
The full API reference for the Record experiment items in bulk endpoint is available here.
Endpoint Overview
The Record Experiment Items in Bulk endpoint allows you to log multiple experiment item evaluations in a single request, including optional model outputs, traces, spans, and structured feedback scores.
Method: PUT
URL: /api/v1/private/experiments/items/bulk
Request Size Limit: The maximum allowed payload size is 4MB. For larger submissions, please divide the data into smaller batches.
Minimum Required Fields
At a minimum, your request must contain:
experiment_name
(string): Name of the experiment.dataset_name
(string): Name of the dataset the evaluation is tied to.items
(list of objects): Each object must include a uniquedataset_item_id
(UUID).
This minimal structure is sufficient to register the dataset item to the experiment.
Optional Enhancements for Richer Evaluation
Each item
can optionally include:
evaluate_task_result
: A map, list, or string representing the output of your application.trace
: An object representing the full execution trace.
evaluate_task_result
or trace
— not both.spans
: A list of structured span objects representing sub-steps or stages of execution.feedback_scores
: A list of structured objects that describe evaluation signals. Each feedback score includes:name
category_name
value
reason
source
Tip: use feedback scores to record evaluations such as accuracy, fluency, or custom criteria from heuristics or human reviewers.
Example Use Cases
Here are a few common ways teams can use the bulk logging endpoint to evaluate their LLM applications effectively:
- Register a dataset item with minimal fields.
- Log application responses with
evaluate_task_result
. - Attach feedback scores like accuracy scores or annotations.
- Enable full experiment observability with traces and spans.
Example Requests
1. Minimal Payload
2. Add Model Output
3. Include Feedback Scores
4. Full Payload with Multiple Items - Example in Python
5. Full Payload with Multiple Items - Example in Java (Using Jackson + HttpClient)
The following example shows how to stream dataset items and log experiment results in bulk using Java.
Authentication
Depending on your deployment, you can access the Experiments REST API either without authentication for local open-source on-premise setups, or with API key authentication for the Opik Cloud environment
Open-Source (No Auth Required)
Opik Cloud
Environment Variables
To promote security, flexibility, and reusability, it is recommended to manage authentication credentials using environment variables. This approach prevents hardcoding sensitive information and allows seamless configuration across different environments.
You can define environment variables directly in your system environment or load them via a .env file using tools like dotenv. Alternatively, credentials and other configurations can also be managed through a centralized configuration file, depending on your deployment setup and preference.