Structured Output Tracking for Instructor with Opik

Instructor is a Python library for working with structured outputs for LLMs built on top of Pydantic. It provides a simple way to manage schema validations, retries and streaming responses.

In this guide, we will showcase how to integrate Opik with Instructor so that all the Instructor calls are logged as traces in Opik.

Account Setup

Comet provides a hosted version of the Opik platform, simply create an account and grab your API Key.

You can also run the Opik platform locally, see the installation guide for more information.

Getting Started

Installation

First, ensure you have both opik and instructor installed:

$pip install opik instructor

Configuring Opik

Configure the Opik Python SDK for your deployment type. See the Python SDK Configuration guide for detailed instructions on:

  • CLI configuration: opik configure
  • Code configuration: opik.configure()
  • Self-hosted vs Cloud vs Enterprise setup
  • Configuration files and environment variables

Configuring Instructor

In order to use Instructor, you will need to configure your LLM provider API keys. For this example, we’ll use OpenAI, Anthropic, and Gemini. You can find or create your API keys in these pages:

You can set them as environment variables:

$export OPENAI_API_KEY="YOUR_API_KEY"
>export ANTHROPIC_API_KEY="YOUR_API_KEY"
>export GOOGLE_API_KEY="YOUR_API_KEY"

Or set them programmatically:

1import os
2import getpass
3
4if "OPENAI_API_KEY" not in os.environ:
5 os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter your OpenAI API key: ")
6
7if "ANTHROPIC_API_KEY" not in os.environ:
8 os.environ["ANTHROPIC_API_KEY"] = getpass.getpass("Enter your Anthropic API key: ")
9
10if "GOOGLE_API_KEY" not in os.environ:
11 os.environ["GOOGLE_API_KEY"] = getpass.getpass("Enter your Google API key: ")

Using Opik with Instructor library

In order to log traces from Instructor into Opik, we are going to patch the instructor library. This will log each LLM call to the Opik platform.

For all the integrations, we will first add tracking to the LLM client and then pass it to the Instructor library:

1from opik.integrations.openai import track_openai
2import instructor
3from pydantic import BaseModel
4from openai import OpenAI
5
6# We will first create the OpenAI client and add the `track_openai`
7# method to log data to Opik
8openai_client = track_openai(OpenAI())
9
10# Patch the OpenAI client for Instructor
11client = instructor.from_openai(openai_client)
12
13# Define your desired output structure
14class UserInfo(BaseModel):
15 name: str
16 age: int
17
18user_info = client.chat.completions.create(
19 model="gpt-4o-mini",
20 response_model=UserInfo,
21 messages=[{"role": "user", "content": "John Doe is 30 years old."}],
22)
23
24print(user_info)

Thanks to the track_openai method, all the calls made to OpenAI will be logged to the Opik platform. This approach also works well if you are also using the opik.track decorator as it will automatically log the LLM call made with Instructor to the relevant trace.

Integrating with other LLM providers

The instructor library supports many LLM providers beyond OpenAI, including: Anthropic, AWS Bedrock, Gemini, etc. Opik supports the majority of these providers as well.

Here are the code snippets needed for the integration with different providers:

Anthropic

1from opik.integrations.anthropic import track_anthropic
2import instructor
3from anthropic import Anthropic
4
5# Add Opik tracking
6anthropic_client = track_anthropic(Anthropic())
7
8# Patch the Anthropic client for Instructor
9client = instructor.from_anthropic(
10 anthropic_client, mode=instructor.Mode.ANTHROPIC_JSON
11)
12
13user_info = client.chat.completions.create(
14 model="claude-3-5-sonnet-20241022",
15 response_model=UserInfo,
16 messages=[{"role": "user", "content": "John Doe is 30 years old."}],
17 max_tokens=1000,
18)
19
20print(user_info)

Gemini

1from opik.integrations.genai import track_genai
2import instructor
3from google import genai
4
5# Add Opik tracking
6gemini_client = track_genai(genai.Client())
7
8# Patch the GenAI client for Instructor
9client = instructor.from_genai(
10 gemini_client, mode=instructor.Mode.GENAI_STRUCTURED_OUTPUTS
11)
12
13user_info = client.chat.completions.create(
14 model="gemini-2.0-flash-001",
15 response_model=UserInfo,
16 messages=[{"role": "user", "content": "John Doe is 30 years old."}],
17)
18
19print(user_info)

You can read more about how to use the Instructor library in their documentation.