Custom model
Opik provides a set of LLM as a Judge metrics that are designed to be model-agnostic and can be used with any LLM. In order to achieve this, we use the LiteLLM library to abstract the LLM calls.
By default, Opik will use the gpt-4o
model. However, you can change this by setting the model
parameter when initializing your metric to any model supported by LiteLLM:
Using a model supported by LiteLLM
In order to use many models supported by LiteLLM, you also need to pass additional parameters. For this, you can use the LiteLLMChatModel class and passing it to the metric:
Creating Your Own Custom Model Class
Opik’s LLM-as-a-Judge metrics, such as Hallucination
, are designed to work with various language models. While Opik supports many models out-of-the-box via LiteLLM, you can integrate any LLM by creating a custom model class. This involves subclassing opik.evaluation.models.OpikBaseModel
and implementing its required methods.
The OpikBaseModel
Interface
OpikBaseModel
is an abstract base class that defines the interface Opik metrics use to interact with LLMs. To create a compatible custom model, you must implement the following methods:
__init__(self, model_name: str)
: Initializes the base model with a given model name.generate_string(self, input: str, **kwargs: Any) -> str
: Simplified interface to generate a string output from the model.generate_provider_response(self, **kwargs: Any) -> Any
: Generate a provider-specific response. Can be used to interface with the underlying model provider (e.g., OpenAI, Anthropic) and get raw output.
Implementing a Custom Model for an OpenAI-like API
Here’s an example of a custom model class that interacts with an LLM service exposing an OpenAI-compatible API endpoint.
Key considerations for the implementation:
- API Endpoint and Payload: Adjust
base_url
and the JSON payload to match your specific LLM provider’s requirements if they deviate from the common OpenAI structure. - Model Name: The
model_name
passed to__init__
is used as themodel
parameter in the API call. Ensure this matches an available model on your LLM service.
Using the Custom Model with the Hallucination
Metric
In order to run an evaluation using your Custom Model with the Hallucination
metric,
you will first need to instantiate our CustomOpenAICompatibleModel
class and pass it to the Hallucination
class.
The evaluation can then be kicked off by calling the Hallucination.score()
` method.
Key considerations for the implementation:
- ScoreResult Output:
Hallucination.score()
returns a ScoreResult object containing the metric name (name
), score value (value
), optional explanation (reason
), metadata (metadata
), and a failure flag (scoring_failed
).