Skip to main content

Weekly Changelog

Week of 2024-12-02

Opik Dashboard:

  • Added a new created_by column for each table to indicate who created the record
  • Mask the API key in the user menu

SDK:

  • Implement background batch sending of traces to speed up processing of trace creation requests
  • Updated OpenAI integration to track cost of LLM calls
  • Updated prompt.format method to raise an error when it is called with the wrong arguments
  • Updated the Opik method so it accepts the api_key parameter as a positional argument
  • Improved the prompt template for the hallucination metric
  • Introduced a new opik_check_tls_certificate configuration option to disable the TLS certificate check.

Week of 2024-11-25

Opik Dashboard:

  • Feedback scores are now displayed as separate columns in the traces and spans table
  • Introduce a new project dashboard to see trace count, feedback scores and token count over time. project dashboard
  • Project statistics are now displayed in the traces and spans table header, this is especially useful for tracking the average feedback scores project statistics
  • Redesigned the experiment item sidebar to make it easier to review experiment results experiment item sidebar
  • Annotating feedback scores in the UI now feels much faster
  • Support exporting traces as JSON file in addition to CSV
  • Sidebars now close when clicking outside of them
  • Dataset groups in the experiment page are now sorted by last updated date
  • Updated scrollbar styles for Windows users

SDK:

  • Improved the robustness to connection issues by adding retry logic.
  • Updated the OpenAI integration to track structured output calls using beta.chat.completions.parse.
  • Fixed issue with update_current_span and update_current_trace that did not support updating the output field.

Week of 2024-11-18

Opik Dashboard:

  • Updated the majority of tables to increase the information density, it is now easier to review many traces at once.
  • Images logged to datasets and experiments are now displayed in the UI. Both images urls and base64 encoded images are supported.

SDK:

  • The scoring_metrics argument is now optional in the evaluate method. This is useful if you are looking at evaluating your LLM calls manually in the Opik UI.
  • When uploading a dataset, the SDK now prints a link to the dataset in the UI.
  • Usage is now correctly logged when using the LangChain OpenAI integration.
  • Implement a batching mechanism for uploading spans and dataset items to avoid 413 Request Entity Too Large errors.
  • Removed pandas and numpy as mandatory dependencies.

Week of 2024-11-11

Opik Dashboard:

  • Added the option to sort the projects table by Last updated, Created at and Name columns.
  • Updated the logic for displaying images, instead of relying on the format of the response, we now use regex rules to detect if the trace or span input includes a base64 encoded image or url.
  • Improved performance of the Traces table by truncating trace inputs and outputs if they contain base64 encoded images.
  • Fixed some issues with rendering trace input and outputs in YAML format.
  • Added grouping and charts to the experiments page: experiment summary

SDK:

  • New integration: Anthropic integration

    from anthropic import Anthropic, AsyncAnthropic
    from opik.integrations.anthropic import track_anthropic

    client = Anthropic()
    client = track_anthropic(client, project_name="anthropic-example")

    message = client.messages.create(
    max_tokens=1024,
    messages=[
    {
    "role": "user",
    "content": "Tell a fact",
    }
    ],
    model="claude-3-opus-20240229",
    )
    print(message)
  • Added a new evaluate_experiment method in the SDK that can be used to re-score an existing experiment, learn more in the Update experiments guide.

Week of 2024-11-04

Opik Dashboard:

  • Added a new Prompt library page to manage your prompts in the UI. prompt library

SDK:

  • Introduced the Prompt object in the SDK to manage prompts stored in the library. See the Prompt Management guide for more details.
  • Introduced a Opik.search_spans method to search for spans in a project. See the Search spans guide for more details.
  • Released a new integration with AWS Bedrock for using Opik with Bedrock models.

Week of 2024-10-28

Opik Dashboard:

  • Added a new Feedback modal in the UI so you can easily provide feedback on any parts of the platform.

SDK:

  • Released new evaluation metric: GEval - This LLM as a Judge metric is task agnostic and can be used to evaluate any LLM call based on your own custom evaluation criteria.
  • Allow users to specify the path to the Opik configuration file using the OPIK_CONFIG_PATH environment variable, read more about it in the Python SDK Configuration guide.
  • You can now configure the project_name as part of the evaluate method so that traces are logged to a specific project instead of the default one.
  • Added a new Opik.search_traces method to search for traces, this includes support for a search string to return only specific traces.
  • Enforce structured outputs for LLM as a Judge metrics so that they are more reliable (they will no longer fail when decoding the LLM response).

Week of 2024-10-21

Opik Dashboard:

  • Added the option to download traces and LLM calls as CSV files from the UI: download traces
  • Introduce a new quickstart guide to help you get started: quickstart guide
  • Updated datasets to support more flexible data schema, you can now insert items with any key value pairs and not just input and expected_output. See more in the SDK section below.
  • Multiple small UX improvements (more informative empty state for projects, updated icons, feedback tab in the experiment page, etc).
  • Fix issue with \t characters breaking the YAML code block in the traces page.

SDK:

  • Datasets now support more flexible data schema, we now support inserting items with any key value pairs:

    import opik

    client = opik.Opik()
    dataset = client.get_or_create_dataset(name="Demo Dataset")
    dataset.insert([
    {"user_question": "Hello, what can you do ?", "expected_output": {"assistant_answer": "I am a chatbot assistant that can answer questions and help you with your queries!"}},
    {"user_question": "What is the capital of France?", "expected_output": {"assistant_answer": "Paris"}},
    ])
  • Released WatsonX, Gemini and Groq integration based on the LiteLLM integration.

  • The context field is now optional in the Hallucination metric.

  • LLM as a Judge metrics now support customizing the LLM provider by specifying the model parameter. See more in the Customizing LLM as a Judge metrics section.

  • Fixed an issue when updating feedback scores using the update_current_span and update_current_trace methods. See this Github issue for more details.

Week of 2024-10-14

Opik Dashboard:

  • Fix handling of large experiment names in breadcrumbs and popups
  • Add filtering options for experiment items in the experiment page experiment item filters

SDK:

  • Allow users to configure the project name in the LangChain integration

Week of 2024-10-07

Opik Dashboard:

  • Added Updated At column in the project page
  • Added support for filtering by token usage in the trace page

SDK:

  • Added link to the trace project when traces are logged for the first time in a session
  • Added link to the experiment page when calling the evaluate method
  • Added project_name parameter in the opik.Opik client and opik.track decorator
  • Added a new nb_samples parameter in the evaluate method to specify the number of samples to use for the evaluation
  • Released the LiteLLM integration

Week of 2024-09-30

Opik Dashboard:

  • Added option to delete experiments from the UI
  • Updated empty state for projects with no traces
  • Removed tooltip delay for the reason icon in the feedback score components

SDK:

  • Introduced new get_or_create_dataset method to the opik.Opik client. This method will create a new dataset if it does not exist.
  • When inserting items into a dataset, duplicate items are now silently ignored instead of being ingested.