Run open source LLM evaluations with Opik!

Star
Comet logo
  • Comet logo
  • Opik Platform
  • Products
    • Opik GenAI Platform
    • MLOps Platform
  • Docs
    • Opik GenAI Platform
    • MLOps Platform
  • Pricing
  • Customers
  • Learn
    • Blog
    • Deep Learning Weekly
  • Company
    • About Us
    • News and Events
      • Events
      • Press Releases
    • Partners
    • Careers
    • Contact Us
    • Leadership
  • Login
Get Demo
Try Comet Free
Contact Us
Try Opik Free
  1. Home
  2. Customers
  3. Pattern’s Data-Driven Approach to LLM Evaluation with Opik

Pattern’s Data-Driven Approach to LLM Evaluation with Opik

Opik x Pattern
Words By

Caroline Borders

September 26, 2025

Pattern Group Inc. (“Pattern”) is a leader in accelerating brands on global ecommerce marketplaces, leveraging proprietary technology and AI. At the core of these capabilities is the AI Operations team, which is responsible for building the foundational AI technologies that support both technical and non-technical users. Their work enables employees across the company to integrate AI into a wide range of projects, from automating internal processes to enhancing customer-facing products and services.

We spoke with Jeremy Mumford, Lead AI Engineer, and Cayden Blake, AI Engineer, from Pattern’s AI Ops team to learn how they leveraged Opik, Comet’s open-source LLM evaluation platform, to achieve measurable efficiency gains and identify opportunities for significant cost savings, enhancing an already valuable workflow that drives growth for their customers.

Automating Content Analysis to Optimize Marketplace Listings

One of the AI Ops team’s responsibilities is maintaining a Content Brief, an AI-powered, data-fueled, content optimization tool for creating product content that drives growth across every marketplace. Content Brief offers insights into how to improve copy and images by allowing brands to make data-backed decisions that improve content since it resonates more strongly with target buyers.

Automating the Content Brief system with AI is imperative; manual processes would be too slow and resource-intensive to keep up with the increasing demand that users have across multiple marketplaces. The team relied on large models to power its Content Brief system; however, the approach proved costly. The team’s challenge was to reduce costs without compromising on performance or quality that their clients depended on, and that is where Opik came in.

“The workflow was quite costly with the models we were using. Our hope was to investigate this process and use Opik to uncover a more cost-effective model without sacrificing performance”

-Cayden Blake, AI Engineer

The Evaluation Process With Opik

The AI Ops team initiated the project by extracting production data from their Content Briefs and converting it to reusable datasets for evaluation within Opik. With those datasets, they defined a set of evaluation metrics tailored to the specific workflow, which included user-defined metrics and custom LLM-as-a-judge metrics to assess the outputs of various models and determine their performance.

Screenshot of Opik experiment dashboard
Opik’s Experiment Dashboard

Using Opik’s experiment dashboard, they ran side-by-side comparisons of multiple models, analyzing both performance scores and cost data in one centralized view. The team also leveraged the Prompt Playground to test prompt variations before applying them to the full data set. To validate the results, they used light prompt tuning and a layer of human review to confirm the experiment’s findings.

The results gave the team clear evidence of tradeoffs: some models delivered strong results, but at a high cost, while others were inexpensive but underperformed. By running structured evaluations in Opik, they identified the “sweet spot,” using a smaller, more efficient model that delivered the same quality as their baseline approach. The average cost savings per Content Brief is expected to reduce aggregate annual costs by approximately $60,000.

“The biggest advantage of using Opik was achieving cost savings while maintaining quality. Benchmark evaluations gave us a scientific way to confirm that a smaller, more efficient model produced consistent results across a wide range of data.”

– Jeremy Mumford, Lead AI Engineer

Looking Ahead: Opik as a Core Part of AI Ops

“It was very satisfying to feel more organized with my AI workflows, just being able to see everything in one place. It made me feel like I actually knew what was going on, instead of just hoping it worked in the background.”

-Cayden Blake, AI Engineer

Opik has provided the AI Ops team with a structured approach to evaluating AI systems, replacing trial and error with a centralized platform for running imperative testing before any system reaches production. This approach has resulted in strengthened individual workflows and created a more disciplined method for scaling AI across the company.

“We are making use of nearly every feature Opik has available today. We use tracing features and the prompt library for version control, the evaluation suite to build benchmarks with both custom and out-of-the-box metrics, and online evaluation to validate models in real time. We’re using the full spectrum of features and have exciting ideas for what to do next.”

-Jeremy Mumford, Lead AI Engineer

Looking ahead, they plan to integrate Opik into their CI/CD pipelines so that every new or updated workflow undergoes standardized evaluation before release, establishing Opik as a standard checkpoint for engineers across the company.

“Opik being open-source was one of the reasons we chose it. Beyond the peace of mind of knowing we can self-host if we want, the ability to debug and submit product requests when we notice things has been really helpful in making sure the product meets our needs.”

-Jeremy Mumford, Lead AI Engineer

For Pattern, Opik has evolved from a model evaluation tool into a hub for AI observability, enabling the company to innovate faster while maintaining confidence and control. If you are interested in learning more about how Opik can accelerate your AI workflows within Ecommerce, click the button below to learn more.

Contact Us

Pattern accelerates brands on global ecommerce marketplaces, leveraging proprietary technology and AI. Utilizing more than 46 trillion data points, sophisticated machine learning and AI models, Pattern optimizes and automates all levers of ecommerce growth for global brands, including advertising, content management, logistics and fulfillment, pricing, forecasting, and customer service. Hundreds of global brands depend on Pattern’s ecommerce acceleration platform every day to drive profitable revenue growth across 60+ global marketplaces—including Amazon, Walmart.com, Target.com, eBay, Tmall, TikTok Shop, JD, and Mercado Libre. To learn more, visit pattern.com or email press@pattern.com.

Pattern

Pattern provides ecommerce solutions to accelerate brands on marketplaces.

Industry

Ecommerce & Retail

Technologies

Opik

Comet logo
  • LinkedIn
  • X
  • YouTube
  • Facebook

Subscribe to Comet

Thank you for subscribing to Comet’s newsletter!

Products

  • Opik LLM Evaluation
  • ML Experiment Management
  • ML Artifacts
  • ML Model Registry
  • ML Model Production Monitoring

Learn

  • Documentation
  • Opik University
  • Comet Blog
  • Deep Learning Weekly

Company

  • About Us
  • News and Events
  • Partners
  • Careers
  • Contact Us

Pricing

  • Pricing
  • Create a Free Account
  • Contact Sales
Capterra badge
AICPA badge

©2025 Comet ML, Inc. – All Rights Reserved

Terms of Service

Privacy Policy

CCPA Privacy Notice

Cookie Settings

We use cookies to collect statistical usage information about our website and its visitors and ensure we give you the best experience on our website. Please refer to our Privacy Policy to learn more.