Run open source LLM evaluations with Opik!

Star

Category: Comet Community Hub

Vincent Koc

March 27, 2025

Academic Research, Comet Community Hub

LLM Evaluation Complexities for Non-Latin Languages

Large language models (LLMs) have revolutionized natural language processing, yet most development and evaluation efforts have historically centered around Latin-script…
Read
Abby Morgan

March 26, 2025

Comet Community Hub, LLMOps, Tutorials

SelfCheckGPT for LLM Evaluation

Detecting hallucinations in language models is challenging. There are three general approaches: The problem with many LLM-as-a-Judge techniques is that…
Read
Abby Morgan

February 24, 2025

Comet Community Hub, LLMOps, Tutorials

LLM Juries for Evaluation

Evaluating the correctness of generated responses is an inherently challenging task. LLM-as-a-Judge evaluators have gained popularity for their ability to…
Read
Stéphan André

February 5, 2025

Comet Community Hub, LLMOps

LLM Monitoring & Maintenance in Production Applications

Generative AI has become a transformative force, revolutionizing how businesses engage with users through chatbots, content creation, and personalized recommendations.…
Read
Abby Morgan

January 28, 2025

Comet Community Hub, LLMOps, Machine Learning, Product, Tutorials

G-Eval for LLM Evaluation

LLM-as-a-judge evaluators have gained widespread adoption due to their flexibility, scalability, and close alignment with human judgment. They excel at…
Read
Abby Morgan

December 19, 2024

Comet Community Hub, LLMOps, Tutorials

BERTScore For LLM Evaluation

Introduction BERTScore represents a pivotal shift in LLM evaluation, moving beyond traditional heuristic-based metrics like BLEU and ROUGE to a…
Read
Claire Longo

December 9, 2024

Comet Community Hub, LLMOps, Tutorials

Building ClaireBot, an AI Personal Stylist Chatbot

Follow the evolution of my personal AI project and discover how to integrate image analysis, LLM models, and LLM-as-a-judge evaluation…
Read
Abby Morgan

November 21, 2024

Comet Community Hub, LLMOps, Tutorials

Perplexity for LLM Evaluation

Perplexity is, historically speaking, one of the “standard” evaluation metrics for language models. And while recent years have seen a…
Read
Gideon Mendels
|
Jacques Verre

September 16, 2024

Comet Community Hub, LLMOps, Product

Meet Opik: Your New Tool to Evaluate, Test, and Monitor LLM Applications

Today, we’re thrilled to introduce Opik – an open-source, end-to-end LLM development platform that provides the observability tools you need…
Read
Fabrício Ceolin

August 30, 2024

Comet Community Hub, LLMOps, Machine Learning, Tutorials

Building a Low-Cost Local LLM Server to Run 70 Billion Parameter Models

A guest post from Fabrício Ceolin, DevOps Engineer at Comet. Inspired by the growing demand for large-scale language models, Fabrício…
Read

Get started today for free.

You don’t need a credit card to sign up, and your Comet account comes with a generous free tier you can actually use—for as long as you like.

Create Free Account