Category: Tutorials

Abby Morgan

March 26, 2025

Comet Community Hub, LLMOps, Tutorials

SelfCheckGPT for LLM Evaluation

Detecting hallucinations in language models is challenging. There are three general approaches: The problem with many LLM-as-a-Judge techniques is that…
Read
Abby Morgan

February 24, 2025

Comet Community Hub, LLMOps, Tutorials

LLM Juries for Evaluation

Evaluating the correctness of generated responses is an inherently challenging task. LLM-as-a-Judge evaluators have gained popularity for their ability to…
Read
Claire Longo

February 19, 2025

LLMOps, Machine Learning, Tutorials

A Simple Recipe for LLM Observability

So, you’re building an AI application on top of an LLM, and you’re planning on setting it live in production.…
Read
Abby Morgan

January 28, 2025

Comet Community Hub, LLMOps, Machine Learning, Product, Tutorials

G-Eval for LLM Evaluation

LLM-as-a-judge evaluators have gained widespread adoption due to their flexibility, scalability, and close alignment with human judgment. They excel at…
Read
Paul Iusztin
|
Decoding ML

January 13, 2025

LLMOps, Tutorials

Build Multi-Index Advanced RAG Apps

Welcome to Lesson 12 of 12 in our free course series, LLM Twin: Building Your Production-Ready AI Replica. You’ll learn…
Read
Paul Iusztin
|
Decoding ML

January 13, 2025

LLMOps, Tutorials

Build a scalable RAG ingestion pipeline using 74.3% less code

Welcome to Lesson 11 of 12 in our free course series, LLM Twin: Building Your Production-Ready AI Replica. You’ll learn…
Read
Abby Morgan

December 19, 2024

Comet Community Hub, LLMOps, Tutorials

BERTScore For LLM Evaluation

Introduction BERTScore represents a pivotal shift in LLM evaluation, moving beyond traditional heuristic-based metrics like BLEU and ROUGE to a…
Read
Claire Longo

December 9, 2024

Comet Community Hub, LLMOps, Tutorials

Building ClaireBot, an AI Personal Stylist Chatbot

Follow the evolution of my personal AI project and discover how to integrate image analysis, LLM models, and LLM-as-a-judge evaluation…
Read
Abby Morgan

November 21, 2024

Comet Community Hub, LLMOps, Tutorials

Perplexity for LLM Evaluation

Perplexity is, historically speaking, one of the “standard” evaluation metrics for language models. And while recent years have seen a…
Read
Siddharth Mehta

October 8, 2024

LLMOps, Product, Tutorials

OpenAI Evals: Log Datasets & Evaluate LLM Performance with Opik

OpenAI’s Python API is quickly becoming one of the most-downloaded Python packages. With an easy-to-use SDK and access…
Read

Get started today for free.

Trusted by Thousands of Data Scientists

Create Free Account