Run open source LLM evaluations with Opik!

Star
Comet logo
  • Opik LLM Evals
  • Products
    • Opik – LLM Evaluation
    • ML Experiment Management
    • ML Artifacts
    • ML Model Registry
    • ML Model Production Monitoring
  • Docs
    • Opik – LLM Evaluation
    • ML Experiment Management
  • Pricing
  • Customers
  • Learn
    • Blog
    • Deep Learning Weekly
    • LLM Course
  • Company
    • About Us
    • News and Events
      • Events
      • Press Releases
    • Careers
    • Contact Us
    • Leadership
  • Login
Get Demo
Try Comet Free
  1. Home
  2. Posts tagged “LLM Evaluation”

Tag: LLM Evaluation

  • Academic Research
  • Comet Community Hub
  • Industry
  • Integrations
  • LLMOps
  • Machine Learning
  • Office Hours
  • Partners & Integrations
  • Product
  • Thought Leadership
  • Tutorials
  • Uncategorized
  • Vincent Koc

    March 27, 2025
    Academic Research, Comet Community Hub

    LLM Evaluation Complexities for Non-Latin Languages

    Large language models (LLMs) have revolutionized natural language processing, yet most development and evaluation efforts have historically centered around Latin-script…

    Read

    LLM Evaluation Complexities for Non-Latin Languages
  • Abby Morgan

    March 26, 2025
    Comet Community Hub, LLMOps, Tutorials

    SelfCheckGPT for LLM Evaluation

    Detecting hallucinations in language models is challenging. There are three general approaches: Measuring token-level probability distributions for indications that a…

    Read

    SelfCheckGPT for LLM Evaluation
  • Abby Morgan

    January 28, 2025
    Comet Community Hub, LLMOps, Machine Learning, Product, Tutorials

    G-Eval for LLM Evaluation

    LLM-as-a-judge evaluators have gained widespread adoption due to their flexibility, scalability, and close alignment with human judgment. They excel at…

    Read

    G-Eval for LLM Evaluation
  • Gourav Bais

    December 19, 2024
    LLMOps

    Intro to LLM Observability: What to Monitor & How to Get Started

    While LLM usage is soaring, productionizing an LLM-powered application or software product presents new and different challenges compared to traditional…

    Read

    Intro to LLM Observability: What to Monitor & How to Get Started
  • Abby Morgan

    December 19, 2024
    Comet Community Hub, LLMOps, Tutorials

    BERTScore For LLM Evaluation

    Introduction BERTScore represents a pivotal shift in LLM evaluation, moving beyond traditional heuristic-based metrics like BLEU and ROUGE to a…

    Read

    BERTScore For LLM Evaluation

Get started today for free.

Trusted by Thousands of Data Scientists

Create Free Account
Contact Sales
Comet logo
  • LinkedIn
  • X
  • YouTube
  • Facebook

Subscribe to Comet

Thank you for subscribing to Comet’s newsletter!

Products

  • Opik
  • Experiment Management
  • Artifacts
  • Model Registry
  • Model Production Monitoring

Learn

  • Documentation
  • Resources
  • Comet Blog
  • Deep Learning Weekly
  • Heartbeat
  • LLM Course

Company

  • About Us
  • News and Events
  • Careers
  • Contact Us

Pricing

  • Pricing
  • Create a Free Account
  • Contact Sales
Capterra badge
AICPA badge

©2025 Comet ML, Inc. – All Rights Reserved

Terms of Service

Privacy Policy

CCPA Privacy Notice

Cookie Settings

We use cookies to collect statistical usage information about our website and its visitors and ensure we give you the best experience on our website. Please refer to our Privacy Policy to learn more.OkPrivacy policy