{"id":12649,"date":"2025-01-29T09:22:36","date_gmt":"2025-01-29T17:22:36","guid":{"rendered":"https:\/\/live-cometml.pantheonsite.io\/?p=12649"},"modified":"2025-11-11T22:19:18","modified_gmt":"2025-11-11T22:19:18","slug":"llm-observability-architecture-engineering","status":"publish","type":"post","link":"https:\/\/www.comet.com\/site\/blog\/llm-observability-architecture-engineering\/","title":{"rendered":"Building Opik: A Scalable Open-Source LLM Observability Platform"},"content":{"rendered":"\n<figure class=\"wp-block-image aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"576\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/BuildingOpik-Blog-scaled-1-1024x576.jpg\" alt=\"building an open-source llm observability platform at scale\" class=\"wp-image-12650\" srcset=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/BuildingOpik-Blog-scaled-1-1024x576.jpg 1024w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/BuildingOpik-Blog-scaled-1-300x169.jpg 300w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/BuildingOpik-Blog-scaled-1-768x432.jpg 768w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/BuildingOpik-Blog-scaled-1-1536x864.jpg 1536w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/BuildingOpik-Blog-scaled-1-2048x1152.jpg 2048w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>Opik is an open-source platform for evaluating, testing, and monitoring LLM applications, created by Comet. When teams integrate language models into their applications, they need ways to debug complex systems, analyze performance, and understand how their development work affects responses returned by an LLM in terms of accuracy, relevance, context awareness, and other qualities. The platform they use to log, evaluate, and iterate on this work needs to be both user friendly and highly scalable, and accommodate many distinct tasks and use cases.<\/p>\n\n\n\n<p>To build Opik, the Comet team tapped into decades of combined experience training, deploying, and monitoring machine learning models, with the goal of making data science workflows more accessible to teams building with LLMs.<\/p>\n\n\n\n<p>In this post, we\u2019ll share the Comet engineering team\u2019s perspective on building Opik, exploring the architectural decisions and technical details that enable Opik to provide robust tracing, evaluation, and production monitoring capabilities.<\/p>\n\n\n\n<p>While our focus is on how Opik\u2019s architecture supports state-of-the-art <a href=\"https:\/\/www.comet.com\/site\/blog\/llm-evaluation-guide\/\">LLM evaluation<\/a>, the underlying design patterns and technologies can serve as a reference for many other production systems.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-key-requirements\">Key Requirements<\/h2>\n\n\n\n<p>Opik\u2019s development was shaped by several key requirements:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Accessibility:<\/strong> The easiest way to use Opik is by creating a free account on comet.com.<\/li>\n\n\n\n<li><strong>Open-source and easy deployment:<\/strong>\n<ul class=\"wp-block-list\">\n<li>Opik is designed as a fully open-source, self-contained application. It can be installed and run on a single host with minimal setup.<\/li>\n\n\n\n<li>Users can also deploy Opik on their own infrastructure using production-ready Kubernetes Helm Charts.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Fast iterations and rapid feedback:<\/strong> Since customers asked for a solution quickly, our team\u2019s goal was to launch early, then iterate based on user feedback.<\/li>\n\n\n\n<li><strong>State-of-the-art architecture and technology:<\/strong> Opik adopts a service-oriented architecture, with each component focusing on a specific functionality.\n<ul class=\"wp-block-list\">\n<li>It relies on proven, highly-scalable open-source systems like ClickHouse, MySQL, Redis, and Nginx.<\/li>\n\n\n\n<li>It\u2019s implemented using reliable languages, frameworks, and libraries such as Python, Java with Dropwizard, and TypeScript with React.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Scalability:<\/strong> Opik supports up to 100,000 traces per month under professional plans, with unlimited traces under enterprise plans.<\/li>\n\n\n\n<li><strong>Core Functionality:<\/strong> Opik addresses core aspects of LLMOps: tracking LLM calls and traces, automating LLM evaluation, collecting and monitoring feedback scores, and token usage over time.<\/li>\n\n\n\n<li><strong>Usability:<\/strong> Opik\u2019s user interfaces emphasize a positive user experience, improved continuously through community feedback and iterative design.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-challenges-and-solutions\"><span style=\"font-weight: 400;\">Challenges and Solutions<\/span><\/h2>\n\n\n\n<p>One prominent challenge in building Opik was the unpredictable nature of LLM calls and traces, which create multiple types of concurrent events (traces, spans, feedback scores, dataset items, experiment items, etc.). These events often arrive in unexpected order, making data consistency trickier.<\/p>\n\n\n\n<p>Addressing these challenges required balancing tradeoffs. The nature of data generated by LLM applications indicated that Opik should be built as an eventually consistent system. Performance and scalability take precedence over strict consistency in Opik\u2019s architecture, which led us to carefully select technologies that meet these performance requirements.<\/p>\n\n\n\n<p>This is why Opik leverages:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><span style=\"font-weight: 400;\"><strong>ClickHouse:<\/strong> for large-scale data ingestion and fast queries (e.g., for traces or experiments).<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: 400;\"><strong>MySQL:<\/strong> for data that demands ACID properties, such as projects or feedback definitions requiring transactional guarantees.<br><br><\/span><\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-high-level-architecture\">High-Level Architecture<\/h2>\n\n\n\n<p>Opik\u2019s architecture consists of multiple services that each handle a specific role, including:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>A backend service:<\/strong> Java + Dropwizard.<\/li>\n\n\n\n<li><strong>A frontend application:<\/strong> TypeScript + React, served by Nginx.<\/li>\n\n\n\n<li><strong>Data stores:<\/strong>\n<ul class=\"wp-block-list\">\n<li>ClickHouse for large-scale data.\n<ul class=\"wp-block-list\">\n<li>With Zookeeper to coordinate the cluster.<\/li>\n\n\n\n<li>With an Operator to provide operational and performance metrics.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>MySQL for transactional data.<\/li>\n\n\n\n<li>Redis for caching, rate limiting, distributed locks and streams.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1019\" height=\"775\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/llm-evaluation-architecture.png\" alt=\"software architecture diagram showing infrastructure and data systems for llm evaluation\" class=\"wp-image-12667\" srcset=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/llm-evaluation-architecture.png 1019w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/llm-evaluation-architecture-300x228.png 300w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/llm-evaluation-architecture-768x584.png 768w\" sizes=\"auto, (max-width: 1019px) 100vw, 1019px\" \/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-performance\">Performance<\/h2>\n\n\n\n<p><span style=\"font-weight: 400;\">We conducted performance tests to measure ingestion and display latencies for Opik using 10,000 and 100,000 traces, each containing three spans, using our <a href=\"https:\/\/www.comet.com\/site\/pricing\/\">free and pro plans<\/a> as a reference.&nbsp;<\/span><\/p>\n\n\n\n<p>From these tests, we can conclude:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Opik can ingest about 3,000 events per second in a local setup.<\/li>\n\n\n\n<li>Scaling from 40,000 events to 400,000 events (10x) did not degrade throughput (still ~3,000 events\/sec).<\/li>\n\n\n\n<li>Ingested traces appear in the UI almost immediately (under 0.04 seconds).<\/li>\n<\/ul>\n\n\n\n<p><span style=\"font-weight: 400;\">These results are remarkable given the constraints of a simple local installation:<\/span><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Single replicas (no redundancy) for backend and frontend services.<\/li>\n\n\n\n<li>Containers with low resources allocated to ClickHouse, MySQL, and Redis.<\/li>\n\n\n\n<li>Docker limited to a maximum of 8 CPU cores and 18 GB total memory.<\/li>\n\n\n\n<li>The Opik Python SDK and the Opik platform sharing local machine resources.<\/li>\n<\/ul>\n\n\n\n<p>Setup:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>The tests were performed on a MacBook Pro with an Apple M3 Pro CPU, 36 GB of memory, and Opik v1.4.5 running locally via Docker Compose:\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/www.comet.com\/docs\/opik\/self-host\/local_deployment\/\">https:\/\/www.comet.com\/docs\/opik\/self-host\/local_deployment\/<\/a><\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>A single instance of the Python SDK (on the same machine) sent traces and spans to the Opik platform.<\/li>\n<\/ul>\n\n\n\n<p>Metrics:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Ingestion latency: Time taken for Opik to ingest all traces\/spans.<\/li>\n\n\n\n<li>Dashboard display latency: Time until a trace becomes visible in the UI after ingestion.<\/li>\n<\/ul>\n\n\n\n<p>Results:<\/p>\n\n\n\n<p>1) 10,000 traces (30,000 spans, ~40,000 total events)<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Ingestion latency: 13.44 seconds (\u2248 2,976 events\/sec).<\/li>\n\n\n\n<li>Dashboard display latency: 0.03 seconds.<\/li>\n<\/ul>\n\n\n\n<p>2) 100,000 traces (300,000 spans, ~400,000 total events)<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Ingestion latency: 2 minutes 8.84 seconds (\u2248 3,104 events\/sec).<\/li>\n\n\n\n<li>Dashboard display latency: 0.04 seconds.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-identifier-strategy\">Identifier Strategy<\/h2>\n\n\n\n<p>During Opik\u2019s development, we needed an identifier strategy that favors scalability for data entities such as traces and experiments. We identified these requirements:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Avoid leaking sensitive information externally (e.g., total record count).<\/li>\n\n\n\n<li>Balance the trade-off between possible metadata leakage (e.g., a timestamp) and performance gains.<\/li>\n\n\n\n<li>Extremely low probability of collisions.<\/li>\n\n\n\n<li>Independence from external coordination or configuration.<\/li>\n\n\n\n<li>K-sorted (to avoid index rebalancing and benefit from sorted data storage).<\/li>\n\n\n\n<li>Compact enough for economical disk usage (tables, indexes etc.) but large enough to accommodate high data ingestion rates.<\/li>\n\n\n\n<li>Multi-language support. Minimum: Java, Python, TypeScript\/JavaScript.<\/li>\n<\/ul>\n\n\n\n<p>We evaluated:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>ksuid.<\/li>\n\n\n\n<li>ulid.<\/li>\n\n\n\n<li>xid.<\/li>\n\n\n\n<li>UUID version 7.<\/li>\n<\/ul>\n\n\n\n<p>After measuring, both in ClickHouse and MySQL:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Generation speed.<\/li>\n\n\n\n<li>Insertion speed.<\/li>\n\n\n\n<li>Sort\/retrieval speed<\/li>\n\n\n\n<li>Disk space usage (both tables and indexes)<\/li>\n<\/ul>\n\n\n\n<p>We selected UUID v7 for most identifiers due to its performance, sortability, and standardization. For cases where exposing the creation timestamp might be problematic, we use UUID v4.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-backend\">Backend<\/h2>\n\n\n\n<p>Opik\u2019s backend uses Java 21 LTS and Dropwizard 4, structured as a RESTful web service offering public API endpoints for core functionality. Full API documentation is available here:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/www.comet.com\/docs\/opik\/reference\/rest_api\/opik-rest-api\/\">https:\/\/www.comet.com\/docs\/opik\/reference\/rest_api\/opik-rest-api\/<\/a><\/li>\n<\/ul>\n\n\n\n<p>We rely on well-known open-source libraries to avoid reinventing the wheel:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Lombok:<\/strong> to reduce Java boilerplate.<\/li>\n\n\n\n<li><strong>Guice:<\/strong> for dependency injection.<\/li>\n\n\n\n<li><strong>OpenTelemetry:<\/strong> for vendor-neutral observability.<\/li>\n\n\n\n<li><strong>Reactive programming:<\/strong> using ClickHouse\u2019s R2DBC client.<\/li>\n\n\n\n<li><strong>Liquibase:<\/strong> for automated database migrations (ClickHouse and MySQL).<\/li>\n\n\n\n<li><strong>MapStruct:<\/strong> to auto-generate Java object mappers.<\/li>\n\n\n\n<li><strong>Spotless:<\/strong> for automated code formatting.<\/li>\n\n\n\n<li><strong>PODAM:<\/strong> for autogenerated test data.<\/li>\n\n\n\n<li><strong>TestContainers:<\/strong> for embedded ClickHouse, MySQL, and Redis in integration tests.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-frontend\">Frontend<\/h2>\n\n\n\n<p>Opik\u2019s user interface is built with:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>TypeScript + React:<\/strong> for stateful, component-based UI development.<\/li>\n\n\n\n<li><strong>Vite:<\/strong> as a fast and modern build system.<\/li>\n\n\n\n<li><strong>Tailwind CSS:<\/strong> for utility-first styling.<\/li>\n\n\n\n<li><strong>TanStack Router:<\/strong> for client-side routing.<\/li>\n\n\n\n<li><strong>TanStack Query:<\/strong> for data synchronization and caching.<\/li>\n\n\n\n<li><strong>TanStack Table:<\/strong> for robust table rendering and interactions.<\/li>\n\n\n\n<li><strong>Zustand:<\/strong> for managing complex UI state.<\/li>\n<\/ul>\n\n\n\n<p>The frontend is served by Nginx, which also functions as a reverse proxy. In the fully open-source version, Nginx does not enforce rate limits by default (though it can be configured to do so).<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-sdks\">SDKs<\/h2>\n\n\n\n<p>Currently, Opik offers a Python SDK, and a TypeScript SDK will be released soon. Much of the boilerplate code for the SDKs is automatically generated from the OpenAPI specification using Fern. This approach helps us:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Keep SDKs in sync with API changes (client and data models regenerating automatically).<\/li>\n\n\n\n<li>Simplify development and reduce manual coding errors.<\/li>\n<\/ul>\n\n\n\n<p>The Python SDK uses a message queue and multiple workers, so it sends data to the Opik API asynchronously. This design ensures that latency or transient errors do not disrupt your LLM application.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-clickhouse\">ClickHouse<\/h2>\n\n\n\n<p>To meet Opik\u2019s scalability requirements for high-volume data ingestion and fast queries, we compared:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Apache Druid,<\/li>\n\n\n\n<li>ClickHouse,<\/li>\n\n\n\n<li>and StarRocks<\/li>\n<\/ul>\n\n\n\n<p>against 22 different criteria (e.g., performance, scalability, operability, popularity, and licensing). After weighing these factors against Opik\u2019s functional and non-functional requirements, we chose ClickHouse.<\/p>\n\n\n\n<p>Opik uses ClickHouse for datasets that require near real-time ingestion and analytical queries, such as:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>LLM calls and traces<\/li>\n\n\n\n<li>Feedback scores<\/li>\n\n\n\n<li>Datasets and experiments<\/li>\n<\/ul>\n\n\n\n<p>ClickHouse\u2019s MergeTree engine family is vital for high ingest speeds and large data volumes. We use the ReplacingMergeTree engine variant to minimize costly data mutations (updates and deletes). Some highlights:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Liquibase: manages schema definitions and versioning consistently.<\/li>\n\n\n\n<li>UUID v7: used for primary keys, leveraging natural timestamp ordering on disk to improve query performance.<\/li>\n\n\n\n<li>Primary key design: fields with lower cardinality appear first in the key to help with data partitioning and query efficiency.<\/li>\n\n\n\n<li>Deployment: Kubernetes Helm Charts for ClickHouse (alongside Zookeeper, the ClickHouse operator, and metrics exporters for Grafana\/Prometheus).<\/li>\n<\/ul>\n\n\n\n<p>The image below details the schema used by Opik in ClickHouse:<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"1005\" height=\"419\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/opik-clickhouse.png\" alt=\"\" class=\"wp-image-12674\" srcset=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/opik-clickhouse.png 1005w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/opik-clickhouse-300x125.png 300w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/opik-clickhouse-768x320.png 768w\" sizes=\"auto, (max-width: 1005px) 100vw, 1005px\" \/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-mysql\">MySQL<\/h2>\n\n\n\n<p>MySQL provides ACID-compliant transactional storage for Opik\u2019s lower-volume but critical data, such as:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Feedback definitions<\/li>\n\n\n\n<li>Metadata containers e.g., projects that group related traces<\/li>\n\n\n\n<li>Configuration data<\/li>\n<\/ul>\n\n\n\n<p>Again, Liquibase automates schema management and keeps MySQL definitions in sync with the rest of the platform.<\/p>\n\n\n\n<p>The image below details the schema used by Opik in MySQL:<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"512\" height=\"232\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/opik-mysql-schema-details.jpg\" alt=\"\" class=\"wp-image-12675\" srcset=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/opik-mysql-schema-details.jpg 512w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/opik-mysql-schema-details-300x136.jpg 300w\" sizes=\"auto, (max-width: 512px) 100vw, 512px\" \/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-redis\">Redis<\/h2>\n\n\n\n<p>Redis is employed as:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>A distributed cache: for high-speed lookups.<\/li>\n\n\n\n<li>A distributed lock: for coordinating safe access to certain shared resources.<\/li>\n\n\n\n<li>A rate limiter: to enforce throughput limits and protect scalability.<\/li>\n\n\n\n<li>A streaming mechanism: Redis streams power Opik\u2019s Online evaluation functionality; future iterations may integrate Kafka or similar platforms for even higher scalability.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-self-hosting\">Self Hosting<\/h2>\n\n\n\n<p>The easiest way to use Opik is via a free comet.com account. However, Opik\u2019s full open-source version can also be self-hosted with all core features (tracing, evaluation, production monitoring), but without the integrated user management provided by comet.com.<\/p>\n\n\n\n<p>There are two main ways to self-host:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Local Installation\n<ul class=\"wp-block-list\">\n<li>Based on Docker and Docker Compose.<\/li>\n\n\n\n<li>Requires only Docker installed on your machine.<\/li>\n\n\n\n<li>Quick-start instructions:\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/github.com\/comet-ml\/opik\/tree\/main\/deployment\/docker-compose\">https:\/\/github.com\/comet-ml\/opik\/tree\/main\/deployment\/docker-compose<\/a><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Kubernetes Installation\n<ul class=\"wp-block-list\">\n<li>Recommended for production-ready deployments.<\/li>\n\n\n\n<li>Highly configurable open-source Helm Charts (battle-tested at Comet).<br><br>More info:\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/github.com\/comet-ml\/opik\/tree\/main\/deployment\/helm_chart\/opik\">https:\/\/github.com\/comet-ml\/opik\/tree\/main\/deployment\/helm_chart\/opik<\/a><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<p>Comet also provides scalable, managed deployment solutions if you prefer a hands-off approach.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-observability\">Observability<\/h2>\n\n\n\n<p>Opik is built and runs on top of open-source infrastructure (MySQL, Redis, Kubernetes, and more), making it straightforward to integrate with popular observability stacks such as Grafana and Prometheus. Specifically:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>The backend uses OpenTelemetry for vendor-neutral instrumentation.<\/li>\n\n\n\n<li>ClickHouse deployments include an operator for real-time performance monitoring and metric exports to Grafana\/Prometheus.<\/li>\n\n\n\n<li>Other components (MySQL, Redis, Kubernetes) also have well-documented strategies for monitoring.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-community-contributions\">Community Contributions<\/h2>\n\n\n\n<p>Opik\u2019s roadmap and extensibility thrive on active community collaboration. We\u2019re excited to see how users contribute by writing code, improving documentation, and sharing feature ideas. If you\u2019d like to get involved, here are a few ways to get started:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Submit <a href=\"https:\/\/github.com\/comet-ml\/opik\/issues\">feature requests<\/a> and <a href=\"https:\/\/github.com\/comet-ml\/opik\/issues\">bug reports<\/a>.<\/li>\n\n\n\n<li>Open <a href=\"https:\/\/github.com\/comet-ml\/opik\/pulls\">Pull Requests<\/a> to propose code or documentation changes.<\/li>\n<\/ul>\n\n\n\n<p>Before contributing, please make sure to review our <a href=\"https:\/\/github.com\/comet-ml\/opik\/blob\/main\/CLA.md\">Contributor License Agreement<\/a> and <a href=\"https:\/\/github.com\/comet-ml\/opik\/blob\/main\/LICENSE\">License<\/a>. This ensures a smooth process and clarifies how your contributions are used and recognized. Together, we can make Opik an even more powerful platform for <a href=\"https:\/\/www.comet.com\/site\/blog\/llm-observability\/\">LLM observability<\/a>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-future-directions\">Future Directions<\/h2>\n\n\n\n<p>Opik\u2019s architecture is designed with extensibility in mind. Recent updates include a new Online evaluation feature that allows traces to be scored in real time, using an LLM as a judge. We plan to add user-defined Python code metrics soon, and a TypeScript\/JavaScript SDK is also underway.<\/p>\n\n\n\n<p>Some upcoming features will introduce notable architectural changes. For example, we plan to support file attachments like images or PDFs in new traces, which will require integrating an object storage system (e.g., Amazon S3 for AWS-based deployments or MinIO for self-hosting). You can explore more details on our public roadmap:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/www.comet.com\/docs\/opik\/roadmap\/\">https:\/\/www.comet.com\/docs\/opik\/roadmap\/<\/a><\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-conclusion\">Conclusion<\/h2>\n\n\n\n<p>Opik is a significant step forward in LLM evaluation and observability, combining cutting-edge technologies with a carefully planned, modular architecture. Its open-source nature and free availability empower a growing community of users, while Comet\u2019s infrastructure offers scaling options and commercial support if needed. Whether you adopt the managed service or self-host Opik, you gain a powerful, flexible framework for building next-generation LLM applications.<\/p>\n\n\n\n<p>For more information, visit Opik\u2019s <a href=\"https:\/\/github.com\/comet-ml\/opik\">GitHub repository<\/a> and <a href=\"https:\/\/www.comet.com\/docs\/opik\/\">documentation<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Opik is an open-source platform for evaluating, testing, and monitoring LLM applications, created by Comet. When teams integrate language models into their applications, they need ways to debug complex systems, analyze performance, and understand how their development work affects responses returned by an LLM in terms of accuracy, relevance, context awareness, and other qualities. The [&hellip;]<\/p>\n","protected":false},"author":132,"featured_media":12650,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"customer_name":"","customer_description":"","customer_industry":"","customer_technologies":"","customer_logo":"","_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[65,9],"tags":[],"coauthors":[227],"class_list":["post-12649","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-llmops","category-product"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v25.9 (Yoast SEO v25.9) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Building Opik: A Scalable Open-Source LLM Observability Platform<\/title>\n<meta name=\"description\" content=\"Comet&#039;s engineering team shares architectural decisions and more in a behind-the-scenes look at building this highly scalable production system.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.comet.com\/site\/blog\/llm-observability-architecture-engineering\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Building Opik: A Scalable Open-Source LLM Observability Platform\" \/>\n<meta property=\"og:description\" content=\"Comet&#039;s engineering team shares architectural decisions and more in a behind-the-scenes look at building this highly scalable production system.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.comet.com\/site\/blog\/llm-observability-architecture-engineering\/\" \/>\n<meta property=\"og:site_name\" content=\"Comet\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/cometdotml\" \/>\n<meta property=\"article:published_time\" content=\"2025-01-29T17:22:36+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-11-11T22:19:18+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/BuildingOpik-Blog-scaled-1.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"2560\" \/>\n\t<meta property=\"og:image:height\" content=\"1440\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Andr\u00e9s Cruz\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@Cometml\" \/>\n<meta name=\"twitter:site\" content=\"@Cometml\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Andr\u00e9s Cruz\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"10 minutes\" \/>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Building Opik: A Scalable Open-Source LLM Observability Platform","description":"Comet's engineering team shares architectural decisions and more in a behind-the-scenes look at building this highly scalable production system.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.comet.com\/site\/blog\/llm-observability-architecture-engineering\/","og_locale":"en_US","og_type":"article","og_title":"Building Opik: A Scalable Open-Source LLM Observability Platform","og_description":"Comet's engineering team shares architectural decisions and more in a behind-the-scenes look at building this highly scalable production system.","og_url":"https:\/\/www.comet.com\/site\/blog\/llm-observability-architecture-engineering\/","og_site_name":"Comet","article_publisher":"https:\/\/www.facebook.com\/cometdotml","article_published_time":"2025-01-29T17:22:36+00:00","article_modified_time":"2025-11-11T22:19:18+00:00","og_image":[{"width":2560,"height":1440,"url":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/BuildingOpik-Blog-scaled-1.jpg","type":"image\/jpeg"}],"author":"Andr\u00e9s Cruz","twitter_card":"summary_large_image","twitter_creator":"@Cometml","twitter_site":"@Cometml","twitter_misc":{"Written by":"Andr\u00e9s Cruz","Est. reading time":"10 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.comet.com\/site\/blog\/llm-observability-architecture-engineering\/#article","isPartOf":{"@id":"https:\/\/www.comet.com\/site\/blog\/llm-observability-architecture-engineering\/"},"author":{"name":"Mike Ranellone","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/b0df8d0db9a521af425e33f561b39c6a"},"headline":"Building Opik: A Scalable Open-Source LLM Observability Platform","datePublished":"2025-01-29T17:22:36+00:00","dateModified":"2025-11-11T22:19:18+00:00","mainEntityOfPage":{"@id":"https:\/\/www.comet.com\/site\/blog\/llm-observability-architecture-engineering\/"},"wordCount":1996,"publisher":{"@id":"https:\/\/www.comet.com\/site\/#organization"},"image":{"@id":"https:\/\/www.comet.com\/site\/blog\/llm-observability-architecture-engineering\/#primaryimage"},"thumbnailUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/BuildingOpik-Blog-scaled-1.jpg","articleSection":["LLMOps","Product"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.comet.com\/site\/blog\/llm-observability-architecture-engineering\/","url":"https:\/\/www.comet.com\/site\/blog\/llm-observability-architecture-engineering\/","name":"Building Opik: A Scalable Open-Source LLM Observability Platform","isPartOf":{"@id":"https:\/\/www.comet.com\/site\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.comet.com\/site\/blog\/llm-observability-architecture-engineering\/#primaryimage"},"image":{"@id":"https:\/\/www.comet.com\/site\/blog\/llm-observability-architecture-engineering\/#primaryimage"},"thumbnailUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/BuildingOpik-Blog-scaled-1.jpg","datePublished":"2025-01-29T17:22:36+00:00","dateModified":"2025-11-11T22:19:18+00:00","description":"Comet's engineering team shares architectural decisions and more in a behind-the-scenes look at building this highly scalable production system.","breadcrumb":{"@id":"https:\/\/www.comet.com\/site\/blog\/llm-observability-architecture-engineering\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.comet.com\/site\/blog\/llm-observability-architecture-engineering\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/blog\/llm-observability-architecture-engineering\/#primaryimage","url":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/BuildingOpik-Blog-scaled-1.jpg","contentUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/BuildingOpik-Blog-scaled-1.jpg","width":2560,"height":1440,"caption":"building an open-source llm observability platform at scale"},{"@type":"BreadcrumbList","@id":"https:\/\/www.comet.com\/site\/blog\/llm-observability-architecture-engineering\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.comet.com\/site\/"},{"@type":"ListItem","position":2,"name":"Building Opik: A Scalable Open-Source LLM Observability Platform"}]},{"@type":"WebSite","@id":"https:\/\/www.comet.com\/site\/#website","url":"https:\/\/www.comet.com\/site\/","name":"Comet","description":"Build Better Models Faster","publisher":{"@id":"https:\/\/www.comet.com\/site\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.comet.com\/site\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.comet.com\/site\/#organization","name":"Comet ML, Inc.","alternateName":"Comet","url":"https:\/\/www.comet.com\/site\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/#\/schema\/logo\/image\/","url":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/logo_comet_square.png","contentUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/logo_comet_square.png","width":310,"height":310,"caption":"Comet ML, Inc."},"image":{"@id":"https:\/\/www.comet.com\/site\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/cometdotml","https:\/\/x.com\/Cometml","https:\/\/www.youtube.com\/channel\/UCmN63HKvfXSCS-UwVwmK8Hw"]},{"@type":"Person","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/b0df8d0db9a521af425e33f561b39c6a","name":"Mike Ranellone","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/image\/f3d00b8d32dc8a46ea04b9a2ad465d29","url":"https:\/\/secure.gravatar.com\/avatar\/56dc2f32e4fc99604d8c4344d1a10237e5298fd2609bbc8d79d5ef1ab5b2e3a1?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/56dc2f32e4fc99604d8c4344d1a10237e5298fd2609bbc8d79d5ef1ab5b2e3a1?s=96&d=mm&r=g","caption":"Mike Ranellone"},"sameAs":["https:\/\/www.comet.com\/"],"url":"https:\/\/www.comet.com\/site\/blog\/author\/mikercomet-com\/"}]}},"jetpack_featured_media_url":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/BuildingOpik-Blog-scaled-1.jpg","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/12649","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/users\/132"}],"replies":[{"embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/comments?post=12649"}],"version-history":[{"count":2,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/12649\/revisions"}],"predecessor-version":[{"id":18355,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/12649\/revisions\/18355"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/media\/12650"}],"wp:attachment":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/media?parent=12649"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/categories?post=12649"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/tags?post=12649"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/coauthors?post=12649"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}