Contributing to the Backend

This guide will help you get started with contributing to the Opik backend.

Before you start, please review our general Contribution Overview and the Contributor License Agreement (CLA).

Project Structure

The Opik backend is a Java application (source in apps/opik-backend) that forms the core of the Opik platform. It handles data ingestion, storage, API requests, and more.

Setting Up Your Development Environment

1

1. Run External Dependencies

The backend relies on Clickhouse, MySQL, and Redis. You can start these using Docker Compose from the deployment/docker-compose directory in the repository:

$cd deployment/docker-compose
>
># Optionally, force a pull of the latest images
># docker compose pull
>
>docker compose up clickhouse redis mysql -d
2

2. Install Prerequisites for Backend Development

Ensure you have the following installed:

  • Java Development Kit (JDK) - Version specified in pom.xml (e.g., JDK 17 or newer)
  • Apache Maven - For building the project
3

3. Build and Run the Backend Locally

Navigate to the backend project directory and use Maven commands:

$cd apps/opik-backend
>
># Build the Opik application (compiles, runs tests, packages)
>mvn clean install
>
># Start the Opik application server
>java -jar target/opik-backend-{project.pom.version}.jar server config.yml

Replace {project.pom.version} with the actual version number found in the pom.xml file (e.g., 0.1.0-SNAPSHOT). Once started, the Opik API should be accessible at http://localhost:8080.

4

4. Code Formatting

We use Spotless for code formatting. Before submitting a PR, please ensure your code is formatted correctly:

$cd apps/opik-backend
>mvn spotless:apply

Our CI (Continuous Integration) will check formatting using mvn spotless:check and fail the build if it’s not correct.

5

5. Running Tests

Ensure your changes pass all backend tests:

$cd apps/opik-backend
>mvn test

Tests leverage the testcontainers library to run integration tests against real instances of external services (Clickhouse, MySQL, etc.). Ports for these services are randomly assigned by the library during tests to avoid conflicts.

6

6. Submitting a Pull Request

After implementing your changes, ensuring tests pass, and code is formatted, commit your work and open a Pull Request against the main branch of the comet-ml/opik repository.

Advanced Backend Topics

To check the health of your locally running backend application, you can access the health check endpoint in your browser or via curl at http://localhost:8080/healthcheck.

Opik uses Liquibase for managing database schema changes (DDL migrations) for both MySQL and ClickHouse.

  • Location: Migrations are located at apps/opik-backend/src/main/resources/liquibase/{{DB}}/migrations (where {{DB}} is db for MySQL or dbAnalytics for ClickHouse).
  • Automation: Execution is typically automated via the apps/opik-backend/run_db_migrations.sh script, Docker images, and Helm charts in deployed environments.
  • Manual Execution: To run DDL migrations manually (replace {project.pom.version} and {database} as needed):
    • Check pending migrations: java -jar target/opik-backend-{project.pom.version}.jar {database} status config.yml
    • Run migrations: java -jar target/opik-backend-{project.pom.version}.jar {database} migrate config.yml
    • Create schema tag: java -jar target/opik-backend-{project.pom.version}.jar {database} tag config.yml {tag_name}
    • Rollback migrations: java -jar target/opik-backend-{project.pom.version}.jar {database} rollback config.yml --count 1 OR java -jar target/opik-backend-{project.pom.version}.jar {database} rollback config.yml --tag {tag_name}
  • Requirements for DDL Migrations:
    • Must be backward compatible (new fields optional/defaulted, column removal in stages, no renaming of active tables/columns).
    • Must be independent of application code changes.
    • Must not cause downtime.
    • Must have a unique name.
    • Must contain a rollback statement (or use empty if Liquibase cannot auto-generate one). Refer to Evolutionary Database Design and Liquibase Rollback Docs.

DML (Data Manipulation Language) migrations are for changes to data itself, not the schema. - Execution: These are not run automatically. They must be run manually by a system admin using a database client. - Documentation: DML migrations are documented in CHANGELOG.md (link to GitHub: CHANGELOG.md) and the scripts are placed at apps/opik-backend/data-migrations along with detailed instructions. - Requirements for DML Migrations: * Must be backward compatible (no data deletion unless 100% safe, allow rollback, no performance degradation). * Must include detailed execution instructions. * Must be batched appropriately to avoid disrupting operations. * Must not cause downtime. * Must have a unique name. * Must contain a rollback statement.

You can query the ClickHouse REST endpoint directly. For example, to get the version:

$echo 'SELECT version()' | curl -H 'X-ClickHouse-User: opik' -H 'X-ClickHouse-Key: opik' 'http://localhost:8123/' -d @-

Sample output: 23.8.15.35