Guaranteed availability

Your agent shouldn’t break because of a single failed network call. The SDK has two layers of protection built in — an in-memory cache and a hardcoded fallback — so your agent always has a configuration to work with.

Caching layer

The SDK caches the last successfully fetched configuration in memory with a default TTL of 300 seconds (5 minutes). A background thread refreshes stale entries before they expire, so your agent never blocks on a network call during normal operation.

If Opik is unreachable during a refresh, the SDK keeps serving the cached value — your agent keeps running.

You can tune the cache TTL by setting the OPIK_CONFIG_TTL_SECONDS environment variable:

$OPIK_CONFIG_TTL_SECONDS=60 # refresh every minute

Fallback configuration

On the very first call (cold start), or if the cache is empty and Opik is unreachable, you can provide a hardcoded fallback. The SDK returns it instead of raising an error, so your agent can start serving requests right away — even before it has contacted the backend.

1import opik
2
3client = opik.Opik()
4
5class MyConfig(opik.Config):
6 model: str = "gpt-4o-mini"
7 temperature: float = 0.7
8 system_prompt: opik.Prompt = None
9
10FALLBACK = MyConfig(
11 model="gpt-4o-mini",
12 temperature=0.7,
13 system_prompt=opik.Prompt(
14 name="system_prompt",
15 prompt="You are an AI assistant.",
16 ),
17)
18
19@opik.track(project_name="my-agent")
20def run_agent(user_input: str):
21 cfg = client.get_or_create_config(fallback=FALLBACK)
22
23 if cfg.is_fallback:
24 print("Warning: running on fallback configuration")
25
26 response = call_llm(
27 model=cfg.model,
28 temperature=cfg.temperature,
29 system_prompt=str(cfg.system_prompt),
30 )
31 return response

Define your fallback as a top-level constant so it stays consistent across calls and is easy to spot in code review.

Detecting fallback usage

The returned config has an is_fallback (Python) / isFallback (TypeScript) property. It’s true when the SDK used your local fallback instead of a backend-fetched configuration. Useful for logging or alerting.

In practice you’ll rarely hit the fallback. Once the first successful fetch populates the cache, the SDK keeps it fresh in the background. The fallback only kicks in during a cold start when the backend isn’t reachable.