ARC-AGI Optimization Tutorial
Tutorial example using ARC-AGI style code tasks
This guide introduces ARC-AGI, why it is a strong fit for optimizer-driven prompt iteration, and where to find the full, runnable implementation in the SDK.
Codebase entry point: sdks/opik_optimizer/scripts/arc_agi/tasks_optimizer.py and the ARC-AGI utilities in sdks/opik_optimizer/scripts/arc_agi/.
What is ARC-AGI?
ARC-AGI tasks are grid-based reasoning puzzles that test an agentโs ability to infer transformation rules from a few examples. They are a natural fit for optimization because small prompt changes can dramatically improve generalization across tasks.
Why use optimizers here?
ARC-AGI evaluation is deterministic and repeatable, which makes it ideal for iterative optimization. HRPO is especially useful because it captures failure modes and proposes targeted fixes.
How the SDK implementation works
The SDK ships a full ARC-AGI workflow you can run locally:
- Dataset loader:
sdks/opik_optimizer/src/opik_optimizer/datasets/arc_agi2.pyloads ARC-AGI-2 tasks and embeds optional grid images. - Prompt templates:
sdks/opik_optimizer/scripts/arc_agi/prompts/contains system and HRPO prompt templates. - Evaluator + metrics:
sdks/opik_optimizer/scripts/arc_agi/utils/code_evaluator.pyexecutes candidate solvers and scores ARC-AGI metrics viautils/metrics.py. - Optimizer wiring:
tasks_optimizer.pyconnects dataset, HRPO, metrics, and logging into a repeatable run.
If you want to run the code as-is, start with the tasks_optimizer.py entry point and follow the CLI flags listed at the top of that file.
Next steps
- Explore the ARC-AGI scripts in the repo and swap in your own datasets or prompt templates.
- Review the run summaries under
scripts/arc_agi/to compare optimizer iterations.