ARC-AGI-3 Kaggle Gateway Deployment Specification
Technical specification for the official ARC Prize 2026 Kaggle evaluation contract: validation vs scoring rerun, gateway sidecar integration, agent registration, submission.parquet schema, template-agent requirements, and shared ASRA notebook tooling. Documents the root cause of generic Kaggle Error failures and the gateway pattern that produced the first successful submission.
Status: Published preprint (SciLayer Systems)
Companion: ASRA Integrated Architecture
Purpose: Define the evaluation contract for ARC Prize 2026 — ARC-AGI-3 on Kaggle so notebooks pass validation and private scoring rerun. This spec is derived from the official ARC-AGI-3 Kaggle Starter and ASRA's gateway migration (Phase 1 v3, ref 53652655, public score 0.03).
1. Problem statement
Many ARC-AGI-3 submissions fail with a generic message:
Kaggle Error: A system error. Please try resubmitting to resolve the error and contact Kaggle Support if it persists.
Misleading symptom: Save & Run All succeeds, submission.parquet appears in notebook outputs, and the Kaggle API returns a submission reference. Failure occurs only on the private scoring rerun — a second execution Kaggle triggers after Submit.
Root cause: The notebook implemented the wrong evaluation contract — typically a standalone agent that calls run_swarm() against the public ARC API, instead of registering with the gateway sidecar and running main.py --agent myagent.
This is an evaluation plumbing problem, independent of agent intelligence. A perfect agent will still ERROR if the gateway path is skipped.
2. Two execution modes
Kaggle ARC-AGI-3 notebooks must branch on KAGGLE_IS_COMPETITION_RERUN:
| Mode | Trigger | What runs | Who writes submission.parquet |
|---|---|---|---|
| Validation | env var unset (Save & Run All, submit gate) | Install wheels, write agent to /tmp, write dummy parquet |
Notebook (placeholder row) |
| Scoring rerun | KAGGLE_IS_COMPETITION_RERUN=1 (hidden, after Submit) |
Wait for gateway → copy framework → register agent → python main.py --agent myagent |
Gateway sidecar records actions and emits real parquet |
Submit click
│
├─► Validation run (user-visible)
│ Install → write /tmp/my_agent.py → dummy parquet
│
└─► Scoring rerun (hidden)
curl gateway health
→ copy ARC-AGI-3-Agents
→ register MyAgent
→ main.py --agent myagent
→ gateway writes submission.parquet
Design implication: Track A (plumbing) and Track B (agent score) are separate. Fix Track A before iterating on Track B.
3. Notebook cell layout (required)
Four code cells, in order:
3.1 Install
Install from competition wheels only (no internet):
!pip install --no-index --find-links \
/kaggle/input/competitions/arc-prize-2026-arc-agi-3/arc_agi_3_wheels \
arc-agi python-dotenv
3.2 Write agent
Use %%writefile /tmp/my_agent.py — not /kaggle/working/my_agent.py. Keeping the agent in /tmp avoids extra output files in the Submit UI and matches the official starter.
3.3 Run / score (rerun branch)
When KAGGLE_IS_COMPETITION_RERUN is set:
- Health check —
curl --fail --retry … http://gateway:8001/api/games - Copy framework —
cp -r …/ARC-AGI-3-Agents /kaggle/working/ - Install agent —
cp /tmp/my_agent.py …/agents/templates/my_agent.py - Register agent — rewrite
agents/__init__.pywith'myagent': MyAgent - Gateway config — write
.envwithHOST=gateway,PORT=8001,OPERATION_MODE=online - Execute —
cd …/ARC-AGI-3-Agents && python main.py --agent myagent
3.4 Dummy submission (validation branch)
When not rerun:
import pandas as pd
submission = pd.DataFrame(
data=[['1_0', '1', True, 1]],
columns=['row_id', 'game_id', 'end_of_game', 'score'])
submission.to_parquet('/kaggle/working/submission.parquet', index=False)
Required columns: row_id, game_id, end_of_game, score.
Do not use alternate schemas such as game_id, score, levels_completed, actions, completed — Run All may pass, but the end-to-end pipeline expects the starter contract.
4. Template agent requirements
The agent spliced into /tmp/my_agent.py must satisfy:
| Requirement | Rationale |
|---|---|
Class named MyAgent |
Framework registration key 'myagent' |
Subclass agents.agent.Agent |
Official agent interface |
| No venv bootstrap in agent file | Wheels installed in notebook cell 1 |
No run_swarm() in agent file |
Scoring runs via main.py, not direct Swarm |
No if __name__ == "__main__" block |
Agent is imported, not executed standalone |
Implement choose_action(frames, latest_frame) |
Per-step decision hook |
Optional append_frame post-step hook |
Transition logging inside episode |
Two deployment shapes:
| Shape | Path | Use |
|---|---|---|
| Kaggle template | asra_phaseN_kaggle_template_agent.py |
Competition scoring |
| Local dev | asra_phaseN_my_agent.py |
Swarm, self-test, local episodes |
Only the template form belongs in the Kaggle notebook.
5. Gateway sidecar configuration
The scoring rerun writes .env for the copied framework:
SCHEME=http
HOST=gateway
PORT=8001
ARC_API_KEY=test-key-123
ARC_BASE_URL=http://gateway:8001/
OPERATION_MODE=online
ENVIRONMENTS_DIR=
RECORDINGS_DIR=/kaggle/working/server_recording
The sidecar at http://gateway:8001/ proxies game interaction during scoring. Agents must not hardcode the public three.arcprize.org endpoint for Kaggle scoring.
Trimmed agents/__init__.py pattern:
from typing import Type
from dotenv import load_dotenv
from .agent import Agent, Playback
from .swarm import Swarm
from .templates.random_agent import Random
from .templates.my_agent import MyAgent
load_dotenv()
AVAILABLE_AGENTS: dict[str, Type[Agent]] = {
'random': Random,
'myagent': MyAgent,
}
6. Submission artifacts
| Artifact | When | Producer | Notes |
|---|---|---|---|
/tmp/my_agent.py |
Both modes | Notebook cell 2 | Not in working output |
submission.parquet |
Validation | Notebook cell 4 | Dummy gate row |
submission.parquet |
Scoring rerun | Gateway | Real scores |
server_recording/ |
Scoring rerun | Gateway | Optional replay data |
Push/submit tooling: validation runs need only submission.parquet in kernel outputs. The agent file lives in /tmp.
7. ASRA shared tooling
ASRA centralizes gateway notebook generation in kaggle-notebooks/_shared/:
| Module | Role |
|---|---|
gateway_notebook.py |
Builds 4-cell .ipynb from template agent body |
phase_registry.py |
Phase 1–9 metadata (agent tag, kernel slug, paths) |
extract_template_agent.py |
Extract MyAgent subclass from legacy my_agent.py |
push_and_submit.py |
Kaggle API push + submit with output checks |
submit.sh |
Per-phase CLI wrapper |
Rebuild all phase notebooks:
cd kaggle-notebooks/_shared
./stage0_setup.sh
Per-phase submit:
cd kaggle-notebooks/phaseN
./submit.sh all "asra-vX.X-phaseN v3 official gateway pattern"
8. Migration checklist (Phases 1–9)
Apply before any score-focused iteration:
- Rebuild notebook via
gateway_notebook.py(official 4-cell layout) - Extract
*_kaggle_template_agent.py(MyAgent, no bootstrap) - Dummy gate parquet:
row_id, game_id, end_of_game, score - Agent at
/tmp/my_agent.py - Rerun branch: gateway health → copy → register →
main.py --agent myagent - Submit message includes
v3 official gateway patternfor traceability - Confirm status Succeeded (not Kaggle Error) before tuning agent logic
Verified submissions (ASRA):
| Phase | Ref | Status | Public score |
|---|---|---|---|
| 1 v3 | 53652655 | Succeeded | 0.03 |
| 2 v3 | 53660658 | Succeeded | 0.00 |
Phases 3–9: gateway notebooks rebuilt locally; competition submit pending daily quota.
9. Common false fixes (do not rely on these alone)
These improvements help local dev but do not fix scoring rerun if the gateway path is missing:
| Attempted fix | Why it fails alone |
|---|---|
Move venv to /tmp |
Scoring still skips gateway |
| Wheels-only install mirror | Same wrong execution path |
| Change dummy parquet column count | Gate may pass; rerun still errors |
Add --self-test in notebook |
Validation-only; not scoring contract |
Direct run_swarm() in agent |
Not how Kaggle scores |
10. Parallel: AI Agent Security competition
The same lesson applies across Kaggle agent competitions:
| Competition | Wrong assumption | Correct path |
|---|---|---|
| AI Agent Security | Write submission.csv manually |
Start inference server; gateway writes CSV on rerun |
| ARC Prize 2026 | Run standalone agent + Swarm | Gateway sidecar + main.py --agent myagent |
Always read the official starter notebook and match its rerun branch.
11. References
- ARC-AGI-3 Kaggle Starter — canonical evaluation pattern
- ASRA kaggle-notebooks/_shared — shared builder and submit tooling
- ASRA Phase 1 kernel — reference implementation
- ASRA Integrated Architecture — stack context
12. One-line takeaway
Generic Kaggle Error on ARC-AGI-3 usually means the notebook never ran the official gateway evaluation path. Match the starter's validation/scoring split, register MyAgent with the sidecar, and use the official dummy parquet schema — then agent improvements show up in the public score.