AFK Builds with 100% Green PRs (Chunk Sidecars Inside the Agent Loop)

I had not touched the keyboard in twenty minutes. CI pipeline #1205 was green anyway. An agent had built another Snake game from our standard Loop Lab benchmark while I stayed out of the way. My local Review Gate failed three times first, each time on a Chunk sidecar check, not merely on my local lint-and-test run. All before a single push to GitHub.

If sidecars are new to you, a Chunk sidecar is a lightweight remote microVM that sits in your inner loop. Your workspace syncs there, Chunk runs a microbuild (lint, tests, build, and the repo policy hooks you configured), and you get CI-shaped feedback on Linux before your branch leaves the laptop.

CircleCI CTO Rob Zuber describes the goal as rebalancing inner and outer loop validation so agents are not learning about failures only after push.

Our three early sidecar failures were unrelated. First, a missing LEGAL_DISCLAIMER.md at the repo root (legal policy gate). Second, a missing SNAKE.md marker file in the experiment workspace (documentation policy gate). Third, a TypeScript compile error the Snake workspace's local tests never exercised but the sidecar's full pnpm build caught.

All three times, the lint-and-test run on my Mac localhost said green, but the red Chunk sidecar said "fix this stuff." CI Doctor landed one fix(ci-sidecar): commit per issue. Only then did my local Review Gate allow a push to GitHub.

That is a small lab story. We think it is a bigger CI story.

We humbly believe we are building toward the future of continuous integration in the agent era: validation woven through the whole loop, ending on a 100% green pull request you can merge with confidence.

Local lint and tests first. A Chunk sidecar microbuild that partially mirrors your pipeline while the agent still has context. Then CircleCI on the branch, doing what CI has always done, checking integration before a human approves merge.

AFK application builds that land as 100% green PRs are what we can demo today. The thesis underneath: CI validation that keeps up with the pace of your local coding agents.

We ran five Snake game builds to pressure-test that future. What follows is the lab report.

Hypothesis

If RalphCI runs Chunk sidecar validation (chunk validate --remote) after local lint:fix and test:run, then:

A coding agent can run mostly unattended through build and fix cycles (while a CI Doctor agent handles sidecar and pipeline failures locally).
Commits that pass the local Review Gate and push to GitHub show up on the branch as a 100% green PR: with every mirrored CI pipeline passing on the branch.
CircleCI remains the authority at merge time. Sidecars extend CI into the inner loop; they do not replace it.

Our February 10 study already showed CI-aware RalphCI can get to green pipelines on this Snake benchmark when the agent answers to CircleCI after push. This experiment adds pre-push CI parity via sidecars so the PR itself is green top to bottom, and not just at the end of the PR.

Setup

Benchmark: Five independent runs in ralph-ci/experiments/w_chunk-sidecars/iteration-{1..5}/. Same Snake spec as prior Loop Lab RalphCI work: 7 tasks, Vitest TDD, canvas game.

The stack (from top to bottom):

Layer	What runs
Build Agent	Codes against `tasks.json`
Review Gate (local)	`format:fix`, `lint:fix`, `test:run` (60s cap)
Review Gate (Chunk)	`chunk validate --remote` on Chunk sidecar (Linux microVM)
Smart push	Push only when local review gate passes
CircleCI	Full `ci-workflow` on the branch after push
CI Doctor	Fixes sidecar or pipeline failures with full logs
Human in the loop	Approval gate before merge (config default)

Chunk mirrors CI: .chunk/config.json runs policy gates (--gate-legal, --gate-snake) plus remote pnpm lint, pnpm test:run, pnpm build, matching CircleCI jobs on the branch.

Sample size: 5 runs. Lab notebook scale.

Results

Five runs at a glance

Run	RalphCI iterations	Sidecar CI Doctor fixes	Tasks completed	Final branch outcome
1	11	3	7/7	Green PR (all checks)
2	13	4	7/7	Green PR (all checks)
3	11	3	7/7	Green PR (all checks)
4	11	3	7/7	Green PR (all checks)
5	11	3	7/7	Green PR (all checks)

Totals: 57 RalphCI iterations, 16 sidecar repair loops, 5/5 runs ended with all tasks complete and green CircleCI on the branch.

Source: iteration-*/metrics.json and activity.md.

Local green is not PR green

Every run, early iterations: Review Gate fails on Chunk sidecar, not local lint-and-test run. On all runs, the first three sidecar fixes were: restore LEGAL_DISCLAIMER.md (legal gate), add the SNAKE.md experiment marker file (documentation gate), then fix a tsc error in run-ci.ts surfaced by the monorepo build step.

Same pattern across runs: failures local tests never see. CI Doctor agent fixes them without a human in the loop. Only then does the local Review Gate allow a push to GitHub.

Layer	Typical signal on early commits
Local `pnpm test:run`	Pass
Chunk sidecar microbuild	Fail (fix CI-shaped feedback until sidecar passes)
CircleCI on the branch	Green once sidecar gate passes

After the sidecar gate cleared, pushes stayed green on CircleCI. All CircleCI pipelines passed upon push to remote. No surprise reds on mirrored work.

Compared to our February baseline

On the same Snake benchmark without sidecars in the Review Gate, agents often finished locally while the branch still failed CircleCI.

With sidecars before push plus CI on the branch, five for five ended as 100% green PRs in our runs: build, lint, policy gates, and test jobs all passing on the merge target after push to GitHub.

The continuous integration breakthrough is three layers: local, sidecar, pipeline. AFK works when failure shows up in layer two, so layer three confirms instead of contradicts.

Takeaway

Rob Zuber framed inner-loop validation for agents. RalphCI plus Chunk sidecars are one way to wire that into an orchestration loop so AFK does not mean "ship and pray."

What held across five Snake game builds:

Sidecars catch CI-shaped failures while the agent still has context.
A 100% green PR needs three layers: local, sidecar, pipeline. Skip one, get surprised downstream.
Humans review green branches ONLY. They rarely see red, not even on a single commit.

Green CI is still priceless. Chunk sidecars are how you earn a 100% green PR before you even ask a human to look, and a green PR instills more confidence in the agent's work than a red one.

The artifact: a 100% green PR ready for review

This is the outcome we care about. A pull request an agent built while I was AFK, with every last CircleCI check green on the branch after push to GitHub.

Classic Snake game pull request built by RalphCI with Chunk sidecars: all checks passed, merge ready

Twelve commits on the branch. Task work, three fix(ci-sidecar): repairs, finalize, metrics. At the bottom of the page: All checks have passed. Five green jobs:

ci/circleci: build
ci/circleci: lint
ci/circleci: require-legal-disclaimer-md
ci/circleci: require-snake-md
ci/circleci: test

No conflicts with base. Merge pull request is the honest end state.

That screenshot is what we mean by AFK builds and 100% green PRs: every CircleCI check green before a human opens the branch, because validation ran local, then on a Chunk sidecar, then on the pipeline, and the agent fixed failures in between.

Five Snake game builds proved the workflow. In future runs, we may expand the repo complexity and scope, and the review gates along with it, to see how far we can push this thesis.

But we will keep the overall stack, because we humbly think that this is where continuous integration is going.

Hypothesis

Setup

Results

Five runs at a glance

Local green is not PR green

Compared to our February baseline

Takeaway

The artifact: a 100% green PR ready for review

Related experiments

The Sidecar Race: 22 Seconds vs 69 Seconds Inside the Agent Loop

What Snake Games Have Taught Us About Shipping with AI Agents

Hardening RalphCI loops for open source after the February 2026 study