← Back to home

Shipwright field note

Make the decision before the agent writes code.

Shipwright is a private product loop and a public evidence page. It asks what should be built next, compares a few possible directions, records the strongest case, and only then turns the decision into a bounded code change.

The screenshots here come from the live Shipwright page. The runtime itself lives in a private GitHub repo, but the product trail is public enough to inspect.

Shipwright live page showing selected direction, active lane, judge signal, and deploy status.
The live page starts with the decision state: selected direction, active lane, judge signal, and deployment evidence.
Why it changed

The old product loop was backwards for autonomous work.

A common agent workflow still looks like this: ship a change, open the preview, then ask product, QA, and design to react to something that already exists. That can work for tiny fixes. It breaks down when the agent is choosing product direction.

Shipwright moves that alignment earlier. Before code is pushed, it frames the product question, creates competing options, compares them with evidence, and turns the chosen option into one clear work order.

Process shift

Prototype first. Reach consensus. Then build.

Old rhythm

Ship the first plausible implementation, then spend product, QA, and design time aligning around what changed.

Shipwright rhythm

Generate multiple directions, compare them in public, choose the strongest one, then let the agent implement a smaller ask.

Shipwright UI consensus panel showing Variant A as accepted with score, judges, and experiment subjects.
Consensus is visible before implementation: what won, why it won, how strong the margin was, and what should be handed to the agent.
Shipwright lane board showing intake, build, review, release, and observe lanes with active and blocked states.
The lane board keeps execution from feeling open-ended. One lane owns the next move; other lanes are eligible, blocked, or off.
How the decision happens

The useful part is the pause before the commit.

01

Frame

Name the product question before asking an agent to edit code.

02

Prototype

Build more than one direction so the team can compare shape, not theory.

03

Judge

Use model votes, human judgment, screenshots, and readiness checks.

04

Agree

Let product, QA, and design see the same evidence before the work lands.

05

Build

Hand the winner to the agent as one bounded implementation target.

Shipwright product model showing observe, compare, route, ship, and report stages with a decision evidence layer.
The product model is simple on purpose: observe, compare, route, ship, report. The key change is that compare happens before route.
Why it relates

This is the same loop showing up across the other work.

Pixelbox

Pixelbox makes the work visible while using Codex from chat. Shipwright adds a decision surface for what should be attempted before the edit starts.

Forge

Forge is about durable autonomous engineering infrastructure. Shipwright is the product layer that decides which small, evidenced move is worth giving to that infrastructure.

Triline

Triline puts an AI into a live conversation. Shipwright does the same kind of mediation for product work: humans and models share the same context before action.

Product QA Design

The goal is not to replace review. It is to move review earlier, when product intent, design tradeoffs, and QA risk can still change what gets built.

The bigger bet

Autonomous shipping gets better when the decision is a shared artifact.

Make the question explicit before generating code. Show multiple candidate directions before asking for consensus. Keep the losing option visible so dissent is not erased. Turn the winner into a smaller, safer implementation target. Publish enough evidence that the next run can learn from it.