Shipwright is a private product loop and a public evidence page. It
asks what should be built next, compares a few possible directions,
records the strongest case, and only then turns the decision into a
bounded code change.
The screenshots here come from the live Shipwright page. The runtime
itself lives in a private GitHub repo, but the product trail is
public enough to inspect.
The live page starts with the decision state: selected direction,
active lane, judge signal, and deployment evidence.
Why it changed
The old product loop was backwards for autonomous work.
A common agent workflow still looks like this: ship a change,
open the preview, then ask product, QA, and design to react to
something that already exists. That can work for tiny fixes. It
breaks down when the agent is choosing product direction.
Shipwright moves that alignment earlier. Before code is pushed,
it frames the product question, creates competing options,
compares them with evidence, and turns the chosen option into one
clear work order.
Process shift
Prototype first. Reach consensus. Then build.
Old rhythm
Ship the first plausible implementation, then spend product,
QA, and design time aligning around what changed.
Shipwright rhythm
Generate multiple directions, compare them in public, choose
the strongest one, then let the agent implement a smaller ask.
Consensus is visible before implementation: what won, why it won,
how strong the margin was, and what should be handed to the agent.
The lane board keeps execution from feeling open-ended. One lane
owns the next move; other lanes are eligible, blocked, or off.
How the decision happens
The useful part is the pause before the commit.
01
Frame
Name the product question before asking an agent to edit code.
02
Prototype
Build more than one direction so the team can compare shape, not theory.
03
Judge
Use model votes, human judgment, screenshots, and readiness checks.
04
Agree
Let product, QA, and design see the same evidence before the work lands.
05
Build
Hand the winner to the agent as one bounded implementation target.
The product model is simple on purpose: observe, compare, route,
ship, report. The key change is that compare happens before route.
Why it relates
This is the same loop showing up across the other work.
Pixelbox
Pixelbox makes the work visible while using Codex from chat.
Shipwright adds a decision surface for what should be attempted
before the edit starts.
Forge
Forge is about durable autonomous engineering infrastructure.
Shipwright is the product layer that decides which small,
evidenced move is worth giving to that infrastructure.
Triline
Triline puts an AI into a live conversation. Shipwright does the
same kind of mediation for product work: humans and models share
the same context before action.
Product QA Design
The goal is not to replace review. It is to move review earlier,
when product intent, design tradeoffs, and QA risk can still
change what gets built.
The bigger bet
Autonomous shipping gets better when the decision is a shared artifact.
Make the question explicit before generating code.Show multiple candidate directions before asking for consensus.Keep the losing option visible so dissent is not erased.Turn the winner into a smaller, safer implementation target.Publish enough evidence that the next run can learn from it.