v0.0 · harness engineering platform

Compose. Measure. Swap.

Forgeplane is the source-of-truth coordination layer for AI agent systems. Bring your model gateway, durable runtime, memory store, inference fleet. The platform underneath measures whether the composition still works after you swap any one of them out, and tells you what each turn cost.

Request access

Spine: Kernel + Rigor
Conversation: Rig, cost-aware multi-model
Telemetry: OpenTelemetry, every turn, every Fitting
Posture: Bring your own everything

Thesis

The platform that refuses the bet.

The labs lap themselves every ninety days. Pin yourself to one and you rebuild the platform every quarter. Three claims, in order:

01
Swappable by design

Every Fitting plugs into the Kernel through a typed contract. Memory backend, durable runtime, model provider, inference fleet, isolation envelope, policy engine: replace any of them with your own implementation, the platform stays.
02
Conformance is measured, not asserted

Rigor is an executable benchmarking suite that scores agent behavior across Fitting swaps. Replace your memory backend, your inference fleet, your durable runtime: Rigor tells you exactly how the swap moves your benchmarks, in numbers, before you ship.
03
Cost discipline is a platform default

Rig routes by task shape: cheap models for grep work, frontier models for review and planning, workhorse models for the 90% middle. Every turn reports what it cost. Every Fitting reports its spend.

Lineup

The platform, then the Fittings.

Core is always Forgeplane. Reference Fittings are canonical, swappable for any implementation that satisfies the same contract.

Core

Core · always

Kernel

role · runtime spine

Supervises Fittings, dispatches typed contract calls, propagates identity, persists the durable session event log.

Core · always

Rigor

role · conformance + benchmarking

Executable benchmarking suite that scores agent behavior across Fitting swaps. The measurement that makes pluggable architecture safe.

Reference Fittings

Reference · swappable

Rig

role · conversation

The agent harness. Runs interactive turns; consumed as a library by other Fittings.

Reference · swappable

Factory

role · workflows

The software factory. Plan, chunk, review, execute. Temporal-backed.

Reference · swappable

Heartwood

role · memory

Curated memory and knowledge. Durable inner wood, durable knowledge.

Reference · swappable

Ember

role · local inference

Local inference serving. Warm-loaded models that retain heat.

Reference · swappable

Sandbox

role · isolation

A family: Process, Container, Hardened, Remote. Operator picks.

Reference · swappable

Vise

role · policy

Authorization gate. Basic allowlist, primed for OPA / Cedar swaps.

Reference · swappable

Smith

role · surface

The canonical TUI and CLI. Web, IDE, Slack adapters join later.

Reference · swappable

Dead Drop

role · egress

Managed external network access. Air-gap critical.

Trace

Every turn reports what it cost.

Token spend, latency, fallbacks, and conformance scores travel through the same OpenTelemetry shape the rest of your stack already uses.

> compare heartwood-v3 with redis-vec

opus-4.7 delegated:

└─ ◆ rigor · memory.parity · 128 cases → 122 pass · 14.2s · $0.31

└─ ◆ glm-5.1 · prose-read · 12.1k → 800 tokens · 4.8s · $0.06

Heartwood passes 122/128; redis-vec passes 119/128. Recall-at-3 regresses 4.1pts on memory.parity.recency. Cost parity within 8%. Heartwood wins on this suite. Run rigor memory.parity --json for the per-case diff.

opus-4.7 · 2 delegations · 19.0s · $0.37 · stop

Compose. Measure. Swap.

The platform that refuses the bet.

Swappable by design

Conformance is measured, not asserted

Cost discipline is a platform default

The platform, then the Fittings.

Every turn reports what it cost.