v0.0 · harness engineering platform
Compose. Measure. Swap.
Forgeplane is the source-of-truth coordination layer for AI agent systems. Bring your model gateway, durable runtime, memory store, inference fleet. The platform underneath measures whether the composition still works after you swap any one of them out, and tells you what each turn cost.
Thesis
The platform that refuses the bet.
The labs lap themselves every ninety days. Pin yourself to one and you rebuild the platform every quarter. Three claims, in order:
- 01
Swappable by design
Every Fitting plugs into the Kernel through a typed contract. Memory backend, durable runtime, model provider, inference fleet, isolation envelope, policy engine: replace any of them with your own implementation, the platform stays.
- 02
Conformance is measured, not asserted
Rigor is an executable benchmarking suite that scores agent behavior across Fitting swaps. Replace your memory backend, your inference fleet, your durable runtime: Rigor tells you exactly how the swap moves your benchmarks, in numbers, before you ship.
- 03
Cost discipline is a platform default
Rig routes by task shape: cheap models for grep work, frontier models for review and planning, workhorse models for the 90% middle. Every turn reports what it cost. Every Fitting reports its spend.
Lineup
The platform, then the Fittings.
Core is always Forgeplane. Reference Fittings are canonical, swappable for any implementation that satisfies the same contract.
Supervises Fittings, dispatches typed contract calls, propagates identity, persists the durable session event log.
Executable benchmarking suite that scores agent behavior across Fitting swaps. The measurement that makes pluggable architecture safe.
The agent harness. Runs interactive turns; consumed as a library by other Fittings.
The software factory. Plan, chunk, review, execute. Temporal-backed.
Curated memory and knowledge. Durable inner wood, durable knowledge.
Local inference serving. Warm-loaded models that retain heat.
A family: Process, Container, Hardened, Remote. Operator picks.
Authorization gate. Basic allowlist, primed for OPA / Cedar swaps.
The canonical TUI and CLI. Web, IDE, Slack adapters join later.
Managed external network access. Air-gap critical.
Trace
Every turn reports what it cost.
Token spend, latency, fallbacks, and conformance scores travel through the same OpenTelemetry shape the rest of your stack already uses.