Self-contained domain-expert AI

v0.1

MaxSavant

An expert in your most complex enterprise system — in a box. It does the real work, and it earned that expertise by drilling against a safe fork of your system until every answer was provably right.

PRODUCT INTRODUCTION · 2026 Not a database. Not a proofing tool.

01 — The Problem

Frontier models are brilliant in general — and shallow on your specific system.

Point a general model at a complex ERP or database platform and it writes plausible-but-wrong output. Teams fix this by hand — feeding docs and examples, wiring tools, then babysitting a trial-and-error loop until the model finally "gets it." Bespoke, slow, and locked in one engineer's head.

1000s

of opaque APIs the model has never reliably seen

Non-obvious

data models with gotchas no doc spells out

Strict

correctness rules where "close" silently breaks production

MAXSAVANT 02 / 15

02 — Why it's unsolved

Sovereignty makes it worse.

To keep data in-region, private deployments run smaller open models — which are even shallower on the domain. The harder the constraint, the worse general models perform.

The gap

There is no automated way to take a general — ideally sovereign — model and make it deeply, verifiably competent at one specialized system, and keep that competence persistent and improving.

MAXSAVANT 03 / 15

03 — What MaxSavant is

A self-contained AI that is a deep expert in your stack — and earned it by drilling against your real system.

It is

+A domain-expert agent you deploy and use — it does real work on your system.

+Intelligence in a box: the docs, data model, APIs, gotchas — and it acts.

+Self-improving: it gets sharper the more it's used, on its own.

It is not

–A QA / testing / "proofing" tool that grades someone else's AI.

–A verification service. The verifier is inside it — the engine, not the product.

–A database. It runs on Postgres, but the DB is plumbing, not the point.

MAXSAVANT 04 / 15

04 — How it works

The self-learning loop

v1 changes no model weights — pure scaffolding. The engine that makes it expert.

01

Attempt

Agent retrieves docs, schema & skills, plans an action, acts via MCP tools.

→

02

Execute

Runs against an isolated copy-on-write fork of the dev instance.

→

03 · CORE

Verify

The verifier scores pass / fail against the real instance. Auditable truth.

→

04

Iterate

On fail, structured feedback — errors, diffs, failing checks — drives the retry.

→

05

Distill skill

Every success becomes a reusable skill in persistent memory.

↻ SLEEP-TIME An offline job re-reads accumulated episodes, merges & prunes skills, and re-optimizes prompts — the system consolidates while idle.

MAXSAVANT 05 / 15

05 — The moat

The verifier is what makes competence provable — and hard to copy.

It requires a real, forkable environment plus curated oracles per domain. That is the central cost — and the barrier.

PRIMARY

Deterministic

Runs the output against the real instance. Does the migration apply? Do the rows reconcile? Do tests pass, types check, invariants hold? Binary, auditable, cheap to trust.

SECONDARY

LLM-judge

Scores semantic quality where determinism can't — readability, idiomatic style, security smells. Used only to rank among passers, never to override a hard fail.

Verifier results are first-class data — the evidence behind every skill, and the metric behind every benchmark.

MAXSAVANT 06 / 15

06 — Worked example · database vertical

TASK "Add a region dimension to the revenue rollup and backfill it."

RETRIEVE

Pulls the relevant schema and a known-good migration example.

DRAFT & RUN

Drafts the migration via MCP tools, runs it on a forked dev DB.

VERIFY → FAIL

A constraint fails. Structured feedback returns the exact error.

2ND PASS

Fixes the constraint; migration applies, rollup totals reconcile.

SKILL SAVED

"Safe schema-migration + reconciliation" — ready for next time.

retrieve→ run→ fail→ fix→ verified + remembered

MAXSAVANT 07 / 15

07 — The Operator Console

Where the learning is visible — and provable.

Illustrative mockup · bespoke, part of the moat — no off-the-shelf tool shows this.

MaxSavant

●Loop Monitor

○Verifier

○Skill Library

○Domains

○MCP / Tools

○Audit Log

DOMAIN

Acme ERP

Postgres · in-region

Loop Monitor / revenue-rollup-region

LOOP RUNNING · EP 1,284

CURRENT TASK · ATTEMPT 2 / 4 sandbox: fork-9f3a ✓

Add a region dimension to the revenue rollup and backfill it.

THOUGHTNeed to alter fact_revenue and rebuild the rollup view.

ACTIONrun_migration(0142_add_region.sql)

OBSERVEapplied in 1.8s · 3 objects altered

ACTIONreconcile_rollup(period=Q1)

VERIFYtotals reconcile · Δ = 0.00 · PASS

▍distilling skill

VERIFIER · HELD-OUT

87.4%

pass rate

pass

fail

baseline 61.2% · +43% rel.

IMPROVEMENT CURVE pass % vs episodes

— baseline1,284 episodes

SKILLS DISTILLED · 142

◆Safe schema-migration + reconciliationv3 · 28×

◆Idempotent backfill with batchingv2 · 17×

08 — Using the agent

Two surfaces. The console proves it; the chat puts it to work.

White-label chat · commodity. Points at the sovereign model.

Acme ERP Expert · Sovereign model · in-region

Our Q1 EMEA revenue looks off in the rollup. Can you find and fix it?

The EMEA rows skipped the new region mapping for two subsidiaries. I drafted the corrective backfill, ran it on a fork, and the verifier confirms Q1 totals now reconcile (Δ = 0.00).

✓ verified used skill · Safe schema-migration + reconciliation

Ask the expert anything about Acme ERP…

↑

Every answer is backed by a verified action, not a guess.

The agent acts through the same MCP tools, runs on a sandbox fork, and shows the verifier's receipt — so an operator can trust the work.

▪Cites the skills it reused

▪Runs in-region on a swappable open model

▪Full audit trail back to the console

09 — Why this scales

One pipeline. Swap the domain plug-in — and re-run for the next system.

Supply five things per vertical and MaxSavant produces a specialized agent — onboarding a domain means authoring the verifier + oracle set, not rebuilding the platform.

01

Dev/test env

Containerized, forkable instance of the target system.

02

Verifier

Deterministic checker + optional LLM-judge.

03

Task / oracle set

Tasks with machine-checkable success criteria.

04

Corpora

Docs, example code, API specs for RAG + Auto-MCP.

05

Tool adapters

Or just a spec to auto-generate them.

DATABASE → ERP → CODE → … Same engine, new vertical. The cost of the first domain becomes leverage on every next one.

MAXSAVANT 10 / 15

10 — Sovereign by default

Built for the customers general AI can't reach.

Permissive-only

Apache-2.0 / MIT / BSD / PostgreSQL throughout. No AGPL/SSPL surprises. SBOM maintained.

Self- & EU-hostable

Runs entirely in the customer's environment. Nothing leaves the region.

Per-tenant isolation

Corpus, skills & run history stay in the customer's own Postgres, in-region — never pooled across tenants, no external egress.

Model-swappable

Thin abstraction layer — swap the base model as the frontier moves.

MAXSAVANT 11 / 15

11 — The stack

Capabilities, not vendors. The whole stack is white-labeled.

Every layer is interchangeable, permissive open-source, and self-hostable. We ship the capability — never a product name.

KNOWLEDGE STORE

Bundled vector + graph + text store

One self-hosted engine — branded as MaxSavant. The system's memory.

SOVEREIGN MODEL

Swappable open-weights core

Reasoning model served in-region; replace it as the frontier moves.

AGENT RUNTIME

Graph-based orchestration

Drives the attempt → verify → iterate loop.

AUTO-TOOLS · MCP

Generated, pruned tool layer

Built from the target system's own API surface, then trimmed to what works.

SKILL MEMORY

Persistent, versioned skills

Distilled from every verified success and reused.

OPTIMIZER

Automatic prompt & example tuning

Learns from winning trajectories — no weight changes in v1.

OPERATOR CONSOLE

Bespoke proof-of-learning UI

The surface that shows the agent getting sharper.

WHITE-LABEL CHAT

Rebrandable chat surface

Over the sovereign model — in the customer's own colors.

PERMISSIVE OSS ONLY Apache-2.0 / MIT / BSD throughout — no copyleft surprises. Everything self-hostable and rebrandable. You ship MaxSavant, not a parts list.

MAXSAVANT 12 / 15

12 — Why now

Three curves cross at once.

01 · CAPABILITY

Open models are finally good enough

Apache-licensed models can be made expert with scaffolding alone — no weight changes required in v1.

02 · METHOD

Verifiable-reward learning matured

Execution-feedback and skill-learning results are now published and repeatable — the loop is proven science.

03 · DEMAND

Sovereignty mandates create the buyer

EU & regulated enterprises need expert AI that can't use a US frontier API — and have budget for it.

MAXSAVANT 13 / 15

13 — Business model

Land one domain. Expand across the stack.

LAND

Platform license

per domain · annual

The scaffolding tier: deployed expert agent + Operator Console for one system. Priced on the engineering it replaces.

EXPAND

Additional verticals

+ per domain plug-in

Same engine, new System Y. Marginal cost falls as the pipeline hardens; net-revenue retention compounds.

PREMIUM

Post-training tier

gated SKU

SFT + RLVR against the same verifier, for customers who plateau on scaffolding and have the volume to justify GPU spend.

Pricing structure illustrative · specific figures TBD with design partners.

MAXSAVANT 14 / 15

14 — MVP & roadmap

A focused Phase 1 with a hard finish line.

PHASE 1 · IN SCOPE

✓One checkable database/code vertical

✓Branded Postgres bundle + schema

✓RAG + pruned Auto-MCP toolset

✓Loop with working deterministic verifier

✓Skill distillation into the DB

✓Thin console + white-label chat

OUT OF MVP

Post-training tier Multi-tenant scale-out Second vertical

DEFINITION OF DONE

On a fixed held-out task set, the MaxSavant agent shows a statistically meaningful, verifier-scored improvement over a vanilla baseline on the same model.

≥ double-digit

relative lift — consistent with published skill-learning results.

MAXSAVANT 15 / 15