Closed beta — 10 spots remaining

The only agent that argues with itself before touching your code.

Before writing a single line, MAESTRO runs an adversarial debate between two frontier models — each reading your actual files, challenging every assumption, and grounding every decision in evidence. Nothing executes until you approve the spec. Nothing commits until it passes every quality gate.

Mission #m_20240315_001

92/100 complete ✓

Preflight

Research

Debate

Spec

Execution

Safety

Review

Commit

Proposer

OpenAI GPT-5.4

Implement SlowAPI rate limiter on /api/submit — 10 req/min per user. Modify backend/api/mission_routes.py and register Starlette middleware.

backend/api/mission_routes.py:38-52 requirements.txt:14

Challenger

Anthropic Claude Sonnet 4.6

Redis not present in config.yaml. Use SlowAPI with in-memory fallback. Middleware must be registered before routing or rate limiting silently fails.

backend/config.yaml:8 — no redis_url backend/main.py:22

Two models. One argument.

GPT-5.4 proposes. Claude Sonnet 4.6 challenges. Each reads your actual codebase — imports, call graphs, architecture — and disputes every claim that isn't grounded in evidence. The spec that reaches execution has survived an adversarial process no single model can replicate.

Codebase intelligence that compounds.

MAESTRO builds a live structural graph of your codebase using Tree-sitter AST parsing. Semantic search runs HyDE — generating hypothetical code snippets to bridge the vocabulary gap between intent and implementation. Episodic memory records every past mission on your project: what failed, why, and what was learned. Every mission makes the next one smarter.

You approve before a single file changes.

After debate, MAESTRO produces a specification: exact files, constraints, test expectations, confidence score, risk level. You read it. You approve it. Only then does execution begin. Not a suggestion — a contract between you and the agent.

Isolated execution. Verified exit.

Every mission runs inside an isolated Firecracker microVM. Seven deterministic safety checks run before review. A five-dimension structured review scores the output before commit. If anything fails, targeted revision is triggered with exact instructions — not a retry from scratch.

Every mission is a controlled operation.

Preflight

Branch snapshot, full file inventory, language detection. MAESTRO knows exactly what it is touching before it starts.

Research

AST-derived structural graph queries (Memgraph), HyDE-enhanced semantic search (FAISS), web research via Exa, and episodic memory from every past mission on this project. MAESTRO understands your architecture before the debate begins.

Debate

GPT-5.4 proposes. Claude Sonnet 4.6 reads your files and disputes every ungrounded claim. The spec that emerges has survived adversarial scrutiny.

Specification

Exact files to modify, constraints, test expectations, confidence score, risk level. You read it. You approve it. Nothing happens without your sign-off.

Execution

Isolated Firecracker microVM. Claude multi-agent executor with parallel sub-agents for complex tasks. No access to your system during execution.

Safety

Seven deterministic checks: secrets scan, scope enforcement, import validation, shrink detection, destructive operations, protected files, threat model.

Review

5-dimension scoring, severity gate, targeted revision

Commit

Clean diff committed, episode stored, codebase re-indexed

Intelligence layer

MAESTRO knows your codebase.
Not from a README — from the AST.

Structural graph

Every function, class, method, and import relationship in your codebase is indexed into a live knowledge graph using Tree-sitter AST parsing. MAESTRO queries this graph to find callers, dependencies, and impact chains before proposing any change.

HyDE semantic search

Hypothetical Document Embedding bridges the gap between a developer's intent and implementation vocabulary. Instead of searching for "rate limiter" and hoping for a match, MAESTRO generates a hypothetical code snippet and searches by embedding similarity — finding what you mean, not just what you typed.

Episodic memory

Every mission outcome is stored per project: what was attempted, what failed, what was learned. Before the next mission on your project, MAESTRO retrieves relevant past episodes and feeds them into the debate. The more you use MAESTRO on a project, the better it gets at that project specifically.

Tasks your team would actually delegate.

"Refactor the authentication service to replace JWT with rotating refresh tokens, update all 14 dependent endpoints, add Redis token blacklist, and maintain 100% test coverage."

Deep98/100

13 min · €0.58complete ✓

"Migrate the entire mission pipeline from synchronous FastAPI handlers to Temporal workflows with full saga rollback semantics, heartbeats, and idempotency guarantees."

Deep94/100

20 min · €0.82complete ✓

"Implement an in-memory caching layer with TTL, ETag/304 support, per-endpoint cache invalidation, cache statistics endpoint, and comprehensive integration tests."

Deep100/100

14 min · €0.58complete ✓

Integrations

Plugs into your entire stack.

Development environment

VS Code & Cursor

Delegate without leaving your editor. Approve the spec inline, the commit lands in your git panel. Native extension for both VS Code and Cursor Plugin Marketplace.

MCP server

MAESTRO exposes a full Model Context Protocol server. Any MCP-compatible tool — Cursor, Windsurf, Claude Code — can trigger missions programmatically. One protocol, every IDE.

GitHub & GitLab

Tag @maestro on any issue or PR. MAESTRO opens a PR with the fix. GitLab CI/CD, self-hosted instances, and GitHub Actions templates all supported.

Research & intelligence

Exa

Live web search during every research phase. Current library docs, recent CVEs, updated API references. The debate is grounded in what is true today — not training data from months ago.

Browserbase

MAESTRO fetches any live URL during research — GitHub READMEs, internal Confluence pages, staging environments, public documentation. If it is reachable, it is readable.

Notion & Figma

Import project specs directly from Notion pages. Pull UI component specs from Figma designs. MAESTRO reads your actual documentation before the debate begins, not a summary of it.

Project context

Supabase

MAESTRO reads your Supabase schema, Row Level Security policies, and edge function definitions before proposing any database change. No hallucinated migrations — the debate knows your actual data layer.

Fireflies

Meeting transcripts become project context. Architectural decisions made in a call, requirements discussed in a standup — MAESTRO reads them and feeds them into the research phase automatically.

Linear & Slack

Assign a Linear ticket directly to MAESTRO. Tag @maestro in any Slack channel. The spec approval appears in the thread. Your team sees exactly what will change before a single file is touched.

Deployment & testing

Vercel

After a successful mission, MAESTRO triggers a Vercel deployment automatically. Your preview URL is live before you finish reading the commit message.

Playwright

E2E tests run inside the Firecracker sandbox alongside your test suite. MAESTRO verifies that what it built actually works in a browser — not just that the unit tests pass.

CI/CD pipeline

When your test suite fails in CI, MAESTRO triggers automatically — analyses the failure, proposes a fix, opens a PR. Routine regressions fix themselves. Your engineers work on harder problems.

We are looking for 10 developers who build serious software.

Your access

Full unrestricted access to MAESTRO

All mission types — Swift and Deep

Direct line to the founding team

Your feedback shapes the roadmap

Completely free — no card, no commitment

Beta is open

Who should apply

Python or TypeScript developer

Active GitHub repository

Runs at least 5 missions/week

Willing to give direct feedback

Located anywhere

What we need from you

Use MAESTRO on real projects

Report issues when you find them

One 30-min call per month

Share honest feedback

That is it

Spots are limited. Applications reviewed within 48 hours.