One goal. Autonomous
execution.
Submit a development objective. CodeHelm decomposes it into a structured task plan, dispatches specialized AI agents to work in parallel, validates the results against quality gates and acceptance criteria, and iterates autonomously — until the work is done.
No credit card required · 14-day Pro trial · BYOK / BYOS
The execution engine between your spec and validated code
CodeHelm is not a wrapper around Claude or Codex. It is an orchestration runtime that takes a development objective, breaks it into a structured task plan, and coordinates multiple specialized AI agents to carry out the work — with quality gates and an LLM critic evaluating every result before anything is accepted.
When results fail validation, CodeHelm re-plans and retries automatically. When runs succeed, the changes are merged and a handoff summary is generated. The whole process is auditable, observable, and team-safe from day one.
Not a wrapper. An execution engine.
Every part of CodeHelm is designed around autonomous, validated, multi-agent software execution — not just API key management.
Autonomous Decomposition
CodeHelm parses your objective with an LLM, extracts requirements and acceptance criteria, then generates a structured task DAG — before a single agent runs.
Multi-Agent Dispatch
Tasks are assigned to specialized agents: Cursor for frontend work, Codex for backend logic, Claude Code for architecture and refactoring. Each runs in an isolated git worktree.
Plugin Skills
Activate reusable plugins like Ottili Frontend Design to bias every run with domain-specific instructions. Pro workspaces can create custom plugins for their own workflows.
Quality Gates
Every completed task passes through configurable validation checkpoints: syntax checks, lint, test suites, build steps, and health checks — before results are accepted.
LLM Critic Evaluation
An LLM evaluator reviews each result against the original acceptance criteria. If it doesn't meet the bar, CodeHelm re-plans the task and retries with an adjusted approach.
Automatic Retry & Refinement
Failed validation triggers a re-plan loop, not a dead end. CodeHelm adjusts the approach and dispatches again — up to configurable iteration limits.
Continuous Improvement
Between major runs, the Micro Evolution Engine scans for improvement signals: TODO/FIXME comments, failed tests, log errors, and health alerts — and resolves them conservatively.
Isolated Execution & Merge
Agents work in separate git worktrees so changes never conflict mid-run. On validation success, the Orchestrator merges results and runs a final gate pass on the combined output.
Full Lifecycle Observability
Every run moves through a structured state machine with 13 states. Every transition is logged, timestamped, and auditable. Prometheus metrics and webhook callbacks for every event.
Team-Safe Orchestration
Multi-workspace isolation with RBAC (owner, admin, member, viewer). AI credentials are workspace-scoped, encrypted at rest, and never proxied through CodeHelm's infrastructure.
What happens during a CodeHelm run
From objective to validated output — every step is automated, observable, and recoverable.
Submit objective
Provide a development goal, spec document, or task description. CodeHelm handles the rest.
Plan & decompose
An LLM parses your objective, identifies requirements, and generates a task dependency graph with agent assignments.
Agents execute
Specialized agents (Cursor, Codex, Claude Code) work on assigned tasks in isolated git worktrees simultaneously.
Validate & merge
Quality gates and an LLM critic evaluate every result. Failures trigger automatic re-planning. Successful output is merged and delivered.
Orchestrated delivery, not one-shot drafts
Single-run AI tools are fast. CodeHelm is thorough. These are different products for different problems.
You pay for orchestration.
Not for tokens.
CodeHelm does not proxy your AI requests. Your API keys go directly to your chosen provider. We charge for the execution layer — the planning, agent coordination, validation, retry logic, and observability infrastructure that makes autonomous AI development reliable and team-safe.
Simple, transparent pricing
Pay for orchestration infrastructure. Your AI tokens stay between you and your provider.
Plus
For individual engineers and small projects
or €190/year — save 16%
- 1 workspace
- Up to 3 repositories
- Up to 5 AI providers
- Plugin library access
- Unlimited runs
- 7-day log retention
- Community support
- Custom plugin creation
- Team members
- RBAC & roles
- Webhook callbacks
Autonomous execution requires serious auditability
When AI agents are making code changes on your behalf, you need a complete record of everything that happened and why.
Append-only audit log
Every state transition, agent dispatch, validation result, and retry is logged and immutable. Full lifecycle history per run.
Encrypted credentials
All AI provider keys and subscriptions are encrypted at rest with AES-256. Never proxied through our infrastructure.
Workspace RBAC
Fine-grained roles — owner, admin, member, viewer — scoped per workspace. Agent credentials and run permissions are isolated.
SOC2-ready design
Designed with SOC2 Type II compliance principles from day one. Structured logs, controlled access, and no AI token logging.