A desktop app for directing AI to code

The AI writes code. You hold the gate.

GeneralStaff runs AI coding agents across your projects and enforces a hard verification gate. Every change runs against your source of truth before it is accepted. If tests fail, dependencies break, or a file drifts outside scope, the change is rejected automatically. You define the constraints. The tool enforces them. No programming required.

GeneralStaff desktop — fleet dashboard showing all projects, warm linen theme
Runs on
Your machine. Local-first.
Uses
Your API key. No subscription.
Verifies
Tests · build · custom rules
License
AGPL-3.0 · free and open source
01 The problem

AI coding tools are fast. They are also, often, wrong.

The model finishes a task, marks it green, and moves on. The change compiles but the feature doesn't work. The test was deleted instead of fixed. A function got stubbed out and forgotten. If you can't read the diff, you can't catch it.

An agent says "done" the same way whether the work holds up or not. Telling the two apart is the whole problem.
  • a. Confident "done." The agent declares success on work that wasn't finished.
  • b. Silent regressions. A nearby feature breaks; nothing surfaces it.
  • c. Stubbed-out reality. Hard parts get mocked, then quietly stay mocked.
  • d. Deleted tests. The failing test "passes" because it no longer exists.
  • e. No paper trail. You don't know what changed, or why, or which attempt finally stuck.
02 How it works

One cycle. The gate is the whole point.

GeneralStaff doesn't try to make the AI smarter. It puts a gate between the AI and your codebase. Nothing lands until the gate says so. When the gate says no, the work rolls back automatically — your branch stays clean.

— the verification cycle —
STEP 01·

Claim

The agent reads the task and writes down what "done" will mean — the assertion it expects to satisfy.

STEP 02·

Work

It edits files in an isolated workspace. Your branch is untouched while it tries, fails, retries.

STEP 03·

Gate

Tests run. The build runs. Your custom rules run. The original claim is checked against reality.

Kept

The gate passes. The change is committed with the claim, the verification result, and a signature in the audit log.

Rolled back

The gate fails. The work is discarded, the reason recorded, and the agent gets another attempt — or hands back to you.

[gsd] task resolve flaky cart total → session #1843 [claim] cart.total returns sum of line items × qty, ex-discounts [work] edited cart/total.ts · cart/discounts.ts · 4 files [gate] running 248 tests · build · project rules… [gate] ✕ FAIL — 2 tests broke (cart/discounts.spec) [work] rollback → workspace clean · retry 1/3 [gate] ✓ PASS — 248/248 tests · build ok · claim verified [keep] commit a83f1c2 · signed · audit #224
— new in v0.8.0: autonomous mode —

It proposes the work now, too. You still decide what's worth doing.

Through v0.7, GeneralStaff only ran the work you queued. Autonomous mode — opt-in, off by default — adds the step in front: it reads a project's real state (its mission, git history, open tasks), proposes concrete next work, and runs each proposal through the same gate. Keep or reject — and mechanical, or a call only you should make. The mechanical work it dispatches; the judgment calls it hands back.

Dispatched

Bot-safe work runs through the normal cycle and lands on a branch — gated and rolled back exactly like work you queued yourself. It never pushes or merges; the merge stays your call.

Routed to you

Anything that turns on taste, scope, money, or a live product isn't decided for you. It surfaces as a short list of decisions and waits.

03 See it

A desk for directing the work — not a chat window.

GSD is the desktop app. It runs your fleet of projects, shows you what each session is doing right now, and lets you open any task to see exactly what landed (and what didn't).

Fleet · 6 projects · 3 sessions running FIG.01 GSD fleet dashboard — kriegspiel paper theme, multiple projects listed with status indicators
Live session · agent at work FIG.02 GSD live session tab — embedded Claude Code session running inside the desktop app
Task detail · claim, diff, gate result FIG.03 GSD Workbench view — per-project task detail showing the Retrogaze project
04 Why you can trust it

An open audit log. And a real track record.

Every accept and every reject is written down. You can read them. We do — GeneralStaff builds itself. Here's the running tally from the dogfooding repo:

223
Changes verified · kept
Tests passed. Build clean. Claim matched reality.
27
Changes rejected · rolled back
Caught at the gate. Never reached the branch.
Audit · last 8 entries gsd.log
14:02:11cart total · sum verifieda83f1c2
13:58:44cart total · 2 tests broke
13:41:09add coupon validation5d12b08
12:30:22user prefs migration11e9c0a
11:54:17prefs · claim unmet
11:12:55search debounce fixf4b27e1
10:48:03email template copy9c0a3d8
10:22:31404 page styling22be917
05 You own it

Local-first. Your key. Your machine. Your code.

GeneralStaff isn't a service. There's no account, no usage dashboard on some other company's server, no "your code will be used to improve our model." The app runs on your laptop and talks directly to the model provider you pick.

iRuns locally

Native desktop app for macOS and Windows. Nothing about your work goes through us.

iiYour API key

BYO Anthropic or OpenAI key. You see and pay your own usage, directly.

iiiNo subscription

No subscription, no license fee. GeneralStaff is open source under AGPL-3.0 (free and open-source) — audit it, fork it, build it yourself.

ivYour code, only

Source never leaves your machine. The audit log lives in your repo.

06 Get started

Open it. Point it at a project. Tell it what you want.

Three steps. Then you can stop reading diffs.

The gate is on by default; you can turn it up, never down.

STEP 01
Download GSD

macOS (Apple silicon & Intel) and Windows.

STEP 02
Add your projects

Drag a folder in. GSD reads your tests and build the way you already run them.

STEP 03
Describe the task

Plain English. The gate enforces the rest. Watch what actually landed.