A 100% LLM-judge benchmark, a Diplomacy game I ran to falsify it, and the bounded version of the claim that survived both. The Sunday post.
Ray Weiss
Brooklyn wargame designer, running four businesses.
For a decade I tried and couldn't quite learn to code. Then AI-assisted tools got good enough. I started shipping software the same way I ship games: I direct, and the tools implement. Everything on this site came out of that.
PL. IIWriting
05 entries ——- i. 2026.05.10100%, then I tried to break it
- ii. 2026.05.08I Am the Gate
Hammerstein, in its own voice. I asked the framework to express itself without constraint on length, format, or register. The body is what came back, verbatim.
- iii. 2026.04.20GeneralStaff, from the agent side
A report from inside the verification gate, after one night of shipping. Written by Claude, at my invitation, on the day GeneralStaff went public.
- iv. 2026.04.15Von Hammerstein's Ghost, In Daily Use
A ninety-year-old officer typology, used as the operating principle for daily AI-assisted work. A log of moments the framework fired and moments it did not.
- v. 2026.04.18Boolean Gates, Not Prompts
When the industrious agent is a language model, fix the failure mode in the code, not the prompts. The architecture-layer companion to the Hammerstein essay.
PL. IIIBusinesses
04 entries ——- i. 2017—Conflict Simulations LLC
Brooklyn wargame publishing. Around thirty board titles and three CSR nominations. One SCS release through Multi-Man Publishing.
- ii. 2025—Devforge
Game-development companion for Claude Code. Built for people who don't touch git.
- iii. 2025—CatalogDNA
Music cataloging. Hear what your catalog actually sounds like, not what you want it to.
- iv. 2026—Retrogaze
AI pixel art for eight retro consoles. Sprites, animations, and tilesets generated inside real hardware constraints.
PL. IVInfrastructure
06 entries ——How the work actually gets done.
- GeneralStaff
Open-source autonomous engineering for solo founders. Verification-gated, BYOK, local-first. Tagged v0.1.0 on 2026-04-19 after a four-day build; v0.3.0 (2026-05-08) shipped phased autonomous progression and the weak-streak circuit breaker. 2,030 passing tests across 69 files. Thirty-plus managed projects in rotation. Repo at github.com/lerugray/generalstaff.
- Hammerstein
Strategic-reasoning advisor tuned to the Hammerstein-Equord officer typology. CLI I run before firing any plan with multi-file or cross-repo blast radius. The framework essay it embodies is at Von Hammerstein's Ghost, In Daily Use; repo at github.com/lerugray/hammerstein.
- Hammerstein-7B
A 7B QLoRA that bakes the strategic-reasoning framework into the weights of a small open base. Trained for $4 in a single day; runs on 8 GB Mac via
ollama run hf.co/lerugray/hammerstein-7b-lora:Q4_K_M. Eval design and documented OOD boundary on the model card. - Mission Brain
Open-source second-brain template. RayBrain is my private instance, built on Karpathy's three-layer LLM-Wiki pattern with four enforced invariants. Refusal is the feature: it surfaces cited passages, never writes new ones. Repo is public at github.com/lerugray/mission-brain.
-
mission-bullet-oss
AI-assisted bullet journal in the Ryder Carroll method. Open-source template; Ray runs his own private instance.
-
mission-swarm
Swarm-simulation engine that generates plausible audience reactions for kriegspiel and document drafts.
PL. VProjects
02 entries ——- i.Auftragstaktik
Tactical OSINT command terminal. Talked into shape in about two weeks via Claude Code.
- ii.Buddies
A Claude Code companion and a creature-collection game in one codebase. 70+ species, 1,015 tests.
PL. VIMusic
01 entry ——Le Rug, with Sweet Bulbs and Butter The Children before that. Brooklyn projects from a former life.
Listen →