Current Toolchain
Last Updated: April 2026
This is our opinionated, point-in-time stack. The pillars are tool-agnostic and durable. This file is not. Expect it to change as the ecosystem evolves.
AI Coding Assistants
Section titled “AI Coding Assistants”Daily Driver: Claude Code
Claude Code is our primary AI coding tool. It runs in the terminal, understands your codebase through agentic exploration, and supports the full development lifecycle from planning through implementation. Key reasons we chose it:
- Agentic mode with tool use (file editing, shell commands, web search) in a single session
- CLAUDE.md and rules files for persistent project context and coding standards
- Hooks system for enforcing guardrails automatically (linting, formatting, test runs)
- Session persistence with
/resumeand/renamefor picking up where you left off - Remote control for mobile monitoring of long-running tasks
- Skills and plugins ecosystem for extending capabilities
- Scheduled tasks for recurring automation
We use Opus 4.7 for complex planning, architecture decisions, and code review. Sonnet 4.6 handles routine implementation work at lower cost with strong quality. Our rough split: Opus for ~30% of work (planning, debugging, architecture), Sonnet for ~70% (implementation, tests, boilerplate). This maps to the 70/30 model strategy described in Pillar 7.
Secondary: Codex
OpenAI’s Codex serves as our secondary when we need a different model perspective or when a task benefits from OpenAI’s ecosystem. Useful for cross-checking architectural decisions against a second model’s reasoning and for tasks where model diversity catches blind spots.
Spec and Planning Tools
Section titled “Spec and Planning Tools”Primary: OpenSpec.dev
OpenSpec is our primary spec-first development tool. It follows a structured workflow: specify, clarify, plan, task, implement. It catches implicit requirements that raw prompting misses by researching the codebase before generating code. Lighter on tokens than alternatives, making it practical for daily use on Pro plans across both brownfield and greenfield work.
The core discipline OpenSpec enforces (and that we expect even without the tool): research the codebase first, surface assumptions through clarifying questions, create a plan with discrete tasks, then implement. See Pillar 2: Planning Before Code.
Secondary: SpecKit
SpecKit’s heavier research phase produces meaningfully better output when thoroughness matters more than cost. It generates massive context files and catches requirements hiding in code comments. The token cost is real, so we reach for SpecKit when a project demands deep codebase analysis upfront.
Dev Infrastructure
Section titled “Dev Infrastructure”CLAUDE.md / Rules Files
Every project has a CLAUDE.md (or equivalent rules file) checked into source control. This is non-negotiable. It contains project context, coding standards, architecture patterns, and explicit “do not” instructions. See Pillar 1: Context Engineering.
We use MCP servers to give our AI tools direct access to external systems: database schemas, API documentation, browser automation, and project management tools. The pattern of giving AI tools to work with (rather than describing things manually) consistently produces better results.
Common MCP servers in our stack:
- Playwright for browser automation and UI verification
- Database connectors for schema introspection
Hooks
Claude Code hooks enforce guardrails at the tool level: pre-commit linting, automatic test runs after code changes, formatting enforcement. These are configured per-project and checked into source control so every developer gets the same constraints. See Pillar 5: Guardrails and Quality.
Reusable skill packages that extend AI capabilities for specific tasks (frontend design, document generation, etc.). We install relevant skills globally and per-project as needed. This ecosystem is growing fast; check skills.sh for what is available.
Our recommended tool for prompt evaluation. promptfoo is open-source and lets you define test cases, run them against multiple prompts or models, and compare results systematically. Use it to evaluate rules files, test prompt variations, benchmark model selection for specific tasks, and catch regressions when you change your AI configuration. Prompts are software; promptfoo lets you test them like software. See Pillar 9: Evaluation and Measurement.
Version Control
Git is non-negotiable. Commit early, commit often. AI-assisted development makes aggressive version control even more critical because you need clean rollback points when AI-generated changes don’t work out.
Voice as Input
Section titled “Voice as Input”Our recommended voice input tool. SuperWhisper runs locally on macOS and provides high-quality speech-to-text that feeds directly into any text field, including your terminal and editor. It keeps you in flow state by letting you dictate prompts, requirements, and architectural thinking faster than typing. Particularly effective for ideation, specification drafting, and talking through a problem before committing to code.
Voice dictation is an underused input method for AI-assisted development. The AI handles the messiness of spoken language well, and you can always refine the prompt after dictation. Some developers use voice for the initial brain dump and then edit for precision before sending.
Voice is not a replacement for written specs or structured prompts. It is an accelerant for getting ideas out of your head and into the AI’s context quickly.
Cost Awareness
Section titled “Cost Awareness”AI model pricing spans a wide range. As of April 2026, input token costs range roughly from $0.25 to $15 per million tokens across major providers, with output tokens running $1.25 to $75 per million. That is a 60x spread. Understanding where your usage falls on that spectrum is a professional responsibility, not an optional concern.
Practical cost levers to be aware of: model tier selection (the 70/30 strategy above), context window size (larger contexts cost more per request), prompt length (concise prompts save tokens without sacrificing quality), and caching (reusing context across related requests). See Pillar 7: Workflow and Tooling for principles on model selection and cost optimization.