Skip to content

Architecture Overview

The autoswe system is an automated loop that rewrites a Python infrastructure-provisioning codebase into Go, one module at a time, running entirely on an NVIDIA DGX Spark (GB10) with local Qwen3.6 as the only writer and Codex GPT-5.5 as the sole independent judge 1. The system iterates each module until a fixed parity matrix passes, capturing every byte of model I/O, every judge verdict, and every commit to local JSONL transcripts 2. The architecture is divided into an operator interface, a local AI writer harness, an external AI judge, and a static site documentation pipeline.

The system operates within a tmux session on the GB10 hardware, where the orchestrator, writer, and judge logs are visible in separate panes 1. The orchestrator manages the loop state, while the writer harness routes all model calls to a local vLLM Qwen3.6 backend 2. The only outbound call from the system is to Codex GPT-5.5, which serves as the independent judge 1.

diagram

The orchestrator is managed by a Python TUI script located at scripts/autoswe.py 2. This script is symlinked to ~/.local/bin/autoswe for operator access. The loop lifecycle runs under a systemd user unit named autoswe.service.

The orchestrator maintains durable state in state.json within the runtime directory ~/.autoswe/rebuild/runtime/. Operator controls are implemented via sentinel files in the runtime directory, which are detected by hooks.

  • Soft Pause: Touching the PAUSE sentinel blocks Bash/Edit/Write at the next vertex boundary, allowing in-flight Read/Grep/Codex calls to complete.
  • Hard Pause: Touching the PAUSE_HARD sentinel blocks every tool type immediately.

The operator can attach to the tmux session using autoswe attach or view the status using autoswe status 1.

The system uses a single local model, Qwen3.6, for authoring and planning, and an external model, Codex GPT-5.5, for judging 2. All five subagents run under the same harness (Claude Code → local Qwen3.6 via autosre claude).

Before any code is written, a same-model plan-review step occurs. A separate Qwen3.6 session reviews the plan against six checks: reinvention, minimum-sufficient, premature-abstraction, deliverable-size, dependency-hygiene, and test-plan. The verdict is either PROCEED, SCOPE_DOWN, or REWRITE.

The judge roles do not use Qwen3.6; they orchestrate calls to Codex GPT-5.5 via Bash.