Architecture Overview

The autoswe system is an automated loop that rewrites a Python infrastructure-provisioning codebase into Go, one module at a time, running entirely on an NVIDIA DGX Spark (GB10) with local Qwen3.6 as the only writer and Codex GPT-5.5 as the sole independent judge ¹. The system iterates each module until a fixed parity matrix passes, capturing every byte of model I/O, every judge verdict, and every commit to local JSONL transcripts ². The architecture is divided into an operator interface, a local AI writer harness, an external AI judge, and a static site documentation pipeline.

High-Level System Design

The system operates within a tmux session on the GB10 hardware, where the orchestrator, writer, and judge logs are visible in separate panes ¹. The orchestrator manages the loop state, while the writer harness routes all model calls to a local vLLM Qwen3.6 backend ². The only outbound call from the system is to Codex GPT-5.5, which serves as the independent judge ¹.

The Orchestrator and Operator Interface

The orchestrator is managed by a Python TUI script located at scripts/autoswe.py ². This script is symlinked to ~/.local/bin/autoswe for operator access. The loop lifecycle runs under a systemd user unit named autoswe.service.

The orchestrator maintains durable state in state.json within the runtime directory ~/.autoswe/rebuild/runtime/. Operator controls are implemented via sentinel files in the runtime directory, which are detected by hooks.

Soft Pause: Touching the PAUSE sentinel blocks Bash/Edit/Write at the next vertex boundary, allowing in-flight Read/Grep/Codex calls to complete.
Hard Pause: Touching the PAUSE_HARD sentinel blocks every tool type immediately.

The operator can attach to the tmux session using autoswe attach or view the status using autoswe status ¹.

AI Agents and Subagents

The system uses a single local model, Qwen3.6, for authoring and planning, and an external model, Codex GPT-5.5, for judging ². All five subagents run under the same harness (Claude Code → local Qwen3.6 via autosre claude).

Writer and Planner

Before any code is written, a same-model plan-review step occurs. A separate Qwen3.6 session reviews the plan against six checks: reinvention, minimum-sufficient, premature-abstraction, deliverable-size, dependency-hygiene, and test-plan. The verdict is either PROCEED, SCOPE_DOWN, or REWRITE.

Judges

The judge roles do not use Qwen3.6; they orchestrate calls to Codex GPT-5.5 via Bash.

Static Site Documentation Pipeline

README.md L1-71 (showing 40 of 71)

# autoswe

> **Disclaimer: unofficial and unsupported.** Provided for testing and
> evaluation only, on an "AS IS" basis, with no warranty and no support. Not
> affiliated with or endorsed by Dell. See [DISCLAIMER.md](DISCLAIMER.md).

Wiki: https://sddcinfo.github.io/autoswe/


An automated loop that rewrites a Python infrastructure-provisioning codebase into Go, one module at a time, running entirely on an NVIDIA DGX Spark (GB10) with **local Qwen3.6** as the only writer and **Codex GPT-5.5** as the sole independent judge. It iterates each module until a fixed parity matrix passes.

> **Status:** under active development. This README is the operator quickstart for GB10.

## How it works

```
GB10 (NVIDIA DGX Spark, ARM64)

  tmux session 'sddc-rebuild'
    pane 0: orchestrator log
    pane 1: writer transcript
    pane 2: judge verdicts
    pane 3: GPU + translate watch
        |
        v
  writer harness  (Claude Code -> local Qwen3.6 on :8011 via vLLM)
        |
        v
  /loop sddc-rebuild:
    PLAN -> WRITE -> STATIC CHECK -> SANDBOX TEST
         -> MUTATION TEST -> REVIEW (codex) -> DECIDE -> COMMIT + RETRO
        |
        v  (only outbound call)
  Codex  (OpenAI, GPT-5.5)
```

## Operator quickstart

```bash
# one-time

AGENTS.md L1-120 (showing 40 of 120)

# AGENTS.md - autoswe

This repo is the **scaffold** for an autonomous self-improving Go rewrite. The rewrite output lives in a separate repo (`$SDDC_OUT`). This file is the canonical agent contract - Claude Code, Codex CLI, and opencode all read it.

`CLAUDE.md` is a symlink to this file so the Claude Code harness picks it up.

---

## What this system does

A local Qwen3.6 model on GB10 rewrites the existing Python `sddcinfo` codebase into a new Go project, one module at a time, iterating until a non-negotiable parity matrix is fully green. Codex GPT-5.5 (high reasoning) is the **sole external judge** - it never authors code, only critiques what Qwen produced.

The loop runs autonomously in a tmux session that humans can attach to locally on GB10 or via SSH. Every byte of model I/O, every judge verdict, every commit is captured to local JSONL transcripts. The accumulated journey docs (`docs/journey/iter-NNNN.md`) become an educational training artifact.

Full design: see internal design notes.

## Less is more - the supervision principle

Every iteration is gated by a **same-model plan-review step** before any code
is written. Qwen3.6 drafts a plan; a separate Qwen3.6 session, in fresh
context with the `plan-self-review` skill mounted, reviews it against six
checks: reinvention, minimum-sufficient, premature-abstraction,
deliverable-size, dependency-hygiene, test-plan. It returns PROCEED,
SCOPE_DOWN, or REWRITE.

This is the cheapest place to enforce simplicity - a plan revision is
hundreds of tokens; a code-and-tests-and-mutation revision is thousands plus
a Codex burn. **Same-model self-review of a plan, before code, catches the
dominant failure mode (over-engineering) at the cheapest possible step.**

The `simplicity` rubric dimension (in `rubrics/code-quality.yaml`) is a
backstop on the OUTPUT side: even if a plan slips through PLAN_REVIEW, the
diff itself gets scored. The retro judge's `Complexity delta` section in
each journey entry is the longitudinal signal: are plan-review loops actually
firing? Are simplicity scores trending up? If not, the inner loop isn't
doing its job and the rubric/skill needs sharpening.

See `steering-rules/feedback_less_is_more.md`
for the parent rule this whole machinery enforces.