Agent Swarm & Demos

The autosre swarm orchestration layer enables multi-agent collaboration through a unified launcher that abstracts provider differences and execution modes. The system supports two distinct providers - local for on-premise vLLM/llama.cpp backends and anthropic for the online API - while managing environment isolation by purging conflicting ANTHROPIC_* variables ¹. Agents are instantiated via pre-defined TaskTemplate configurations that assign specialized roles, such as Security Reviewers or Incident Commanders, to a fixed number of parallel agents ². Execution occurs in either interactive mode, which replaces the current process for live debugging, or eval mode, which spawns a subprocess to capture structured JSON transcripts for later analysis ¹.

Provider Orchestration and Environment Isolation

The SwarmLauncher class manages the lifecycle of agent teams by constructing the appropriate environment and command-line arguments for the claude CLI. It supports two provider modes: local and anthropic. In both modes, the launcher purges all ANTHROPIC_* environment variables (such as ANTHROPIC_API_KEY, ANTHROPIC_BASE_URL, and ANTHROPIC_MODEL) to prevent configuration leakage between modes.

For the local provider, the launcher re-applies environment overrides from the configured Backend instance and sets ANTHROPIC_API_KEY to a dummy value (local-vllm), as the local proxy does not validate the key. For the anthropic provider, all override keys remain unset, allowing Claude Code to fall back to its native authentication configuration stored in ~/.claude.

Execution Modes: Interactive vs. Eval

The launcher supports two execution modes determined by the presence of an EvalLaunchSpec.

Interactive Mode

Eval Mode In eval mode, the launcher spawns a subprocess via subprocess.Popen. It sets the working directory to a read-only snapshot of the target repository (worktree_path) and passes the rendered suite prompt via the --print argument. The subprocess is configured with --output-format=stream-json, --verbose, and --include-partial-messages to capture structured output into a transcript file (transcript.jsonl).

Eval mode uses a restricted settings profile that denies Bash execution and limits permissions to file-inspection tools (Read, Glob, Grep) and limited write/edit capabilities. The actual enforcement of read-only constraints is handled at the filesystem layer by autosre/eval/snapshot.py, which sets worktree files to chmod 0o444. The launcher also injects an x-autosre-run-id header into API requests via ANTHROPIC_CUSTOM_HEADERS to tag logs in the proxy.

Task Templates and Agent Roles

Agent swarms are configured using TaskTemplate dataclasses that define the number of agents, their roles, and the initial prompt ². The SwarmLauncher accepts a TaskTemplate to format the initial prompt with role descriptions.

The repository includes several pre-defined templates:

code-review: 4 agents (Security Reviewer, Performance Analyst, Code Quality, Documentation & Tests).
architecture-analysis: 3 agents (Scalability Architect, Security Architect, Maintainability Architect).
incident-response: 5 agents (Incident Commander, Log Analyst, Metrics Analyst, Root Cause Analyst, Remediation Engineer).
content-generation: 3 agents (Researcher, Writer, Editor/Reviewer).
data-analysis: 3 agents (Data Explorer, Visualization Specialist, Insights Analyst).

In eval mode, the system prompt is constructed by combining the suite’s specific reviewer persona (if provided) with harness rules that enforce the read-only contract and output file constraints ¹.

Demo Framework and SRE Scenarios

The autosre.demos package provides an enterprise demo framework for showcasing SRE capabilities ³. It exposes DemoRunner, DemoScenario, and DemoPhase classes to orchestrate live presentations. The framework supports audience-adaptive scenarios via AUDIENCE_PROFILES and AudienceProfile.

The demo framework integrates with the swarm orchestration layer to drive vLLM backends and Claude Code agent swarms during live presentations. This allows for the demonstration of complex SRE tasks, such as incident response simulations, using the multi-agent templates defined in the swarm package ². The DemoRunner likely coordinates the execution of these scenarios, leveraging the SwarmLauncher to manage the underlying agent processes ³.

autosre/swarm/launcher.py L1-120 (showing 40 of 120)

"""Swarm launcher - launches Claude Code agent teams with task templates.

Supports two providers:

- ``local`` (default): routes Claude Code at the local vLLM/llama.cpp/Ollama
  backend through the existing Anthropic-compatible proxy. This is the
  behavior autosre has always had.
- ``anthropic``: launches Claude Code against the online Anthropic API using
  Claude Code's own native auth. No environment variables are injected; all
  ``ANTHROPIC_*`` overrides the user may have in their shell are purged so
  that the online session does not accidentally point at a local proxy.

Supports two execution modes:

- Interactive (default): replaces the current process with
  ``os.execvpe``. This is what ``autosre swarm launch`` has always done.
- Eval (``eval_mode`` set): runs Claude Code as a subprocess with stdout
  piped to a transcript file. Used by ``autosre eval run`` so every turn
  can be captured and later normalized into ``turns.jsonl``.
"""

from __future__ import annotations

import json
import os
import shutil
import subprocess
import tempfile
import time
import uuid
from dataclasses import dataclass, field
from pathlib import Path
from typing import TYPE_CHECKING, Any, Literal

import click

from autosre.claude_version import require_claude_version
from autosre.models import OPUS_1M

if TYPE_CHECKING:

autosre/swarm/templates.py L1-109 (showing 40 of 109)

"""Pre-defined task templates for agent swarms."""

from __future__ import annotations

from dataclasses import dataclass


@dataclass
class TaskTemplate:
    """Pre-defined task template for agent swarm launches.

    Provides structured initial prompts and role assignments
    for Claude Code agent teams.
    """

    name: str
    description: str
    num_agents: int
    agent_roles: list[str]
    initial_prompt: str

    def format_prompt(self) -> str:
        """Format the initial prompt with agent role descriptions."""
        roles_text = "\n".join(
            f"- Agent {i + 1}: {role}" for i, role in enumerate(self.agent_roles)
        )
        return f"{self.initial_prompt}\n\nAgent roles:\n{roles_text}"


TASK_TEMPLATES: dict[str, TaskTemplate] = {
    "code-review": TaskTemplate(
        name="code-review",
        description="Multi-agent code review from security, performance, quality, and documentation perspectives",
        num_agents=4,
        agent_roles=[
            "Security Reviewer - identify vulnerabilities, injection risks, auth issues",
            "Performance Analyst - find bottlenecks, memory leaks, inefficient patterns",
            "Code Quality - assess readability, maintainability, design patterns",
            "Documentation & Tests - check coverage, missing tests, unclear docs",
        ],

autosre/demos/__init__.py L1-18

"""Enterprise demo framework for GB10 showcases.

Provides audience-adaptive demo scenarios that orchestrate
vLLM backends and Claude Code agent swarms for live presentations.
"""

from autosre.demos.audience import AUDIENCE_PROFILES, AudienceProfile
from autosre.demos.runner import DemoRunner
from autosre.demos.scenario import DemoPhase, DemoScenario

__all__ = [
    "AUDIENCE_PROFILES",
    "AudienceProfile",
    "DemoPhase",
    "DemoRunner",
    "DemoScenario",
]