pptcraft

ppt-craft is an unofficial, local “Claude-for-PowerPoint” add-in designed for the GB10 architecture that runs directly inside PowerPoint (online or desktop) ¹. It replaces external API dependencies by leveraging a local Qwen3.6 vLLM backend to orchestrate AI-driven generation of presentation decks. The system enables users to interact with a Claude-style taskpane for chat and slide management while the backend handles the complex rendering and OOXML manipulation.

The architecture relies on a shared vLLM endpoint to manage the heavy computational load of the 35B parameter model, which requires approximately 35 GB of VRAM and a 3-minute cold-load time. By sharing this instance across multiple clients, ppt-craft optimizes resource usage while providing a REST API interface that connects the Office.js add-in to the Python-pptx and lxml-based rendering engine.

Subsystem	Description
UI Clients	The PowerPoint host environment running the Office.js taskpane SPA for chat and slide tree interaction.
REST API	The FastAPI server running on port 3030 that handles requests and manages the connection between the UI and the backend.
Local LLM	The shared vLLM endpoint hosting the Qwen3.6-FP8 model, managed by autosre and accessible at localhost:8010.
Rendering Engine	The core logic using python-pptx and lxml to unpack, edit, and validate OOXML structures for slide generation.

README.md L1-111 (showing 40 of 111)

# ppt-craft

> **Disclaimer: unofficial and unsupported.** Provided for testing and
> evaluation only, on an "AS IS" basis, with no warranty and no support. Not
> affiliated with or endorsed by Dell. See [DISCLAIMER.md](DISCLAIMER.md).

Wiki: https://sddcinfo.github.io/pptcraft/


Local "Claude-for-PowerPoint" add-in for the GB10 - runs **inside
PowerPoint** (online or desktop), powered by the **local Qwen3.6 vLLM
backend** instead of Anthropic's API.

```
┌────────────────────────────────────────────────────────────────────┐
│ PowerPoint host (Online in browser, OR desktop on Mac/Windows) │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ Claude-style taskpane (Office.js SPA) │ │
│ │ chat panel · live slide tree · action bar │ │
│ │ Office.js: insertSlidesFromBase64, getFileAsync, … │ │
│ └────────────────────────────┬───────────────────────────────┘ │
└──────────────────────────────┬│────────────────────────────────────┘
                               ││ HTTPS (mkcert local CA) / WSS / REST
┌──────────────────────────────▼▼────────────────────────────────────┐
│ GB10 - ppt-craft serve (FastAPI on :3030, HTTPS) │
│ python-pptx + lxml on OOXML │
│ shared vLLM @ localhost:8010 (Qwen3.6-FP8) │
└────────────────────────────────────────────────────────────────────┘
```

## Backend sharing

`ppt-craft` reuses the **same shared vLLM endpoint** as
`meeting-scribe` and `autosre` - `http://localhost:8010` by default
(autosre-managed `Qwen/Qwen3.6-35B-A3B-FP8`). One model, three clients.
The 35B takes ~3 min to cold-load and needs ~35 GB VRAM, so the three
clients share one instance rather than each loading their own.

Each consumer can run **independently** by pointing at a different URL: