Architecture Overview
The gb10-provision system is a self-contained, state-free PXE/onboarding control plane designed to bring a fresh NVIDIA GB10 (DGX Spark) appliance online over an isolated provisioning LAN 1. It operates as a set of container images and a Helm chart deployed on the appliance’s own k3s cluster. The architecture relies on an isolated-LAN security model where the API binds only to the provisioning interface and assumes the network is physically or logically isolated. The system is state-free, meaning no site inventory, credentials, MACs, or operator data are stored in the repository; all host-specific information is supplied at deploy time via Helm values or environment variables.
Core Components
Section titled “Core Components”The system consists of three primary container images and a Helm chart that orchestrates them:
- gb10-dnsmasq: Provides DHCP and TFTP/PXE services specifically for the isolated provisioning network.
- gb10-provision-api: A FastAPI-based control plane that serves cloud-init and autoinstall configurations. It tracks per-client onboarding state and renders user-data.
- Helm Chart (gb10-provision): Wires the dnsmasq, nginx, and the API components together.
Data Flow and Interaction
Section titled “Data Flow and Interaction”The provisioning process involves the interaction between the client appliance and the control plane components. The gb10-dnsmasq component handles the initial network discovery via DHCP and TFTP/PXE. The gb10-provision-api serves the necessary configuration files, such as cloud-init and autoinstall seeds, to the client. The Helm chart ensures these components are correctly integrated within the k3s environment.
Design Principles
Section titled “Design Principles”The architecture is guided by three key design principles:
- State-free: The system does not maintain any persistent state regarding site inventory, credentials, MAC addresses, or operator data. All host-specific data is provided dynamically at deployment time.
- Isolated-LAN Security: The API binds exclusively to the provisioning interface (e.g.,
--host $(LAN_IP)) and never to0.0.0.0. It runs unprivileged and assumes the provisioning network is isolated, not intended for exposure on a routable network. - Strict Templating: Jinja rendering utilizes
StrictUndefined, ensuring that any missing variable causes a loud failure rather than generating a broken autoinstall seed.
# gb10-provision
> **Disclaimer: unofficial and unsupported.** Provided for testing and
> evaluation only, on an "AS IS" basis, with no warranty and no support. Not
> affiliated with or endorsed by Dell. See [DISCLAIMER.md](DISCLAIMER.md).
Wiki: https://sddcinfo.github.io/provisioning/
A self-contained, state-free PXE/onboarding **control plane** for bringing a fresh
NVIDIA GB10 (DGX Spark) appliance online over an isolated provisioning LAN.
It ships as container images + a Helm chart deployed on the appliance's own k3s:
```
containers/
gb10-dnsmasq/ # DHCP + TFTP/PXE for the isolated provisioning network
gb10-provision-api/ # FastAPI control-plane: serves cloud-init/autoinstall,
# tracks per-client onboarding state, renders user-data
helm/
gb10-provision/ # Helm chart wiring dnsmasq + nginx + the API together
```
## Design
- **State-free.** No site inventory, credentials, MACs, or operator data live in this
repo. Everything host-specific is supplied at deploy time via Helm values / env.
- **Isolated-LAN security model.** The API binds only to the provisioning interface
(`--host $(LAN_IP)`, never `0.0.0.0`) and runs unprivileged. It assumes the
provisioning network is physically/logically isolated; it is **not** intended to be
exposed on a routable network.
- **Strict templating.** Jinja rendering uses `StrictUndefined`, so a missing variable
fails loudly rather than shipping a broken autoinstall seed.
## Deploy
```bash
# Import the images into the appliance's k3s containerd, then:
helm upgrade --install gb10-provision ./helm/gb10-provision \
--namespace gb10-provision --create-namespace \