Skip to content

Helm Deployment

The gb10-provision Helm chart deploys a PXE provisioning stack consisting of dnsmasq, nginx, and a provision-api service, designed specifically for bare-metal GB10 onboarding 1. Because PXE-booting devices lack an IP address and require direct LAN access, all pods utilize hostNetwork: true to bind directly to the physical LAN interface rather than using ClusterIP services 2. The architecture relies on a shared lan-setup initContainer to dynamically discover the live interface IP at runtime, ensuring that services bind to the correct address even after network changes or reboots 3.

The chart is defined in Chart.yaml as an application type with version 0.1.0 and app version 1.0.0 1. Configuration is managed via values.yaml, which requires the lanIface (the physical interface name) to be set explicitly, as there is no default 2. The lanIp defaults to 10.88.0.1 but is primarily used in “direct” mode to assert the IP on the interface.

Key configuration values include:

  • provisionPort: Defaults to 8088, serving as the single source of truth for the API’s bind port, container port, and nginx upstream.
  • nginxHttpPort: Defaults to 80, the port nginx serves HTTP content on.
  • dataPath: Defaults to /data/provision, a HostPath volume pre-populated by the CLI.
  • dnsmasq.mode: Can be direct (full DHCP on isolated LAN) or proxy (proxy-DHCP on shared L2 switch).

To handle dynamic IP addressing without baking IPs into manifests, the chart uses a shared lan-setup initContainer defined in _helpers.tpl 3. This initContainer runs in every pod and writes the live bind IP to a shared emptyDir volume at /run/lan/ip.

The behavior of lan-setup depends on the dnsmasq.mode:

  • Direct Mode: The initContainer asserts the configured lanIp on the interface using NET_ADMIN capabilities, then publishes that IP.
  • Proxy Mode: The initContainer waits for the interface to acquire any IPv4 address via DHCP lease, then publishes that live IP.

The main containers (nginx, provision-api) mount this volume read-only and read /run/lan/ip at startup to bind to the correct address 4 5.

diagram

The nginx deployment serves static assets (boot chain, images, repos) and proxies dynamic requests to the API 4. It uses a ConfigMap to store the nginx configuration template, which includes ${LAN_IP} as a placeholder. At container start, envsubst replaces ${LAN_IP} with the live IP from /run/lan/ip before starting nginx. The deployment uses a Recreate strategy to ensure only one pod binds the LAN ports at a time. A checksum annotation on the pod template triggers a rollout if the nginx configuration content changes.

DNSmasq handles DHCP and TFTP services 6. It binds to the specified lanIface and excludes the loopback interface.

  • Direct Mode: Acts as the primary DHCP server, handing out addresses from dhcpRange and booting EFI clients via TFTP.
  • Proxy Mode: Acts as a proxy-DHCP, providing PXE boot information without handing out IPs, coexisting with an existing DHCP server.

DNSmasq requires NET_ADMIN and NET_RAW capabilities. It also uses the Recreate strategy.

The provision-api is a FastAPI/uvicorn service that generates cloud-init data and handles heartbeat/client registration 5. It is not exposed directly; nginx proxies requests for /cloud-init/, /heartbeat, /clients, and /onboard.service to it 7. The API binds to the live IP read from /run/lan/ip and the port defined in provisionPort 5. It also uses the Recreate strategy.

The provisioning stack supports a fully offline PXE boot process for GB10 devices . The flow is as follows:

  1. PXE Boot: The GB10 broadcasts a DHCP request. DNSmasq responds with boot information (shim image) .
  2. TFTP/HTTP: The client downloads the signed shim via TFTP, then GRUB loads the kernel and initrd via HTTP from nginx .
  3. Dynamic Server Resolution: GRUB resolves the provisioning server dynamically using ${net_default_server} .
  4. Onboarding: On first boot, the onboard service reads the persisted server address to communicate with the provision-api .
diagram