Helm Deployment

The gb10-provision Helm chart deploys a PXE provisioning stack consisting of dnsmasq, nginx, and a provision-api service, designed specifically for bare-metal GB10 onboarding ¹. Because PXE-booting devices lack an IP address and require direct LAN access, all pods utilize hostNetwork: true to bind directly to the physical LAN interface rather than using ClusterIP services ². The architecture relies on a shared lan-setup initContainer to dynamically discover the live interface IP at runtime, ensuring that services bind to the correct address even after network changes or reboots ³.

Chart Structure and Configuration

The chart is defined in Chart.yaml as an application type with version 0.1.0 and app version 1.0.0 ¹. Configuration is managed via values.yaml, which requires the lanIface (the physical interface name) to be set explicitly, as there is no default ². The lanIp defaults to 10.88.0.1 but is primarily used in “direct” mode to assert the IP on the interface.

Key configuration values include:

provisionPort: Defaults to 8088, serving as the single source of truth for the API’s bind port, container port, and nginx upstream.
nginxHttpPort: Defaults to 80, the port nginx serves HTTP content on.
dataPath: Defaults to /data/provision, a HostPath volume pre-populated by the CLI.
dnsmasq.mode: Can be direct (full DHCP on isolated LAN) or proxy (proxy-DHCP on shared L2 switch).

Network Initialization and IP Discovery

To handle dynamic IP addressing without baking IPs into manifests, the chart uses a shared lan-setup initContainer defined in _helpers.tpl ³. This initContainer runs in every pod and writes the live bind IP to a shared emptyDir volume at /run/lan/ip.

The behavior of lan-setup depends on the dnsmasq.mode:

Direct Mode: The initContainer asserts the configured lanIp on the interface using NET_ADMIN capabilities, then publishes that IP.
Proxy Mode: The initContainer waits for the interface to acquire any IPv4 address via DHCP lease, then publishes that live IP.

The main containers (nginx, provision-api) mount this volume read-only and read /run/lan/ip at startup to bind to the correct address ⁴ ⁵.

Component Deployments

Nginx

The nginx deployment serves static assets (boot chain, images, repos) and proxies dynamic requests to the API ⁴. It uses a ConfigMap to store the nginx configuration template, which includes ${LAN_IP} as a placeholder. At container start, envsubst replaces ${LAN_IP} with the live IP from /run/lan/ip before starting nginx. The deployment uses a Recreate strategy to ensure only one pod binds the LAN ports at a time. A checksum annotation on the pod template triggers a rollout if the nginx configuration content changes.

DNSmasq

DNSmasq handles DHCP and TFTP services ⁶. It binds to the specified lanIface and excludes the loopback interface.

Direct Mode: Acts as the primary DHCP server, handing out addresses from dhcpRange and booting EFI clients via TFTP.
Proxy Mode: Acts as a proxy-DHCP, providing PXE boot information without handing out IPs, coexisting with an existing DHCP server.

DNSmasq requires NET_ADMIN and NET_RAW capabilities. It also uses the Recreate strategy.

Provision API

The provision-api is a FastAPI/uvicorn service that generates cloud-init data and handles heartbeat/client registration ⁵. It is not exposed directly; nginx proxies requests for /cloud-init/, /heartbeat, /clients, and /onboard.service to it ⁷. The API binds to the live IP read from /run/lan/ip and the port defined in provisionPort ⁵. It also uses the Recreate strategy.

Data Flow and Boot Process

The provisioning stack supports a fully offline PXE boot process for GB10 devices . The flow is as follows:

PXE Boot: The GB10 broadcasts a DHCP request. DNSmasq responds with boot information (shim image) .
TFTP/HTTP: The client downloads the signed shim via TFTP, then GRUB loads the kernel and initrd via HTTP from nginx .
Dynamic Server Resolution: GRUB resolves the provisioning server dynamically using ${net_default_server} .
Onboarding: On first boot, the onboard service reads the persisted server address to communicate with the provision-api .

helm/gb10-provision/Chart.yaml L1-7

apiVersion: v2
name: gb10-provision
description: PXE provisioning server for bare GB10 onboarding (dnsmasq + nginx + provision-api)
type: application
version: 0.1.0
appVersion: "1.0.0"

helm/gb10-provision/values.yaml L1-63 (showing 40 of 63)

# gb10-provision Helm values
#
# All pods run with hostNetwork: true so DHCP/TFTP/HTTP bind on the
# physical LAN interface (10.88.0.1). A fresh PXE-booting GB10 has no
# IP yet, so ordinary ClusterIP services cannot route to it.
#
# Provisioning traffic is plain HTTP over the isolated 10.88.0.0/24 LAN
# (one host on, one host off). The initrd/subiquity environment has no
# trust anchor for a self-signed leaf, so no TLS is used here. The
# admin-side trust gate lives on :4443 (cookie auth, different iface).

# LAN interface NAME assigned by `sddc gb10 serve up` before Helm install.
# Required - no default. Every daemon binds to this iface ONLY so DHCP
# answers never leak to the AP/WAN.
lanIface: ""

# LAN interface IP assigned by `sddc gb10 serve up` before Helm install.
lanIp: "10.88.0.1"

# Internal port the provision-api (FastAPI/uvicorn) listens on, reached only via
# nginx proxy_pass. Single source of truth (both containerPort and --port). NOT
# 8080 - that collides with meeting-scribe's host-server fallback on this box
# (hostNetwork pods bind real host ports). `serve up` overrides via --set.
provisionPort: 8088

# Port nginx serves static + proxied content on (HTTP only).
nginxHttpPort: 80

# HostPath pre-populated by `sddc gb10 serve up` before Helm install.
dataPath: "/data/provision"

dnsmasq:
  image: "sddcinfo/gb10-dnsmasq:dev"
  # "direct": full DHCP server on an isolated 10.88.0.1/24 LAN (two GB10s
  # back-to-back). "proxy": proxy-DHCP on a shared L2 switch - no address
  # handout, coexists with the existing DHCP, only answers PXE boot info.
  mode: "direct"
  dhcpRange: "10.88.0.100,10.88.0.200,12h"  # used in direct mode
  proxySubnet: ""  # used in proxy mode, e.g. 192.168.8.0 (set by `serve up`)
  tftpRoot: "/data/provision/tftp"

helm/gb10-provision/templates/_helpers.tpl L1-120 (showing 40 of 120)

{{/*
Expand the name of the chart.
*/}}
{{- define "gb10-provision.name" -}}
{{- .Chart.Name | trunc 63 | trimSuffix "-" }}
{{- end }}

{{/*
Common labels
*/}}
{{- define "gb10-provision.labels" -}}
helm.sh/chart: {{ .Chart.Name }}-{{ .Chart.Version }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- end }}

{{/*
Selector labels for a given component
*/}}
{{- define "gb10-provision.selectorLabels" -}}
app.kubernetes.io/name: {{ include "gb10-provision.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
app.kubernetes.io/component: {{ .component }}
{{- end }}

{{/*
lanInit - mode-aware initContainer that establishes the LAN-IP precondition and
publishes the *live* bind address to a shared volume (/run/lan/ip) for the main
(host-networked) container to read. Runs on every pod (re)start, so addressing
self-heals across reboots AND across an iface-IP change (the NM dispatcher rolls
the pods; the fresh initContainer re-discovers and republishes - see serve.py).

  direct: idempotently (re)assert {{ .Values.lanIp }}/24, then publish it.
          Needs NET_ADMIN (mutates host networking) - granted ONLY here.
  proxy:  wait for the iface to have ANY IPv4, then publish whatever it actually
          is (NOT a baked value) so nginx/api bind the current lease, not a
          stale rendered IP. Unprivileged.

Reuses .Values.dnsmasq.image (has iproute2). Pods are already hostNetwork:true,
so the initContainer shares the host netns. Mounts the shared lan-ip emptyDir.
*/}}

helm/gb10-provision/templates/deployment-nginx.yaml L1-87 (showing 40 of 87)

apiVersion: v1
kind: ConfigMap
metadata:
  name: {{ .Release.Name }}-nginx-conf
  namespace: {{ .Release.Namespace }}
  labels:
    {{- include "gb10-provision.labels" . | nindent 4 }}
data:
  # ${LAN_IP} is NOT a Helm value - it's an envsubst placeholder filled at
  # container start from /run/lan/ip (the live iface IP published by lan-setup),
  # so nginx binds whatever address the box currently has, with no baked IP. The
  # config body lives in the gb10-provision.nginxConf helper so the deployment
  # can checksum the exact rendered content (below) and roll on any change.
  nginx.conf.template: |
    {{- include "gb10-provision.nginxConf" . | nindent 4 }}
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ .Release.Name }}-nginx
  namespace: {{ .Release.Namespace }}
  labels:
    {{- include "gb10-provision.labels" . | nindent 4 }}
    app.kubernetes.io/component: nginx
spec:
  replicas: 1
  # hostNetwork singleton: the new pod can't bind the LAN ports while the
  # old one holds them, so replace (not roll) on upgrade.
  strategy:
    type: Recreate
  selector:
    matchLabels:
      app.kubernetes.io/name: {{ include "gb10-provision.name" . }}
      app.kubernetes.io/instance: {{ .Release.Name }}
      app.kubernetes.io/component: nginx
  template:
    metadata:
      annotations:
        # Roll the pod whenever the rendered nginx config changes (a new
        # location, header, or port) - without this a config-only `serve up`

helm/gb10-provision/templates/deployment-provision-api.yaml L1-64 (showing 40 of 64)

apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ .Release.Name }}-provision-api
  namespace: {{ .Release.Namespace }}
  labels:
    {{- include "gb10-provision.labels" . | nindent 4 }}
    app.kubernetes.io/component: provision-api
spec:
  replicas: 1
  # hostNetwork singleton: the new pod can't bind the LAN ports while the
  # old one holds them, so replace (not roll) on upgrade.
  strategy:
    type: Recreate
  selector:
    matchLabels:
      app.kubernetes.io/name: {{ include "gb10-provision.name" . }}
      app.kubernetes.io/instance: {{ .Release.Name }}
      app.kubernetes.io/component: provision-api
  template:
    metadata:
      labels:
        app.kubernetes.io/name: {{ include "gb10-provision.name" . }}
        app.kubernetes.io/instance: {{ .Release.Name }}
        app.kubernetes.io/component: provision-api
    spec:
      hostNetwork: true
      initContainers:
        {{- include "gb10-provision.lanInit" . | nindent 8 }}
      containers:
        - name: provision-api
          image: {{ .Values.provisionApi.image }}
          ports:
            - containerPort: {{ .Values.provisionPort }}
              protocol: TCP
          env:
            - name: PORT
              value: {{ .Values.provisionPort | quote }}
            - name: DATA_PATH
              value: {{ .Values.dataPath | quote }}

helm/gb10-provision/templates/deployment-dnsmasq.yaml L1-98 (showing 40 of 98)

apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ .Release.Name }}-dnsmasq
  namespace: {{ .Release.Namespace }}
  labels:
    {{- include "gb10-provision.labels" . | nindent 4 }}
    app.kubernetes.io/component: dnsmasq
spec:
  replicas: 1
  # hostNetwork singleton: the new pod can't bind the LAN ports while the
  # old one holds them, so replace (not roll) on upgrade.
  strategy:
    type: Recreate
  selector:
    matchLabels:
      app.kubernetes.io/name: {{ include "gb10-provision.name" . }}
      app.kubernetes.io/instance: {{ .Release.Name }}
      app.kubernetes.io/component: dnsmasq
  template:
    metadata:
      labels:
        app.kubernetes.io/name: {{ include "gb10-provision.name" . }}
        app.kubernetes.io/instance: {{ .Release.Name }}
        app.kubernetes.io/component: dnsmasq
    spec:
      # hostNetwork: binds DHCP (67/udp) and TFTP (69/udp) on the physical
      # LAN interface so a PXE-booting GB10 (no IP yet) can reach them.
      hostNetwork: true
      {{- if .Values.nodeName }}
      nodeSelector:
        kubernetes.io/hostname: {{ .Values.nodeName | quote }}
      {{- end }}
      initContainers:
        {{- include "gb10-provision.lanInit" . | nindent 8 }}
      containers:
        - name: dnsmasq
          image: {{ .Values.dnsmasq.image }}
          securityContext:
            capabilities:

helm/gb10-provision/templates/_helpers.tpl L121-147

    location /casper/      { alias /data/provision/www/casper/; autoindex on; }
    location /oemdata/     { alias /data/provision/www/oemdata/; autoindex on; }
    location /grub/        { alias /data/provision/www/grub/;    autoindex on; }
    location /repos/       { alias /data/provision/repos/;       autoindex on; }
    location /wheels/      { alias /data/provision/wheels/;      autoindex on; }
    location /images/      { alias /data/provision/images/;      autoindex on; }
    location /bin/         { alias /data/provision/bin/;         autoindex on; }
    # 100%-offline caches served on the LAN (no internet on the target).
    location /hf/          { alias /data/provision/hf/;          autoindex on; }
    location /mise/        { alias /data/provision/mise/;        autoindex on; }
    location /apt/         { alias /data/provision/apt/;         autoindex on; }
    location /fwupd/       { alias /data/provision/fwupd/;       autoindex on; }
    location /manifest.yaml { alias /data/provision/manifest.yaml; }
    # Proxy dynamic cloud-init + heartbeat to provision-api. Forward the client's
    # Host (the address it used to reach us) so the API renders provisioning_server
    # dynamically - without this, proxy_pass rewrites Host to the upstream and the
    # API would echo a fixed IP. ($host is an nginx var; envsubst's ${LAN_IP}
    # allow-list leaves it untouched.)
    proxy_set_header Host $host;
    location /cloud-init/  { proxy_pass http://${LAN_IP}:{{ .Values.provisionPort }}/cloud-init/; }
    location /heartbeat    { proxy_pass http://${LAN_IP}:{{ .Values.provisionPort }}/heartbeat;  }
    location /clients      { proxy_pass http://${LAN_IP}:{{ .Values.provisionPort }}/clients;    }
    location /onboard.service { proxy_pass http://${LAN_IP}:{{ .Values.provisionPort }}/onboard.service; }
  }
}
{{- end }}