Architecture Overview

The Meeting Scribe system is a real-time bilingual meeting transcription appliance built on a FastAPI core that orchestrates audio ingestion, AI-driven speech-to-text, translation, and text-to-speech services. The architecture separates the synchronous request-handling server process from asynchronous background daemons and external AI containers, ensuring that live audio processing remains responsive even under transient overload. The system binds exclusively to the appliance’s Access Point IP for HTTPS traffic, integrating local hardware controls, captive portal redirection, and a dedicated kiosk loopback listener to manage diverse client access patterns while maintaining strict separation of concerns through a shared runtime state module.

High-Level System Topology

The application runs as a single FastAPI instance managed by Uvicorn, bound to the appliance’s AP IP (10.42.0.1) on port 443 for HTTPS traffic ¹. This main listener is the primary entry point for browser-facing routes, which are protected by an admin cookie gate or guest PIN. To support network discovery and captive portal redirection, the system may also run a lightweight HTTP sub-app on port 80 at the AP IP, which serves static handoff pages and handles OS captive-portal probes without stateful logic ². Additionally, a kiosk loopback listener binds to 127.0.0.1 on port 8444 (configurable via SCRIBE_KIOSK_LISTENER_PORT) to serve headless Chromium clients, ensuring that kiosk-specific routes are isolated from standard admin traffic.

Server Process and Background Daemons

The FastAPI server process handles all incoming HTTP and WebSocket connections, but it delegates heavy lifting to background daemons to maintain low latency for live audio streams. The core logic for meeting lifecycle, speaker enrollment, and session management is centralized in meeting_scribe.runtime.state, which acts as the single source of truth for shared globals like config, storage, resampler, and translation_queue ¹. This separation allows route modules and WebSocket handlers to access backend state without circular imports or tight coupling to the server module.

Background tasks are managed through the FastAPI lifespan mechanism, which initializes backends and starts long-lived worker tasks. A critical example is the TTS pipeline, which uses a single long-lived worker task to drain a FIFO queue of translation events. This design ensures that under transient overload, listeners hear every segment in order, albeit with delay, rather than dropping audio. The queue is bounded to prevent memory exhaustion; when the cap is hit, the oldest item is dropped to keep playback roughly in sync with live speech. Other background monitors, such as the silence watchdog and loop lag monitor, also run as independent tasks to ensure system health without blocking the main event loop.

External AI Containers and Hardware Integration

The system integrates with external AI containers for Automatic Speech Recognition (ASR) and Text-to-Speech (TTS). The ASR backend uses vLLM with the Qwen3-ASR model, while the TTS pipeline relies on a resident faster-qwen3-tts container. These backends are constructed during the application lifespan and assigned to the shared state, allowing the server to interact with them asynchronously. The architecture includes deep health probes and container auto-restart mechanisms to handle backend failures gracefully.

Local hardware integration is managed through the hotspot and terminal modules. The system binds to the appliance’s network interface (wlP9s9) using IP_FREEBIND to ensure the listener succeeds even before NetworkManager fully configures the IP ². Hardware-specific controls, such as Bluetooth and speakerphone settings, are exposed via dedicated admin routes that interact with the underlying OS ¹. The embedded terminal panel provides an admin-only in-browser shell, leveraging tmux for session management, with configuration written at startup to ensure consistency across restarts.

Separation of Concerns and Module Structure

To maintain clarity and testability, the codebase enforces a strict separation of concerns. Route modules are extracted into meeting_scribe.routes.* and meeting_scribe.hotspot.*, keeping the main server.py file focused on wiring and configuration. This modular approach allows for independent testing of routes and handlers using ASGITransport without spinning up the full server ². Middleware registration, including GZip compression and static cache headers, is handled in a dedicated meeting_scribe.middlewares module to ensure consistent ordering and application across all requests ¹.

src/meeting_scribe/server.py L1-120 (showing 40 of 120)

"""FastAPI server - GB10 real-time bilingual meeting transcription.

HTTPS only, bound to the appliance's AP IP (mandatory self-signed cert
at certs/cert.pem); admin cookie + guest PIN gate every browser-facing
route. Runs one recording at a time. Wires together: storage, audio
resample, ASR (vLLM Qwen3-ASR), translation, TTS, WS broadcast.
"""

from __future__ import annotations

import asyncio
import logging
import os
from contextlib import suppress
from pathlib import Path
from typing import TYPE_CHECKING

import uvicorn
from fastapi import FastAPI
from fastapi.staticfiles import StaticFiles

from meeting_scribe.config import ServerConfig
from meeting_scribe.runtime import state
from meeting_scribe.runtime.net import (
    _make_tcp_socket,
    _NoSignalServer,
    _serve_three_apps,
    _serve_two_apps,
)
from meeting_scribe.server_support.page_cache import cache_html
from meeting_scribe.speaker.enrollment import SpeakerEnrollmentStore
from meeting_scribe.terminal.auth import (
    _KIOSK_COOKIE_HMAC_INFO,
    KIOSK_COOKIE_NAME,
    CookieSigner,
    TicketStore,
    derive_cookie_subkey,
)
from meeting_scribe.terminal.bootstrap import BootstrapConfig, register_bootstrap_routes
from meeting_scribe.terminal.registry import ActiveTerminals

src/meeting_scribe/server.py L481-557 (showing 40 of 77)

            log_level="info",
            lifespan="off",
        )
        captive_server = _NoSignalServer(captive_config)

    # Kiosk loopback listener on 127.0.0.1:<port> (plain HTTP).
    # Serves only /kiosk, /kiosk-bootstrap, /api/kiosk/* via the
    # ``require_kiosk_listener`` dependency on those routes. Other
    # routes returned by the canonical app are technically reachable
    # over this socket too but every state-changing one goes through
    # role + origin checks that fail for the kiosk cookie.
    # Disabled when ``SCRIBE_DISABLE_KIOSK_LISTENER=1`` (unit tests;
    # dev sidecars where port 8444 is taken).
    kiosk_disabled = os.environ.get("SCRIBE_DISABLE_KIOSK_LISTENER") == "1"
    kiosk_port = int(os.environ.get("SCRIBE_KIOSK_LISTENER_PORT", "8444"))
    kiosk_server: _NoSignalServer | None = None
    kiosk_sockets: list = []
    if not kiosk_disabled:
        kiosk_sockets = [_make_tcp_socket("127.0.0.1", kiosk_port, freebind=False)]
        kiosk_config = uvicorn.Config(
            app,
            log_level="info",
            lifespan="off",
            ws_ping_interval=20,
            ws_ping_timeout=45,
        )
        kiosk_server = _NoSignalServer(kiosk_config)

    listener_summary = [f"main=https://{AP_IP}:{main_port}"]
    if captive_disabled:
        listener_summary.append("captive=disabled")
    else:
        listener_summary.append(f"captive=http://{AP_IP}:80")
    if kiosk_disabled:
        listener_summary.append("kiosk=disabled")
    else:
        listener_summary.append(f"kiosk=http://127.0.0.1:{kiosk_port}")
    logger.info("starting listeners: %s", ", ".join(listener_summary))

    try: