Skip to content

Architecture Overview

The Meeting Scribe system is a real-time bilingual meeting transcription appliance built on a FastAPI core that orchestrates audio ingestion, AI-driven speech-to-text, translation, and text-to-speech services. The architecture separates the synchronous request-handling server process from asynchronous background daemons and external AI containers, ensuring that live audio processing remains responsive even under transient overload. The system binds exclusively to the appliance’s Access Point IP for HTTPS traffic, integrating local hardware controls, captive portal redirection, and a dedicated kiosk loopback listener to manage diverse client access patterns while maintaining strict separation of concerns through a shared runtime state module.

The application runs as a single FastAPI instance managed by Uvicorn, bound to the appliance’s AP IP (10.42.0.1) on port 443 for HTTPS traffic 1. This main listener is the primary entry point for browser-facing routes, which are protected by an admin cookie gate or guest PIN. To support network discovery and captive portal redirection, the system may also run a lightweight HTTP sub-app on port 80 at the AP IP, which serves static handoff pages and handles OS captive-portal probes without stateful logic 2. Additionally, a kiosk loopback listener binds to 127.0.0.1 on port 8444 (configurable via SCRIBE_KIOSK_LISTENER_PORT) to serve headless Chromium clients, ensuring that kiosk-specific routes are isolated from standard admin traffic.

The FastAPI server process handles all incoming HTTP and WebSocket connections, but it delegates heavy lifting to background daemons to maintain low latency for live audio streams. The core logic for meeting lifecycle, speaker enrollment, and session management is centralized in meeting_scribe.runtime.state, which acts as the single source of truth for shared globals like config, storage, resampler, and translation_queue 1. This separation allows route modules and WebSocket handlers to access backend state without circular imports or tight coupling to the server module.

Background tasks are managed through the FastAPI lifespan mechanism, which initializes backends and starts long-lived worker tasks. A critical example is the TTS pipeline, which uses a single long-lived worker task to drain a FIFO queue of translation events. This design ensures that under transient overload, listeners hear every segment in order, albeit with delay, rather than dropping audio. The queue is bounded to prevent memory exhaustion; when the cap is hit, the oldest item is dropped to keep playback roughly in sync with live speech. Other background monitors, such as the silence watchdog and loop lag monitor, also run as independent tasks to ensure system health without blocking the main event loop.

External AI Containers and Hardware Integration

Section titled “External AI Containers and Hardware Integration”

The system integrates with external AI containers for Automatic Speech Recognition (ASR) and Text-to-Speech (TTS). The ASR backend uses vLLM with the Qwen3-ASR model, while the TTS pipeline relies on a resident faster-qwen3-tts container. These backends are constructed during the application lifespan and assigned to the shared state, allowing the server to interact with them asynchronously. The architecture includes deep health probes and container auto-restart mechanisms to handle backend failures gracefully.

Local hardware integration is managed through the hotspot and terminal modules. The system binds to the appliance’s network interface (wlP9s9) using IP_FREEBIND to ensure the listener succeeds even before NetworkManager fully configures the IP 2. Hardware-specific controls, such as Bluetooth and speakerphone settings, are exposed via dedicated admin routes that interact with the underlying OS 1. The embedded terminal panel provides an admin-only in-browser shell, leveraging tmux for session management, with configuration written at startup to ensure consistency across restarts.

Separation of Concerns and Module Structure

Section titled “Separation of Concerns and Module Structure”

To maintain clarity and testability, the codebase enforces a strict separation of concerns. Route modules are extracted into meeting_scribe.routes.* and meeting_scribe.hotspot.*, keeping the main server.py file focused on wiring and configuration. This modular approach allows for independent testing of routes and handlers using ASGITransport without spinning up the full server 2. Middleware registration, including GZip compression and static cache headers, is handled in a dedicated meeting_scribe.middlewares module to ensure consistent ordering and application across all requests 1.