Real-Time Communication
The WebSocket infrastructure in meetingscribe is designed to isolate high-throughput audio processing from state management and broadcast logic by splitting functionality into four distinct endpoints. This architecture ensures that the audio-input hot-path remains decoupled from view-broadcast and audio-output state machines, allowing for specialized handling of binary PCM streams, read-only transcript replays, synthesized audio delivery, and admin-only state mutations. Each endpoint serves a specific role in the real-time communication pipeline, from ingesting raw audio for ASR and diarization to broadcasting language-preference updates and interpretation status to connected clients.
Audio Input and Control
Section titled “Audio Input and Control”The /api/ws endpoint serves as the exclusive entry point for admin-only audio upstream and JSON control messages. It is designed to receive binary 16 kHz PCM data, which is then forwarded to the Automatic Speech Recognition (ASR) and diarization systems 1. This isolation ensures that the heavy computational load of audio processing does not interfere with the view-broadcast or audio-output state machines.
Transcript Broadcasting and Session Management
Section titled “Transcript Broadcasting and Session Management”The /api/ws/view endpoint provides a read-only transcript stream for clients such as popouts, kiosks, or guests. When a client connects late to a meeting, this endpoint replays the existing meeting journal to bring the client up to speed before switching to receiving only language-preference updates. This mechanism allows clients to re-render language-dependent UI elements, such as column headers, A/B toggle button labels, slide-language selectors, and monolingual modes, without needing to poll the /api/status endpoint 2.
Audio Output and Synthesis
Section titled “Audio Output and Synthesis”The /api/ws/audio-out endpoint is designated for guest-scope listeners. Upon connection, the client negotiates an audio format, after which it receives synthesized Text-to-Speech (TTS) audio along with optional pass-through of the original audio 1. Helpers tightly bound to this endpoint, such as _deliver_audio_to_listener and _send_passthrough_audio, are located in server.py and are lazy-imported to maintain modularity.
Admin State Mutation and Role-Based Visibility
Section titled “Admin State Mutation and Role-Based Visibility”The /api/ws/admin endpoint is restricted to admin users and serves as the channel for state mutations, including language pair configuration and interpretation settings. Admins can set language pairs via the set_language_pair message type, which validates the codes, updates the current meeting state, and broadcasts a meeting_meta_changed event to view clients 2. Similarly, interpretation status changes are routed through canonical handlers to ensure collision validation, with the resulting status broadcast to all connected clients via _broadcast_interpretation_status.
"""WebSocket route modules.
Four WS endpoints, each in its own module so the audio-input
hot-path stays isolated from the view-broadcast and audio-output
state machines:
* ``audio_input`` (``/api/ws``) - admin-only audio upstream + JSON
control messages. Receives binary 16 kHz PCM, forwards to ASR
and diarization.
* ``view_broadcast`` (``/api/ws/view``) - read-only transcript
stream. Replays the meeting journal to a late-joining client
then receives only language-preference updates.
* ``audio_output`` (``/api/ws/audio-out``) - guest-scope listener
endpoint. Negotiates an audio format on connect, then receives
synthesized TTS audio plus optional pass-through original audio.
* ``admin`` (``/api/ws/admin``) - admin-only state mutation channel.
Helpers tightly bound to one endpoint live in that endpoint's
module; helpers shared across the audio delivery broadcast pipeline
(``_deliver_audio_to_listener``, ``_send_passthrough_audio`` etc.)
stay in ``server.py`` for now and are lazy-imported.
"""
"""``/api/ws/admin`` - admin-only state mutation WebSocket."""
from __future__ import annotations
import json as _json
import logging
from fastapi import APIRouter, WebSocket, WebSocketDisconnect
from meeting_scribe.audio.audio_routing import get_routing_settings
from meeting_scribe.languages import is_supported
from meeting_scribe.runtime import state
from meeting_scribe.server_support.admin_guard import require_admin_ws
from meeting_scribe.server_support.broadcast import _broadcast_json
from meeting_scribe.server_support.settings_store import (
_effective_interpretation_enabled,
_load_settings_override,
)
logger = logging.getLogger(__name__)
router = APIRouter()
async def _broadcast_interpretation_status(payload: dict) -> None:
await _broadcast_json({"type": "interpretation_status", **payload})
async def _set_language_pair(pair: list[str]) -> dict:
parts = [str(p).strip().lower() for p in pair if str(p).strip()]
if len(parts) not in (1, 2) or len(set(parts)) != len(parts):
return {"ok": False, "error": "language_pair must contain 1 or 2 distinct codes"}
if not all(is_supported(p) for p in parts):
return {"ok": False, "error": "unsupported language in language_pair"}
if state.current_meeting is None:
return {"ok": False, "error": "no active meeting"}
state.current_meeting.language_pair = parts
try:
state.storage._write_meta(state.current_meeting)
except Exception: