Real-Time Communication

The WebSocket infrastructure in meetingscribe is designed to isolate high-throughput audio processing from state management and broadcast logic by splitting functionality into four distinct endpoints. This architecture ensures that the audio-input hot-path remains decoupled from view-broadcast and audio-output state machines, allowing for specialized handling of binary PCM streams, read-only transcript replays, synthesized audio delivery, and admin-only state mutations. Each endpoint serves a specific role in the real-time communication pipeline, from ingesting raw audio for ASR and diarization to broadcasting language-preference updates and interpretation status to connected clients.

Audio Input and Control

The /api/ws endpoint serves as the exclusive entry point for admin-only audio upstream and JSON control messages. It is designed to receive binary 16 kHz PCM data, which is then forwarded to the Automatic Speech Recognition (ASR) and diarization systems ¹. This isolation ensures that the heavy computational load of audio processing does not interfere with the view-broadcast or audio-output state machines.

Transcript Broadcasting and Session Management

The /api/ws/view endpoint provides a read-only transcript stream for clients such as popouts, kiosks, or guests. When a client connects late to a meeting, this endpoint replays the existing meeting journal to bring the client up to speed before switching to receiving only language-preference updates. This mechanism allows clients to re-render language-dependent UI elements, such as column headers, A/B toggle button labels, slide-language selectors, and monolingual modes, without needing to poll the /api/status endpoint ².

Audio Output and Synthesis

The /api/ws/audio-out endpoint is designated for guest-scope listeners. Upon connection, the client negotiates an audio format, after which it receives synthesized Text-to-Speech (TTS) audio along with optional pass-through of the original audio ¹. Helpers tightly bound to this endpoint, such as _deliver_audio_to_listener and _send_passthrough_audio, are located in server.py and are lazy-imported to maintain modularity.

Admin State Mutation and Role-Based Visibility

The /api/ws/admin endpoint is restricted to admin users and serves as the channel for state mutations, including language pair configuration and interpretation settings. Admins can set language pairs via the set_language_pair message type, which validates the codes, updates the current meeting state, and broadcasts a meeting_meta_changed event to view clients ². Similarly, interpretation status changes are routed through canonical handlers to ensure collision validation, with the resulting status broadcast to all connected clients via _broadcast_interpretation_status.

src/meeting_scribe/ws/__init__.py L1-23

"""WebSocket route modules.

Four WS endpoints, each in its own module so the audio-input
hot-path stays isolated from the view-broadcast and audio-output
state machines:

* ``audio_input`` (``/api/ws``) - admin-only audio upstream + JSON
  control messages. Receives binary 16 kHz PCM, forwards to ASR
  and diarization.
* ``view_broadcast`` (``/api/ws/view``) - read-only transcript
  stream. Replays the meeting journal to a late-joining client
  then receives only language-preference updates.
* ``audio_output`` (``/api/ws/audio-out``) - guest-scope listener
  endpoint. Negotiates an audio format on connect, then receives
  synthesized TTS audio plus optional pass-through original audio.
* ``admin`` (``/api/ws/admin``) - admin-only state mutation channel.

Helpers tightly bound to one endpoint live in that endpoint's
module; helpers shared across the audio delivery broadcast pipeline
(``_deliver_audio_to_listener``, ``_send_passthrough_audio`` etc.)
stay in ``server.py`` for now and are lazy-imported.
"""

src/meeting_scribe/ws/admin.py L1-120 (showing 40 of 120)

"""``/api/ws/admin`` - admin-only state mutation WebSocket."""

from __future__ import annotations

import json as _json
import logging

from fastapi import APIRouter, WebSocket, WebSocketDisconnect

from meeting_scribe.audio.audio_routing import get_routing_settings
from meeting_scribe.languages import is_supported
from meeting_scribe.runtime import state
from meeting_scribe.server_support.admin_guard import require_admin_ws
from meeting_scribe.server_support.broadcast import _broadcast_json
from meeting_scribe.server_support.settings_store import (
    _effective_interpretation_enabled,
    _load_settings_override,
)

logger = logging.getLogger(__name__)
router = APIRouter()


async def _broadcast_interpretation_status(payload: dict) -> None:
    await _broadcast_json({"type": "interpretation_status", **payload})


async def _set_language_pair(pair: list[str]) -> dict:
    parts = [str(p).strip().lower() for p in pair if str(p).strip()]
    if len(parts) not in (1, 2) or len(set(parts)) != len(parts):
        return {"ok": False, "error": "language_pair must contain 1 or 2 distinct codes"}
    if not all(is_supported(p) for p in parts):
        return {"ok": False, "error": "unsupported language in language_pair"}
    if state.current_meeting is None:
        return {"ok": False, "error": "no active meeting"}

    state.current_meeting.language_pair = parts
    try:
        state.storage._write_meta(state.current_meeting)
    except Exception: