Skip to content

Real-Time Communication

The WebSocket infrastructure in meetingscribe is designed to isolate high-throughput audio processing from state management and broadcast logic by splitting functionality into four distinct endpoints. This architecture ensures that the audio-input hot-path remains decoupled from view-broadcast and audio-output state machines, allowing for specialized handling of binary PCM streams, read-only transcript replays, synthesized audio delivery, and admin-only state mutations. Each endpoint serves a specific role in the real-time communication pipeline, from ingesting raw audio for ASR and diarization to broadcasting language-preference updates and interpretation status to connected clients.

The /api/ws endpoint serves as the exclusive entry point for admin-only audio upstream and JSON control messages. It is designed to receive binary 16 kHz PCM data, which is then forwarded to the Automatic Speech Recognition (ASR) and diarization systems 1. This isolation ensures that the heavy computational load of audio processing does not interfere with the view-broadcast or audio-output state machines.

diagram

Transcript Broadcasting and Session Management

Section titled “Transcript Broadcasting and Session Management”

The /api/ws/view endpoint provides a read-only transcript stream for clients such as popouts, kiosks, or guests. When a client connects late to a meeting, this endpoint replays the existing meeting journal to bring the client up to speed before switching to receiving only language-preference updates. This mechanism allows clients to re-render language-dependent UI elements, such as column headers, A/B toggle button labels, slide-language selectors, and monolingual modes, without needing to poll the /api/status endpoint 2.

diagram

The /api/ws/audio-out endpoint is designated for guest-scope listeners. Upon connection, the client negotiates an audio format, after which it receives synthesized Text-to-Speech (TTS) audio along with optional pass-through of the original audio 1. Helpers tightly bound to this endpoint, such as _deliver_audio_to_listener and _send_passthrough_audio, are located in server.py and are lazy-imported to maintain modularity.

diagram

Admin State Mutation and Role-Based Visibility

Section titled “Admin State Mutation and Role-Based Visibility”

The /api/ws/admin endpoint is restricted to admin users and serves as the channel for state mutations, including language pair configuration and interpretation settings. Admins can set language pairs via the set_language_pair message type, which validates the codes, updates the current meeting state, and broadcasts a meeting_meta_changed event to view clients 2. Similarly, interpretation status changes are routed through canonical handlers to ensure collision validation, with the resulting status broadcast to all connected clients via _broadcast_interpretation_status.

diagram