Skip to content

Testing & Benchmarks

The testing infrastructure for meetingscribe is designed to validate complex client-side state management and cross-window synchronization without requiring the heavy computational resources of the production environment. By utilizing a custom test harness that mounts the real WebSocket broadcasting router while stubbing the GPU-dependent backends, the suite ensures that the client-side logic for handling transcript events, speaker pulses, and reconnection logic is rigorously exercised. This approach allows for deterministic regression testing of specific UI bugs, such as grid clearing issues or language routing errors, within a standard CI environment that lacks GPU access.

The primary browser tests are located in tests/browser/test_cross_window_sync.py and are marked with pytest.mark.browser to distinguish them from unit tests 1. These tests utilize Playwright to simulate user interactions across multiple browser contexts, specifically focusing on the synchronization between the admin window and the pop-out transcript view.

The test suite is built around a custom live_meeting_server fixture. This fixture starts a background uvicorn server running a FastAPI application that mirrors the production server’s WebSocket behavior 2. Crucially, this harness mounts the real view_broadcast router, ensuring that the WebSocket shape, journal replay logic, and connection registry are exercised exactly as they are in production 1. However, it stubs the rest of the surface that the pop-out initialization touches, such as language lists and meeting status, to avoid dependencies on external services or hardware 2.

Key test cases include:

  • Cross-window consistency: Verifying that two independent pop-out viewers ingest the same broadcast events and render identical segment IDs 3.
  • Speaker pulse regression: Ensuring that frequent speaker_pulse events do not clear the pop-out grid, a bug caused by control messages being incorrectly funneled into the segment store.
  • Reconnection logic: Confirming that a pop-out window reconnecting after a disconnect replays the journal to catch up with the admin view without duplicating segments 4.
  • Language routing: Validating that source and translation text appear in the correct columns for various language pairs (e.g., English-Japanese, English-German) 5.

The provided source material does not contain specific benchmark harnesses for translation or speakerphone hardware. The testing strategy explicitly avoids using the real meeting_scribe application lifespan because it boots vLLM backends, which require GPUs that are not available in the CI environment 1. Instead, the focus is on client-side state management and WebSocket handling, with backends stubbed to return static or simulated data 2.

The CI pipeline is configured to run the browser tests using the custom harness described above. This configuration allows the tests to execute in environments without GPU support by bypassing the heavy ASR and translation backends 1. The harness ensures that the critical path of the client-side JavaScript code, specifically scribe-app.js, is exercised by connecting to the same server instance and receiving broadcasts via the shared ws_connections set 6.

diagram