Skip to content

Slide Processing

The slide processing pipeline in meetingscribe is designed to handle the ingestion, conversion, translation, and rendering of PowerPoint presentations through a multi-phase background worker system. The architecture prioritizes perceived performance by splitting processing into a fast validation phase and slower rendering/extraction phases, allowing the deck to be activated immediately after validation. The system leverages asyncio.to_thread() to run CPU-bound tasks like LibreOffice conversion in background threads without blocking the main event loop, ensuring that text extraction and translation can proceed in parallel with the slower image rendering steps.

The pipeline is defined in the src/meeting_scribe/slides module, which serves as the entry point for slide upload, translation, and rendering logic 1. The core orchestration happens in worker.py, which manages the lifecycle of a slide deck through distinct stages tracked in a meta.json file. The worker checks for the availability of LibreOffice via check_worker_available() before proceeding 2. The processing is split into phases to optimize latency: Phase 1 validates the file, Phase 2 extracts text and renders originals, and Phase 3 handles translation reinsertion and final rendering.

diagram

The first phase is designed to complete in under one second, allowing the system to provide immediate feedback to the user. The run_validate function executes _validate_sync in a background thread using asyncio.to_thread(). This synchronous function writes the uploaded bytes to _upload.pptx and calls validate_pptx_contents to verify the file’s integrity. If validation fails, the meta.json file is updated with a FAILED status and the error message, raising a RuntimeError to halt the pipeline. Upon success, the slide count is recorded in the metadata, and the stage is marked as DONE.

Phase 2: Text Extraction and Original Rendering

Section titled “Phase 2: Text Extraction and Original Rendering”

Phase 2 is split into two concurrent tasks to minimize total processing time. The system prefers running run_extract_text and run_render_originals concurrently rather than sequentially.

The run_extract_text function calls _extract_text_sync, which uses python-pptx to extract text runs and speaker notes 3. This step is fast (~1-2 seconds) and produces text_extract.json and slide_notes.json. The extraction of speaker notes is non-fatal; if it fails, a sentinel JSON file is written to distinguish between “no notes” and “extraction failure”. This parallelism allows translation services to start processing text while the slower image rendering occurs in the background.

The run_render_originals function handles the conversion of the original PPTX to images, a process that typically takes 25-30 seconds for a 50-slide deck 2. It calls _render_originals_sync, which uses convert_pptx_to_images (LibreOffice + pdftoppm) to generate PNGs in the original/ directory. Progress is broadcast to the event loop via _make_thread_safe_progress, which wraps the async ProgressBroadcast callable to ensure thread safety. A legacy function run_render_and_extract exists for backward compatibility but performs these steps sequentially and is discouraged for new code.

Phase 3: Translation Reinsertion and Final Rendering

Section titled “Phase 3: Translation Reinsertion and Final Rendering”

Once translations are available, the pipeline proceeds to reinsert the translated text into the PPTX and render the final slides. The run_reinsert function executes _run_reinsert_sync in a background thread. This function calls reinsert_translated_text to modify the PPTX, saving it as translated.pptx. Subsequently, render_translated_to_images generates the final PNGs in the translated/ directory. The pipeline marks the stage as complete and records the completed_at timestamp upon finishing.

To improve user experience, the system supports an “express lane” that renders only the first 1-2 translated slides quickly. The run_partial_translated_render function handles this by creating a unique scratch directory for each invocation to avoid race conditions when multiple batches are processed in parallel. It calls render_partial_translated to generate PNGs for specific slide indices, then cleans up the temporary work directory. Additionally, run_translated_pdf_only provides a post-express finalizer that generates translated.pptx and original.pdf without re-rendering PNGs, leveraging the fact that the express lane already produced the necessary images.