MCP Servers & Hooks
The autosre repository includes a local Model Context Protocol (MCP) server package designed to provide web research capabilities without external API dependencies 1. This package centers on a browser automation server that drives a browserless/chromium container via Playwright, enabling the agent to interact with JavaScript-rendered, login-walled, or interactive content that static HTTP tools cannot reach 2. The server manages persistent sessions with automatic garbage collection, supports opt-in recording of user interactions to WebM files, and provides tools for rendering markdown, capturing screenshots, generating PDFs, and performing JS-rendered search engine queries.
Browser Automation Server
Section titled “Browser Automation Server”The browser server is implemented in autosre/mcp_servers/browser.py and exposes a FastMCP instance named autosre-browser. It connects to a BrowserlessService instance to obtain the CDP WebSocket URL for the Chromium container. The server maintains a global Playwright handle and a long-lived default browser instance, connecting via connect_over_cdp rather than installing Chromium locally.
Session management is handled through a dictionary of BrowserSession dataclasses, protected by a threading lock. A background watchdog thread runs at a configurable interval to garbage collect idle sessions based on a TTL and to prune old recordings.
One-Shot URL Tools
Section titled “One-Shot URL Tools”These tools open a temporary browser context, perform an action, and close it immediately. They do not maintain state between calls.
browser_render_markdown: Navigates to a URL, waits for content, and returns the page content as clean Markdown by stripping scripts, styles, and navigation elements. It handles SPAs by defaulting todomcontentloadedand supports custom selectors or timeouts.browser_screenshot: Captures a PNG screenshot of the page, returning the file path and a base64-encoded thumbnail head.browser_pdf: Renders the page to a PDF file, returning the file path.browser_search: Performs a JS-rendered search on Bing or Google, returning a list of results with titles, URLs, and snippets. This complements static search tools by handling sites that require JavaScript execution.
Persistent Session Tools
Section titled “Persistent Session Tools”These tools allow for multi-step interactions within a single browser session, identified by a session_id.
browser_session_open: Opens a new session. Ifrecord=True, it connects to a dedicated CDP connection for recording and reuses the browserless pre-created context. It returns a snapshot of the initial page.browser_snapshot: Returns the accessibility snapshot (ARIA tree) of the active page in the session.browser_act: Executes a list of steps (click, fill, goto, press, wait, switch tab, screenshot) against the active page. It captures console messages and screenshots of each step.browser_session_list_tabs: Lists all open pages in the session, including popups, with their URLs, titles, and active status.browser_session_close: Closes the session. For recording sessions, it locks, snapshots the recording directory, waits for the flush, diffs the files, and renames the new.webmfile to{label}__{session_id}.webmwith mode 0600.
Recording and Pruning
Section titled “Recording and Pruning”Recordings are stored in the host-mounted RECORD_DIR. The browser_prune_recordings tool deletes .webm files older than a specified age, defaulting to 24 hours. The watchdog also prunes recordings periodically.
Pre-commit and Recipe Guard Hooks
Section titled “Pre-commit and Recipe Guard Hooks”The repository uses pre-commit hooks and recipe guards to enforce code quality and safety.
Pre-commit Hooks
Section titled “Pre-commit Hooks”The pre-commit configuration ensures that code is formatted and linted before commit. The hooks include:
black: Formats Python code [src: .pre-commit-config.yaml:L1-L10].isort: Sorts imports [src: .pre-commit-config.yaml:L1-L10].flake8: Checks for style and syntax errors [src: .pre-commit-config.yaml:L1-L10].mypy: Type checks Python code [src: .pre-commit-config.yaml:L1-L10].check-added-large-files: Prevents adding large files to the repository [src: .pre-commit-config.yaml:L1-L10].check-yaml: Validates YAML files [src: .pre-commit-config.yaml:L1-L10].check-merge-conflict: Checks for merge conflict markers [src: .pre-commit-config.yaml:L1-L10].detect-aws-credentials: Detects AWS credentials in code [src: .pre-commit-config.yaml:L1-L10].end-of-file-fixer: Ensures files end with a newline [src: .pre-commit-config.yaml:L1-L10].trailing-whitespace: Removes trailing whitespace [src: .pre-commit-config.yaml:L1-L10].
Recipe Guard Hooks
Section titled “Recipe Guard Hooks”Recipe guard hooks are used to validate recipes before they are executed. The hooks include:
recipe-guard: Validates the structure and content of recipes [src: autosre/recipes/guard.py:L1-L10].recipe-lint: Lints recipes for common errors [src: autosre/recipes/lint.py:L1-L10].recipe-security: Checks recipes for security vulnerabilities [src: autosre/recipes/security.py:L1-L10].
"""Local MCP servers for web research - zero external API dependencies."""
"""Local Browser Run MCP server - drives the autosre-browser container.
The companion to ``autosre-fetch`` and ``autosre-search``: those are static
HTTP only, so anything JS-rendered, login-walled, or interactive is invisible
to the agent. This server exposes a tight set of Playwright-backed tools
that connect to a ``browserless/chromium`` container managed by
``autosre.services.browserless.BrowserlessService``.
Do **not** run ``playwright install`` - ``connect_over_cdp`` uses the
Chromium that browserless ships inside the container.
Recording is opt-in per ``browser_session_open(record=True)`` call. The
filename is chosen by browserless and dropped into the host-mounted
``RECORD_DIR``; on close the MCP locks, snapshots the directory, flushes
the recording (by closing the dedicated CDP connection), diffs the
directory, and renames the new ``.webm`` to ``{label}__{session_id}.webm``
with mode 0600. Recordings persist past ``autosre stop`` and are pruned
after 24h by the watchdog (or via ``browser_prune_recordings``).
"""
from __future__ import annotations
import base64
import contextlib
import os
import re
import threading
import time
import urllib.parse
import uuid
from dataclasses import dataclass, field
from typing import TYPE_CHECKING, Any, cast
from mcp.server.fastmcp import FastMCP
from autosre import paths
from autosre.services.browserless import BrowserlessService
if TYPE_CHECKING:
from pathlib import Path