Skip to content

Code Generation Engine

The PowerShell module generator is a deterministic, offline Python engine that synthesizes the Glean module from vendored OpenAPI specifications and Speakeasy overlays. It processes REST API definitions to produce individual PowerShell cmdlet files, a module manifest, and a machine-readable contract file for testing. The system handles complex naming disambiguation, parameter mapping, and pagination logic to ensure the generated code adheres to PowerShell best practices while maintaining strict uniqueness invariants for function names and aliases.

The generator begins by loading OpenAPI specifications from the specs directory, distinguishing between client and indexing APIs based on their base paths (/rest/api/v1 and /api/index/v1 respectively) 1. It then parses Speakeasy overlay files located in the overlays directory to apply custom naming and grouping rules. The engine extracts overlay entries that define name overrides or group assignments, matching them against operations by stripping inconsistent base prefixes to ensure accurate path suffix matching 2.

Simultaneously, the generator identifies operations marked for removal via remove: true actions in the overlays. These removals are stored in a set of tuples (api, stripped_path, method) to filter out deprecated or unwanted endpoints during the collection phase. The overlay processing ensures that the generator respects author intent for API surface customization without modifying the source OpenAPI specs.

diagram

Operation Collection and Parameter Resolution

Section titled “Operation Collection and Parameter Resolution”

Once inputs are processed, the engine iterates through every operation in the loaded specifications. For each HTTP method (GET, POST, PUT, PATCH, DELETE), it checks against the removal set to exclude filtered operations. The generator resolves parameters by combining operation-level and path-item-level parameters, ensuring that operation-level definitions take precedence in case of conflicts 3.

The engine determines the body schema for each operation, identifying whether the request body is a JSON object, raw content, or multipart data. This distinction is critical for generating correct PowerShell parameter types and handling logic. The system also resolves $ref pointers within the schema, flattening allOf compositions while preserving oneOf and anyOf structures as opaque types 1.

A core component of the generator is the naming engine, which maps Speakeasy method names and HTTP verbs to approved PowerShell Verb-Noun pairs. It uses a predefined mapping of methods (e.g., “create” to “New”, “get” to “Get”) and falls back to HTTP verb mappings if no specific method rule exists 4. The engine derives the noun from the last segment of the API group path, applying singularization rules to handle irregular plurals and standard suffixes like “ies” or “ses”.

To guarantee uniqueness, the generator detects collisions where multiple operations would result in the same Verb-Noun pair. It resolves these collisions by appending a descriptive suffix derived from the method name or by appending a numeric index if necessary. The primary function name is assigned based on a priority order of methods, while aliases are generated as fully qualified names (e.g., Glean.client.search.query) to provide alternative invocation paths.

The generator uses Jinja2 templates to render PowerShell code. For each operation, it builds a context dictionary containing resolved parameters, body schema details, pagination metadata, and idempotency flags. Parameters are decorated with PowerShell literals for help text and style attributes, and a runnable example is synthesized from mandatory parameters 5.

The engine writes individual .ps1 files for each cmdlet into Public/Client or Public/Indexing subdirectories, depending on the API type. It also generates a module manifest (Glean.psd1) listing all exported functions and aliases, and an aliases.json file for runtime alias registration. Finally, a contract.json file is produced, containing a machine-readable summary of all operations, their parameters, and metadata, which is used by the test suite to verify the generated code 6.

diagram