Implementation Guide¶

This document describes how py-launch-lab works at a technical level — the detection pipeline, data model, report generation, and AI integration.

Architecture at a Glance¶

CLI (cli.py)
  ├─ matrix run   → runner.py  → subprocess → Process/Window detection
  │                                │
  │                                ▼
  │                          collect.py  → artifacts/json/<id>.json
  │
  └─ report build → html_report.py
                        ├─ load_all_results()        ← collect.py
                        ├─ check_expectations()      ← expectations.py
                        ├─ _try_ollama_summary()     ← Ollama API
                        └─ _render_html_report()     → artifacts/html/report.html

Scenario Definitions (`matrix.py`)¶

Every test scenario is a frozen @dataclass with these fields:

Field	Purpose
`scenario_id`	Unique key, used as artifact filename
`launcher`	Top-level executable (e.g. `python`, `uv`, `venv-direct`)
`mode`	Launch mode (e.g. `script.py`, `run script.pyw`)
`fixture`	Which fixture package or script to use
`args`	Command-line arguments to pass to the launcher
`windows_only`	Skip on non-Windows platforms
`requires_uv`	Skip if `uv` is not on PATH

Adding a scenario is data-only — you add a Scenario() entry in matrix.py and (optionally) an entry in expectations.py. No code changes to runner.py are required.

The Detection Pipeline (`runner.py`)¶

Two-Phase Process Observation¶

Each scenario is executed twice:

Phase 1 — Window and Console Detection (Windows only)¶

detect_proc = subprocess.Popen(
    cmd,
    creationflags=subprocess.CREATE_NEW_CONSOLE,
    # NO stdout/stderr pipes — the process gets a real console
)

The process is launched with CREATE_NEW_CONSOLE and no pipes. Pipes suppress console allocation, which would make detection useless.
The runner polls aggressively (10 iterations × 50 ms, then 300 ms) for fast-exiting processes.
If the process is still alive, it snapshots:
- Process tree via CreateToolhelp32Snapshot (ctypes)
- Visible windows via EnumWindows (user32)
- Console host by looking for conhost.exe / WindowsTerminal.exe in the child process tree

Phase 2 — Output Capture (all platforms)¶

proc = subprocess.Popen(cmd, stdout=PIPE, stderr=PIPE, text=True)
out, err = proc.communicate(timeout=timeout)

This captures stdout, stderr, and the exit code.

Keepalive Strategy¶

Many processes (especially uv, uvx, uvw, and venv entry-point wrappers) exit before Phase 1 can snapshot them. When Phase 1 detects an early exit, the runner invokes _try_keepalive_detection(), which re-launches the same executable with a long-lived command:

Executable Type	Keepalive Command
Python-like (`python`, `pythonw`)	`<exe> -c "import time; time.sleep(10)"`
uv-like (`uv`, `uvx`, `uvw`)	`<exe> run python -c "import time; time.sleep(10)"`
Shim (`pyshim-win`)	`<exe> --hide-console -- python -c "import time; time.sleep(10)"`
Venv wrappers	Sibling `python.exe -c "import time; time.sleep(10)"`

The keepalive process is launched with CREATE_NEW_CONSOLE, observed for 800 ms, then killed.

Inference Fallback¶

If direct detection and keepalive both fail (or aren't available), the runner falls back to PE-subsystem-based inference:

CUI executable → console_window = True, visible_window = False
GUI executable → console_window = False, visible_window = True (if the scenario name/mode contains "gui")

Child Python Subsystem Override¶

Venv entry-point wrappers (pip/uv-generated .exe files) internally launch the venv's python.exe or pythonw.exe. The wrapper's PE subsystem may not reflect the child interpreter's subsystem.

The function _detect_child_python_subsystem():

Checks if the executable is a venv wrapper (has a sibling python.exe in the same Scripts/ directory)
Determines which interpreter the wrapper calls (GUI wrappers → pythonw.exe, console wrappers → python.exe)
Inspects the child interpreter's PE subsystem

Critical override: if the wrapper is GUI but the child interpreter is CUI (as happens in uv venvs due to uv#9781, under investigation at joelvaneenwyk/uv#1 with a fix in progress at joelvaneenwyk/uv#2), the runner forces console_window = True because the CUI child will trigger Windows console allocation regardless of the wrapper's subsystem.

PE Inspection (`inspect_pe.py`)¶

Reads the PE optional header from any Windows executable using raw struct unpacking:

# Seek to PE offset (stored at 0x3C in DOS header)
f.seek(0x3C)
pe_offset = struct.unpack("<I", f.read(4))[0]

# Skip COFF header, read optional header magic
# Subsystem field is at offset 68 from optional header start
f.seek(pe_offset + 4 + 20 + 68)
subsystem = struct.unpack("<H", f.read(2))[0]

Returns one of: Subsystem.GUI, Subsystem.CUI, Subsystem.NOT_PE, Subsystem.UNKNOWN, or None (file not found).

Windows Detection (`detect_windows.py`)¶

Three core detection functions, all implemented via ctypes:

`get_process_tree(pid)`¶

Uses CreateToolhelp32Snapshot + Process32First/Process32Next to enumerate all processes, filters to direct children of the given PID, and retrieves full image paths via QueryFullProcessImageNameW.

`detect_console_host(pid)`¶

Calls get_process_tree() and checks for conhost.exe, WindowsTerminal.exe, or OpenConsole.exe among child processes. Returns True if any console host is found.

`detect_visible_window(pid)`¶

Uses EnumWindows to iterate all top-level windows, checks each window's owner PID via GetWindowThreadProcessId, and calls IsWindowVisible. Returns True if any visible window belongs to the target process.

`get_creation_flags(pid)`¶

Attempts to retrieve the creation flags used when the process was spawned. This is recorded in the JSON artifact for forensic inspection.

Expected Behaviour & Anomaly Detection (`expectations.py`)¶

Expectations Dictionary¶

Every scenario has an ExpectedBehaviour definition:

@dataclass(frozen=True)
class ExpectedBehaviour:
    pe_subsystem: Subsystem | None = None
    console_window: bool | None = None
    visible_window: bool | None = None
    stdout_available: bool | None = None
    exit_code: int = 0
    explanation: str = ""
    doc_url: str = ""

Anomaly Checker¶

check_expectations(result) compares the actual ScenarioResult against the expected behaviour field-by-field. Any deviation produces an Anomaly:

@dataclass
class Anomaly:
    field: str        # e.g. "Console Window"
    expected: str     # e.g. "No"
    actual: str       # e.g. "Yes"
    explanation: str  # Why this happened
    doc_url: str      # Link to upstream issue or docs

HTML Report (`html_report.py`)¶

A self-contained HTML file with embedded CSS and JavaScript. Features:

Single Unified Table¶

All scenarios in one table — no per-launcher sections. Columns:

Column	Source
Scenario ID	`scenario_id`
Launcher	`launcher`
Mode	`mode`
PE Subsystem	`pe_subsystem`
Console Window	`console_window_detected`
Visible Window	`visible_window_detected`
stdout	`stdout_available`
Exit Code	`exit_code`
Command Line	`command_line` (relative paths)

Column Filters¶

Each column header has a filter row with dropdowns for enum columns and text inputs for free-text columns. Filtering is done entirely in JavaScript on the client side.

Sortable Headers¶

Click any column header to sort ascending/descending.

Anomaly Highlighting¶

Rows with anomalies get an anomaly-row CSS class (subtle red background). Below each anomaly row, an expandable anomaly-detail-row shows: - Which fields deviate from expectations - Expected vs actual values - An explanation of why this happened - Links to upstream issues

Command Line Column¶

Absolute paths are converted to project-relative paths for readability (e.g. .cache/matrix_venv/Scripts/python.exe instead of the full path).

Ollama AI Integration¶

The report builder optionally calls a local Ollama instance to generate a natural-language summary paragraph.

How It Works¶

Builds a compact JSON payload summarising all scenario results and anomalies
Sends it to POST /api/generate on the local Ollama server
Uses curl (to avoid adding a requests dependency)
The AI summary is inserted at the top of the HTML report

Configuration¶

Environment Variable	Default	Purpose
`OLLAMA_MODEL`	`llama3.2`	Which Ollama model to use
`OLLAMA_HOST`	`http://localhost:11434`	Ollama API endpoint

If Ollama is not running, the report is generated without the AI summary.

CLI & Task Runner¶

CLI (`cli.py`)¶

Built with Typer. Key commands:

py-launch-lab matrix run          # Run all scenarios
py-launch-lab matrix list         # List all defined scenarios
py-launch-lab report build        # Build HTML report from JSON artifacts
py-launch-lab report build --force  # Rebuild even if up-to-date
py-launch-lab scenario run <id>   # Run a single scenario
py-launch-lab inspect exe <path>  # Inspect a PE executable
py-launch-lab probe <executable>  # Probe a binary interactively

The --force flag bypasses the freshness check (compares JSON file mtimes against the report file). Verbose logging is enabled automatically when running report build.

Taskfile (`taskfile.yaml`)¶

The project includes a Taskfile for common operations. The report task forwards the FORCE variable:

report:
  cmds:
    - uv run py-launch-lab report build {{if .FORCE}}--force{{end}}
  vars:
    FORCE: '{{.FORCE | default ""}}'

Usage: task report FORCE=1

Data Model (`models.py`)¶

The ScenarioResult Pydantic model is the primary evidence unit. All fields use model_config = ConfigDict(use_enum_values=True) so JSON serialisation uses string values rather than enum names.

Key enums:

Subsystem — GUI, CUI, UNKNOWN, NOT_PE
LauncherKind — python, pythonw, uv, uvw, uvx, venv-direct, pyshim-win, etc.

The ProcessInfo sub-model captures process tree snapshots with PID, name, executable path, PE subsystem, and command line.

Test Suite¶

75 unit tests (5 skipped on non-Windows) covering:

PE inspection with synthetic PE files
Matrix scenario definitions
HTML report generation (unified table, anomaly highlighting, filter row)
JSON schema validation
Detection module imports
Runner module logic