219 lines
8.0 KiB
Markdown
219 lines
8.0 KiB
Markdown
# Sprint 12 Contract: Minimal Windows UI Launcher
|
|
|
|
Status: Implemented with residual conversion-smoke risk
|
|
Last updated: 2026-05-11
|
|
|
|
## Objective
|
|
|
|
Build a minimal Windows desktop launcher for the existing `pdf2md` CLI and package the launcher itself as `dist/pdf2md-ui.exe`.
|
|
|
|
The UI must remain a thin local launcher. It must not become a second conversion engine, a hosted app, a manual review workflow, or a bundled redistribution of MinerU, CUDA PyTorch, model weights, Node.js, or MathJax.
|
|
|
|
## Research Basis
|
|
|
|
- Primary research document: `docs/UI_RESEARCH.md`.
|
|
- The recommended implementation path is `tkinter`/`ttk`, a subprocess runner around `pdf2md` or `uv run pdf2md`, and PyInstaller for the Windows executable.
|
|
|
|
## Current Precondition
|
|
|
|
- `pdf2md doctor`, `pdf2md convert`, and `pdf2md recheck` are implemented.
|
|
- Conversion remains strict-local and MinerU-only.
|
|
- Current CLI output is coarse during MinerU execution because the adapter captures MinerU subprocess output internally.
|
|
- UI research is complete.
|
|
- UI implementation exists under `src/pdf2md_ui/`.
|
|
- `dist\pdf2md-ui.exe` can be built with PyInstaller.
|
|
|
|
## Touched Surfaces
|
|
|
|
Allowed during implementation:
|
|
|
|
- `src/pdf2md_ui/__init__.py`
|
|
- `src/pdf2md_ui/app.py`
|
|
- `src/pdf2md_ui/runner.py`
|
|
- `tests/test_ui_runner.py`
|
|
- `pyproject.toml`
|
|
- `uv.lock`
|
|
- `README.md`
|
|
- `PLAN.md`
|
|
- `PROGRESS.md`
|
|
- `docs/WORKARCHIVE.md`
|
|
- `docs/V1IMPLEMENTATIONPLAN.md`
|
|
|
|
Generated but not committed unless explicitly requested:
|
|
|
|
- `build/`
|
|
- `dist/`
|
|
- `*.spec`
|
|
- generated conversion outputs under `outputs/`
|
|
|
|
Not allowed:
|
|
|
|
- Runtime document upload paths.
|
|
- Remote OCR, hosted LLM/VLM, hosted renderers, or remote document parsing APIs.
|
|
- `--api-url`, router mode, HTTP client backends, remote OpenAI-compatible endpoints, or runtime engine selection.
|
|
- Direct UI calls to `mineru`; the UI must call the project-owned `pdf2md` CLI.
|
|
- Bundling MinerU, CUDA PyTorch, local model weights, Node.js, or MathJax into the first UI executable.
|
|
- Batch queues, drag/drop, PDF preview, Markdown preview, Obsidian automation, installer generation, or code signing in this sprint.
|
|
- Mandatory default tests that require real MinerU, GPU, model files, network, Obsidian, or `samples/`.
|
|
|
|
## Product Behavior
|
|
|
|
The first UI is a single-window launcher:
|
|
|
|
- Select one input PDF.
|
|
- Select an output root, defaulting to `outputs`; the current CLI creates the final `<stem>\` folder inside it.
|
|
- Configure only existing CLI options:
|
|
- overwrite
|
|
- keep raw output
|
|
- optional grouped pages with default `20`
|
|
- GPU device with default `cuda:0`, including `auto` when supported by the CLI
|
|
- MinerU profile `auto|safe|performance` with default `auto`
|
|
- Run `Doctor`.
|
|
- Run `Convert`.
|
|
- Run `Recheck` for an existing Markdown output.
|
|
- Cancel a running subprocess.
|
|
- Open the output directory after completion.
|
|
- Show a read-only log and indeterminate progress while a command is running.
|
|
|
|
Command resolution:
|
|
|
|
1. Use a configured command if present.
|
|
2. Else use `pdf2md` from `PATH`.
|
|
3. Else use `uv run pdf2md` from a configured project root containing `pyproject.toml`.
|
|
4. Else report a setup error and direct the user to run `pdf2md doctor`.
|
|
|
|
## Architecture Plan
|
|
|
|
### WP12.1: CLI Runner
|
|
|
|
Actions:
|
|
|
|
- Add a runner module that builds fixed argument lists for `doctor`, `convert`, and `recheck`.
|
|
- Use `subprocess.Popen` with `shell=False`.
|
|
- Set `MINERU_MODEL_SOURCE=local` in the child environment unless already set.
|
|
- Merge stderr into stdout for a single UI log stream.
|
|
- Read subprocess output on a worker thread and report status events to the UI.
|
|
- Add a Windows process-tree cancellation helper that uses `taskkill /pid <pid> /t /f` only after normal termination does not finish promptly.
|
|
|
|
Expected output:
|
|
|
|
- Testable command-construction and process-management code that never accepts arbitrary shell text from the UI.
|
|
|
|
### WP12.2: Minimal Tk UI
|
|
|
|
Actions:
|
|
|
|
- Add a `tkinter`/`ttk` app with file and directory pickers, option controls, command buttons, progress indicator, and log pane.
|
|
- Keep long-running work off Tk's event handler thread.
|
|
- Disable conflicting controls while a command is running.
|
|
- Surface non-zero exit codes clearly.
|
|
|
|
Expected output:
|
|
|
|
- A simple local GUI for existing CLI workflows.
|
|
|
|
### WP12.3: Build
|
|
|
|
Actions:
|
|
|
|
- Add PyInstaller only to a build dependency group such as `ui-build`.
|
|
- Build the executable with:
|
|
|
|
```powershell
|
|
uv run --group ui-build pyinstaller --clean --onefile --windowed --name pdf2md-ui src\pdf2md_ui\app.py
|
|
```
|
|
|
|
Expected output:
|
|
|
|
- `dist\pdf2md-ui.exe` exists after the build.
|
|
|
|
## Verification Checks
|
|
|
|
Default checks:
|
|
|
|
- `uv run pytest tests/test_ui_runner.py`
|
|
- `uv run pytest tests/test_cli.py` if shared CLI behavior changes
|
|
- `git diff --check`
|
|
- `git status --short --untracked-files=all`
|
|
|
|
Build check:
|
|
|
|
```powershell
|
|
uv run --group ui-build pyinstaller --clean --onefile --windowed --name pdf2md-ui src\pdf2md_ui\app.py
|
|
Test-Path dist\pdf2md-ui.exe
|
|
```
|
|
|
|
Manual smoke:
|
|
|
|
1. Launch `dist\pdf2md-ui.exe`.
|
|
2. Run Doctor from the UI.
|
|
3. Convert one small local sample into an ignored `outputs/` directory.
|
|
4. Confirm Markdown, report Markdown, and assets are produced as expected for the active output layout.
|
|
|
|
## Acceptance Criteria
|
|
|
|
- The UI invokes `pdf2md` or `uv run pdf2md`; it never invokes `mineru` directly.
|
|
- Commands are fixed argument lists and run with `shell=False`.
|
|
- The UI remains responsive while a conversion is running.
|
|
- Cancel attempts to stop the process tree on Windows.
|
|
- Doctor and conversion exit codes are visible in the UI.
|
|
- PyInstaller produces `dist\pdf2md-ui.exe`.
|
|
- Default tests stay independent of real MinerU, GPU, model files, network, Obsidian, and `samples/`.
|
|
|
|
## Hard Failure Criteria
|
|
|
|
- UI code exposes arbitrary shell command execution.
|
|
- UI exposes remote/API options or weakens strict-local policy.
|
|
- UI claims conversion success without checking the CLI exit code.
|
|
- UI freezes during a long conversion because the CLI runs on Tk's event handler thread.
|
|
- The first UI executable bundles MinerU, CUDA PyTorch, model weights, Node.js, or MathJax.
|
|
- Build outputs, generated conversion outputs, local models, or sample PDFs are committed.
|
|
|
|
## Handoff Requirements
|
|
|
|
After implementation:
|
|
|
|
- Update `PROGRESS.md` with files changed, commands run, test outcomes, build outcome, known failures, residual risks, and next action.
|
|
- Move completed implementation details to `docs/WORKARCHIVE.md` after verification.
|
|
- Keep sample PDFs and generated outputs out of the commit.
|
|
|
|
## Implementation Handoff
|
|
|
|
Files changed:
|
|
|
|
- `src/pdf2md_ui/__init__.py`
|
|
- `src/pdf2md_ui/app.py`
|
|
- `src/pdf2md_ui/runner.py`
|
|
- `tests/test_ui_runner.py`
|
|
- `pyproject.toml`
|
|
- `uv.lock`
|
|
- `README.md`
|
|
- `PLAN.md`
|
|
- `PROGRESS.md`
|
|
- `docs/WORKARCHIVE.md`
|
|
- `docs/V1IMPLEMENTATIONPLAN.md`
|
|
|
|
Verification:
|
|
|
|
- `uv run pytest tests\test_ui_runner.py`: passed 16 tests.
|
|
- `uv run pytest`: passed 188 tests with 1 optional skip.
|
|
- `uv run --group ui-build pyinstaller --clean --onefile --windowed --name pdf2md-ui src\pdf2md_ui\app.py`: passed.
|
|
- `Test-Path dist\pdf2md-ui.exe`: returned `True`.
|
|
- `uv run pdf2md doctor`: returned WARN only for the documented GTX 1070 Ti/Pascal compatibility risk.
|
|
- Launch smoke for `dist\pdf2md-ui.exe`: process started and was then terminated by the smoke script.
|
|
|
|
Follow-up refresh on 2026-05-12:
|
|
|
|
- Updated the UI command builder and form controls for the Sprint 15 `--mineru-profile auto|safe|performance` CLI option.
|
|
- Rebuilt `dist\pdf2md-ui.exe` after Sprint 16 simplified output layout and Sprint 15 profile changes.
|
|
- `uv run pytest tests\test_ui_runner.py`: passed 17 tests.
|
|
- Launch smoke for the rebuilt `dist\pdf2md-ui.exe`: process started and was then terminated by the smoke script.
|
|
|
|
Known failure:
|
|
|
|
- A CLI conversion smoke using `samples\FourNodeQuadrilateralShellElementMITC4.pdf` and the same command shape used by the UI did not finish within the 15-minute timeout. The spawned process tree was terminated with `taskkill`.
|
|
|
|
Residual risk:
|
|
|
|
- A hands-on UI Doctor click and UI conversion click should still be run when the local MinerU runtime is expected to complete within an acceptable time.
|