1.7 KiB
UI Folder Batch Conversion Design
Goal
Add a minimal UI workflow that lets the user select one folder and convert every PDF directly inside that folder to Markdown.
Scope
- Include only
*.pdffiles directly under the selected folder. - Exclude PDFs in nested folders.
- Reuse the existing
pdf2md convertCLI command for each PDF. - Keep conversion sequential to avoid GPU and MinerU runtime contention.
- Apply the existing UI conversion options to every PDF in the batch: output directory, overwrite, keep raw, grouped pages, GPU, and MinerU profile.
Design
The runner layer owns folder discovery and batch command construction. It will expose a small helper that returns direct-child PDF paths in deterministic name order and another helper that builds one fixed-argument CommandSpec per PDF by calling the existing build_convert_command().
The Tk UI adds an input-folder row and a folder-convert button. When the user starts folder conversion, the UI validates the selected folder, builds the command list, and runs commands one at a time on the existing worker thread pattern. It logs each PDF before it starts, stops on the first non-zero exit code, and honors Cancel by terminating the currently running process and not starting later PDFs.
Non-Goals
- No recursive folder conversion.
- No parallel conversion.
- No new CLI command.
- No direct MinerU invocation from the UI.
- No remote/API options or arbitrary shell command execution.
Verification
- Add focused runner tests for direct-child PDF discovery, nested PDF exclusion, deterministic ordering, and batch command construction.
- Run
uv run pytest tests/test_ui_runner.py. - Rebuild the UI executable with PyInstaller and confirm
dist/pdf2md-ui.exeexists.