34 lines
1.7 KiB
Markdown
34 lines
1.7 KiB
Markdown
# UI Folder Batch Conversion Design
|
|
|
|
## Goal
|
|
|
|
Add a minimal UI workflow that lets the user select one folder and convert every PDF directly inside that folder to Markdown.
|
|
|
|
## Scope
|
|
|
|
- Include only `*.pdf` files directly under the selected folder.
|
|
- Exclude PDFs in nested folders.
|
|
- Reuse the existing `pdf2md convert` CLI command for each PDF.
|
|
- Keep conversion sequential to avoid GPU and MinerU runtime contention.
|
|
- Apply the existing UI conversion options to every PDF in the batch: output directory, overwrite, keep raw, grouped pages, GPU, and MinerU profile.
|
|
|
|
## Design
|
|
|
|
The runner layer owns folder discovery and batch command construction. It will expose a small helper that returns direct-child PDF paths in deterministic name order and another helper that builds one fixed-argument `CommandSpec` per PDF by calling the existing `build_convert_command()`.
|
|
|
|
The Tk UI adds an input-folder row and a folder-convert button. When the user starts folder conversion, the UI validates the selected folder, builds the command list, and runs commands one at a time on the existing worker thread pattern. It logs each PDF before it starts, stops on the first non-zero exit code, and honors Cancel by terminating the currently running process and not starting later PDFs.
|
|
|
|
## Non-Goals
|
|
|
|
- No recursive folder conversion.
|
|
- No parallel conversion.
|
|
- No new CLI command.
|
|
- No direct MinerU invocation from the UI.
|
|
- No remote/API options or arbitrary shell command execution.
|
|
|
|
## Verification
|
|
|
|
- Add focused runner tests for direct-child PDF discovery, nested PDF exclusion, deterministic ordering, and batch command construction.
|
|
- Run `uv run pytest tests/test_ui_runner.py`.
|
|
- Rebuild the UI executable with PyInstaller and confirm `dist/pdf2md-ui.exe` exists.
|