modify pdftomd

2026-05-14 10:16:59 +09:00
parent 2232b51fc9
commit dc11880140
69 changed files with 7784 additions and 1150 deletions
@@ -0,0 +1,33 @@
+# UI Folder Batch Conversion Design
+
+## Goal
+
+Add a minimal UI workflow that lets the user select one folder and convert every PDF directly inside that folder to Markdown.
+
+## Scope
+
+- Include only `*.pdf` files directly under the selected folder.
+- Exclude PDFs in nested folders.
+- Reuse the existing `pdf2md convert` CLI command for each PDF.
+- Keep conversion sequential to avoid GPU and MinerU runtime contention.
+- Apply the existing UI conversion options to every PDF in the batch: output directory, overwrite, keep raw, grouped pages, GPU, and MinerU profile.
+
+## Design
+
+The runner layer owns folder discovery and batch command construction. It will expose a small helper that returns direct-child PDF paths in deterministic name order and another helper that builds one fixed-argument `CommandSpec` per PDF by calling the existing `build_convert_command()`.
+
+The Tk UI adds an input-folder row and a folder-convert button. When the user starts folder conversion, the UI validates the selected folder, builds the command list, and runs commands one at a time on the existing worker thread pattern. It logs each PDF before it starts, stops on the first non-zero exit code, and honors Cancel by terminating the currently running process and not starting later PDFs.
+
+## Non-Goals
+
+- No recursive folder conversion.
+- No parallel conversion.
+- No new CLI command.
+- No direct MinerU invocation from the UI.
+- No remote/API options or arbitrary shell command execution.
+
+## Verification
+
+- Add focused runner tests for direct-child PDF discovery, nested PDF exclusion, deterministic ordering, and batch command construction.
+- Run `uv run pytest tests/test_ui_runner.py`.
+- Rebuild the UI executable with PyInstaller and confirm `dist/pdf2md-ui.exe` exists.