Files
PDFToMD/docs/superpowers/specs/2026-05-13-ui-folder-batch-conversion-design.md
2026-05-14 10:16:59 +09:00

1.7 KiB

UI Folder Batch Conversion Design

Goal

Add a minimal UI workflow that lets the user select one folder and convert every PDF directly inside that folder to Markdown.

Scope

  • Include only *.pdf files directly under the selected folder.
  • Exclude PDFs in nested folders.
  • Reuse the existing pdf2md convert CLI command for each PDF.
  • Keep conversion sequential to avoid GPU and MinerU runtime contention.
  • Apply the existing UI conversion options to every PDF in the batch: output directory, overwrite, keep raw, grouped pages, GPU, and MinerU profile.

Design

The runner layer owns folder discovery and batch command construction. It will expose a small helper that returns direct-child PDF paths in deterministic name order and another helper that builds one fixed-argument CommandSpec per PDF by calling the existing build_convert_command().

The Tk UI adds an input-folder row and a folder-convert button. When the user starts folder conversion, the UI validates the selected folder, builds the command list, and runs commands one at a time on the existing worker thread pattern. It logs each PDF before it starts, stops on the first non-zero exit code, and honors Cancel by terminating the currently running process and not starting later PDFs.

Non-Goals

  • No recursive folder conversion.
  • No parallel conversion.
  • No new CLI command.
  • No direct MinerU invocation from the UI.
  • No remote/API options or arbitrary shell command execution.

Verification

  • Add focused runner tests for direct-child PDF discovery, nested PDF exclusion, deterministic ordering, and batch command construction.
  • Run uv run pytest tests/test_ui_runner.py.
  • Rebuild the UI executable with PyInstaller and confirm dist/pdf2md-ui.exe exists.