112 lines
4.4 KiB
Markdown
112 lines
4.4 KiB
Markdown
# UI Folder Batch Conversion Implementation Plan
|
|
|
|
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
|
|
|
**Goal:** Add a minimal UI folder workflow that converts every direct-child PDF in a selected folder by sequentially invoking the existing `pdf2md convert` CLI.
|
|
|
|
**Architecture:** Keep the converter and CLI unchanged. Add deterministic folder discovery and batch command construction to `src/pdf2md_ui/runner.py`, then make `src/pdf2md_ui/app.py` run a list of `CommandSpec` objects sequentially on the existing worker-thread/event-queue pattern.
|
|
|
|
**Tech Stack:** Python 3.12, tkinter/ttk, pytest, PyInstaller, existing `pdf2md_ui.runner` subprocess wrapper.
|
|
|
|
---
|
|
|
|
### Task 1: Runner Batch Helpers
|
|
|
|
**Files:**
|
|
- Modify: `tests/test_ui_runner.py`
|
|
- Modify: `src/pdf2md_ui/runner.py`
|
|
|
|
- [x] **Step 1: Write failing tests**
|
|
|
|
```python
|
|
def test_list_direct_pdf_files_returns_sorted_direct_children_only(tmp_path: Path) -> None:
|
|
(tmp_path / "b.PDF").write_text("", encoding="utf-8")
|
|
(tmp_path / "a.pdf").write_text("", encoding="utf-8")
|
|
nested = tmp_path / "nested"
|
|
nested.mkdir()
|
|
(nested / "c.pdf").write_text("", encoding="utf-8")
|
|
(tmp_path / "notes.txt").write_text("", encoding="utf-8")
|
|
|
|
assert [path.name for path in list_direct_pdf_files(tmp_path)] == ["a.pdf", "b.PDF"]
|
|
```
|
|
|
|
```python
|
|
def test_build_batch_convert_commands_reuses_convert_options(tmp_path: Path) -> None:
|
|
resolved = ResolvedCommand(("pdf2md",), cwd=None, source="path")
|
|
pdfs = [tmp_path / "a.pdf", tmp_path / "b.pdf"]
|
|
|
|
commands = build_batch_convert_commands(
|
|
resolved,
|
|
pdfs,
|
|
tmp_path / "out",
|
|
overwrite=True,
|
|
keep_raw=True,
|
|
chunk_pages=5,
|
|
gpu="auto",
|
|
mineru_profile="safe",
|
|
)
|
|
|
|
assert [command.args[2] for command in commands] == [str(pdfs[0]), str(pdfs[1])]
|
|
assert all("--chunk-pages" in command.args for command in commands)
|
|
assert all("--mineru-profile" in command.args for command in commands)
|
|
```
|
|
|
|
- [x] **Step 2: Run tests to verify RED**
|
|
|
|
Run: `uv run pytest tests/test_ui_runner.py::test_list_direct_pdf_files_returns_sorted_direct_children_only tests/test_ui_runner.py::test_build_batch_convert_commands_reuses_convert_options -q`
|
|
|
|
Expected: FAIL because the new helpers are not defined.
|
|
|
|
- [x] **Step 3: Implement minimal runner helpers**
|
|
|
|
Add `list_direct_pdf_files(folder)` using `Path.iterdir()` and case-insensitive `.pdf` suffix matching. Add `build_batch_convert_commands()` that loops over the provided PDF paths and delegates to `build_convert_command()`.
|
|
|
|
- [x] **Step 4: Run tests to verify GREEN**
|
|
|
|
Run: `uv run pytest tests/test_ui_runner.py -q`
|
|
|
|
Expected: all UI runner tests pass.
|
|
|
|
### Task 2: Tk UI Batch Execution
|
|
|
|
**Files:**
|
|
- Modify: `src/pdf2md_ui/app.py`
|
|
|
|
- [x] **Step 1: Add folder state and controls**
|
|
|
|
Add `input_folder_var`, a path row labeled `Input folder`, and a `Convert folder` button beside the existing action buttons.
|
|
|
|
- [x] **Step 2: Add batch command startup**
|
|
|
|
Implement `_choose_folder()`, `_run_folder_convert()`, and `_start_command_sequence()`. `_run_folder_convert()` validates the folder and output directory, parses `chunk_pages`, builds commands through the runner helper, and starts the sequence.
|
|
|
|
- [x] **Step 3: Add sequential worker behavior**
|
|
|
|
Run each command synchronously on the worker thread. Emit log messages before each file starts. Stop after the first non-zero exit code. If Cancel is requested, terminate the active command and do not start later commands.
|
|
|
|
- [x] **Step 4: Run focused tests**
|
|
|
|
Run: `uv run pytest tests/test_ui_runner.py -q`
|
|
|
|
Expected: all UI runner tests pass; UI app imports without syntax errors through test collection.
|
|
|
|
### Task 3: Build and Handoff
|
|
|
|
**Files:**
|
|
- Modify: `PROGRESS.md`
|
|
- Generated ignored output: `dist/pdf2md-ui.exe`
|
|
|
|
- [x] **Step 1: Rebuild the UI executable**
|
|
|
|
Run: `uv run --group ui-build pyinstaller --clean --onefile --windowed --name pdf2md-ui src\pdf2md_ui\app.py`
|
|
|
|
Expected: exit code 0 and `dist\pdf2md-ui.exe` exists.
|
|
|
|
- [x] **Step 2: Update progress**
|
|
|
|
Record the new UI folder batch feature and verification commands in `PROGRESS.md`.
|
|
|
|
- [x] **Step 3: Check and commit**
|
|
|
|
Run: `git diff --check`, `git status --short`, then commit only the scoped source, test, and documentation changes.
|