Files
PDFToMD/PLAN.md
T
2026-05-14 10:16:59 +09:00

5.6 KiB

PLAN.md

This file is the shared work plan for agents. Read it before starting work, then update it when the plan changes.

Current Goal

Completed work through Sprint 16, the Sprint 16 SolidElement validation, and the UI direct-folder batch conversion is archived in docs/WORKARCHIVE.md. Sprint 17 offline installer planning has been abandoned and is retained only as historical context.

Active Constraints

  • Do not implement additional program code beyond the active user-approved sprint.
  • Keep MinerU 3.1.0 as the only conversion engine.
  • Keep processing local-only.
  • Target Python 3.12.
  • Target GPU: GTX 1070 Ti 8GB.
  • Default conversion device: cuda:0.
  • Default MinerU profile: auto.
  • Run MinerU through direct local CLI execution only.
  • UI code must invoke the existing project-owned pdf2md CLI; it must not call MinerU directly.
  • The current UI executable is a thin launcher for the installed local runtime, not a self-contained bundle of MinerU, PyTorch, CUDA, local models, Node.js, or MathJax.
  • UI subprocess calls must use fixed argument lists with shell=False and must not expose arbitrary command execution.
  • UI folder batch conversion must run direct-child PDFs sequentially through existing pdf2md convert commands.
  • On MinerU failure, report a clear error/warning and do not silently fallback.
  • Current conversions write simplified Markdown/report outputs with no persisted metadata JSON; internal provenance still feeds warnings and reports.
  • pdf2md recheck remains legacy-only for outputs that still have adjacent metadata JSON.
  • Do not commit generated installer payloads, wheelhouses, Python installers, model files, Node binaries, generated installer executables, build/, dist/, outputs/, or samples/.
  • Use samples/ only as local fixture context; do not commit sample files unless explicitly requested.

Active References

  • Product requirements: PRD.md.
  • System design: ARCHITECTURE.md.
  • Agent workflow: AGENTS.md.
  • Current implementation sequence: docs/V1IMPLEMENTATIONPLAN.md.
  • Completed work archive: docs/WORKARCHIVE.md.
  • Release gates: docs/V1RELEASECHECKLIST.md.
  • Completed UI folder batch design and plan: docs/superpowers/specs/2026-05-13-ui-folder-batch-conversion-design.md and docs/superpowers/plans/2026-05-13-ui-folder-batch-conversion.md.
  • Abandoned Sprint 17 historical plan: docs/Sprints/SPRINT17CONTRACT.md and docs/superpowers/plans/2026-05-12-offline-installer.md.

Planned Work

  1. Keep completed sprint details out of PROGRESS.md; use docs/WORKARCHIVE.md and docs/Sprints/*.md for history.
  2. Preserve strict-local runtime behavior: use local model paths, direct CLI execution, and no user-specified API or remote backend.
  3. When practical, run hands-on UI smoke from dist\pdf2md-ui.exe: Doctor, then one small local conversion to ignored outputs/.
  4. On a stronger NVIDIA GPU PC, run uv run pdf2md doctor and one optional local conversion with --gpu auto --mineru-profile auto to validate the auto profile.
  5. Decide in a future sprint whether simplified outputs need metadata-free pdf2md recheck; current behavior intentionally remains legacy-only.

Completed Work References

  • Completed sprint outcomes through Sprint 16 are summarized in docs/WORKARCHIVE.md.
  • Detailed historical contracts remain under docs/Sprints/SPRINT0CONTRACT.md through docs/Sprints/SPRINT16CONTRACT.md.
  • UI direct-folder batch conversion is archived in docs/WORKARCHIVE.md; its design and execution plan live under docs/superpowers/.
  • Abandoned Sprint 17 offline installer planning is archived in docs/WORKARCHIVE.md and must not be treated as active planned work.
  • Historical verification results and sample conversion evidence live in docs/WORKARCHIVE.md.

Open Questions

  • Whether metadata-free pdf2md recheck should be designed for simplified outputs.
  • Whether a stronger NVIDIA GPU PC changes the default practical MinerU profile recommendation after real conversion validation.

Decisions

  • Use PLAN.md for intended work and ownership.
  • Use PROGRESS.md for current status, blockers, and next actions.
  • Use docs/WORKARCHIVE.md for archived completed work, historical verification, runtime setup evidence, and sample conversion evidence.
  • MinerU default local CLI execution is the only v1 execution mode.
  • MinerU 3.1.0 may launch a temporary local mineru-api internally when mineru CLI runs without --api-url.
  • Strict-local mode forbids --api-url, remote APIs, router mode, HTTP client backends, and remote OpenAI-compatible backends.
  • No silent fallback after MinerU failure.
  • Current conversion output uses <stem>/<stem>_001.md, shared <stem>/images/, and one <stem>/<stem>_report.md; new conversions do not persist metadata JSON.
  • Local MathJax render checking is optional and nonfatal; missing Node.js or MathJax must produce a clear warning instead of blocking conversion.
  • Chunking remains opt-in through --chunk-pages; if the option is present without a value, final grouped outputs use 20 source pages.
  • In chunk mode, MinerU receives one source page per run and final Markdown parts are grouped by chunk_pages.
  • --gpu auto selects the visible NVIDIA GPU with the largest local nvidia-smi VRAM report.
  • --mineru-profile auto is the default and stays conservative on GTX 1070 Ti 8GB, low-VRAM, and pre-Turing GPUs.
  • The UI launcher can convert a direct folder by running one existing pdf2md convert command per direct-child PDF sequentially.
  • Sprint 17 offline installer planning is abandoned. Do not implement or extend offline installer work unless the user explicitly reopens that direction.