# V1 Implementation Plan: Local PDF-to-Markdown Converter Last updated: 2026-05-13 This document tracks the current v1 implementation state and open future decisions. It does not replace `PRD.md` or `ARCHITECTURE.md`; use those files as the source of product requirements and system design. Completed sprint details are archived in `docs/WORKARCHIVE.md`, and detailed acceptance criteria remain in `docs/Sprints/*.md`. ## 1. Current V1 State The core v1 converter is implemented through Sprint 16. The implemented system includes: - Python 3.12 package and `pdf2md` CLI. - Direct local MinerU 3.1.0 CLI adapter with strict-local enforcement. - Obsidian-friendly Markdown normalization. - Internal provenance, structured warnings, quality checks, and one human-readable report. - `pdf2md doctor`. - Optional grouped page conversion through `--chunk-pages`. - Local MathJax render checking and conservative failed-span repair. - pypdf-based text fidelity diagnostics. - NVIDIA GPU inventory, `--gpu auto`, and `--mineru-profile auto|safe|performance`. - Simplified output layout: `//_001.md`, shared `//images/`, and `//_report.md`. - No public metadata JSON for new conversions. - Minimal Windows UI launcher over the existing CLI, including direct-folder PDF batch conversion through sequential `pdf2md convert` subprocesses. Historical implementation evidence, verification commands, and sample conversion results are in `docs/WORKARCHIVE.md`. ## 2. V1 Outcome v1 is complete when a local user can run: ```bash uv run pdf2md doctor uv run pdf2md convert paper.pdf --out out uv run pdf2md convert pdfs --out out --recursive ``` and receive, for each PDF: - Obsidian-friendly Markdown parts under `//_001.md`, `_002.md`, and so on. - A stable shared image/media directory under `//images/`. - One human-readable report under `//_report.md`. - No persisted metadata JSON for new conversions. - Clear warnings when math, tables, assets, reading order, text fidelity, GPU availability, or MinerU execution are uncertain. Long PDFs can be chunked explicitly: ```bash uv run pdf2md convert paper.pdf --out out --chunk-pages uv run pdf2md convert paper.pdf --out out --chunk-pages 20 ``` When `--chunk-pages` is active, MinerU receives one-page temporary PDFs and final Markdown files are grouped by the configured page count. Temporary one-page PDFs and intermediate per-page outputs are deleted. The Windows UI launcher is a convenience wrapper over `pdf2md`; it is not a separate conversion pipeline. UI folder batch conversion runs direct-child PDFs sequentially through the same CLI conversion path. ## 3. Non-Negotiable Constraints - Python 3.12 and `uv`. - MinerU 3.1.0 is the only conversion engine. - Direct local MinerU CLI execution only. - MinerU 3.1.0 may launch a temporary local `mineru-api` internally when CLI runs without `--api-url`. - No cloud OCR, hosted LLM/VLM, remote document parser, `--api-url`, remote APIs, router mode, HTTP client backends, or remote OpenAI-compatible backends. - Target hardware: NVIDIA GTX 1070 Ti 8GB. - Digital PDFs with text layers are the v1 priority. - `samples/` is local fixture context and must not be committed unless explicitly requested. - UI launcher must invoke `pdf2md` or `uv run pdf2md`; it must not call MinerU directly or bundle the full conversion runtime. - Every substantial implementation chunk needs a sprint contract and independent evaluation. ## 4. Current Repository Layout ```text pyproject.toml README.md src/ pdf2md/ __init__.py cli.py conversion.py pdf_splitter.py paths.py mineru_adapter.py ir.py markdown.py metadata.py quality.py report.py doctor.py gpu.py mineru_profile.py math_render.py math_repair.py text_fidelity.py pdf2md_ui/ __init__.py app.py runner.py tests/ integration/ docs/ Sprints/ superpowers/ ``` Do not scaffold unused modules before a sprint needs them. ## 5. Active Next Sprint Status: - No active implementation sprint. Next implementation work should start from a new user-approved requirement and, if substantial, a new sprint contract. ## 6. Abandoned Planning ### Sprint 17: Offline Windows Installer Status: - Abandoned at the user's request on 2026-05-13. Historical references: - `docs/Sprints/SPRINT17CONTRACT.md`. - `docs/superpowers/plans/2026-05-12-offline-installer.md`. Do not implement or extend Sprint 17 unless the user explicitly reopens offline installer work. ## 7. Future Decisions - Decide whether simplified outputs need a metadata-free `pdf2md recheck`; current `recheck` remains legacy-only for outputs with adjacent metadata JSON. - Validate `--gpu auto --mineru-profile auto` on a stronger NVIDIA GPU PC. ## 8. Harness Operating Model Use the project long-running harness only for substantial implementation work. 1. `harness-planner-agent` turns the next user request into a sprint contract. 2. `evaluation-agent` reviews the contract before code changes start. 3. `feature-generator-agent` implements one approved contract at a time. 4. `feature-generator-agent` runs self-checks and records residual risks. 5. `evaluation-agent` independently verifies the result against the contract. 6. The parent agent updates `PROGRESS.md`, commits the completed change, and leaves a handoff. After a chunk is no longer active, archive completed-work details in `docs/WORKARCHIVE.md` and keep `PROGRESS.md` focused on current status, blockers, and next actions. ## 9. Completed Sprint Archive Completed sprint details have been moved out of this active implementation plan. - Summary and verification evidence: `docs/WORKARCHIVE.md`. - Detailed historical contracts: `docs/Sprints/SPRINT0CONTRACT.md` through `docs/Sprints/SPRINT16CONTRACT.md`. - UI folder batch design and execution record: `docs/superpowers/specs/2026-05-13-ui-folder-batch-conversion-design.md` and `docs/superpowers/plans/2026-05-13-ui-folder-batch-conversion.md`. - Abandoned Sprint 17 planning record: `docs/Sprints/SPRINT17CONTRACT.md` and `docs/superpowers/plans/2026-05-12-offline-installer.md`. Facts carried forward from completed work: - MinerU is fixed to version 3.1.0. - Direct local CLI command shape is `mineru -p -o `. - Python 3.12 is compatible with the pinned MinerU package range. - GTX 1070 Ti CUDA/PyTorch support needs explicit doctor validation. - Formula reconstruction remains best effort and must keep warnings/provenance visible. - MinerU/model license posture is acceptable for personal local use. Redistribution remains gated by license review.