add pdftomd

2026-05-08 16:42:19 +09:00
parent 551ab50735
commit 88d6b92283
99 changed files with 47332 additions and 0 deletions
@@ -0,0 +1,320 @@
+# Sprint 9 Contract: Local Fixture Evaluation And V1 Release Gate
+
+Status: Implemented
+Last updated: 2026-05-08
+
+## Objective
+
+Validate the v1 converter against local fixture workflows without committing sample PDFs or making the default test loop depend on MinerU models, GPU, CUDA, network access, Obsidian, or LaTeX tooling.
+
+Sprint 9 must establish:
+
+- A fast mocked integration suite that exercises the public conversion path end to end.
+- An optional, explicitly enabled local MinerU fixture evaluation path for `samples/`.
+- A fixture coverage manifest or checklist that records which local PDFs cover math, tables, figures/assets, reading order, Korean filenames, and metadata/report risks.
+- Release-gate documentation that distinguishes default automated checks from optional local MinerU/GPU checks.
+- Clear `PROGRESS.md` notes for local fixture coverage, skipped/blocked optional checks, known quality risks, and the v1 go/no-go recommendation.
+
+Sprint 9 is an evaluation and release-gate sprint. It may add tests, local-only evaluation helpers, fixture manifests, and narrow compatibility fixes only when needed to evaluate the current v1 behavior. It must not add alternate engines, cloud/API paths, runtime engine selection, or automatic model downloads.
+
+## Current Precondition
+
+Sprint 8 is complete:
+
+- `pdf2md doctor` exists and reports Python, `uv`, MinerU CLI/version, GPU, PyTorch, model/cache, and strict-local policy status.
+- Local `pdf2md doctor` currently fails because the `mineru` CLI is not installed on PATH.
+- `pdf2md convert` exists and writes Markdown, metadata JSON, and `<stem>.report.md` with fake-adapter test coverage.
+- Default tests pass without real MinerU, CUDA, GPU, model files, network, Obsidian, LaTeX tooling, or `samples/`.
+- `samples/` exists locally and is untracked. Observed local fixture files include:
+  - `samples/FourNodeQuadrilateralShellElementMITC4.pdf`
+  - `samples/MITC공부.pdf`
+  - `samples/2007쉘구조물의유한요소해석에대하여.pdf`
+  - `samples/유한요소해석법을이용한쉘구조물의동적좌굴해석.pdf`
+  - `samples/metadata.json`
+
+Sprint 9 must preserve the untracked status of `samples/` unless the user explicitly requests otherwise.
+
+## Touched Surfaces
+
+Allowed:
+
+- `tests/integration/`
+- `tests/test_conversion.py`
+- `tests/test_cli.py`
+- `tests/test_report.py`
+- `tests/test_metadata.py`
+- `tests/test_quality.py`
+- `tests/conftest.py` only for markers or opt-in fixture controls
+- `src/pdf2md/mineru_adapter.py` only for narrow compatibility fixes backed by mocked or optional local MinerU output evidence
+- `src/pdf2md/conversion.py` only for narrow release-gate defects found by integration tests
+- `src/pdf2md/quality.py` only for local quality metric defects found by integration tests
+- `src/pdf2md/report.py` only for report defects found by integration tests
+- `README.md`
+- `docs/V1RELEASECHECKLIST.md`
+- `docs/V1IMPLEMENTATIONPLAN.md`
+- `docs/Sprints/SPRINT9CONTRACT.md`
+- `PLAN.md`
+- `PROGRESS.md`
+
+Not allowed:
+
+- Committed files under `samples/`
+- Committed generated conversion outputs from local sample PDFs
+- Mandatory tests that require real MinerU, GPU, CUDA, PyTorch, model files, network, Obsidian, LaTeX tooling, or `samples/`
+- Automatic package installs or model downloads from tests, import time, doctor, convert, or helpers
+- Runtime engine selection or alternate conversion engines
+- Cloud OCR, remote LLM/VLM, hosted renderer, remote document parser, remote asset fetching, `--api-url`, router mode, HTTP client backends, remote APIs, or remote OpenAI-compatible backends
+- CLI/API options that disable strict-local policy
+- Claims that v1 perfectly reconstructs LaTeX, tables, or reading order
+
+## Expected Outputs
+
+1. Fast mocked integration suite
+   - Exercises `convert_pdf` and/or `pdf2md convert` with a fake MinerU adapter through the real orchestration path.
+   - Verifies Markdown, metadata JSON, and `<stem>.report.md` are all written.
+   - Verifies output paths, asset links, warning counts, and report status stay consistent.
+   - Verifies failures produce metadata/report warnings when possible and do not silently fallback.
+   - Runs as part of `uv run pytest` without real MinerU, models, GPU, network, Obsidian, LaTeX tooling, or `samples/`.
+
+2. Optional local MinerU fixture evaluation
+   - Provides an explicit opt-in command or pytest marker/environment gate for real local MinerU sample evaluation.
+   - Skips or reports a clear local blocker when `pdf2md doctor` fails because MinerU, model/cache paths, or GPU/PyTorch acceleration are unavailable.
+   - Reads sample PDFs only from `samples/` or a user-provided local sample directory.
+   - Writes generated outputs to a temporary or ignored output directory, never to tracked fixture paths.
+   - Produces or records, for each attempted sample:
+     - source filename
+     - command run
+     - exit code
+     - generated Markdown path
+     - generated metadata JSON path
+     - generated `.report.md` path
+     - warning count
+     - math renderability or checker-unavailable count
+     - table fallback/degradation count when available
+     - missing or broken asset link count
+     - page coverage when available
+   - Does not mark optional evaluation as passed when MinerU is missing; it records the blocker.
+
+3. Fixture coverage manifest or checklist
+   - Maps local sample files to risk categories:
+     - simple digital PDF
+     - math-heavy PDF
+     - multi-column or complex reading order
+     - table with formulas
+     - figure/caption/assets
+     - Korean filename/path handling
+   - May store only relative sample names, categories, and notes; it must not embed sample PDFs or generated outputs.
+   - Records coverage gaps that need additional user-provided samples.
+
+4. V1 release checklist
+   - Defines default release gates:
+     - `uv sync`
+     - `uv run pytest`
+     - `uv run pdf2md --version`
+     - `uv run pdf2md doctor`
+     - `git diff --check`
+     - `git status --short --untracked-files=all`
+   - Defines optional local MinerU release gates separately from default gates.
+   - Requires Markdown, metadata JSON, and `.report.md` to exist before any sample conversion is considered successful.
+   - Requires warnings and residual risks to be recorded in `PROGRESS.md`.
+   - Makes local-only and no-sample-commit checks explicit.
+
+5. Documentation
+   - README or release checklist explains how to run default checks and optional local fixture checks.
+   - Documentation states that optional fixture checks may be skipped or blocked until MinerU 3.1.0 and model/cache setup are available.
+   - Documentation does not instruct users to use `--api-url`, router mode, HTTP client backends, remote APIs, or remote OpenAI-compatible backends.
+
+6. Handoff
+   - `PROGRESS.md` records changed files, commands run, tests passed or blocked, local fixture status, generated output location if any, known failures, residual risks, and next action.
+
+## Non-Goals
+
+- Do not install MinerU.
+- Do not download MinerU models.
+- Do not run model setup automatically.
+- Do not require the local GTX 1070 Ti to pass CUDA/PyTorch checks in the default test loop.
+- Do not improve OCR/model accuracy.
+- Do not introduce a manual review UI or web UI.
+- Do not add alternate conversion engines or fallback engines.
+- Do not benchmark against cloud OCR/API services.
+- Do not commit sample PDFs, sample-derived outputs, or large binary fixtures.
+- Do not make text edit distance the only quality criterion.
+- Do not claim v1 is release-ready if metadata JSON or `.report.md` generation is missing.
+
+## Work Packages
+
+### WP9.1: Fast Mocked Integration Checks
+
+Owner:
+
+- `feature-generator-agent`
+- `evaluation-agent`
+
+Actions:
+
+- Add integration-level tests that use fake adapter output but run the public conversion orchestration and CLI paths.
+- Assert generated Markdown, metadata JSON, `.report.md`, assets, warnings, and summaries are mutually consistent.
+- Keep tests deterministic and independent of real samples.
+
+Output:
+
+- `uv run pytest` covers v1 file-output behavior without model or GPU dependencies.
+
+### WP9.2: Optional MinerU Sample Evaluation Harness
+
+Owner:
+
+- `mineru-integration-agent`
+- `local-setup-agent`
+- `evaluation-agent`
+
+Actions:
+
+- Add an explicit opt-in local fixture command/test path.
+- Gate real MinerU execution behind an environment variable, marker, or explicit command documented in README/checklist.
+- Run `pdf2md doctor` or equivalent preflight before optional local MinerU evaluation.
+- Use temporary or ignored output directories.
+- Record blocked status clearly when MinerU/model/cache setup is missing.
+
+Output:
+
+- Local users can run real sample evaluation when setup is ready, while default tests stay fast and local.
+
+### WP9.3: Fixture Coverage And Metrics
+
+Owner:
+
+- `evaluation-agent`
+- `obsidian-markdown-agent`
+- `metadata-agent`
+
+Actions:
+
+- Define fixture categories and expected risk coverage.
+- Track math delimiter/renderability, tables, reading order, assets, page coverage, metadata fields, warning counts, and report usefulness.
+- Avoid scoring quality only by plain-text edit distance.
+
+Output:
+
+- Fixture coverage is explicit and gaps are visible.
+
+### WP9.4: V1 Release Gate Documentation
+
+Owner:
+
+- `requirements-guard-agent`
+- `evaluation-agent`
+
+Actions:
+
+- Add or update release checklist documentation.
+- Separate default release gates from optional local MinerU/GPU gates.
+- Keep strict-local wording consistent with `ARCHITECTURE.md`, `PRD.md`, and `README.md`.
+- Update `PLAN.md` and `PROGRESS.md` with the next action and release readiness state.
+
+Output:
+
+- A future agent can determine whether v1 is blocked, partial, or ready without relying on conversation history.
+
+### WP9.5: Independent Evaluation
+
+Owner:
+
+- `evaluation-agent`
+
+Actions:
+
+- Review completed Sprint 9 work against this contract.
+- Verify default tests do not require real MinerU, GPU, CUDA, PyTorch, model files, network, Obsidian, LaTeX tooling, or `samples/`.
+- Verify optional local MinerU evaluation is clearly gated.
+- Verify generated sample outputs and sample PDFs are not staged.
+- Verify release checklist cannot pass without Markdown, metadata JSON, and `.report.md`.
+
+Output:
+
+- PASS/FAIL notes with actionable findings and residual risk.
+
+## Verification Checks
+
+Required:
+
+- `git status --short --untracked-files=all` before staging confirms `samples/` remains untracked and unstaged.
+- `uv --version` is run and result is recorded.
+- `uv sync` passes.
+- `uv run pytest` passes.
+- Targeted integration tests pass.
+- `uv run pdf2md --version` passes.
+- `uv run pdf2md doctor` is run and its result is recorded as pass, warn, or blocked/fail.
+- `git diff --check` passes.
+- Default tests do not require real MinerU, CUDA, GPU, PyTorch, model files, network, Obsidian, LaTeX tooling, or `samples/`.
+- No model downloads occur.
+- No setup downloads occur from tests, import time, doctor, convert, or helper scripts.
+- No network calls are required in default tests.
+- No candidate engine comparison is reintroduced.
+- No alternate engine or runtime engine selection is added.
+- No CLI/API option disables strict-local policy.
+- No `--api-url`, router mode, HTTP client backend, remote API, or remote OpenAI-compatible backend support is added.
+- Optional local MinerU checks are skipped or blocked clearly when setup is unavailable.
+- Sample PDFs and generated sample outputs are not staged or committed.
+- `PROGRESS.md` records local fixture coverage status and release readiness.
+
+Recommended:
+
+- Add a pytest marker or environment variable for optional local MinerU tests.
+- Keep optional output under a temporary directory or an ignored local output root.
+- Include at least one Korean filename/path check in fast mocked tests.
+- Include one fake output with math, one with a table warning, and one with an asset link.
+- Record source-to-output paths in release checklist examples.
+- Treat local doctor failure as a release blocker for real MinerU validation but not for the default fast test loop.
+
+## Hard Failure Criteria
+
+Sprint 9 fails and must stop for a user decision if any of these are true:
+
+- Default tests require real MinerU, GPU, CUDA, PyTorch, model files, network, Obsidian, LaTeX tooling, or `samples/`.
+- Sample PDFs or generated sample outputs are staged or committed.
+- Optional real MinerU evaluation runs without an explicit opt-in gate.
+- Optional real MinerU evaluation writes generated output into tracked fixture paths.
+- V1 release checklist can pass without generated Markdown, metadata JSON, and `.report.md`.
+- Release status is marked ready when `pdf2md doctor` has a hard failure and no explicit user waiver is recorded.
+- The implementation adds runtime engine selection or alternate engines.
+- The implementation adds or permits `--api-url`, remote APIs, router mode, HTTP client backends, or remote OpenAI-compatible backends.
+- The implementation uses cloud/API fallback for any fixture evaluation.
+- The implementation hides MinerU failure or silently falls back to another engine.
+- Quality criteria ignore math, tables, reading order, assets, metadata, or report quality.
+
+## Acceptance Criteria
+
+Sprint 9 is complete when:
+
+- `docs/Sprints/SPRINT9CONTRACT.md` exists and is referenced by relevant agents.
+- Fast mocked integration tests exist and pass under `uv run pytest`.
+- Optional local MinerU fixture evaluation is documented and explicitly gated.
+- Local fixture coverage categories and gaps are recorded.
+- Release checklist documentation exists or is updated.
+- `PROGRESS.md` records optional local MinerU status, including skipped/blocked reasons when applicable.
+- Default tests do not require real MinerU, GPU, CUDA, PyTorch, model files, network, Obsidian, LaTeX tooling, or `samples/`.
+- No sample PDF or generated sample output is staged or committed.
+- `uv sync` passes.
+- `uv run pytest` passes.
+- `git diff --check` passes.
+- Independent evaluation is complete.
+- The completed change is committed.
+
+## Handoff Fields
+
+Use these fields when Sprint 9 completes:
+
+- Files changed:
+- Commands run:
+- Tests passed:
+- Tests blocked:
+- Optional local MinerU status:
+- Fixture coverage:
+- Generated output locations:
+- Known failures:
+- Residual risks:
+- User decisions needed:
+- V1 release recommendation:
+- Go/no-go recommendation for next sprint:
+- Next action: