# Sprint 9 Contract: Local Fixture Evaluation And V1 Release Gate Status: Implemented Last updated: 2026-05-08 ## Objective Validate the v1 converter against local fixture workflows without committing sample PDFs or making the default test loop depend on MinerU models, GPU, CUDA, network access, Obsidian, or LaTeX tooling. Sprint 9 must establish: - A fast mocked integration suite that exercises the public conversion path end to end. - An optional, explicitly enabled local MinerU fixture evaluation path for `samples/`. - A fixture coverage manifest or checklist that records which local PDFs cover math, tables, figures/assets, reading order, Korean filenames, and metadata/report risks. - Release-gate documentation that distinguishes default automated checks from optional local MinerU/GPU checks. - Clear `PROGRESS.md` notes for local fixture coverage, skipped/blocked optional checks, known quality risks, and the v1 go/no-go recommendation. Sprint 9 is an evaluation and release-gate sprint. It may add tests, local-only evaluation helpers, fixture manifests, and narrow compatibility fixes only when needed to evaluate the current v1 behavior. It must not add alternate engines, cloud/API paths, runtime engine selection, or automatic model downloads. ## Current Precondition Sprint 8 is complete: - `pdf2md doctor` exists and reports Python, `uv`, MinerU CLI/version, GPU, PyTorch, model/cache, and strict-local policy status. - Local `pdf2md doctor` currently fails because the `mineru` CLI is not installed on PATH. - `pdf2md convert` exists and writes Markdown, metadata JSON, and `.report.md` with fake-adapter test coverage. - Default tests pass without real MinerU, CUDA, GPU, model files, network, Obsidian, LaTeX tooling, or `samples/`. - `samples/` exists locally and is untracked. Observed local fixture files include: - `samples/FourNodeQuadrilateralShellElementMITC4.pdf` - `samples/MITC공부.pdf` - `samples/2007쉘구조물의유한요소해석에대하여.pdf` - `samples/유한요소해석법을이용한쉘구조물의동적좌굴해석.pdf` - `samples/metadata.json` Sprint 9 must preserve the untracked status of `samples/` unless the user explicitly requests otherwise. ## Touched Surfaces Allowed: - `tests/integration/` - `tests/test_conversion.py` - `tests/test_cli.py` - `tests/test_report.py` - `tests/test_metadata.py` - `tests/test_quality.py` - `tests/conftest.py` only for markers or opt-in fixture controls - `src/pdf2md/mineru_adapter.py` only for narrow compatibility fixes backed by mocked or optional local MinerU output evidence - `src/pdf2md/conversion.py` only for narrow release-gate defects found by integration tests - `src/pdf2md/quality.py` only for local quality metric defects found by integration tests - `src/pdf2md/report.py` only for report defects found by integration tests - `README.md` - `docs/V1RELEASECHECKLIST.md` - `docs/V1IMPLEMENTATIONPLAN.md` - `docs/Sprints/SPRINT9CONTRACT.md` - `PLAN.md` - `PROGRESS.md` Not allowed: - Committed files under `samples/` - Committed generated conversion outputs from local sample PDFs - Mandatory tests that require real MinerU, GPU, CUDA, PyTorch, model files, network, Obsidian, LaTeX tooling, or `samples/` - Automatic package installs or model downloads from tests, import time, doctor, convert, or helpers - Runtime engine selection or alternate conversion engines - Cloud OCR, remote LLM/VLM, hosted renderer, remote document parser, remote asset fetching, `--api-url`, router mode, HTTP client backends, remote APIs, or remote OpenAI-compatible backends - CLI/API options that disable strict-local policy - Claims that v1 perfectly reconstructs LaTeX, tables, or reading order ## Expected Outputs 1. Fast mocked integration suite - Exercises `convert_pdf` and/or `pdf2md convert` with a fake MinerU adapter through the real orchestration path. - Verifies Markdown, metadata JSON, and `.report.md` are all written. - Verifies output paths, asset links, warning counts, and report status stay consistent. - Verifies failures produce metadata/report warnings when possible and do not silently fallback. - Runs as part of `uv run pytest` without real MinerU, models, GPU, network, Obsidian, LaTeX tooling, or `samples/`. 2. Optional local MinerU fixture evaluation - Provides an explicit opt-in command or pytest marker/environment gate for real local MinerU sample evaluation. - Skips or reports a clear local blocker when `pdf2md doctor` fails because MinerU, model/cache paths, or GPU/PyTorch acceleration are unavailable. - Reads sample PDFs only from `samples/` or a user-provided local sample directory. - Writes generated outputs to a temporary or ignored output directory, never to tracked fixture paths. - Produces or records, for each attempted sample: - source filename - command run - exit code - generated Markdown path - generated metadata JSON path - generated `.report.md` path - warning count - math renderability or checker-unavailable count - table fallback/degradation count when available - missing or broken asset link count - page coverage when available - Does not mark optional evaluation as passed when MinerU is missing; it records the blocker. 3. Fixture coverage manifest or checklist - Maps local sample files to risk categories: - simple digital PDF - math-heavy PDF - multi-column or complex reading order - table with formulas - figure/caption/assets - Korean filename/path handling - May store only relative sample names, categories, and notes; it must not embed sample PDFs or generated outputs. - Records coverage gaps that need additional user-provided samples. 4. V1 release checklist - Defines default release gates: - `uv sync` - `uv run pytest` - `uv run pdf2md --version` - `uv run pdf2md doctor` - `git diff --check` - `git status --short --untracked-files=all` - Defines optional local MinerU release gates separately from default gates. - Requires Markdown, metadata JSON, and `.report.md` to exist before any sample conversion is considered successful. - Requires warnings and residual risks to be recorded in `PROGRESS.md`. - Makes local-only and no-sample-commit checks explicit. 5. Documentation - README or release checklist explains how to run default checks and optional local fixture checks. - Documentation states that optional fixture checks may be skipped or blocked until MinerU 3.1.0 and model/cache setup are available. - Documentation does not instruct users to use `--api-url`, router mode, HTTP client backends, remote APIs, or remote OpenAI-compatible backends. 6. Handoff - `PROGRESS.md` records changed files, commands run, tests passed or blocked, local fixture status, generated output location if any, known failures, residual risks, and next action. ## Non-Goals - Do not install MinerU. - Do not download MinerU models. - Do not run model setup automatically. - Do not require the local GTX 1070 Ti to pass CUDA/PyTorch checks in the default test loop. - Do not improve OCR/model accuracy. - Do not introduce a manual review UI, hosted web UI, or local desktop launcher in Sprint 9. - Do not add alternate conversion engines or fallback engines. - Do not benchmark against cloud OCR/API services. - Do not commit sample PDFs, sample-derived outputs, or large binary fixtures. - Do not make text edit distance the only quality criterion. - Do not claim v1 is release-ready if metadata JSON or `.report.md` generation is missing. ## Work Packages ### WP9.1: Fast Mocked Integration Checks Owner: - `feature-generator-agent` - `evaluation-agent` Actions: - Add integration-level tests that use fake adapter output but run the public conversion orchestration and CLI paths. - Assert generated Markdown, metadata JSON, `.report.md`, assets, warnings, and summaries are mutually consistent. - Keep tests deterministic and independent of real samples. Output: - `uv run pytest` covers v1 file-output behavior without model or GPU dependencies. ### WP9.2: Optional MinerU Sample Evaluation Harness Owner: - `mineru-integration-agent` - `local-setup-agent` - `evaluation-agent` Actions: - Add an explicit opt-in local fixture command/test path. - Gate real MinerU execution behind an environment variable, marker, or explicit command documented in README/checklist. - Run `pdf2md doctor` or equivalent preflight before optional local MinerU evaluation. - Use temporary or ignored output directories. - Record blocked status clearly when MinerU/model/cache setup is missing. Output: - Local users can run real sample evaluation when setup is ready, while default tests stay fast and local. ### WP9.3: Fixture Coverage And Metrics Owner: - `evaluation-agent` - `obsidian-markdown-agent` - `metadata-agent` Actions: - Define fixture categories and expected risk coverage. - Track math delimiter/renderability, tables, reading order, assets, page coverage, metadata fields, warning counts, and report usefulness. - Avoid scoring quality only by plain-text edit distance. Output: - Fixture coverage is explicit and gaps are visible. ### WP9.4: V1 Release Gate Documentation Owner: - `requirements-guard-agent` - `evaluation-agent` Actions: - Add or update release checklist documentation. - Separate default release gates from optional local MinerU/GPU gates. - Keep strict-local wording consistent with `ARCHITECTURE.md`, `PRD.md`, and `README.md`. - Update `PLAN.md` and `PROGRESS.md` with the next action and release readiness state. Output: - A future agent can determine whether v1 is blocked, partial, or ready without relying on conversation history. ### WP9.5: Independent Evaluation Owner: - `evaluation-agent` Actions: - Review completed Sprint 9 work against this contract. - Verify default tests do not require real MinerU, GPU, CUDA, PyTorch, model files, network, Obsidian, LaTeX tooling, or `samples/`. - Verify optional local MinerU evaluation is clearly gated. - Verify generated sample outputs and sample PDFs are not staged. - Verify release checklist cannot pass without Markdown, metadata JSON, and `.report.md`. Output: - PASS/FAIL notes with actionable findings and residual risk. ## Verification Checks Required: - `git status --short --untracked-files=all` before staging confirms `samples/` remains untracked and unstaged. - `uv --version` is run and result is recorded. - `uv sync` passes. - `uv run pytest` passes. - Targeted integration tests pass. - `uv run pdf2md --version` passes. - `uv run pdf2md doctor` is run and its result is recorded as pass, warn, or blocked/fail. - `git diff --check` passes. - Default tests do not require real MinerU, CUDA, GPU, PyTorch, model files, network, Obsidian, LaTeX tooling, or `samples/`. - No model downloads occur. - No setup downloads occur from tests, import time, doctor, convert, or helper scripts. - No network calls are required in default tests. - No candidate engine comparison is reintroduced. - No alternate engine or runtime engine selection is added. - No CLI/API option disables strict-local policy. - No `--api-url`, router mode, HTTP client backend, remote API, or remote OpenAI-compatible backend support is added. - Optional local MinerU checks are skipped or blocked clearly when setup is unavailable. - Sample PDFs and generated sample outputs are not staged or committed. - `PROGRESS.md` records local fixture coverage status and release readiness. Recommended: - Add a pytest marker or environment variable for optional local MinerU tests. - Keep optional output under a temporary directory or an ignored local output root. - Include at least one Korean filename/path check in fast mocked tests. - Include one fake output with math, one with a table warning, and one with an asset link. - Record source-to-output paths in release checklist examples. - Treat local doctor failure as a release blocker for real MinerU validation but not for the default fast test loop. ## Hard Failure Criteria Sprint 9 fails and must stop for a user decision if any of these are true: - Default tests require real MinerU, GPU, CUDA, PyTorch, model files, network, Obsidian, LaTeX tooling, or `samples/`. - Sample PDFs or generated sample outputs are staged or committed. - Optional real MinerU evaluation runs without an explicit opt-in gate. - Optional real MinerU evaluation writes generated output into tracked fixture paths. - V1 release checklist can pass without generated Markdown, metadata JSON, and `.report.md`. - Release status is marked ready when `pdf2md doctor` has a hard failure and no explicit user waiver is recorded. - The implementation adds runtime engine selection or alternate engines. - The implementation adds or permits `--api-url`, remote APIs, router mode, HTTP client backends, or remote OpenAI-compatible backends. - The implementation uses cloud/API fallback for any fixture evaluation. - The implementation hides MinerU failure or silently falls back to another engine. - Quality criteria ignore math, tables, reading order, assets, metadata, or report quality. ## Acceptance Criteria Sprint 9 is complete when: - `docs/Sprints/SPRINT9CONTRACT.md` exists and is referenced by relevant agents. - Fast mocked integration tests exist and pass under `uv run pytest`. - Optional local MinerU fixture evaluation is documented and explicitly gated. - Local fixture coverage categories and gaps are recorded. - Release checklist documentation exists or is updated. - `PROGRESS.md` records optional local MinerU status, including skipped/blocked reasons when applicable. - Default tests do not require real MinerU, GPU, CUDA, PyTorch, model files, network, Obsidian, LaTeX tooling, or `samples/`. - No sample PDF or generated sample output is staged or committed. - `uv sync` passes. - `uv run pytest` passes. - `git diff --check` passes. - Independent evaluation is complete. - The completed change is committed. ## Handoff Fields Use these fields when Sprint 9 completes: - Files changed: - Commands run: - Tests passed: - Tests blocked: - Optional local MinerU status: - Fixture coverage: - Generated output locations: - Known failures: - Residual risks: - User decisions needed: - V1 release recommendation: - Go/no-go recommendation for next sprint: - Next action: