baram2584/PDFToMD

Fork 0

Files

T

김경종 dc11880140 modify pdftomd

2026-05-14 10:16:59 +09:00

14 KiB

Raw Blame History

Sprint 9 Contract: Local Fixture Evaluation And V1 Release Gate

Status: Implemented Last updated: 2026-05-08

Objective

Validate the v1 converter against local fixture workflows without committing sample PDFs or making the default test loop depend on MinerU models, GPU, CUDA, network access, Obsidian, or LaTeX tooling.

Sprint 9 must establish:

A fast mocked integration suite that exercises the public conversion path end to end.
An optional, explicitly enabled local MinerU fixture evaluation path for samples/.
A fixture coverage manifest or checklist that records which local PDFs cover math, tables, figures/assets, reading order, Korean filenames, and metadata/report risks.
Release-gate documentation that distinguishes default automated checks from optional local MinerU/GPU checks.
Clear PROGRESS.md notes for local fixture coverage, skipped/blocked optional checks, known quality risks, and the v1 go/no-go recommendation.

Sprint 9 is an evaluation and release-gate sprint. It may add tests, local-only evaluation helpers, fixture manifests, and narrow compatibility fixes only when needed to evaluate the current v1 behavior. It must not add alternate engines, cloud/API paths, runtime engine selection, or automatic model downloads.

Current Precondition

Sprint 8 is complete:

pdf2md doctor exists and reports Python, uv, MinerU CLI/version, GPU, PyTorch, model/cache, and strict-local policy status.
Local pdf2md doctor currently fails because the mineru CLI is not installed on PATH.
pdf2md convert exists and writes Markdown, metadata JSON, and <stem>.report.md with fake-adapter test coverage.
Default tests pass without real MinerU, CUDA, GPU, model files, network, Obsidian, LaTeX tooling, or samples/.
samples/ exists locally and is untracked. Observed local fixture files include:
- samples/FourNodeQuadrilateralShellElementMITC4.pdf
- samples/MITC공부.pdf
- samples/2007쉘구조물의유한요소해석에대하여.pdf
- samples/유한요소해석법을이용한쉘구조물의동적좌굴해석.pdf
- samples/metadata.json

Sprint 9 must preserve the untracked status of samples/ unless the user explicitly requests otherwise.

Touched Surfaces

Allowed:

tests/integration/
tests/test_conversion.py
tests/test_cli.py
tests/test_report.py
tests/test_metadata.py
tests/test_quality.py
tests/conftest.py only for markers or opt-in fixture controls
src/pdf2md/mineru_adapter.py only for narrow compatibility fixes backed by mocked or optional local MinerU output evidence
src/pdf2md/conversion.py only for narrow release-gate defects found by integration tests
src/pdf2md/quality.py only for local quality metric defects found by integration tests
src/pdf2md/report.py only for report defects found by integration tests
README.md
docs/V1RELEASECHECKLIST.md
docs/V1IMPLEMENTATIONPLAN.md
docs/Sprints/SPRINT9CONTRACT.md
PLAN.md
PROGRESS.md

Not allowed:

Committed files under samples/
Committed generated conversion outputs from local sample PDFs
Mandatory tests that require real MinerU, GPU, CUDA, PyTorch, model files, network, Obsidian, LaTeX tooling, or samples/
Automatic package installs or model downloads from tests, import time, doctor, convert, or helpers
Runtime engine selection or alternate conversion engines
Cloud OCR, remote LLM/VLM, hosted renderer, remote document parser, remote asset fetching, --api-url, router mode, HTTP client backends, remote APIs, or remote OpenAI-compatible backends
CLI/API options that disable strict-local policy
Claims that v1 perfectly reconstructs LaTeX, tables, or reading order

Expected Outputs

Fast mocked integration suite
- Exercises convert_pdf and/or pdf2md convert with a fake MinerU adapter through the real orchestration path.
- Verifies Markdown, metadata JSON, and <stem>.report.md are all written.
- Verifies output paths, asset links, warning counts, and report status stay consistent.
- Verifies failures produce metadata/report warnings when possible and do not silently fallback.
- Runs as part of uv run pytest without real MinerU, models, GPU, network, Obsidian, LaTeX tooling, or samples/.
Optional local MinerU fixture evaluation
- Provides an explicit opt-in command or pytest marker/environment gate for real local MinerU sample evaluation.
- Skips or reports a clear local blocker when pdf2md doctor fails because MinerU, model/cache paths, or GPU/PyTorch acceleration are unavailable.
- Reads sample PDFs only from samples/ or a user-provided local sample directory.
- Writes generated outputs to a temporary or ignored output directory, never to tracked fixture paths.
- Produces or records, for each attempted sample:
  - source filename
  - command run
  - exit code
  - generated Markdown path
  - generated metadata JSON path
  - generated .report.md path
  - warning count
  - math renderability or checker-unavailable count
  - table fallback/degradation count when available
  - missing or broken asset link count
  - page coverage when available
- Does not mark optional evaluation as passed when MinerU is missing; it records the blocker.
Fixture coverage manifest or checklist
- Maps local sample files to risk categories:
  - simple digital PDF
  - math-heavy PDF
  - multi-column or complex reading order
  - table with formulas
  - figure/caption/assets
  - Korean filename/path handling
- May store only relative sample names, categories, and notes; it must not embed sample PDFs or generated outputs.
- Records coverage gaps that need additional user-provided samples.
V1 release checklist
- Defines default release gates:
  - uv sync
  - uv run pytest
  - uv run pdf2md --version
  - uv run pdf2md doctor
  - git diff --check
  - git status --short --untracked-files=all
- Defines optional local MinerU release gates separately from default gates.
- Requires Markdown, metadata JSON, and .report.md to exist before any sample conversion is considered successful.
- Requires warnings and residual risks to be recorded in PROGRESS.md.
- Makes local-only and no-sample-commit checks explicit.
Documentation
- README or release checklist explains how to run default checks and optional local fixture checks.
- Documentation states that optional fixture checks may be skipped or blocked until MinerU 3.1.0 and model/cache setup are available.
- Documentation does not instruct users to use --api-url, router mode, HTTP client backends, remote APIs, or remote OpenAI-compatible backends.
Handoff
- PROGRESS.md records changed files, commands run, tests passed or blocked, local fixture status, generated output location if any, known failures, residual risks, and next action.

Non-Goals

Do not install MinerU.
Do not download MinerU models.
Do not run model setup automatically.
Do not require the local GTX 1070 Ti to pass CUDA/PyTorch checks in the default test loop.
Do not improve OCR/model accuracy.
Do not introduce a manual review UI, hosted web UI, or local desktop launcher in Sprint 9.
Do not add alternate conversion engines or fallback engines.
Do not benchmark against cloud OCR/API services.
Do not commit sample PDFs, sample-derived outputs, or large binary fixtures.
Do not make text edit distance the only quality criterion.
Do not claim v1 is release-ready if metadata JSON or .report.md generation is missing.

Work Packages

WP9.1: Fast Mocked Integration Checks

Owner:

feature-generator-agent
evaluation-agent

Actions:

Add integration-level tests that use fake adapter output but run the public conversion orchestration and CLI paths.
Assert generated Markdown, metadata JSON, .report.md, assets, warnings, and summaries are mutually consistent.
Keep tests deterministic and independent of real samples.

Output:

uv run pytest covers v1 file-output behavior without model or GPU dependencies.

WP9.2: Optional MinerU Sample Evaluation Harness

Owner:

mineru-integration-agent
local-setup-agent
evaluation-agent

Actions:

Add an explicit opt-in local fixture command/test path.
Gate real MinerU execution behind an environment variable, marker, or explicit command documented in README/checklist.
Run pdf2md doctor or equivalent preflight before optional local MinerU evaluation.
Use temporary or ignored output directories.
Record blocked status clearly when MinerU/model/cache setup is missing.

Output:

Local users can run real sample evaluation when setup is ready, while default tests stay fast and local.

WP9.3: Fixture Coverage And Metrics

Owner:

evaluation-agent
obsidian-markdown-agent
metadata-agent

Actions:

Define fixture categories and expected risk coverage.
Track math delimiter/renderability, tables, reading order, assets, page coverage, metadata fields, warning counts, and report usefulness.
Avoid scoring quality only by plain-text edit distance.

Output:

Fixture coverage is explicit and gaps are visible.

WP9.4: V1 Release Gate Documentation

Owner:

requirements-guard-agent
evaluation-agent

Actions:

Add or update release checklist documentation.
Separate default release gates from optional local MinerU/GPU gates.
Keep strict-local wording consistent with ARCHITECTURE.md, PRD.md, and README.md.
Update PLAN.md and PROGRESS.md with the next action and release readiness state.

Output:

A future agent can determine whether v1 is blocked, partial, or ready without relying on conversation history.

WP9.5: Independent Evaluation

Owner:

evaluation-agent

Actions:

Review completed Sprint 9 work against this contract.
Verify default tests do not require real MinerU, GPU, CUDA, PyTorch, model files, network, Obsidian, LaTeX tooling, or samples/.
Verify optional local MinerU evaluation is clearly gated.
Verify generated sample outputs and sample PDFs are not staged.
Verify release checklist cannot pass without Markdown, metadata JSON, and .report.md.

Output:

PASS/FAIL notes with actionable findings and residual risk.

Verification Checks

Required:

git status --short --untracked-files=all before staging confirms samples/ remains untracked and unstaged.
uv --version is run and result is recorded.
uv sync passes.
uv run pytest passes.
Targeted integration tests pass.
uv run pdf2md --version passes.
uv run pdf2md doctor is run and its result is recorded as pass, warn, or blocked/fail.
git diff --check passes.
Default tests do not require real MinerU, CUDA, GPU, PyTorch, model files, network, Obsidian, LaTeX tooling, or samples/.
No model downloads occur.
No setup downloads occur from tests, import time, doctor, convert, or helper scripts.
No network calls are required in default tests.
No candidate engine comparison is reintroduced.
No alternate engine or runtime engine selection is added.
No CLI/API option disables strict-local policy.
No --api-url, router mode, HTTP client backend, remote API, or remote OpenAI-compatible backend support is added.
Optional local MinerU checks are skipped or blocked clearly when setup is unavailable.
Sample PDFs and generated sample outputs are not staged or committed.
PROGRESS.md records local fixture coverage status and release readiness.

Recommended:

Add a pytest marker or environment variable for optional local MinerU tests.
Keep optional output under a temporary directory or an ignored local output root.
Include at least one Korean filename/path check in fast mocked tests.
Include one fake output with math, one with a table warning, and one with an asset link.
Record source-to-output paths in release checklist examples.
Treat local doctor failure as a release blocker for real MinerU validation but not for the default fast test loop.

Hard Failure Criteria

Sprint 9 fails and must stop for a user decision if any of these are true:

Default tests require real MinerU, GPU, CUDA, PyTorch, model files, network, Obsidian, LaTeX tooling, or samples/.
Sample PDFs or generated sample outputs are staged or committed.
Optional real MinerU evaluation runs without an explicit opt-in gate.
Optional real MinerU evaluation writes generated output into tracked fixture paths.
V1 release checklist can pass without generated Markdown, metadata JSON, and .report.md.
Release status is marked ready when pdf2md doctor has a hard failure and no explicit user waiver is recorded.
The implementation adds runtime engine selection or alternate engines.
The implementation adds or permits --api-url, remote APIs, router mode, HTTP client backends, or remote OpenAI-compatible backends.
The implementation uses cloud/API fallback for any fixture evaluation.
The implementation hides MinerU failure or silently falls back to another engine.
Quality criteria ignore math, tables, reading order, assets, metadata, or report quality.

Acceptance Criteria

Sprint 9 is complete when:

docs/Sprints/SPRINT9CONTRACT.md exists and is referenced by relevant agents.
Fast mocked integration tests exist and pass under uv run pytest.
Optional local MinerU fixture evaluation is documented and explicitly gated.
Local fixture coverage categories and gaps are recorded.
Release checklist documentation exists or is updated.
PROGRESS.md records optional local MinerU status, including skipped/blocked reasons when applicable.
Default tests do not require real MinerU, GPU, CUDA, PyTorch, model files, network, Obsidian, LaTeX tooling, or samples/.
No sample PDF or generated sample output is staged or committed.
uv sync passes.
uv run pytest passes.
git diff --check passes.
Independent evaluation is complete.
The completed change is committed.

Handoff Fields

Use these fields when Sprint 9 completes:

Files changed:
Commands run:
Tests passed:
Tests blocked:
Optional local MinerU status:
Fixture coverage:
Generated output locations:
Known failures:
Residual risks:
User decisions needed:
V1 release recommendation:
Go/no-go recommendation for next sprint:
Next action:

14 KiB Raw Blame History

Sprint 9 Contract: Local Fixture Evaluation And V1 Release Gate

Objective

Current Precondition

Touched Surfaces

Expected Outputs

Non-Goals

Work Packages

WP9.1: Fast Mocked Integration Checks

WP9.2: Optional MinerU Sample Evaluation Harness

WP9.3: Fixture Coverage And Metrics

WP9.4: V1 Release Gate Documentation

WP9.5: Independent Evaluation

Verification Checks

Hard Failure Criteria

Acceptance Criteria

Handoff Fields

14 KiB

Raw Blame History