# Sprint 7 Contract: Conversion Orchestrator, CLI, And Python API Status: Implemented Last updated: 2026-05-08 ## Objective Connect the existing project-owned boundaries into a working conversion orchestration layer, public Python API, and `pdf2md convert` CLI path. Sprint 7 must establish: - A public `convert_pdf` API for one local PDF. - A batch conversion API or helper for directory inputs. - A `pdf2md convert INPUT --out OUTPUT_DIR` command. - Product behavior that writes Markdown, optional metadata JSON, and `.report.md`. - Local asset materialization for adapter-provided asset files. - CLI summaries that surface success, failure, and warning counts. - Fast tests that use fake adapter outputs and do not require real MinerU, model files, GPU, sample PDFs, network, Obsidian, or LaTeX tooling. Sprint 7 is an orchestration sprint. It may call the real `MinerUAdapter` in the normal production path, but the default test suite must use injected fake adapters and must not execute MinerU. ## Current Precondition Sprint 6 is complete: - `src/pdf2md/paths.py` owns input discovery and output path planning. - `src/pdf2md/ir.py` owns project records, block types, warning codes, and warning severities. - `src/pdf2md/metadata.py` builds JSON-serializable metadata and summary counts from project-owned records. - `src/pdf2md/mineru_adapter.py` owns the mocked direct local MinerU CLI adapter boundary. - `src/pdf2md/markdown.py` owns Obsidian Markdown normalization, asset link warnings, and table fallback warnings. - `src/pdf2md/quality.py` owns local quality checks over normalized Markdown and asset context. - `src/pdf2md/report.py` owns report content rendering and final status calculation. - `uv run pytest` passed 103 tests. Sprint 7 may compute source SHA-256, create conversion output directories, copy local adapter-provided asset files, write final Markdown, write metadata JSON when requested, and write `.report.md`. It must keep public return types project-owned and must not require raw MinerU-specific Python objects from callers. ## Touched Surfaces Allowed: - `src/pdf2md/conversion.py` - `src/pdf2md/cli.py` - `src/pdf2md/__init__.py` - `src/pdf2md/paths.py` only for narrowly required path/output helper compatibility - `src/pdf2md/mineru_adapter.py` only for narrowly required adapter protocol or result compatibility - `src/pdf2md/metadata.py` only for narrowly required output-location or summary compatibility - `src/pdf2md/markdown.py` only for narrowly required orchestration compatibility - `src/pdf2md/quality.py` only for narrowly required orchestration compatibility - `src/pdf2md/report.py` only for narrowly required orchestration compatibility - `tests/test_conversion.py` - `tests/test_cli.py` - Existing focused unit tests only if a touched module requires compatibility updates - `README.md` only if a short usage note is needed for `pdf2md convert` - `PLAN.md` - `PROGRESS.md` - `docs/V1IMPLEMENTATIONPLAN.md` - `docs/Sprints/SPRINT7CONTRACT.md` Not allowed: - `src/pdf2md/doctor.py` - Working `pdf2md doctor` behavior - `scripts/` - MinerU/model installation or download scripts - Real MinerU invocation in default tests - Real GPU/CUDA checks - Real PDF content parsing outside adapter output handling - Runtime engine selection or alternate engine support - Cloud OCR, remote LLM/VLM, hosted renderer, remote document parser, remote asset fetching, HTTP client backend, router mode, `--api-url`, remote APIs, or remote OpenAI-compatible backend support - A CLI flag that disables strict-local policy - Committed files under `samples/` ## Expected Outputs Sprint 7 should produce: 1. Public conversion records and API - A project-owned conversion result type containing at least: - source PDF path - Markdown output path - metadata JSON path when written - report path - assets directory - raw output directory when kept - engine name and version - final status - warning count - warnings - `convert_pdf(input_path, output_dir, metadata=True, keep_raw=False, overwrite=False, gpu=None, strict_local=True, adapter=None, clock=None)` or an equivalently small API. - The default API path uses the direct local MinerU adapter. - Tests can inject a fake adapter and deterministic clock. - The public return type must not expose raw MinerU-specific Python objects as required fields. 2. Single-PDF orchestration - Discover and plan the single PDF using existing path helpers. - Create required output directories only after preflight path checks pass. - Run the adapter into a planned temporary or raw work directory. - Stop the individual conversion on adapter hard failure and return explicit warnings/status. - Normalize adapter Markdown into Obsidian-friendly Markdown. - Copy local adapter-provided asset files into the planned assets directory when needed. - Compute source SHA-256 with local file reads. - Build metadata from project-owned records. - Run local quality checks over normalized Markdown and asset context. - Render report Markdown from metadata and quality results. - Write final Markdown, optional metadata JSON, and report Markdown. 3. Output writing and overwrite behavior - Never write outside the planned output root. - Respect existing-output conflicts unless `overwrite=True`. - Keep writes deterministic and UTF-8 encoded for text outputs. - Preserve `--metadata` behavior: metadata JSON is written when enabled and omitted when disabled. - Always write `.report.md`. - `--keep-raw` preserves raw MinerU output in the planned raw directory. - Without `--keep-raw`, temporary raw work must be cleaned up when the conversion completes, while preserving enough failure context in metadata/report/warnings. 4. Asset materialization - Copy only local files returned by the adapter. - Do not fetch remote assets. - Do not follow asset paths that escape the adapter work directory or the planned output root. - Handle missing or invalid adapter asset paths with project-owned warnings. - Normalize final Markdown asset links to stable relative paths. 5. CLI convert command - `pdf2md convert INPUT --out OUTPUT_DIR`. - Options: - `--metadata` - `--keep-raw` - `--recursive` - `--overwrite` - `--gpu GPU_DEVICE` - `--strict-local` - Strict-local must remain enabled in v1; the CLI must not add a supported way to disable it. - Single PDF conversion returns exit code `0` on success or partial success, and non-zero when a hard error prevents conversion. - Directory conversion handles multiple PDFs deterministically and prints a summary. - Batch conversion should continue to the next file when one PDF fails after planning, then return a non-zero exit code if any PDF failed. - CLI output must include converted count, failed count, and warning count. 6. Batch conversion - Directory inputs use existing non-recursive discovery by default. - Recursive discovery occurs only with `--recursive`. - Output paths preserve relative subdirectories from the input root. - Duplicate planned outputs and overwrite conflicts fail before conversion starts. - Results are deterministic and ordered by discovery/path planning order. 7. Failure and warning behavior - MinerU failure must be clear and must not trigger fallback to any other engine. - Strict-local violations must be hard failures. - Per-file failures must include project-owned warnings. - CLI summaries must not suppress warning counts. - Metadata/report content must reflect warnings emitted during adapter, normalization, asset, and quality steps. 8. Tests - API test for one successful conversion with a fake adapter. - API test for adapter failure with no fallback. - API test for output conflict and overwrite behavior. - API test for metadata disabled behavior. - API test for local asset copying and relative Markdown links. - API test for `keep_raw` behavior. - CLI test for single PDF conversion with a fake adapter. - CLI test for directory conversion with deterministic summary output. - CLI test for recursive behavior. - CLI test for failure summary and non-zero exit code. - Tests proving default checks do not require real MinerU, GPU, models, network, `samples/`, Obsidian, or LaTeX tooling. 9. Handoff - `PROGRESS.md` records changed files, commands run, tests passed or blocked, known failures, residual risks, and next action. ## Non-Goals - Do not implement `pdf2md doctor`. - Do not implement environment diagnostics. - Do not install MinerU 3.1.0. - Do not download MinerU models. - Do not probe real MinerU output with local sample PDFs. - Do not add setup scripts. - Do not implement runtime engine selection. - Do not add alternate engines. - Do not add cloud, remote API, router, HTTP client backend, remote OpenAI-compatible backend, hosted renderer, or remote asset-fetching support. - Do not add a CLI flag or API option that disables strict-local policy. - Do not require real MinerU, CUDA, GPU, model files, network, Obsidian, LaTeX tooling, or `samples/` in default tests. - Do not implement real math rendering; use the Sprint 6 local checker boundary. - Do not commit generated conversion outputs or sample PDFs. ## Work Packages ### WP7.1: Public API And Result Records Owner: - `feature-generator-agent` - `requirements-guard-agent` Actions: - Add `conversion.py`. - Define project-owned conversion result records. - Expose `convert_pdf` from the library surface. - Support fake adapter and deterministic clock injection for tests. Output: - Callers can run one conversion without depending on raw MinerU objects. ### WP7.2: Single-PDF Orchestration And Output Writing Owner: - `feature-generator-agent` - `mineru-integration-agent` - `metadata-agent` - `obsidian-markdown-agent` Actions: - Connect path planning, adapter execution, Markdown normalization, metadata building, quality checks, and report rendering. - Write final Markdown, optional metadata JSON, and report Markdown. - Compute source SHA-256. - Preserve strict-local behavior and no-fallback behavior. Output: - One PDF can be converted through mocked adapter outputs in tests and through the real adapter in normal use. ### WP7.3: Asset And Raw Output Handling Owner: - `feature-generator-agent` - `obsidian-markdown-agent` - `metadata-agent` Actions: - Copy local adapter-provided assets into the planned assets directory. - Normalize Markdown links relative to the final Markdown file. - Preserve raw output only when requested. - Clean temporary work when raw output is not requested. Output: - Markdown, assets, metadata, and report paths are stable and local-only. ### WP7.4: CLI Convert And Batch Summary Owner: - `feature-generator-agent` - `requirements-guard-agent` Actions: - Replace the placeholder CLI with `convert` while keeping `--version`. - Add only the agreed v1 options. - Print deterministic summaries with converted, failed, and warning counts. - Return non-zero exit code when hard failures occur. Output: - Users can run `pdf2md convert` for one PDF or a directory. ### WP7.5: Independent Evaluation Owner: - `evaluation-agent` Actions: - Review completed orchestration behavior against this contract. - Verify no default test executes real MinerU, uses GPU, downloads models, uses network, or requires `samples/`. - Verify no runtime remote/API path or alternate engine is introduced. - Verify `samples/` remains untracked and unstaged. Output: - PASS/FAIL notes with any missing acceptance criteria. ## Verification Checks Required: - `git status --short --untracked-files=all` before staging confirms `samples/` remains untracked and unstaged. - `uv --version` is run and result is recorded. - `uv sync` passes. - `uv run pytest tests/test_conversion.py tests/test_cli.py` passes. - `uv run pytest` passes. - `git diff --check` passes. - Default tests do not require real MinerU, CUDA, GPU, model files, network, Obsidian, LaTeX tooling, or `samples/`. - No model downloads occur. - No network calls are required. - No candidate engine comparison is reintroduced. - No alternate engine or runtime engine selection is added. - No CLI/API option disables strict-local policy. - No `--api-url`, router mode, HTTP client backend, remote API, or remote OpenAI-compatible backend support is added. - Adapter failures produce explicit failed results and no fallback conversion. - Output files are written only after path preflight succeeds. - Existing outputs are protected unless overwrite is enabled. - CLI summaries include warning counts. - Metadata/report paths and counts match the files written. Recommended: - Keep conversion orchestration small and dependency-injected. - Prefer local temporary directories from the standard library for raw work when `keep_raw` is disabled. - Keep batch conversion a thin loop over single-file conversion. - Keep CLI formatting simple and stable enough for tests. - Use fake adapter records in tests rather than monkeypatching subprocess behavior at the CLI layer. ## Hard Failure Criteria Sprint 7 fails and must stop for a user decision if any of these are true: - Public API requires or exposes raw MinerU-specific Python objects as required return fields. - The implementation silently falls back to another engine after MinerU failure. - A CLI/API option disables strict-local policy. - The implementation adds or permits `--api-url`, remote APIs, router mode, HTTP client backends, or remote OpenAI-compatible backends. - Default tests execute real MinerU, require GPU/CUDA, download models, use network, require Obsidian/LaTeX tooling, or require `samples/`. - Output writing can escape the planned output root. - Existing files are overwritten without explicit overwrite intent. - CLI writes final outputs after a preflight hard failure. - CLI summaries suppress warning counts or failed counts. - Metadata/report content omits warnings emitted during adapter, normalization, asset, or quality steps. - `samples/` is staged or committed. ## Acceptance Criteria Sprint 7 is complete when: - `src/pdf2md/conversion.py` exists and owns conversion orchestration. - `convert_pdf` is available from the public Python package. - `pdf2md convert INPUT --out OUTPUT_DIR` exists. - Single-PDF conversion is tested with a fake adapter. - Directory and recursive conversion behavior is tested with fake adapters. - Output conflict and overwrite behavior is tested. - Adapter failure produces a clear failed result and no fallback. - Final Markdown, metadata JSON when enabled, and report Markdown are written by product behavior. - Local adapter-provided assets are copied or warned about deterministically. - `--keep-raw` behavior is tested. - CLI summaries include converted, failed, and warning counts. - Default tests do not require real MinerU, GPU, model files, network, Obsidian, LaTeX tooling, or `samples/`. - `uv sync` passes. - Targeted conversion/CLI tests pass. - `uv run pytest` passes. - `PROGRESS.md` records checks performed and residual risks. - Independent evaluation is complete. - The completed change is committed. ## Handoff Fields Use these fields when Sprint 7 completes: - Files changed: - Commands run: - Tests passed: - Tests blocked: - Known failures: - Residual risks: - User decisions needed: - Go/no-go recommendation for Sprint 8: - Next action: