# Sprint 1 Contract: Project Scaffold And Fast Test Loop Status: Completed Last updated: 2026-05-07 ## Objective Create the minimal Python project scaffold and fast local test loop for the PDF-to-Markdown converter. Sprint 1 must establish: - A `uv`-managed Python 3.12 project. - A source package importable as `pdf2md`. - A reserved `pdf2md` CLI entry point that does not implement conversion yet. - A fast test command that runs without MinerU, model downloads, GPU access, sample PDFs, or network access. Sprint 1 is scaffolding only. It must not implement PDF conversion, MinerU execution, Markdown normalization, metadata generation, or report generation. ## Current Precondition Sprint 0 found that `uv` was not available on PATH in the current local environment. Sprint 1 resolved this by installing `uv` per-user at `C:\Users\user\.local\bin`. Before Sprint 1 can be accepted, one of these must happen: - `uv` is installed and `uv --version` succeeds. - The user explicitly approves including `uv` bootstrap documentation or setup handling as part of Sprint 1, and the contract result records that `uv sync` could not be run locally. Do not silently replace `uv` with another package manager. ## Touched Surfaces Allowed: - `pyproject.toml` - `uv.lock` - `.gitignore` - `src/pdf2md/__init__.py` - `src/pdf2md/cli.py` only for a minimal placeholder CLI if needed for entry point verification - `tests/` - `README.md` only for minimal setup/test instructions if needed - `PLAN.md` only for current-goal coordination updates required by the shared agent workflow - `PROGRESS.md` - `docs/V1IMPLEMENTATIONPLAN.md` only if sequencing or constraints need adjustment - `docs/Sprints/SPRINT1CONTRACT.md` Not allowed: - `src/pdf2md/conversion.py` - `src/pdf2md/mineru_adapter.py` - `src/pdf2md/paths.py` - `src/pdf2md/ir.py` - `src/pdf2md/markdown.py` - `src/pdf2md/metadata.py` - `src/pdf2md/quality.py` - `src/pdf2md/report.py` - `src/pdf2md/doctor.py` - `scripts/` - Any real MinerU invocation - Any model download or install script - Any committed file under `samples/` ## Expected Outputs Sprint 1 should produce: 1. Project package scaffold - `pyproject.toml` with project metadata. - Python requirement constrained to Python 3.12. - Build configuration suitable for a `src/` layout. - `uv.lock` generated by `uv sync`. - `.gitignore` entries for local virtual environments, pytest cache, and Python bytecode. - Minimal test dependency configuration. - CLI entry point name reserved as `pdf2md`. 2. Minimal source package - `src/pdf2md/__init__.py`. - A stable package import surface. - Optional minimal `src/pdf2md/cli.py` placeholder that exits clearly and does not imply conversion is implemented. 3. Fast test loop - A minimal test suite that verifies the package imports. - If a CLI placeholder is added, a smoke test that verifies the CLI entry point is wired without invoking conversion. - Tests must not require MinerU, CUDA, GPU, model files, `samples/`, or network. 4. Developer workflow - `uv sync` should work when `uv` is installed. - `uv run pytest` should work when `uv` is installed. - If `uv` is still missing locally, record the failure explicitly in `PROGRESS.md` and do not mark Sprint 1 complete. 5. Handoff - `PROGRESS.md` records changed files, commands run, tests passed or blocked, known failures, residual risks, and next action. ## Non-Goals - Do not implement PDF discovery. - Do not implement conversion orchestration. - Do not implement the MinerU adapter. - Do not run MinerU. - Do not install MinerU 3.1.0. - Do not download MinerU models. - Do not implement Markdown normalization. - Do not implement metadata JSON or `.report.md` output. - Do not implement `pdf2md doctor`; a CLI placeholder may mention future commands, but it must not create a doctor module. - Do not add runtime engine selection. - Do not add alternate conversion engines. - Do not add cloud, remote API, router, HTTP client backend, or remote OpenAI-compatible backend support. ## Work Packages ### WP1.1: Scaffold Metadata Owner: - `feature-generator-agent` Actions: - Create the minimal `pyproject.toml`. - Use Python 3.12 constraints. - Configure a `src/` package layout. - Configure pytest as the fast local test runner. - Reserve the `pdf2md` console script. Output: - A minimal, maintainable scaffold without speculative dependencies. ### WP1.2: Package Import Surface Owner: - `feature-generator-agent` Actions: - Create `src/pdf2md/__init__.py`. - Expose only a minimal version/import surface. - Avoid public API promises beyond what Sprint 1 verifies. Output: - `import pdf2md` succeeds. ### WP1.3: CLI Placeholder Owner: - `feature-generator-agent` Actions: - If needed for console script verification, create `src/pdf2md/cli.py`. - The placeholder may expose a help message or a clear "not implemented yet" command. - It must not create conversion flags beyond the reserved command shape unless tests need them. Output: - `pdf2md` entry point is wired without implying conversion works. ### WP1.4: Fast Tests Owner: - `feature-generator-agent` - `evaluation-agent` Actions: - Add minimal tests for package import and optional CLI placeholder behavior. - Ensure tests are local, fast, and independent of MinerU/model/GPU/network state. Output: - `uv run pytest` passes when `uv` is available. ### WP1.5: Independent Evaluation Owner: - `evaluation-agent` Actions: - Review the completed scaffold against this contract. - Verify no converter implementation was added. - Verify `samples/` remains untracked and unstaged. - Verify no runtime remote path or alternate engine was introduced. Output: - PASS/FAIL notes with any missing acceptance criteria. ## Verification Checks Required: - `git status --short` before staging confirms `samples/` remains untracked. - `uv --version` is run and result is recorded. - `uv sync` passes if `uv` is available. - `uv run pytest` passes if `uv` is available. - If `uv` is unavailable, Sprint 1 is marked blocked rather than complete. - Import test passes through the configured test command. - No real MinerU dependency is required for default tests. - No model downloads occur. - No network calls are required. - No candidate engine comparison is reintroduced. - No conversion behavior is implemented. - `git diff --check` passes. Recommended: - Keep `pyproject.toml` dependency list minimal. - Avoid adding README content beyond setup/test instructions needed for the scaffold. - Use `requirements-guard-agent` to check document consistency if the scaffold reveals a sequencing issue. ## Hard Failure Criteria Sprint 1 fails and must stop for a user decision if any of these are true: - `uv` remains unavailable and the user has not approved bootstrap handling. - The project cannot be installed as a Python 3.12 package. - The package cannot be imported as `pdf2md`. - Default tests require MinerU, model downloads, GPU access, sample PDFs, or network access. - The scaffold introduces conversion logic outside Sprint 1 scope. - The scaffold introduces alternate engines or runtime engine selection. - The scaffold introduces `--api-url`, remote APIs, router mode, HTTP client backends, or remote OpenAI-compatible backends. - `samples/` is staged or committed. ## Acceptance Criteria Sprint 1 is complete when: - `pyproject.toml` exists and defines a minimal Python 3.12 `uv` project. - `src/pdf2md/__init__.py` exists and `import pdf2md` works through the project environment. - `uv sync` passes. - `uv run pytest` passes. - The `pdf2md` CLI entry point is reserved and does not imply conversion is implemented. - No converter implementation code beyond the allowed placeholder exists. - No default test depends on MinerU, GPU, model files, network, or `samples/`. - `PROGRESS.md` records checks performed and residual risks. - Independent evaluation is complete. - The completed change is committed. ## Handoff Fields Use these fields when Sprint 1 completes: - Files changed: - Commands run: - Tests passed: - Tests blocked: - Known failures: - Residual risks: - User decisions needed: - Go/no-go recommendation for Sprint 2: - Next action: