# PLAN.md This file is the shared work plan for agents. Read it before starting work, then update it when the plan changes. ## Current Goal Completed work history is archived in `docs/WORKARCHIVE.md`. Sprint 10 pre-conversion PDF chunking is implemented. On this PC, base uv/Python project setup is complete in `.venv`; next runtime work is local MinerU/PyTorch/model/MathJax setup if requested. ## Active Constraints - Do not implement additional program code beyond the active user-approved sprint. - Keep MinerU 3.1.0 as the only conversion engine. - Keep processing local-only. - Target Python 3.12. - Target GPU: GTX 1070 Ti 8GB. - Default conversion device: `cuda:0`. - Run MinerU through direct local CLI execution only. - On MinerU failure, report a clear error/warning and do not silently fallback. - Write both metadata JSON and a human-readable `.report.md` quality report for conversions. - Use `samples/` only as local fixture context; do not commit sample files unless explicitly requested. ## Planned Work 1. Use `research-agent` for MinerU 3.1.0 source tracking and official-doc verification. 2. Use `requirements-guard-agent` for cross-document consistency reviews. 3. Use `mineru-integration-agent` for direct local MinerU CLI adapter planning. 4. Use `obsidian-markdown-agent` for math-heavy Obsidian Markdown output planning. 5. Use `metadata-agent` for provenance, warning, JSON metadata, and `.report.md` planning. 6. Use `evaluation-agent` for local fixture coverage and regression criteria. 7. Use `local-setup-agent` for Python 3.12, uv, CUDA, GTX 1070 Ti 8GB, and doctor-check planning. 8. Use `license-privacy-agent` for license and strict-local privacy review. 9. Use `harness-planner-agent` to turn substantial implementation requests into scoped contracts before code work starts. 10. Use `feature-generator-agent` to implement one approved contract at a time after the user explicitly requests implementation. 11. Use `evaluation-agent` as the independent contract reviewer and QA evaluator before and after each implementation chunk. 12. Follow `docs/V1IMPLEMENTATIONPLAN.md` for the v1 implementation sprint sequence. 13. Use `docs/Sprints/SPRINT10CONTRACT.md` for the implemented long-PDF pre-conversion chunking sprint. 14. Use `docs/WORKARCHIVE.md` for completed sprint history, prior verification, runtime setup evidence, and sample conversion evidence. ## Open Questions - None. ## Decisions - Use `PLAN.md` for intended work and ownership. - Use `PROGRESS.md` for completed work, current status, blockers, and next actions. - Use `docs/WORKARCHIVE.md` for archived completed work and historical handoff details. - MinerU default local CLI execution is the only v1 execution mode. - MinerU 3.1.0 may launch a temporary local `mineru-api` internally when `mineru` CLI runs without `--api-url`. - Strict-local mode forbids `--api-url`, remote APIs, router mode, HTTP client backends, and remote OpenAI-compatible backends. - No silent fallback after MinerU failure. - Conversion output includes both metadata JSON and `.report.md`. - Local MathJax render checking is optional and nonfatal; missing Node.js or MathJax must produce a clear warning instead of blocking conversion. - Project-scoped custom agents live in `.codex/agents/*.toml`. - Project prompt commands live in `.codex/commands/*.md`. - Project-specific skills live in `.codex/skills/*/SKILL.md`. - Project hooks live in `.codex/hooks.json` and `.codex/hooks/*.py`. - Agent, command, skill, and hook assets are written in English for Codex compatibility. - Long-running implementation should use a planner/generator/evaluator harness only when the task complexity justifies the overhead. - Each substantial implementation chunk should have a sprint contract with objective, scope, verification, failure thresholds, and handoff fields. - Generator agents may self-check, but independent evaluation is required before marking a chunk complete. - V1 implementation sequencing and sprint contracts live in `docs/V1IMPLEMENTATIONPLAN.md`. - Concrete sprint contract documents live under `docs/Sprints/`. - Sprint 2 path planning contract lives at `docs/Sprints/SPRINT2CONTRACT.md`. - Sprint 3 domain records and metadata contract lives at `docs/Sprints/SPRINT3CONTRACT.md`. - Sprint 4 MinerU adapter contract lives at `docs/Sprints/SPRINT4CONTRACT.md`. - Sprint 4 fixes the v1 adapter executable to the direct `mineru` CLI; user-specified alternate executables, including `mineru-api`, are prohibited. - Sprint 5 Obsidian Markdown normalization and asset link contract lives at `docs/Sprints/SPRINT5CONTRACT.md`. - Sprint 5 owns Markdown normalization only; it does not write final Markdown files, copy assets, run MinerU, or connect to conversion orchestration. - Sprint 6 quality checks and report generation contract lives at `docs/Sprints/SPRINT6CONTRACT.md`. - Sprint 6 owns quality/report boundaries only; it does not write final files, run MinerU, or connect to conversion orchestration. - Sprint 7 conversion orchestration, CLI, and Python API contract lives at `docs/Sprints/SPRINT7CONTRACT.md`. - Sprint 7 will be the first implementation sprint allowed to write final Markdown, metadata JSON, report Markdown, and local copied assets as product behavior. - Sprint 7 implemented conversion orchestration, `convert_pdf`, batch conversion, `pdf2md convert`, output writing, metadata/report writing, and fake-adapter CLI/API tests. - Sprint 8 should cover `pdf2md doctor` and setup documentation; Sprint 7 intentionally did not add doctor behavior. - Sprint 8 doctor and setup documentation contract lives at `docs/Sprints/SPRINT8CONTRACT.md`. - Sprint 8 owns doctor diagnostics and setup docs only; it must not run real MinerU, download models, run sample PDFs, or add runtime remote/API paths in default tests. - Sprint 8 implements `pdf2md doctor`, local setup diagnostics, and setup documentation without running real MinerU, downloading models, or touching `samples/` in default tests. - Sprint 9 local fixture evaluation and v1 release gate contract lives at `docs/Sprints/SPRINT9CONTRACT.md`. - Sprint 9 must keep default tests independent of real MinerU, GPU, models, network, Obsidian, LaTeX tooling, and `samples/`; real MinerU fixture checks must be explicit opt-in only. - Sprint 9 implements fast mocked integration tests, explicit opt-in local MinerU fixture evaluation, and `docs/V1RELEASECHECKLIST.md`. - `pdf2md convert` defaults to `--gpu cuda:0`. - The MinerU adapter maps CUDA device requests to local subprocess environment variables instead of adding speculative MinerU CLI flags. - GTX 1070 Ti local runtime uses PyTorch `2.6.0+cu126` and `torchvision 0.21.0+cu126` installed after `uv sync`, followed by `mineru[core]==3.1.0`. - MinerU models are downloaded with `mineru-models-download -s huggingface -m all`, and runtime model loading uses `MINERU_MODEL_SOURCE=local`. - Sprint 10 uses `pypdf` for local PDF page chunk planning and temporary chunk PDF writing. - Sprint 10 converts chunk PDFs independently and does not merge generated Markdown outputs. - Chunking is opt-in through `--chunk-pages`; if the option is present without a value, the CLI uses 20 pages per chunk. - `convert_pdf()` keeps returning `ConversionResult` without chunking and returns `BatchConversionResult` when `chunk_pages` is set. - Chunk PDFs are temporary local files and are deleted after conversion completes, including when raw MinerU output is retained.