7.3 KiB
7.3 KiB
PLAN.md
This file is the shared work plan for agents. Read it before starting work, then update it when the plan changes.
Current Goal
Completed work history is archived in docs/WORKARCHIVE.md. Sprint 10 pre-conversion PDF chunking is implemented. On this PC, full local runtime setup is complete in .venv; next work is optional manual Obsidian quality review or additional sample validation if requested.
Active Constraints
- Do not implement additional program code beyond the active user-approved sprint.
- Keep MinerU 3.1.0 as the only conversion engine.
- Keep processing local-only.
- Target Python 3.12.
- Target GPU: GTX 1070 Ti 8GB.
- Default conversion device:
cuda:0. - Run MinerU through direct local CLI execution only.
- On MinerU failure, report a clear error/warning and do not silently fallback.
- Write both metadata JSON and a human-readable
.report.mdquality report for conversions. - Use
samples/only as local fixture context; do not commit sample files unless explicitly requested.
Planned Work
- Use
research-agentfor MinerU 3.1.0 source tracking and official-doc verification. - Use
requirements-guard-agentfor cross-document consistency reviews. - Use
mineru-integration-agentfor direct local MinerU CLI adapter planning. - Use
obsidian-markdown-agentfor math-heavy Obsidian Markdown output planning. - Use
metadata-agentfor provenance, warning, JSON metadata, and.report.mdplanning. - Use
evaluation-agentfor local fixture coverage and regression criteria. - Use
local-setup-agentfor Python 3.12, uv, CUDA, GTX 1070 Ti 8GB, and doctor-check planning. - Use
license-privacy-agentfor license and strict-local privacy review. - Use
harness-planner-agentto turn substantial implementation requests into scoped contracts before code work starts. - Use
feature-generator-agentto implement one approved contract at a time after the user explicitly requests implementation. - Use
evaluation-agentas the independent contract reviewer and QA evaluator before and after each implementation chunk. - Follow
docs/V1IMPLEMENTATIONPLAN.mdfor the v1 implementation sprint sequence. - Use
docs/Sprints/SPRINT10CONTRACT.mdfor the implemented long-PDF pre-conversion chunking sprint. - Use
docs/WORKARCHIVE.mdfor completed sprint history, prior verification, runtime setup evidence, and sample conversion evidence.
Open Questions
- None.
Decisions
- Use
PLAN.mdfor intended work and ownership. - Use
PROGRESS.mdfor completed work, current status, blockers, and next actions. - Use
docs/WORKARCHIVE.mdfor archived completed work and historical handoff details. - MinerU default local CLI execution is the only v1 execution mode.
- MinerU 3.1.0 may launch a temporary local
mineru-apiinternally whenmineruCLI runs without--api-url. - Strict-local mode forbids
--api-url, remote APIs, router mode, HTTP client backends, and remote OpenAI-compatible backends. - No silent fallback after MinerU failure.
- Conversion output includes both metadata JSON and
<stem>.report.md. - Local MathJax render checking is optional and nonfatal; missing Node.js or MathJax must produce a clear warning instead of blocking conversion.
- Project-scoped custom agents live in
.codex/agents/*.toml. - Project prompt commands live in
.codex/commands/*.md. - Project-specific skills live in
.codex/skills/*/SKILL.md. - Project hooks live in
.codex/hooks.jsonand.codex/hooks/*.py. - Agent, command, skill, and hook assets are written in English for Codex compatibility.
- Long-running implementation should use a planner/generator/evaluator harness only when the task complexity justifies the overhead.
- Each substantial implementation chunk should have a sprint contract with objective, scope, verification, failure thresholds, and handoff fields.
- Generator agents may self-check, but independent evaluation is required before marking a chunk complete.
- V1 implementation sequencing and sprint contracts live in
docs/V1IMPLEMENTATIONPLAN.md. - Concrete sprint contract documents live under
docs/Sprints/. - Sprint 2 path planning contract lives at
docs/Sprints/SPRINT2CONTRACT.md. - Sprint 3 domain records and metadata contract lives at
docs/Sprints/SPRINT3CONTRACT.md. - Sprint 4 MinerU adapter contract lives at
docs/Sprints/SPRINT4CONTRACT.md. - Sprint 4 fixes the v1 adapter executable to the direct
mineruCLI; user-specified alternate executables, includingmineru-api, are prohibited. - Sprint 5 Obsidian Markdown normalization and asset link contract lives at
docs/Sprints/SPRINT5CONTRACT.md. - Sprint 5 owns Markdown normalization only; it does not write final Markdown files, copy assets, run MinerU, or connect to conversion orchestration.
- Sprint 6 quality checks and report generation contract lives at
docs/Sprints/SPRINT6CONTRACT.md. - Sprint 6 owns quality/report boundaries only; it does not write final files, run MinerU, or connect to conversion orchestration.
- Sprint 7 conversion orchestration, CLI, and Python API contract lives at
docs/Sprints/SPRINT7CONTRACT.md. - Sprint 7 will be the first implementation sprint allowed to write final Markdown, metadata JSON, report Markdown, and local copied assets as product behavior.
- Sprint 7 implemented conversion orchestration,
convert_pdf, batch conversion,pdf2md convert, output writing, metadata/report writing, and fake-adapter CLI/API tests. - Sprint 8 should cover
pdf2md doctorand setup documentation; Sprint 7 intentionally did not add doctor behavior. - Sprint 8 doctor and setup documentation contract lives at
docs/Sprints/SPRINT8CONTRACT.md. - Sprint 8 owns doctor diagnostics and setup docs only; it must not run real MinerU, download models, run sample PDFs, or add runtime remote/API paths in default tests.
- Sprint 8 implements
pdf2md doctor, local setup diagnostics, and setup documentation without running real MinerU, downloading models, or touchingsamples/in default tests. - Sprint 9 local fixture evaluation and v1 release gate contract lives at
docs/Sprints/SPRINT9CONTRACT.md. - Sprint 9 must keep default tests independent of real MinerU, GPU, models, network, Obsidian, LaTeX tooling, and
samples/; real MinerU fixture checks must be explicit opt-in only. - Sprint 9 implements fast mocked integration tests, explicit opt-in local MinerU fixture evaluation, and
docs/V1RELEASECHECKLIST.md. pdf2md convertdefaults to--gpu cuda:0.- The MinerU adapter maps CUDA device requests to local subprocess environment variables instead of adding speculative MinerU CLI flags.
- GTX 1070 Ti local runtime uses PyTorch
2.6.0+cu126andtorchvision 0.21.0+cu126installed afteruv sync, followed bymineru[core]==3.1.0. - MinerU models are downloaded with
mineru-models-download -s huggingface -m all, and runtime model loading usesMINERU_MODEL_SOURCE=local. - Sprint 10 uses
pypdffor local PDF page chunk planning and temporary chunk PDF writing. - Sprint 10 converts chunk PDFs independently and does not merge generated Markdown outputs.
- Chunking is opt-in through
--chunk-pages; if the option is present without a value, the CLI uses 20 pages per chunk. convert_pdf()keeps returningConversionResultwithout chunking and returnsBatchConversionResultwhenchunk_pagesis set.- Chunk PDFs are temporary local files and are deleted after conversion completes, including when raw MinerU output is retained.