Files
PDFToMD/PROGRESS.md
T
2026-05-08 17:03:40 +09:00

3.6 KiB

PROGRESS.md

This file records current progress for agents. Read it before starting work, then update it after meaningful changes. Completed historical work is archived in docs/WORKARCHIVE.md.

Current Status

  • Project direction is documented in PRD.md, ARCHITECTURE.md, AGENTS.md, and docs/KNOWLEDGEBASE.md.
  • MinerU 3.1.0 is fixed as the only conversion engine.
  • The converter currently includes path planning, project-owned records, metadata, direct local MinerU adapter boundary, Obsidian Markdown normalization, local quality checks, report rendering, conversion orchestration, pdf2md convert, pdf2md doctor, local MathJax render checking, release-gate tests, and opt-in pre-conversion PDF chunking.
  • docs/V1IMPLEMENTATIONPLAN.md defines the v1 implementation sequence.
  • docs/Sprints/ contains completed sprint contracts through Sprint 10.
  • docs/WORKARCHIVE.md contains completed sprint history, historical verification results, runtime setup notes, and sample conversion evidence.
  • samples/ exists locally and is untracked by git.
  • outputs/ is ignored and contains local generated conversion outputs.

Environment Notes

  • OS/workspace: Windows PowerShell in D:\Work\Repos\AICoding\ConvertPDFToMD.
  • Python target: 3.12.
  • Local Python observed: 3.12.7.
  • uv is installed per-user at C:\Users\user\.local\bin.
  • Target GPU: NVIDIA GTX 1070 Ti 8GB.
  • Default conversion device: cuda:0.
  • MinerU execution mode: direct local mineru CLI only.
  • Strict-local allows MinerU 3.1.0's CLI-internal temporary local mineru-api when the CLI runs without --api-url.
  • Strict-local prohibits --api-url, remote APIs, router mode, HTTP client backends, and remote OpenAI-compatible backends.
  • Current local runtime has CUDA-enabled PyTorch 2.6.0+cu126, torchvision 0.21.0+cu126, mineru[core]==3.1.0, local MinerU models, and MINERU_MODEL_SOURCE=local.
  • Current pdf2md doctor status is WARN only because GTX 1070 Ti is Pascal/pre-Turing; MinerU, CUDA PyTorch, local model config, MathJax, and strict-local checks pass.

Recent Completed Work

  • Archived completed sprint and setup history into docs/WORKARCHIVE.md.
  • Added docs/WORKARCHIVE.md references to AGENTS.md, PLAN.md, docs/V1IMPLEMENTATIONPLAN.md, relevant .codex/agents/*.toml, .codex/commands/*.md, and project skills.
  • Sprint 10 is implemented with pypdf>=6.10.2,<7, src/pdf2md/pdf_splitter.py, --chunk-pages [PAGES], chunk-aware conversion orchestration, temporary chunk cleanup, and chunk report context.
  • --chunk-pages is opt-in; when present without a value it uses 20 pages.
  • convert_pdf() returns BatchConversionResult when chunk_pages is set and keeps returning ConversionResult when chunking is unset.
  • Converted samples/FourNodeQuadrilateralShellElementMITC4.pdf with MINERU_MODEL_SOURCE=local and default --gpu cuda:0; output was written to ignored outputs/FourNodeQuadrilateralShellElementMITC4/.
  • The FourNode sample conversion report status was success: 7 pages, 22 assets, 38 inline formulas, 16 display formulas, 0 math render errors, and 0 warnings.

In Progress

  • No active implementation chunk.

Blockers

  • No active blocker.
  • GTX 1070 Ti remains an 8GB Pascal GPU; larger PDFs may still hit VRAM or model compatibility limits.

Next Actions

  1. Review generated sample Markdown outputs in Obsidian if visual quality needs manual assessment.
  2. Run optional real local chunked conversion on a long sample only if requested.
  3. Preserve strict-local runtime behavior: use local model paths, direct CLI execution, and no user-specified API or remote backend.