6.4 KiB
6.4 KiB
PROGRESS.md
This file records current progress for agents. Read it before starting work, then update it after meaningful changes. Completed historical work is archived in docs/WORKARCHIVE.md.
Current Status
- Project direction is documented in
PRD.md,ARCHITECTURE.md,AGENTS.md, anddocs/KNOWLEDGEBASE.md. - MinerU 3.1.0 is fixed as the only conversion engine.
- The converter currently includes path planning, project-owned records, metadata, direct local MinerU adapter boundary, Obsidian Markdown normalization, local quality checks, report rendering, conversion orchestration,
pdf2md convert,pdf2md recheck,pdf2md doctor, local MathJax render checking, conservative MathJax warning mitigation, release-gate tests, and opt-in pre-conversion PDF chunking. docs/V1IMPLEMENTATIONPLAN.mddefines the v1 implementation sequence.docs/Sprints/contains completed sprint contracts through Sprint 11.docs/WORKARCHIVE.mdcontains completed sprint history, historical verification results, runtime setup notes, and sample conversion evidence.samples/exists locally as fixture context.outputs/is ignored and contains local generated conversion outputs.
Environment Notes
- OS/workspace: Windows PowerShell in
C:\git\PDFToMD. - Python target: 3.12.
- Local project Python observed: 3.12.13 in
.venv. uvis installed per-user atC:\Users\baram\.local\bin.- Target GPU documented for the original project setup: NVIDIA GTX 1070 Ti 8GB.
- Current PC GPU observed by
doctor: NVIDIA GeForce RTX 4080 SUPER 16GB. - Default conversion device:
cuda:0. - MinerU execution mode: direct local
mineruCLI only. - Strict-local allows MinerU 3.1.0's CLI-internal temporary local
mineru-apiwhen the CLI runs without--api-url. - Strict-local prohibits
--api-url, remote APIs, router mode, HTTP client backends, and remote OpenAI-compatible backends. - Current
.venvhas project fast-test dependencies, CUDA-enabled PyTorch2.6.0+cu126,torchvision 0.21.0+cu126, andmineru[core]==3.1.0. - Current
pdf2md doctorstatus is PASS. MinerU, RTX 4080 SUPER CUDA PyTorch, local model config, MathJax, and strict-local checks pass. - MinerU models were downloaded from Hugging Face by explicit setup command. Runtime model loading uses
MINERU_MODEL_SOURCE=local.
Recent Completed Work
- Archived completed sprint and setup history into
docs/WORKARCHIVE.md. - Added
docs/WORKARCHIVE.mdreferences toAGENTS.md,PLAN.md,docs/V1IMPLEMENTATIONPLAN.md, relevant.codex/agents/*.toml,.codex/commands/*.md, and project skills. - Sprint 10 is implemented with
pypdf>=6.10.2,<7,src/pdf2md/pdf_splitter.py,--chunk-pages [PAGES], chunk-aware conversion orchestration, temporary chunk cleanup, and chunk report context. --chunk-pagesis opt-in; when present without a value it uses 20 pages.convert_pdf()returnsBatchConversionResultwhenchunk_pagesis set and keeps returningConversionResultwhen chunking is unset.- Converted
samples/FourNodeQuadrilateralShellElementMITC4.pdfwithMINERU_MODEL_SOURCE=localand default--gpu cuda:0; output was written to ignoredoutputs/FourNodeQuadrilateralShellElementMITC4/. - The FourNode sample conversion report status was
success: 7 pages, 22 assets, 38 inline formulas, 16 display formulas, 0 math render errors, and 0 warnings. - Installed uv
0.11.12atC:\Users\baram\.local\bin, installed uv-managed CPython3.12.13, created.venv, and ranuv sync. - Verified base project environment with
uv run pytest: 163 passed, 1 skipped. - Installed runtime dependencies on this PC: CUDA PyTorch
2.6.0+cu126,torchvision 0.21.0+cu126,mineru[core]==3.1.0, local MathJax npm dependencies, and local MinerU models. - Set user environment variable
MINERU_MODEL_SOURCE=local. - Verified full local runtime with
uv run pdf2md doctor: PASS. - Verified real local sample conversion:
samples/FourNodeQuadrilateralShellElementMITC4.pdfto ignoredoutputs/runtime-smoke/, statussuccess, 7 pages, 22 assets, 38 inline formulas, 16 display formulas, 0 math render errors, and 0 warnings. - Converted
samples/MITC공부.pdfto ignoredoutputs/MITC공부/; report status waspartial: 13 pages, 107 assets, 23 inline formulas, 103 display formulas, 2 MathJax render warnings, and 0 missing or invalid asset links. - Added
recheck_markdown()andpdf2md recheck <markdown.md>to rerun local quality checks for an existing generated Markdown file and rewrite the adjacent metadata JSON and.report.mdwithout rerunning MinerU. - Verified
uv run pdf2md recheck outputs\MITC공부\MITC공부.md; the command regenerated metadata/report and still reported 2 warnings because the current Markdown still contains the two MathJax-invalid expressions. - Reconverted
samples/MITC공부.pdfwith--overwriteto ignoredoutputs/MITC공부/; report status remainspartial: 13 pages, 107 assets, 23 inline formulas, 103 display formulas, 2 MathJax render warnings, and 0 missing or invalid asset links. - Sprint 11 implemented conservative MathJax warning mitigation with failed-expression details,
src/pdf2md/math_repair.py, sharedconvert/recheckrepair integration, andMATH_RENDER_REPAIREDinfo warnings. - Verified default fast suite:
uv run pytestpassed 172 tests with 1 skipped. - Verified requested real sample:
uv run pdf2md convert samples\MITC공부.pdf --out outputs\sprint11-MITC공부 --overwritesucceeded with 13 pages, 107 assets, 23 inline formulas, 103 display formulas, 0 MathJax render errors, and 2MATH_RENDER_REPAIREDinfo warnings. - Reconverted
samples/MITC공부.pdfto ignoredoutputs/MITC공부/with Sprint 11 mitigation; report status ispartialfrom 2MATH_RENDER_REPAIREDinfo warnings, with 13 pages, 107 assets, 23 inline formulas, 103 display formulas, 0 MathJax render errors, and 0 missing or invalid asset links.
In Progress
- No active implementation chunk.
Blockers
- No active blocker.
Next Actions
- Review generated sample Markdown outputs in Obsidian if visual quality needs manual assessment.
- Run additional real local sample validation only if requested, especially for new MathJax failure messages not covered by Sprint 11's narrow repair rules.
- Run optional real local chunked conversion on a long sample only if requested.
- Preserve strict-local runtime behavior: use local model paths, direct CLI execution, and no user-specified API or remote backend.