65 lines
5.3 KiB
Markdown
65 lines
5.3 KiB
Markdown
# PROGRESS.md
|
|
|
|
This file records current progress for agents. Read it before starting work, then update it after meaningful changes. Completed historical work is archived in `docs/WORKARCHIVE.md`.
|
|
|
|
## Current Status
|
|
|
|
- Project direction is documented in `PRD.md`, `ARCHITECTURE.md`, `AGENTS.md`, and `docs/KNOWLEDGEBASE.md`.
|
|
- MinerU 3.1.0 is fixed as the only conversion engine.
|
|
- The converter currently includes path planning, project-owned records, metadata, direct local MinerU adapter boundary, Obsidian Markdown normalization, local quality checks, report rendering, conversion orchestration, `pdf2md convert`, `pdf2md recheck`, `pdf2md doctor`, local MathJax render checking, release-gate tests, and opt-in pre-conversion PDF chunking.
|
|
- `docs/V1IMPLEMENTATIONPLAN.md` defines the v1 implementation sequence.
|
|
- `docs/Sprints/` contains completed sprint contracts through Sprint 10.
|
|
- `docs/WORKARCHIVE.md` contains completed sprint history, historical verification results, runtime setup notes, and sample conversion evidence.
|
|
- `samples/` exists locally as fixture context.
|
|
- `outputs/` is ignored and contains local generated conversion outputs.
|
|
|
|
## Environment Notes
|
|
|
|
- OS/workspace: Windows PowerShell in `C:\git\PDFToMD`.
|
|
- Python target: 3.12.
|
|
- Local project Python observed: 3.12.13 in `.venv`.
|
|
- `uv` is installed per-user at `C:\Users\baram\.local\bin`.
|
|
- Target GPU documented for the original project setup: NVIDIA GTX 1070 Ti 8GB.
|
|
- Current PC GPU observed by `doctor`: NVIDIA GeForce RTX 4080 SUPER 16GB.
|
|
- Default conversion device: `cuda:0`.
|
|
- MinerU execution mode: direct local `mineru` CLI only.
|
|
- Strict-local allows MinerU 3.1.0's CLI-internal temporary local `mineru-api` when the CLI runs without `--api-url`.
|
|
- Strict-local prohibits `--api-url`, remote APIs, router mode, HTTP client backends, and remote OpenAI-compatible backends.
|
|
- Current `.venv` has project fast-test dependencies, CUDA-enabled PyTorch `2.6.0+cu126`, `torchvision 0.21.0+cu126`, and `mineru[core]==3.1.0`.
|
|
- Current `pdf2md doctor` status is PASS. MinerU, RTX 4080 SUPER CUDA PyTorch, local model config, MathJax, and strict-local checks pass.
|
|
- MinerU models were downloaded from Hugging Face by explicit setup command. Runtime model loading uses `MINERU_MODEL_SOURCE=local`.
|
|
|
|
## Recent Completed Work
|
|
|
|
- Archived completed sprint and setup history into `docs/WORKARCHIVE.md`.
|
|
- Added `docs/WORKARCHIVE.md` references to `AGENTS.md`, `PLAN.md`, `docs/V1IMPLEMENTATIONPLAN.md`, relevant `.codex/agents/*.toml`, `.codex/commands/*.md`, and project skills.
|
|
- Sprint 10 is implemented with `pypdf>=6.10.2,<7`, `src/pdf2md/pdf_splitter.py`, `--chunk-pages [PAGES]`, chunk-aware conversion orchestration, temporary chunk cleanup, and chunk report context.
|
|
- `--chunk-pages` is opt-in; when present without a value it uses 20 pages.
|
|
- `convert_pdf()` returns `BatchConversionResult` when `chunk_pages` is set and keeps returning `ConversionResult` when chunking is unset.
|
|
- Converted `samples/FourNodeQuadrilateralShellElementMITC4.pdf` with `MINERU_MODEL_SOURCE=local` and default `--gpu cuda:0`; output was written to ignored `outputs/FourNodeQuadrilateralShellElementMITC4/`.
|
|
- The FourNode sample conversion report status was `success`: 7 pages, 22 assets, 38 inline formulas, 16 display formulas, 0 math render errors, and 0 warnings.
|
|
- Installed uv `0.11.12` at `C:\Users\baram\.local\bin`, installed uv-managed CPython `3.12.13`, created `.venv`, and ran `uv sync`.
|
|
- Verified base project environment with `uv run pytest`: 163 passed, 1 skipped.
|
|
- Installed runtime dependencies on this PC: CUDA PyTorch `2.6.0+cu126`, `torchvision 0.21.0+cu126`, `mineru[core]==3.1.0`, local MathJax npm dependencies, and local MinerU models.
|
|
- Set user environment variable `MINERU_MODEL_SOURCE=local`.
|
|
- Verified full local runtime with `uv run pdf2md doctor`: PASS.
|
|
- Verified real local sample conversion: `samples/FourNodeQuadrilateralShellElementMITC4.pdf` to ignored `outputs/runtime-smoke/`, status `success`, 7 pages, 22 assets, 38 inline formulas, 16 display formulas, 0 math render errors, and 0 warnings.
|
|
- Converted `samples/MITC공부.pdf` to ignored `outputs/MITC공부/`; report status was `partial`: 13 pages, 107 assets, 23 inline formulas, 103 display formulas, 2 MathJax render warnings, and 0 missing or invalid asset links.
|
|
- Added `recheck_markdown()` and `pdf2md recheck <markdown.md>` to rerun local quality checks for an existing generated Markdown file and rewrite the adjacent metadata JSON and `.report.md` without rerunning MinerU.
|
|
- Verified `uv run pdf2md recheck outputs\MITC공부\MITC공부.md`; the command regenerated metadata/report and still reported 2 warnings because the current Markdown still contains the two MathJax-invalid expressions.
|
|
|
|
## In Progress
|
|
|
|
- No active implementation chunk.
|
|
|
|
## Blockers
|
|
|
|
- No active blocker.
|
|
|
|
## Next Actions
|
|
|
|
1. Manually fix the two MathJax-invalid expressions in `outputs/MITC공부/MITC공부.md` if a warning-free local report is desired, then run `uv run pdf2md recheck outputs\MITC공부\MITC공부.md`.
|
|
2. Review generated sample Markdown outputs in Obsidian if visual quality needs manual assessment.
|
|
3. Run optional real local chunked conversion on a long sample only if requested.
|
|
4. Preserve strict-local runtime behavior: use local model paths, direct CLI execution, and no user-specified API or remote backend.
|