Add Markdown recheck command
This commit is contained in:
+7
-4
@@ -6,7 +6,7 @@ This file records current progress for agents. Read it before starting work, the
|
||||
|
||||
- Project direction is documented in `PRD.md`, `ARCHITECTURE.md`, `AGENTS.md`, and `docs/KNOWLEDGEBASE.md`.
|
||||
- MinerU 3.1.0 is fixed as the only conversion engine.
|
||||
- The converter currently includes path planning, project-owned records, metadata, direct local MinerU adapter boundary, Obsidian Markdown normalization, local quality checks, report rendering, conversion orchestration, `pdf2md convert`, `pdf2md doctor`, local MathJax render checking, release-gate tests, and opt-in pre-conversion PDF chunking.
|
||||
- The converter currently includes path planning, project-owned records, metadata, direct local MinerU adapter boundary, Obsidian Markdown normalization, local quality checks, report rendering, conversion orchestration, `pdf2md convert`, `pdf2md recheck`, `pdf2md doctor`, local MathJax render checking, release-gate tests, and opt-in pre-conversion PDF chunking.
|
||||
- `docs/V1IMPLEMENTATIONPLAN.md` defines the v1 implementation sequence.
|
||||
- `docs/Sprints/` contains completed sprint contracts through Sprint 10.
|
||||
- `docs/WORKARCHIVE.md` contains completed sprint history, historical verification results, runtime setup notes, and sample conversion evidence.
|
||||
@@ -45,6 +45,8 @@ This file records current progress for agents. Read it before starting work, the
|
||||
- Verified full local runtime with `uv run pdf2md doctor`: PASS.
|
||||
- Verified real local sample conversion: `samples/FourNodeQuadrilateralShellElementMITC4.pdf` to ignored `outputs/runtime-smoke/`, status `success`, 7 pages, 22 assets, 38 inline formulas, 16 display formulas, 0 math render errors, and 0 warnings.
|
||||
- Converted `samples/MITC공부.pdf` to ignored `outputs/MITC공부/`; report status was `partial`: 13 pages, 107 assets, 23 inline formulas, 103 display formulas, 2 MathJax render warnings, and 0 missing or invalid asset links.
|
||||
- Added `recheck_markdown()` and `pdf2md recheck <markdown.md>` to rerun local quality checks for an existing generated Markdown file and rewrite the adjacent metadata JSON and `.report.md` without rerunning MinerU.
|
||||
- Verified `uv run pdf2md recheck outputs\MITC공부\MITC공부.md`; the command regenerated metadata/report and still reported 2 warnings because the current Markdown still contains the two MathJax-invalid expressions.
|
||||
|
||||
## In Progress
|
||||
|
||||
@@ -56,6 +58,7 @@ This file records current progress for agents. Read it before starting work, the
|
||||
|
||||
## Next Actions
|
||||
|
||||
1. Review generated sample Markdown outputs in Obsidian if visual quality needs manual assessment.
|
||||
2. Run optional real local chunked conversion on a long sample only if requested.
|
||||
3. Preserve strict-local runtime behavior: use local model paths, direct CLI execution, and no user-specified API or remote backend.
|
||||
1. Manually fix the two MathJax-invalid expressions in `outputs/MITC공부/MITC공부.md` if a warning-free local report is desired, then run `uv run pdf2md recheck outputs\MITC공부\MITC공부.md`.
|
||||
2. Review generated sample Markdown outputs in Obsidian if visual quality needs manual assessment.
|
||||
3. Run optional real local chunked conversion on a long sample only if requested.
|
||||
4. Preserve strict-local runtime behavior: use local model paths, direct CLI execution, and no user-specified API or remote backend.
|
||||
|
||||
Reference in New Issue
Block a user