Add Markdown recheck command
This commit is contained in:
@@ -4,7 +4,7 @@ Local-only PDF-to-Markdown converter for math-heavy digital documents.
|
||||
|
||||
## Status
|
||||
|
||||
The project currently provides a Python package, `pdf2md convert`, metadata/report output, mocked MinerU adapter tests, `pdf2md doctor` setup diagnostics, and Sprint 9 release-gate documentation. Real local MinerU sample validation remains optional and may be blocked until MinerU 3.1.0 and local model/cache setup are available.
|
||||
The project currently provides a Python package, `pdf2md convert`, Markdown recheck via `pdf2md recheck`, metadata/report output, mocked MinerU adapter tests, `pdf2md doctor` setup diagnostics, and Sprint 9 release-gate documentation. Real local MinerU sample validation remains optional and may be blocked until MinerU 3.1.0 and local model/cache setup are available.
|
||||
|
||||
## Setup
|
||||
|
||||
@@ -76,6 +76,16 @@ The model/cache check looks for these environment variables when present:
|
||||
|
||||
It also checks for `%USERPROFILE%\mineru.json`, which MinerU documents as its default user config location. Missing model/cache paths are warnings because model download and cache population must be explicit setup actions.
|
||||
|
||||
## Rechecking Markdown
|
||||
|
||||
After editing a generated Markdown file, rerun local quality checks and regenerate the adjacent metadata/report files:
|
||||
|
||||
```powershell
|
||||
uv run pdf2md recheck outputs/MITC공부/MITC공부.md
|
||||
```
|
||||
|
||||
`recheck` reads the existing `<stem>.metadata.json` for source PDF, engine, page, and asset provenance. It replaces quality warnings that can be recalculated from the current Markdown, including MathJax render failures and local asset-link warnings, then rewrites `<stem>.metadata.json` and `<stem>.report.md`.
|
||||
|
||||
## Runtime Policy
|
||||
|
||||
Runtime conversion is strict-local. Allowed: direct `mineru` CLI execution and the CLI-internal temporary local `mineru-api` that MinerU starts when `--api-url` is omitted. Prohibited: `--api-url`, remote APIs, router mode, HTTP client backends, remote OpenAI-compatible backends, hosted renderers, and cloud fallbacks.
|
||||
|
||||
Reference in New Issue
Block a user