modify pdftomd

This commit is contained in:
김경종
2026-05-14 10:16:59 +09:00
parent 2232b51fc9
commit dc11880140
69 changed files with 7784 additions and 1150 deletions
+4 -4
View File
@@ -1,6 +1,6 @@
---
name: fixture-evaluation
description: Plan local fixture-based quality checks for this MinerU PDF-to-Markdown converter using samples/ without committing sample PDFs. Use when Codex needs to define sample coverage, quality metrics, regression checks, JSON metadata assertions, or human-readable .report.md expectations.
description: Plan local fixture-based quality checks for this MinerU PDF-to-Markdown converter using samples/ without committing sample PDFs. Use when Codex needs to define sample coverage, quality metrics, regression checks, internal provenance assertions, or human-readable _report.md expectations.
---
# Fixture Evaluation
@@ -14,9 +14,9 @@ Use this skill to turn local sample PDFs into a small, repeatable quality plan.
1. Read `PLAN.md` and `PROGRESS.md` first.
2. Read `docs/WORKARCHIVE.md` when prior fixture coverage, verification, or sample conversion evidence is needed.
3. Inspect `samples/` only enough to understand fixture categories and filenames.
4. Map each fixture to risks: math, tables, multi-column reading order, figures/assets, Korean filenames, and metadata coverage.
4. Map each fixture to risks: math, tables, multi-column reading order, figures/assets, Korean filenames, and report/provenance coverage.
5. Separate fast checks using mocked MinerU outputs from optional checks that require MinerU models, GPU, or long execution.
6. Define metrics for both JSON metadata and `<stem>.report.md`.
6. Define metrics for internal provenance and `<stem>_report.md`.
7. Update `PROGRESS.md` with fixture coverage and gaps.
## Guardrails
@@ -24,7 +24,7 @@ Use this skill to turn local sample PDFs into a small, repeatable quality plan.
- Do not commit sample PDFs.
- Do not copy samples into tracked fixtures without explicit user permission.
- Do not make GPU/model-dependent checks mandatory for the default fast loop.
- Do not grade only plain-text edit distance; include math, tables, reading order, assets, metadata, and renderability.
- Do not grade only plain-text edit distance; include math, tables, reading order, assets, report provenance, and renderability.
## Reference