38 lines
1.3 KiB
Markdown
38 lines
1.3 KiB
Markdown
# Step 1: quality-metrics-report
|
|
|
|
## Read First
|
|
- /AGENTS.md
|
|
- /PLAN.md
|
|
- /PROGRESS.md
|
|
- /docs/HARNESS.md
|
|
- /docs/IMPLEMENTATION_PLAN.md
|
|
- /docs/PRD.md
|
|
- /phases/7-mvp-quality-hardening/step0.md
|
|
|
|
## Task
|
|
Add focused quality metrics for converted Markdown bundles.
|
|
|
|
Metrics should cover headings, math delimiter balance, LaTeX environment pairs, image links, captions, table parseability, chunk frontmatter, and no-exception conversion.
|
|
|
|
## Sprint Contract
|
|
- Done means: evaluator-friendly quality metrics can be run on sample outputs and produce actionable failure messages.
|
|
- Hard thresholds: metrics do not rely on full Markdown snapshots; failures identify file/chunk/block context; reports stay out of generated Markdown.
|
|
- Files owned: `src/pdftomd/quality.py`, `tests/`, optional scripts under `scripts/`, `PROGRESS.md`, phase index.
|
|
- Dependencies: Step 0 sample smoke conversions and quality gates.
|
|
|
|
## Acceptance Criteria
|
|
```powershell
|
|
python scripts\validate_workspace.py
|
|
.\venv\python.exe -m pytest tests
|
|
```
|
|
|
|
## Verification
|
|
1. Run the acceptance commands.
|
|
2. Confirm metrics can be used by `harness-review`.
|
|
3. Update `PROGRESS.md` and this phase index.
|
|
|
|
## Do Not
|
|
- Do not create broad snapshot baselines as the primary quality gate.
|
|
- Do not write quality reports inside Markdown chunks.
|
|
- Do not hide per-chunk failures.
|