add pdftomd
This commit is contained in:
@@ -0,0 +1,31 @@
|
||||
# Obsidian Output Checks
|
||||
|
||||
Use these checks when designing or reviewing Markdown output.
|
||||
|
||||
## Math
|
||||
|
||||
- Inline math: `$...$`, no line breaks inside the delimiter pair.
|
||||
- Display math: `$$...$$`, with blank lines before and after the block.
|
||||
- Preserve source provenance for formulas: page index, bbox if available, engine, confidence, and warning codes.
|
||||
- Record render failures separately from extraction confidence.
|
||||
- Avoid rewriting LaTeX semantics unless the rule is deterministic and tested.
|
||||
|
||||
## Assets
|
||||
|
||||
- Store images under a deterministic asset directory next to the Markdown output.
|
||||
- Use relative Markdown links that remain valid when the output directory is moved as a unit.
|
||||
- Record asset source page, bbox if available, generated file path, and missing-link warnings.
|
||||
|
||||
## Tables
|
||||
|
||||
- Prefer Markdown tables only when cell boundaries and reading order are reliable.
|
||||
- If formulas or merged cells make Markdown tables misleading, use a readable fallback and emit a table warning.
|
||||
- Keep table warnings visible in both JSON metadata and `.report.md`.
|
||||
|
||||
## Report Signals
|
||||
|
||||
- Total pages processed and pages with warnings.
|
||||
- Math block count, inline math count, and non-renderable math count.
|
||||
- Broken asset links and missing assets.
|
||||
- Table degradation count.
|
||||
- Reading-order uncertainty count.
|
||||
Reference in New Issue
Block a user