add pdftomd

This commit is contained in:
김경종
2026-05-08 16:42:19 +09:00
parent 551ab50735
commit 88d6b92283
99 changed files with 47332 additions and 0 deletions
@@ -0,0 +1,31 @@
---
name: math-markdown-review
description: Review and design Obsidian-friendly Markdown normalization for math-heavy PDF conversion, including LaTeX delimiters, display math spacing, asset links, tables, and quality report warnings. Use when Codex needs to check Markdown output assumptions, design post-processing rules, or define renderability checks for formulas and assets.
---
# Math Markdown Review
## Overview
Use this skill when Markdown output quality matters more than raw text extraction. The goal is best-effort automatic conversion with explicit warnings and provenance for failures.
## Workflow
1. Read `PLAN.md` and `PROGRESS.md` first.
2. Read `PRD.md` and `ARCHITECTURE.md` when output behavior, metadata, or reporting is affected.
3. Preserve project delimiter policy: inline math uses `$...$`; display math uses `$$...$$`.
4. Check asset links, table fallback behavior, heading/list interactions, and page boundary markers against Obsidian rendering assumptions.
5. Define warnings for low-confidence math, non-renderable LaTeX, broken asset links, table degradation, and reading-order uncertainty.
6. Ensure `.report.md` content is derived from metadata, not separate manual state.
## Checks
- Inline math should not contain unescaped newlines or surrounding spaces that break rendering.
- Display math should be separated from surrounding paragraphs by blank lines.
- Asset paths should be stable, relative to the Markdown file, and safe for Obsidian vaults.
- Tables with formulas should prefer readable Markdown when reliable and warn when downgraded.
- Every renderability failure should be countable in metadata and visible in `.report.md`.
## Reference
Read `references/obsidian-output-checks.md` for concrete normalization and report-signal guidance.
@@ -0,0 +1,4 @@
interface:
display_name: "Math Markdown Review"
short_description: "Check Obsidian math Markdown output"
default_prompt: "Use $math-markdown-review to design or check Obsidian-friendly Markdown normalization, math delimiters, asset paths, tables, and quality report signals."
@@ -0,0 +1,31 @@
# Obsidian Output Checks
Use these checks when designing or reviewing Markdown output.
## Math
- Inline math: `$...$`, no line breaks inside the delimiter pair.
- Display math: `$$...$$`, with blank lines before and after the block.
- Preserve source provenance for formulas: page index, bbox if available, engine, confidence, and warning codes.
- Record render failures separately from extraction confidence.
- Avoid rewriting LaTeX semantics unless the rule is deterministic and tested.
## Assets
- Store images under a deterministic asset directory next to the Markdown output.
- Use relative Markdown links that remain valid when the output directory is moved as a unit.
- Record asset source page, bbox if available, generated file path, and missing-link warnings.
## Tables
- Prefer Markdown tables only when cell boundaries and reading order are reliable.
- If formulas or merged cells make Markdown tables misleading, use a readable fallback and emit a table warning.
- Keep table warnings visible in both JSON metadata and `.report.md`.
## Report Signals
- Total pages processed and pages with warnings.
- Math block count, inline math count, and non-renderable math count.
- Broken asset links and missing assets.
- Table degradation count.
- Reading-order uncertainty count.