add pdftomd

This commit is contained in:
김경종
2026-05-08 16:42:19 +09:00
parent 551ab50735
commit 88d6b92283
99 changed files with 47332 additions and 0 deletions
@@ -0,0 +1,28 @@
---
description: Plan the direct local MinerU CLI adapter and failure/reporting behavior
argument-hint: [integration-scope]
allowed-tools: [Read, Glob, Grep, Bash, WebFetch, Edit]
---
# /plan-mineru-integration
Plan the future implementation shape for the MinerU adapter without writing converter code.
## Arguments
The user invoked this command with: $ARGUMENTS
## Workflow
1. Read `PLAN.md`, `PROGRESS.md`, `PRD.md`, and `ARCHITECTURE.md`.
2. Verify any MinerU CLI facts that may have changed before changing docs.
3. Define the smallest adapter contract for command construction, working directories, outputs, stdout/stderr capture, exit handling, warnings, and provenance.
4. Ensure failure behavior is explicit: no silent fallback and no alternate engine route.
5. Identify mocked-output tests and optional MinerU-dependent checks.
6. Update `PLAN.md` only if implementation sequencing changes; update `PROGRESS.md` after the planning work.
## Guardrails
- Do not implement program code during planning.
- Do not introduce runtime engine selection or cloud-compatible endpoints.
- Keep GPU limitations and CPU messaging explicit for GTX 1070 Ti 8GB.
@@ -0,0 +1,27 @@
---
description: Plan fixture-based quality checks and conversion report requirements
argument-hint: [sample-or-quality-scope]
allowed-tools: [Read, Glob, Grep, Bash, Edit]
---
# /plan-quality-evaluation
Plan local fixture evaluation and report requirements for math-heavy PDF conversion.
## Arguments
The user invoked this command with: $ARGUMENTS
## Workflow
1. Read `PLAN.md`, `PROGRESS.md`, `PRD.md`, and `ARCHITECTURE.md`.
2. Inspect `samples/` only as local fixture context; do not stage or commit sample files.
3. Define checks for page coverage, reading order, math renderability, delimiter normalization, table handling, asset links, metadata completeness, and warning counts.
4. Define `.json` metadata and `.report.md` expectations from the same source data.
5. Separate fast mocked checks from optional MinerU/model/GPU-dependent checks.
6. Update `PROGRESS.md` with the planned coverage and remaining sample gaps.
## Guardrails
- Do not copy sample PDFs into tracked files.
- Do not require GPU or large model downloads for the default fast verification loop.
+26
View File
@@ -0,0 +1,26 @@
---
description: Review core project documents for consistency with fixed decisions
argument-hint: [scope]
allowed-tools: [Read, Glob, Grep, Bash, Edit]
---
# /review-project-docs
Review project documents for contradictions, stale decisions, and missing constraints.
## Arguments
The user invoked this command with: $ARGUMENTS
## Workflow
1. Read `PLAN.md` and `PROGRESS.md`.
2. Read the requested document scope, defaulting to `AGENTS.md`, `PRD.md`, `ARCHITECTURE.md`, and `docs/KNOWLEDGEBASE.md`.
3. Check for contradictions against fixed decisions: MinerU 3.1.0 only, local-only, direct CLI execution, CLI-internal temporary local `mineru-api` allowed, no `--api-url` or remote API path, Python 3.12, uv, Obsidian Markdown, metadata JSON, and `.report.md`.
4. Report findings first with file and line references.
5. If edits are requested, make only surgical documentation changes and update `PROGRESS.md`.
## Guardrails
- Do not add speculative features, alternate engines, web UI, cloud OCR, or manual review queues.
- Do not rewrite unrelated prose while fixing one inconsistency.
+29
View File
@@ -0,0 +1,29 @@
---
description: Research current MinerU 3.1.0 facts for local integration planning
argument-hint: [research-question]
allowed-tools: [Read, Glob, Grep, Bash, WebFetch, Edit]
---
# /run-mineru-research
Research MinerU 3.1.0 facts that affect this project's documentation or future implementation.
## Arguments
The user invoked this command with: $ARGUMENTS
## Workflow
1. Read `PLAN.md`, `PROGRESS.md`, `ARCHITECTURE.md`, and `docs/KNOWLEDGEBASE.md`.
2. Use official MinerU documentation, the MinerU GitHub repository, primary papers, and official dependency documentation.
3. Verify facts that can change: install commands, supported Python/CUDA versions, CLI flags, output formats, model download behavior, and licenses.
4. Record sources with URLs and access dates when updating docs.
5. Keep findings scoped to MinerU 3.1.0; do not add candidate-engine comparisons.
6. Update `PROGRESS.md` with what was verified and what remains uncertain.
## Guardrails
- Allow only direct `mineru` CLI execution and the CLI-internal temporary local `mineru-api` process.
- Do not add cloud OCR, hosted LLM, `--api-url`, remote API, router, HTTP client backend, or remote OpenAI-compatible backend paths.
- Do not turn research notes into implementation code.
- If official sources conflict, stop and ask for a decision instead of guessing.
+32
View File
@@ -0,0 +1,32 @@
---
description: Start a project task by loading shared plan and progress context
argument-hint: [agent-or-task]
allowed-tools: [Read, Glob, Grep, Bash, Edit]
---
# /start-agent-work
Start work in this repository with the project coordination protocol.
## Arguments
The user invoked this command with: $ARGUMENTS
## Workflow
1. Read `PLAN.md` and `PROGRESS.md`.
2. State the current goal, the next action, and any blocker that matters for the task.
3. Read only the additional source documents needed for the requested work.
4. If subagents are useful and the user explicitly asked for delegated agent work, choose the smallest set of `.codex/agents/*.toml` roles that covers the task.
5. For substantial implementation work, use the harness sequence: `harness-planner-agent` drafts the plan and contract, `feature-generator-agent` implements one agreed chunk, and `evaluation-agent` reviews the contract and completed work.
6. Do not implement converter code unless the user explicitly requests implementation.
7. After meaningful changes, update `PROGRESS.md`; update `PLAN.md` only when sequencing, decisions, ownership, or blockers change.
8. Run the smallest useful verification, check git status, and commit project changes while excluding `samples/`.
## Guardrails
- Keep MinerU 3.1.0 as the only conversion engine.
- Allow MinerU 3.1.0's CLI-internal temporary local `mineru-api`, but prohibit `--api-url`, remote APIs, router mode, HTTP client backends, and remote OpenAI-compatible backends.
- Keep runtime processing local-only.
- Keep `samples/` out of commits unless the user explicitly requests otherwise.
- Prefer official sources for changing facts about Codex, MinerU, Python, uv, CUDA, or licenses.