modify pdftomd
This commit is contained in:
@@ -72,6 +72,16 @@ Strong success criteria let you loop independently. Weak criteria ("make it work
|
||||
|
||||
**These guidelines are working if:** fewer unnecessary changes in diffs, fewer rewrites due to overcomplication, and clarifying questions come before implementation rather than after mistakes.
|
||||
|
||||
## Commands
|
||||
|
||||
| Command | Description |
|
||||
| --- | --- |
|
||||
| `uv run pytest` | Run the default fast test suite. |
|
||||
| `uv run pdf2md doctor` | Check local Python, uv, MinerU, GPU/PyTorch, model/cache, MathJax, and strict-local setup. |
|
||||
| `uv run pytest tests/test_ui_runner.py` | Run focused UI command-resolution and subprocess tests. |
|
||||
| `uv run --group ui-build pyinstaller --clean --onefile --windowed --name pdf2md-ui src\pdf2md_ui\app.py` | Rebuild the thin Windows UI executable. |
|
||||
| `uv run pdf2md convert paper.pdf --out outputs --chunk-pages --gpu auto --mineru-profile auto --strict-local` | Optional local conversion smoke; keep generated output ignored. |
|
||||
|
||||
## Source Documents
|
||||
|
||||
- `PLAN.md`: shared plan, planned work, open questions, and ownership for agents.
|
||||
@@ -80,8 +90,11 @@ Strong success criteria let you loop independently. Weak criteria ("make it work
|
||||
- `ARCHITECTURE.md`: system layers, MinerU adapter contract, intermediate representation, metadata schema, and local-only enforcement.
|
||||
- `docs/KNOWLEDGEBASE.md`: research basis and implementation background.
|
||||
- `docs/V1IMPLEMENTATIONPLAN.md`: v1 implementation sequence, sprint contracts, verification gates, and agent ownership.
|
||||
- `docs/UI_RESEARCH.md`: research basis for the implemented minimal Windows UI launcher.
|
||||
- `docs/WORKARCHIVE.md`: archived completed work, historical sprint outcomes, setup results, verification history, and sample conversion evidence.
|
||||
- `docs/Sprints/*.md`: active and historical sprint contracts.
|
||||
- `docs/superpowers/specs/*.md`: design specs created for focused project workflows.
|
||||
- `docs/superpowers/plans/*.md`: executable task plans created from specs, including completed UI folder batch work and abandoned historical plans.
|
||||
- `.codex/agents/*.toml`: project-scoped custom subagent roles.
|
||||
- `.codex/commands/*.md`: reusable project prompt commands.
|
||||
- `.codex/skills/*/SKILL.md`: project-specific Codex skills.
|
||||
@@ -155,7 +168,8 @@ Periodically re-evaluate the harness itself. Remove roles, contracts, or checks
|
||||
- Input priority: digital PDFs with text layers.
|
||||
- Quality workflow: fully automatic. Log warnings and continue when possible.
|
||||
- MinerU execution: direct local `mineru` CLI only. MinerU 3.1.0 may launch a temporary local `mineru-api` internally when CLI runs without `--api-url`.
|
||||
- Quality report: write both metadata JSON and `<stem>.report.md`.
|
||||
- Output layout: write `<out>/<stem>/<stem>_001.md`, shared `<out>/<stem>/images/`, and `<out>/<stem>/<stem>_report.md`; new conversions do not persist public metadata JSON after Sprint 16.
|
||||
- UI folder batch conversion: the UI may convert direct-child PDFs in a selected folder by sequentially invoking existing `pdf2md convert` commands.
|
||||
- v1 use case: personal/research. MinerU and transitive model/package licenses must be documented before redistribution.
|
||||
|
||||
## Architecture Guidance
|
||||
@@ -217,6 +231,8 @@ After changing files:
|
||||
- Check `git status --short`.
|
||||
- Commit the completed change unless the user explicitly asks not to.
|
||||
- Do not include unrelated user edits in the commit.
|
||||
- Commit rollback requests - Verify the target commit and current status first, then use a direct non-interactive reset; leave untracked generated/local artifacts such as `build/`, `dist/`, `samples/`, and `*.spec` files untouched unless deletion is explicitly requested.
|
||||
- Installed-runtime doctor debugging - Test both `uv run pdf2md doctor` and direct venv execution such as `.venv\Scripts\pdf2md.exe doctor`; direct execution may not inherit the same PATH behavior as `uv run`.
|
||||
|
||||
## Documentation Guidance
|
||||
|
||||
|
||||
Reference in New Issue
Block a user