1.5 KiB
1.5 KiB
MinerU Research Source Checklist
Use this checklist before changing project docs or plans based on MinerU facts.
Sources
- MinerU GitHub repository for install instructions, CLI examples, output behavior, and license files.
- MinerU official documentation for current setup and execution modes.
- MinerU release notes or tags for version-specific changes.
- Primary papers for model capability claims.
- Official Python, uv, CUDA, PyTorch, or dependency docs for environment compatibility.
Facts To Verify
- Supported Python versions and package manager expectations.
- Whether MinerU 3.1.0 supports the required local CLI path on Windows.
- Whether MinerU 3.1.0's CLI-internal temporary local
mineru-apibehavior stays local and avoids--api-url. - Required model download/cache behavior and offline reuse assumptions.
- GPU/CPU execution options and expected memory pressure for GTX 1070 Ti 8GB.
- Output directory structure, Markdown output, image asset output, JSON/intermediate output, and page/block metadata availability.
- Exit codes, error messages, logging behavior, and partial-output behavior.
- License obligations for MinerU, bundled models, and transitive runtime packages.
Recording Rules
- Record source URL and access date for durable claims.
- Distinguish official fact from inference.
- Keep alternate engine names out of project docs unless the user explicitly asks for a separate historical note.
- If a source conflicts with a fixed product decision, record the conflict and ask for a user decision.