30 lines
1.5 KiB
Markdown
30 lines
1.5 KiB
Markdown
# MinerU Research Source Checklist
|
|
|
|
Use this checklist before changing project docs or plans based on MinerU facts.
|
|
|
|
## Sources
|
|
|
|
- MinerU GitHub repository for install instructions, CLI examples, output behavior, and license files.
|
|
- MinerU official documentation for current setup and execution modes.
|
|
- MinerU release notes or tags for version-specific changes.
|
|
- Primary papers for model capability claims.
|
|
- Official Python, uv, CUDA, PyTorch, or dependency docs for environment compatibility.
|
|
|
|
## Facts To Verify
|
|
|
|
- Supported Python versions and package manager expectations.
|
|
- Whether MinerU 3.1.0 supports the required local CLI path on Windows.
|
|
- Whether MinerU 3.1.0's CLI-internal temporary local `mineru-api` behavior stays local and avoids `--api-url`.
|
|
- Required model download/cache behavior and offline reuse assumptions.
|
|
- GPU/CPU execution options and expected memory pressure for GTX 1070 Ti 8GB.
|
|
- Output directory structure, Markdown output, image asset output, JSON/intermediate output, and page/block metadata availability.
|
|
- Exit codes, error messages, logging behavior, and partial-output behavior.
|
|
- License obligations for MinerU, bundled models, and transitive runtime packages.
|
|
|
|
## Recording Rules
|
|
|
|
- Record source URL and access date for durable claims.
|
|
- Distinguish official fact from inference.
|
|
- Keep alternate engine names out of project docs unless the user explicitly asks for a separate historical note.
|
|
- If a source conflicts with a fixed product decision, record the conflict and ask for a user decision.
|