Files
PDFToMD/.codex/skills/mineru-research/references/source-checklist.md
T
2026-05-08 16:42:19 +09:00

1.5 KiB

MinerU Research Source Checklist

Use this checklist before changing project docs or plans based on MinerU facts.

Sources

  • MinerU GitHub repository for install instructions, CLI examples, output behavior, and license files.
  • MinerU official documentation for current setup and execution modes.
  • MinerU release notes or tags for version-specific changes.
  • Primary papers for model capability claims.
  • Official Python, uv, CUDA, PyTorch, or dependency docs for environment compatibility.

Facts To Verify

  • Supported Python versions and package manager expectations.
  • Whether MinerU 3.1.0 supports the required local CLI path on Windows.
  • Whether MinerU 3.1.0's CLI-internal temporary local mineru-api behavior stays local and avoids --api-url.
  • Required model download/cache behavior and offline reuse assumptions.
  • GPU/CPU execution options and expected memory pressure for GTX 1070 Ti 8GB.
  • Output directory structure, Markdown output, image asset output, JSON/intermediate output, and page/block metadata availability.
  • Exit codes, error messages, logging behavior, and partial-output behavior.
  • License obligations for MinerU, bundled models, and transitive runtime packages.

Recording Rules

  • Record source URL and access date for durable claims.
  • Distinguish official fact from inference.
  • Keep alternate engine names out of project docs unless the user explicitly asks for a separate historical note.
  • If a source conflicts with a fixed product decision, record the conflict and ask for a user decision.