20 KiB
Sprint 15 Contract: NVIDIA GPU Detection And Auto MinerU Profile
Status: Implemented Last updated: 2026-05-12
Objective
Add a strict-local runtime profiling layer that detects installed NVIDIA GPUs and applies conservative MinerU environment tuning by default.
The default runtime profile is auto. In auto, the converter should keep 8GB and pre-Turing GPUs conservative, while allowing a slightly more aggressive local MinerU configuration only when the selected NVIDIA GPU has at least 16GB VRAM and no pre-Turing compatibility warning.
This sprint is motivated by local evidence from samples\FourNodeQuadrilateralShellElementMITC4.pdf: Sprint 14's one-page conversion path used cuda:0 correctly, but GTX 1070 Ti 8GB stayed near full VRAM use and stalled on source page 2. The next useful test should be on a stronger NVIDIA GPU with explicit runtime diagnostics and reproducible MinerU environment settings.
Source Basis
Use these source-backed facts during implementation:
- MinerU CLI supports
mineru -p <input_path> -o <output_path>and, without--api-url, launches a temporary localmineru-api: https://opendatalab.github.io/MinerU/usage/cli_tools/ - MinerU CLI documents
-b/--backend,-f/--formula,-t/--table,--api-url, and related options, but this project must not expose remote/API or backend selection paths in v1: https://opendatalab.github.io/MinerU/usage/cli_tools/ - MinerU environment variables include
MINERU_PDF_RENDER_THREADS,MINERU_PROCESSING_WINDOW_SIZE,MINERU_API_MAX_CONCURRENT_REQUESTS, and timeout settings: https://opendatalab.github.io/MinerU/usage/cli_tools/ - MinerU advanced CLI docs support selecting visible GPU devices with
CUDA_VISIBLE_DEVICES: https://opendatalab.github.io/MinerU/usage/advanced_cli_parameters/ - MinerU local deployment docs list auto-engine GPU requirements around 8GB+ VRAM and GPU acceleration for Volta-or-later devices: https://opendatalab.github.io/MinerU/quick_start/
- MinerU extension docs say
vllmandlmdeployacceleration extras are alternatives and should not both be installed just for this sprint: https://opendatalab.github.io/MinerU/quick_start/extension_modules/
Access date for the source review: 2026-05-12.
Current Precondition
- MinerU 3.1.0 remains the only conversion engine.
- Conversion runs through direct local
mineruCLI execution only. - Strict-local allows only the direct CLI and MinerU CLI-internal temporary local
mineru-api; remote API/backend paths remain prohibited. pdf2md convertdefaults to--gpu cuda:0.MinerUAdaptercurrently mapscuda:NtoMINERU_DEVICE_MODE=cudaandCUDA_VISIBLE_DEVICES=N.pdf2md doctoralready reports NVIDIA GPU visibility, PyTorch CUDA visibility, GPU names, and Pascal/pre-Turing warnings.- Sprint 14 chunk mode runs one source page per MinerU invocation when
--chunk-pagesis active.
Contract Assumptions
- Keep
--gpu cuda:0as the default for backward compatibility with PRD and existing docs. - Add
--gpu autoas an opt-in GPU selection mode that chooses the visible NVIDIA GPU with the largest reported VRAM. - Add
--mineru-profile {auto,safe,performance}with defaultauto. - Keep all conversion requests sequential in Sprint 15. Do not introduce parallel page conversion.
- Keep formula and table parsing enabled. Do not optimize by disabling required output quality features.
- Do not add
--backend,--api-url,--url, router mode, HTTP client backend, remote OpenAI-compatible backend, or remote model server support. - Treat MinerU environment tuning as best-effort. If GPU inventory cannot be read, continue with safe profile settings and a warning/provenance record rather than guessing aggressive values.
Touched Surfaces
Allowed during implementation:
- Create
src/pdf2md/gpu.py - Create
src/pdf2md/mineru_profile.py - Modify
src/pdf2md/mineru_adapter.py - Modify
src/pdf2md/conversion.py - Modify
src/pdf2md/cli.py - Modify
src/pdf2md/doctor.py - Modify
src/pdf2md_ui/runner.pyonly if the UI command builder needs profile passthrough - Modify
src/pdf2md_ui/app.pyonly if a minimal profile control is necessary - Add
tests/test_gpu.py - Add
tests/test_mineru_profile.py - Modify
tests/test_mineru_adapter.py - Modify
tests/test_conversion.py - Modify
tests/test_cli.py - Modify
tests/test_doctor.py - Modify
tests/test_ui_runner.pyonly if UI command construction changes - Modify
README.md - Modify
ARCHITECTURE.md - Modify
PRD.mdif CLI option documentation changes - Modify
docs/V1IMPLEMENTATIONPLAN.md - Modify
PLAN.md - Modify
PROGRESS.md - Modify
docs/WORKARCHIVE.mdafter implementation
Not allowed:
- Adding another conversion engine or runtime engine selector.
- Passing
--api-url,--url, or any remote endpoint to MinerU. - Adding
mineru-router, HTTP client backend, or OpenAI-compatible backend usage. - Installing
vllm,lmdeploy, CUDA packages, models, or any runtime package automatically. - Changing the default conversion engine or disabling formula/table recognition.
- Making default tests depend on real MinerU, GPU, CUDA, PyTorch, model files, network, Obsidian, MathJax, or
samples/. - Committing sample PDFs, generated
outputs/, retained temporary page outputs, local model files, ordist/pdf2md-ui.exe.
Product Behavior
CLI
Existing behavior remains valid:
uv run pdf2md convert paper.pdf --out outputs
uv run pdf2md convert paper.pdf --out outputs --gpu cuda:0
New behavior:
uv run pdf2md convert paper.pdf --out outputs --mineru-profile auto
uv run pdf2md convert paper.pdf --out outputs --mineru-profile safe
uv run pdf2md convert paper.pdf --out outputs --mineru-profile performance
uv run pdf2md convert paper.pdf --out outputs --gpu auto --mineru-profile auto
Rules:
--mineru-profiledefaults toauto.--gpu cuda:Nselects a concrete CUDA index and tunes MinerU for that selected GPU when inventory is available.--gpu Nis still normalized tocuda:N.--gpu autoselects the visible NVIDIA GPU with the largest VRAM from local GPU inventory.- If
--gpu autocannot find a visible NVIDIA GPU, fail clearly before conversion rather than silently switching to CPU. - If
--mineru-profile performanceis requested on a selected GPU below 16GB VRAM or with pre-Turing risk, downgrade to safe settings with a warning in metadata/report. Do not fail solely because performance was unsafe.
Doctor
pdf2md doctor should report:
- All visible NVIDIA GPUs with index, name, total VRAM, and driver version from
nvidia-smi. - PyTorch CUDA device names and compute capabilities when available.
- Selected default GPU recommendation for
--gpu auto. - Recommended MinerU profile for the detected primary GPU.
- Existing Pascal/pre-Turing warnings.
Doctor must not require a real conversion, model load, network access, or package download.
Auto Profile Policy
Use a small deterministic policy table. Values are intentionally conservative because the converter runs real PDFs and should prefer completion over peak throughput.
| Selected GPU | Auto policy | MinerU environment |
|---|---|---|
| No GPU inventory, CUDA requested | Safe fallback with warning | MINERU_PROCESSING_WINDOW_SIZE=1, MINERU_API_MAX_CONCURRENT_REQUESTS=1, MINERU_PDF_RENDER_THREADS=1 |
| Pre-Turing or VRAM < 12GB | Safe | MINERU_PROCESSING_WINDOW_SIZE=1, MINERU_API_MAX_CONCURRENT_REQUESTS=1, MINERU_PDF_RENDER_THREADS=1 |
| 12GB <= VRAM < 16GB | Auto conservative | MINERU_PROCESSING_WINDOW_SIZE=4, MINERU_API_MAX_CONCURRENT_REQUESTS=1, MINERU_PDF_RENDER_THREADS=2 |
| VRAM >= 16GB and Turing-or-newer | Auto moderately aggressive | MINERU_PROCESSING_WINDOW_SIZE=8, MINERU_API_MAX_CONCURRENT_REQUESTS=1, MINERU_PDF_RENDER_THREADS=4 |
Explicit safe |
Safe regardless of GPU | MINERU_PROCESSING_WINDOW_SIZE=1, MINERU_API_MAX_CONCURRENT_REQUESTS=1, MINERU_PDF_RENDER_THREADS=1 |
Explicit performance on VRAM >= 16GB and Turing-or-newer |
Performance | MINERU_PROCESSING_WINDOW_SIZE=16, MINERU_API_MAX_CONCURRENT_REQUESTS=1, MINERU_PDF_RENDER_THREADS=4 |
Explicit performance on weaker GPU |
Downgraded safe with warning | safe values |
Do not set MINERU_HYBRID_BATCH_RATIO in Sprint 15 because MinerU docs describe it as commonly used for hybrid-http-client, which this project prohibits in v1.
Do not set backend CLI flags in Sprint 15. The default MinerU backend remains MinerU-owned.
Architecture Plan
WP15.1: GPU Inventory Boundary
Actions:
- Add
src/pdf2md/gpu.py. - Define immutable
GpuInfoandGpuInventoryrecords. - Parse
nvidia-smi --query-gpu=index,name,memory.total,driver_version --format=csv,noheader,nounits. - Parse memory in MiB as an integer.
- Mark pre-Turing risk using the existing name-based heuristic for GTX 10xx and pre-Turing names.
- Optionally enrich compute capability through PyTorch when available, but keep PyTorch optional and mockable.
- Provide
select_gpu(gpus, requested)forcuda:N,N, andauto.
Expected output:
- GPU detection is independently testable with captured command output strings.
- No real
nvidia-smi, GPU, or PyTorch is needed in default tests.
WP15.2: MinerU Profile Policy
Actions:
- Add
src/pdf2md/mineru_profile.py. - Define supported profile names:
auto,safe,performance. - Define a result record containing:
- requested profile,
- applied profile,
- selected GPU index if known,
- selected GPU name if known,
- selected GPU VRAM MiB if known,
- environment variables to set,
- warnings or info messages as project
WarningRecordvalues.
- Implement the policy table above.
- Keep profile environment values in a small allowlist.
Expected output:
- The policy can be tested without running MinerU.
- Performance profile cannot silently overcommit weak GPUs.
WP15.3: Adapter Environment Integration
Actions:
- Extend
MinerUOptionswithmineru_profile: str = "auto"and optional resolved profile metadata. - Keep strict-local validation for every option string.
- Update
_mineru_environment()to merge:MINERU_DEVICE_MODE=cuda,CUDA_VISIBLE_DEVICES=<selected index>,- profile environment variables from
mineru_profile.py.
- Preserve previous environment values after subprocess execution.
- Include profile details in
engine_options.
Expected output:
- Real MinerU still receives only direct local CLI command shape:
mineru -p <input> -o <output>
- Tuning is done through local environment variables, not remote/API/backend flags.
WP15.4: Conversion And CLI Wiring
Actions:
- Add
--mineru-profiletopdf2md convert. - Accept
--gpu auto. - Resolve selected GPU and profile before calling the adapter.
- Surface profile warnings in conversion metadata/report warnings.
- Preserve existing
--gpu cuda:0default. - Ensure
convert_pdf()can receive the profile through the Python API.
Expected output:
- Default conversions use
mineru_profile=auto. - Existing calls with no new flags continue to work.
- Metadata explains which profile was applied.
WP15.5: Doctor Reporting
Actions:
- Reuse
gpu.pyinventory parsing indoctor.py. - Keep the existing
gpuandpytorchchecks, but make GPU details more explicit. - Add a doctor detail line for auto-selected GPU and recommended profile.
- Keep warning-only behavior for Pascal/pre-Turing GPUs.
Expected output:
- On a stronger PC,
pdf2md doctorshows enough evidence to decide whetherautoorperformanceis appropriate. - On the current GTX 1070 Ti, doctor still warns and recommends safe/conservative behavior.
WP15.6: Documentation
Actions:
- Update README setup and conversion docs with
--mineru-profile. - Update ARCHITECTURE to document that tuning uses strict-local environment variables only.
- Update PRD CLI section if the new public flag is added.
- Update
docs/V1IMPLEMENTATIONPLAN.md,PLAN.md, andPROGRESS.md. - Archive implementation details in
docs/WORKARCHIVE.mdonly after implementation and verification.
Expected output:
- Users can move the repo to a stronger NVIDIA GPU PC, run
pdf2md doctor, and understand the selected profile.
Tests
Default fast tests:
- GPU inventory parser handles one RTX GPU, multiple GPUs, no GPU lines, and malformed memory fields.
select_gpu(..., "auto")selects the largest VRAM GPU.select_gpu(..., "cuda:1")selects index 1 and errors when absent.select_gpu(..., "1")normalizes to index 1.autoprofile returns safe values for GTX 1070 Ti 8GB.autoprofile returns moderately aggressive values for an RTX GPU with 16GB or more.performanceprofile returns performance values only for 16GB+ Turing-or-newer GPUs.performanceprofile on GTX 1070 Ti downgrades to safe and returns a warning.- Adapter sets and restores
MINERU_DEVICE_MODE,CUDA_VISIBLE_DEVICES,MINERU_PROCESSING_WINDOW_SIZE,MINERU_API_MAX_CONCURRENT_REQUESTS, andMINERU_PDF_RENDER_THREADS. - Strict-local validation rejects remote/API/backend-like option strings in profile-related fields.
- CLI default passes
mineru_profile=auto. - CLI accepts
--mineru-profile safeand--mineru-profile performance. - CLI rejects invalid profile values.
- Doctor report includes visible GPU details and recommended profile with mocked command outputs.
- Existing conversion, chunking, metadata, report, and UI tests remain green.
Optional local validation on a stronger NVIDIA GPU PC:
uv run pdf2md doctor
$env:MINERU_MODEL_SOURCE='local'
uv run pdf2md convert samples\FourNodeQuadrilateralShellElementMITC4.pdf --out outputs\fournode-sprint15-auto --overwrite --chunk-pages --gpu auto --mineru-profile auto --strict-local
Expected optional validation:
- Doctor reports the stronger GPU name, VRAM, and recommended profile.
- Conversion metadata records
mineru_profileand selected GPU information. - Generated outputs stay ignored and uncommitted.
Acceptance Criteria
--mineru-profile autois the default conversion behavior.autouses safe settings on the current GTX 1070 Ti 8GB and stronger settings only on 16GB+ Turing-or-newer NVIDIA GPUs.--gpu autocan choose the largest visible NVIDIA GPU without adding remote/runtime backend support.- MinerU command shape remains direct local CLI only.
- Strict-local prohibitions remain enforced.
pdf2md doctorprovides actionable GPU/profile information.- Metadata/report preserve the applied runtime profile.
- Default tests remain fast, mocked, local, and independent of real MinerU/GPU/model files/network/samples.
Hard Failure Criteria
- Implementation adds runtime backend selection or exposes
--backend. - Implementation passes
--api-url,--url, router, HTTP client backend, or remote OpenAI-compatible backend values. autoprofile applies aggressive settings to GTX 1070 Ti 8GB or other pre-Turing/low-VRAM GPUs.- Existing
--gpu cuda:0behavior breaks. - Profile tuning disables formula or table parsing.
- Doctor or tests require real GPU, real MinerU execution, model files, network, Obsidian, MathJax, or
samples/. - Sample PDFs, generated outputs, local model files, or
dist/pdf2md-ui.exeare committed.
Implementation Task Plan
Task 1: GPU Inventory
Files:
- Create
src/pdf2md/gpu.py - Create
tests/test_gpu.py
Steps:
- Add failing tests for parsing
nvidia-smiCSV output. - Add failing tests for
auto,cuda:N, and numeric GPU selection. - Implement immutable GPU records and parser helpers.
- Implement selection errors as
ValueErrorwith clear messages. - Run
uv run pytest tests/test_gpu.py. - Commit GPU inventory boundary.
Task 2: MinerU Profile Policy
Files:
- Create
src/pdf2md/mineru_profile.py - Create
tests/test_mineru_profile.py
Steps:
- Add failing tests for safe, auto, and performance profile policy.
- Add tests proving 16GB+ Turing-or-newer GPUs get the moderately aggressive auto environment.
- Add tests proving GTX 1070 Ti 8GB stays safe.
- Implement the allowlisted environment mapping.
- Run
uv run pytest tests/test_mineru_profile.py tests/test_gpu.py. - Commit profile policy.
Task 3: Adapter And Conversion Wiring
Files:
- Modify
src/pdf2md/mineru_adapter.py - Modify
src/pdf2md/conversion.py - Modify
tests/test_mineru_adapter.py - Modify
tests/test_conversion.py
Steps:
- Add failing adapter tests for profile environment variables and environment restoration.
- Add failing conversion tests that metadata receives applied profile information.
- Extend
MinerUOptionsand conversion options minimally. - Merge GPU and profile environment variables before the MinerU subprocess.
- Run
uv run pytest tests/test_mineru_adapter.py tests/test_conversion.py tests/test_mineru_profile.py tests/test_gpu.py. - Commit adapter/conversion wiring.
Task 4: CLI And Doctor
Files:
- Modify
src/pdf2md/cli.py - Modify
src/pdf2md/doctor.py - Modify
tests/test_cli.py - Modify
tests/test_doctor.py
Steps:
- Add failing CLI tests for default
auto, explicitsafe, explicitperformance, invalid profile rejection, and--gpu auto. - Add failing doctor tests for GPU inventory and recommended profile details.
- Implement CLI argument parsing and doctor report additions.
- Run
uv run pytest tests/test_cli.py tests/test_doctor.py tests/test_gpu.py tests/test_mineru_profile.py. - Commit CLI and doctor wiring.
Task 5: UI And Documentation
Files:
- Modify
src/pdf2md_ui/runner.pyonly if explicit UI profile passthrough is needed - Modify
src/pdf2md_ui/app.pyonly if explicit UI profile control is needed - Modify
tests/test_ui_runner.pyonly if runner command construction changes - Modify
README.md - Modify
ARCHITECTURE.md - Modify
PRD.md - Modify
docs/V1IMPLEMENTATIONPLAN.md - Modify
PLAN.md - Modify
PROGRESS.md - Modify
docs/WORKARCHIVE.mdafter implementation
Steps:
- Keep UI unchanged if default CLI
autoprofile is enough for the first implementation pass. - If UI exposes a profile control, add tests for fixed argument-list construction with
shell=False. - Document
--mineru-profile,--gpu auto, profile policy, strict-local boundaries, and stronger-PC validation command. - Run focused docs/UI tests if changed.
- Run final verification commands.
- Commit documentation and final coordination updates.
Verification Commands
uv run pytest tests/test_gpu.py tests/test_mineru_profile.py tests/test_mineru_adapter.py tests/test_conversion.py tests/test_cli.py tests/test_doctor.py
uv run pytest
git diff --check
git status --short --untracked-files=all
Optional stronger-PC validation is listed in the Tests section and must remain explicit opt-in.
Handoff Requirements
After implementation:
- Update
PROGRESS.mdwith files changed, commands run, test outcomes, optional stronger-PC validation outcome, known failures, residual risks, and next action. - Archive completed implementation details in
docs/WORKARCHIVE.md. - Keep generated outputs, sample PDFs, local model files, and UI build artifacts out of the commit.
- Record the detected GPU, applied profile, and whether
samples\FourNodeQuadrilateralShellElementMITC4.pdfcompleted on the stronger PC.
Implementation handoff:
- Files changed:
src/pdf2md/gpu.py,src/pdf2md/mineru_profile.py,src/pdf2md/mineru_adapter.py,src/pdf2md/conversion.py,src/pdf2md/cli.py,src/pdf2md/doctor.py, docs, and focused tests. - Commands run:
uv run pytest tests/test_gpu.py tests/test_mineru_profile.py tests/test_mineru_adapter.py tests/test_conversion.py tests/test_cli.py tests/test_doctor.py;uv run pytest;uv run pdf2md doctor. - Tests passed: targeted Sprint 15 suite passed 101 tests; full default suite passed 225 tests with 1 optional skip; local doctor returned WARN with expected GTX 1070 Ti safe-profile recommendation.
- Known failures: optional stronger-PC real MinerU conversion validation was not run in this workspace.
- Residual risks: GTX 1070 Ti 8GB remains likely to stall on hard pages; stronger-PC behavior still needs local runtime validation.
- Next action: on a stronger NVIDIA GPU PC, run
pdf2md doctorand an explicit local conversion with--gpu auto --mineru-profile auto.
Future Sprint Boundary
A later sprint may add page-level timeout handling, resumable page caches, or a performance mode that can run multiple page conversions concurrently on GPUs with enough VRAM. Those behaviors are intentionally out of Sprint 15 scope.