# Sprint 15 Contract: NVIDIA GPU Detection And Auto MinerU Profile Status: Implemented Last updated: 2026-05-12 ## Objective Add a strict-local runtime profiling layer that detects installed NVIDIA GPUs and applies conservative MinerU environment tuning by default. The default runtime profile is `auto`. In `auto`, the converter should keep 8GB and pre-Turing GPUs conservative, while allowing a slightly more aggressive local MinerU configuration only when the selected NVIDIA GPU has at least 16GB VRAM and no pre-Turing compatibility warning. This sprint is motivated by local evidence from `samples\FourNodeQuadrilateralShellElementMITC4.pdf`: Sprint 14's one-page conversion path used `cuda:0` correctly, but GTX 1070 Ti 8GB stayed near full VRAM use and stalled on source page 2. The next useful test should be on a stronger NVIDIA GPU with explicit runtime diagnostics and reproducible MinerU environment settings. ## Source Basis Use these source-backed facts during implementation: - MinerU CLI supports `mineru -p -o ` and, without `--api-url`, launches a temporary local `mineru-api`: https://opendatalab.github.io/MinerU/usage/cli_tools/ - MinerU CLI documents `-b/--backend`, `-f/--formula`, `-t/--table`, `--api-url`, and related options, but this project must not expose remote/API or backend selection paths in v1: https://opendatalab.github.io/MinerU/usage/cli_tools/ - MinerU environment variables include `MINERU_PDF_RENDER_THREADS`, `MINERU_PROCESSING_WINDOW_SIZE`, `MINERU_API_MAX_CONCURRENT_REQUESTS`, and timeout settings: https://opendatalab.github.io/MinerU/usage/cli_tools/ - MinerU advanced CLI docs support selecting visible GPU devices with `CUDA_VISIBLE_DEVICES`: https://opendatalab.github.io/MinerU/usage/advanced_cli_parameters/ - MinerU local deployment docs list auto-engine GPU requirements around 8GB+ VRAM and GPU acceleration for Volta-or-later devices: https://opendatalab.github.io/MinerU/quick_start/ - MinerU extension docs say `vllm` and `lmdeploy` acceleration extras are alternatives and should not both be installed just for this sprint: https://opendatalab.github.io/MinerU/quick_start/extension_modules/ Access date for the source review: 2026-05-12. ## Current Precondition - MinerU 3.1.0 remains the only conversion engine. - Conversion runs through direct local `mineru` CLI execution only. - Strict-local allows only the direct CLI and MinerU CLI-internal temporary local `mineru-api`; remote API/backend paths remain prohibited. - `pdf2md convert` defaults to `--gpu cuda:0`. - `MinerUAdapter` currently maps `cuda:N` to `MINERU_DEVICE_MODE=cuda` and `CUDA_VISIBLE_DEVICES=N`. - `pdf2md doctor` already reports NVIDIA GPU visibility, PyTorch CUDA visibility, GPU names, and Pascal/pre-Turing warnings. - Sprint 14 chunk mode runs one source page per MinerU invocation when `--chunk-pages` is active. ## Contract Assumptions - Keep `--gpu cuda:0` as the default for backward compatibility with PRD and existing docs. - Add `--gpu auto` as an opt-in GPU selection mode that chooses the visible NVIDIA GPU with the largest reported VRAM. - Add `--mineru-profile {auto,safe,performance}` with default `auto`. - Keep all conversion requests sequential in Sprint 15. Do not introduce parallel page conversion. - Keep formula and table parsing enabled. Do not optimize by disabling required output quality features. - Do not add `--backend`, `--api-url`, `--url`, router mode, HTTP client backend, remote OpenAI-compatible backend, or remote model server support. - Treat MinerU environment tuning as best-effort. If GPU inventory cannot be read, continue with safe profile settings and a warning/provenance record rather than guessing aggressive values. ## Touched Surfaces Allowed during implementation: - Create `src/pdf2md/gpu.py` - Create `src/pdf2md/mineru_profile.py` - Modify `src/pdf2md/mineru_adapter.py` - Modify `src/pdf2md/conversion.py` - Modify `src/pdf2md/cli.py` - Modify `src/pdf2md/doctor.py` - Modify `src/pdf2md_ui/runner.py` only if the UI command builder needs profile passthrough - Modify `src/pdf2md_ui/app.py` only if a minimal profile control is necessary - Add `tests/test_gpu.py` - Add `tests/test_mineru_profile.py` - Modify `tests/test_mineru_adapter.py` - Modify `tests/test_conversion.py` - Modify `tests/test_cli.py` - Modify `tests/test_doctor.py` - Modify `tests/test_ui_runner.py` only if UI command construction changes - Modify `README.md` - Modify `ARCHITECTURE.md` - Modify `PRD.md` if CLI option documentation changes - Modify `docs/V1IMPLEMENTATIONPLAN.md` - Modify `PLAN.md` - Modify `PROGRESS.md` - Modify `docs/WORKARCHIVE.md` after implementation Not allowed: - Adding another conversion engine or runtime engine selector. - Passing `--api-url`, `--url`, or any remote endpoint to MinerU. - Adding `mineru-router`, HTTP client backend, or OpenAI-compatible backend usage. - Installing `vllm`, `lmdeploy`, CUDA packages, models, or any runtime package automatically. - Changing the default conversion engine or disabling formula/table recognition. - Making default tests depend on real MinerU, GPU, CUDA, PyTorch, model files, network, Obsidian, MathJax, or `samples/`. - Committing sample PDFs, generated `outputs/`, retained temporary page outputs, local model files, or `dist/pdf2md-ui.exe`. ## Product Behavior ### CLI Existing behavior remains valid: ```powershell uv run pdf2md convert paper.pdf --out outputs uv run pdf2md convert paper.pdf --out outputs --gpu cuda:0 ``` New behavior: ```powershell uv run pdf2md convert paper.pdf --out outputs --mineru-profile auto uv run pdf2md convert paper.pdf --out outputs --mineru-profile safe uv run pdf2md convert paper.pdf --out outputs --mineru-profile performance uv run pdf2md convert paper.pdf --out outputs --gpu auto --mineru-profile auto ``` Rules: - `--mineru-profile` defaults to `auto`. - `--gpu cuda:N` selects a concrete CUDA index and tunes MinerU for that selected GPU when inventory is available. - `--gpu N` is still normalized to `cuda:N`. - `--gpu auto` selects the visible NVIDIA GPU with the largest VRAM from local GPU inventory. - If `--gpu auto` cannot find a visible NVIDIA GPU, fail clearly before conversion rather than silently switching to CPU. - If `--mineru-profile performance` is requested on a selected GPU below 16GB VRAM or with pre-Turing risk, downgrade to safe settings with a warning in metadata/report. Do not fail solely because performance was unsafe. ### Doctor `pdf2md doctor` should report: - All visible NVIDIA GPUs with index, name, total VRAM, and driver version from `nvidia-smi`. - PyTorch CUDA device names and compute capabilities when available. - Selected default GPU recommendation for `--gpu auto`. - Recommended MinerU profile for the detected primary GPU. - Existing Pascal/pre-Turing warnings. Doctor must not require a real conversion, model load, network access, or package download. ### Auto Profile Policy Use a small deterministic policy table. Values are intentionally conservative because the converter runs real PDFs and should prefer completion over peak throughput. | Selected GPU | Auto policy | MinerU environment | | --- | --- | --- | | No GPU inventory, CUDA requested | Safe fallback with warning | `MINERU_PROCESSING_WINDOW_SIZE=1`, `MINERU_API_MAX_CONCURRENT_REQUESTS=1`, `MINERU_PDF_RENDER_THREADS=1` | | Pre-Turing or VRAM < 12GB | Safe | `MINERU_PROCESSING_WINDOW_SIZE=1`, `MINERU_API_MAX_CONCURRENT_REQUESTS=1`, `MINERU_PDF_RENDER_THREADS=1` | | 12GB <= VRAM < 16GB | Auto conservative | `MINERU_PROCESSING_WINDOW_SIZE=4`, `MINERU_API_MAX_CONCURRENT_REQUESTS=1`, `MINERU_PDF_RENDER_THREADS=2` | | VRAM >= 16GB and Turing-or-newer | Auto moderately aggressive | `MINERU_PROCESSING_WINDOW_SIZE=8`, `MINERU_API_MAX_CONCURRENT_REQUESTS=1`, `MINERU_PDF_RENDER_THREADS=4` | | Explicit `safe` | Safe regardless of GPU | `MINERU_PROCESSING_WINDOW_SIZE=1`, `MINERU_API_MAX_CONCURRENT_REQUESTS=1`, `MINERU_PDF_RENDER_THREADS=1` | | Explicit `performance` on VRAM >= 16GB and Turing-or-newer | Performance | `MINERU_PROCESSING_WINDOW_SIZE=16`, `MINERU_API_MAX_CONCURRENT_REQUESTS=1`, `MINERU_PDF_RENDER_THREADS=4` | | Explicit `performance` on weaker GPU | Downgraded safe with warning | safe values | Do not set `MINERU_HYBRID_BATCH_RATIO` in Sprint 15 because MinerU docs describe it as commonly used for `hybrid-http-client`, which this project prohibits in v1. Do not set backend CLI flags in Sprint 15. The default MinerU backend remains MinerU-owned. ## Architecture Plan ### WP15.1: GPU Inventory Boundary Actions: - Add `src/pdf2md/gpu.py`. - Define immutable `GpuInfo` and `GpuInventory` records. - Parse `nvidia-smi --query-gpu=index,name,memory.total,driver_version --format=csv,noheader,nounits`. - Parse memory in MiB as an integer. - Mark pre-Turing risk using the existing name-based heuristic for GTX 10xx and pre-Turing names. - Optionally enrich compute capability through PyTorch when available, but keep PyTorch optional and mockable. - Provide `select_gpu(gpus, requested)` for `cuda:N`, `N`, and `auto`. Expected output: - GPU detection is independently testable with captured command output strings. - No real `nvidia-smi`, GPU, or PyTorch is needed in default tests. ### WP15.2: MinerU Profile Policy Actions: - Add `src/pdf2md/mineru_profile.py`. - Define supported profile names: `auto`, `safe`, `performance`. - Define a result record containing: - requested profile, - applied profile, - selected GPU index if known, - selected GPU name if known, - selected GPU VRAM MiB if known, - environment variables to set, - warnings or info messages as project `WarningRecord` values. - Implement the policy table above. - Keep profile environment values in a small allowlist. Expected output: - The policy can be tested without running MinerU. - Performance profile cannot silently overcommit weak GPUs. ### WP15.3: Adapter Environment Integration Actions: - Extend `MinerUOptions` with `mineru_profile: str = "auto"` and optional resolved profile metadata. - Keep strict-local validation for every option string. - Update `_mineru_environment()` to merge: - `MINERU_DEVICE_MODE=cuda`, - `CUDA_VISIBLE_DEVICES=`, - profile environment variables from `mineru_profile.py`. - Preserve previous environment values after subprocess execution. - Include profile details in `engine_options`. Expected output: - Real MinerU still receives only direct local CLI command shape: ```text mineru -p -o ``` - Tuning is done through local environment variables, not remote/API/backend flags. ### WP15.4: Conversion And CLI Wiring Actions: - Add `--mineru-profile` to `pdf2md convert`. - Accept `--gpu auto`. - Resolve selected GPU and profile before calling the adapter. - Surface profile warnings in conversion metadata/report warnings. - Preserve existing `--gpu cuda:0` default. - Ensure `convert_pdf()` can receive the profile through the Python API. Expected output: - Default conversions use `mineru_profile=auto`. - Existing calls with no new flags continue to work. - Metadata explains which profile was applied. ### WP15.5: Doctor Reporting Actions: - Reuse `gpu.py` inventory parsing in `doctor.py`. - Keep the existing `gpu` and `pytorch` checks, but make GPU details more explicit. - Add a doctor detail line for auto-selected GPU and recommended profile. - Keep warning-only behavior for Pascal/pre-Turing GPUs. Expected output: - On a stronger PC, `pdf2md doctor` shows enough evidence to decide whether `auto` or `performance` is appropriate. - On the current GTX 1070 Ti, doctor still warns and recommends safe/conservative behavior. ### WP15.6: Documentation Actions: - Update README setup and conversion docs with `--mineru-profile`. - Update ARCHITECTURE to document that tuning uses strict-local environment variables only. - Update PRD CLI section if the new public flag is added. - Update `docs/V1IMPLEMENTATIONPLAN.md`, `PLAN.md`, and `PROGRESS.md`. - Archive implementation details in `docs/WORKARCHIVE.md` only after implementation and verification. Expected output: - Users can move the repo to a stronger NVIDIA GPU PC, run `pdf2md doctor`, and understand the selected profile. ## Tests Default fast tests: - GPU inventory parser handles one RTX GPU, multiple GPUs, no GPU lines, and malformed memory fields. - `select_gpu(..., "auto")` selects the largest VRAM GPU. - `select_gpu(..., "cuda:1")` selects index 1 and errors when absent. - `select_gpu(..., "1")` normalizes to index 1. - `auto` profile returns safe values for GTX 1070 Ti 8GB. - `auto` profile returns moderately aggressive values for an RTX GPU with 16GB or more. - `performance` profile returns performance values only for 16GB+ Turing-or-newer GPUs. - `performance` profile on GTX 1070 Ti downgrades to safe and returns a warning. - Adapter sets and restores `MINERU_DEVICE_MODE`, `CUDA_VISIBLE_DEVICES`, `MINERU_PROCESSING_WINDOW_SIZE`, `MINERU_API_MAX_CONCURRENT_REQUESTS`, and `MINERU_PDF_RENDER_THREADS`. - Strict-local validation rejects remote/API/backend-like option strings in profile-related fields. - CLI default passes `mineru_profile=auto`. - CLI accepts `--mineru-profile safe` and `--mineru-profile performance`. - CLI rejects invalid profile values. - Doctor report includes visible GPU details and recommended profile with mocked command outputs. - Existing conversion, chunking, metadata, report, and UI tests remain green. Optional local validation on a stronger NVIDIA GPU PC: ```powershell uv run pdf2md doctor $env:MINERU_MODEL_SOURCE='local' uv run pdf2md convert samples\FourNodeQuadrilateralShellElementMITC4.pdf --out outputs\fournode-sprint15-auto --overwrite --chunk-pages --gpu auto --mineru-profile auto --strict-local ``` Expected optional validation: - Doctor reports the stronger GPU name, VRAM, and recommended profile. - Conversion metadata records `mineru_profile` and selected GPU information. - Generated outputs stay ignored and uncommitted. ## Acceptance Criteria - `--mineru-profile auto` is the default conversion behavior. - `auto` uses safe settings on the current GTX 1070 Ti 8GB and stronger settings only on 16GB+ Turing-or-newer NVIDIA GPUs. - `--gpu auto` can choose the largest visible NVIDIA GPU without adding remote/runtime backend support. - MinerU command shape remains direct local CLI only. - Strict-local prohibitions remain enforced. - `pdf2md doctor` provides actionable GPU/profile information. - Metadata/report preserve the applied runtime profile. - Default tests remain fast, mocked, local, and independent of real MinerU/GPU/model files/network/samples. ## Hard Failure Criteria - Implementation adds runtime backend selection or exposes `--backend`. - Implementation passes `--api-url`, `--url`, router, HTTP client backend, or remote OpenAI-compatible backend values. - `auto` profile applies aggressive settings to GTX 1070 Ti 8GB or other pre-Turing/low-VRAM GPUs. - Existing `--gpu cuda:0` behavior breaks. - Profile tuning disables formula or table parsing. - Doctor or tests require real GPU, real MinerU execution, model files, network, Obsidian, MathJax, or `samples/`. - Sample PDFs, generated outputs, local model files, or `dist/pdf2md-ui.exe` are committed. ## Implementation Task Plan ### Task 1: GPU Inventory Files: - Create `src/pdf2md/gpu.py` - Create `tests/test_gpu.py` Steps: - [x] Add failing tests for parsing `nvidia-smi` CSV output. - [x] Add failing tests for `auto`, `cuda:N`, and numeric GPU selection. - [x] Implement immutable GPU records and parser helpers. - [x] Implement selection errors as `ValueError` with clear messages. - [x] Run `uv run pytest tests/test_gpu.py`. - [x] Commit GPU inventory boundary. ### Task 2: MinerU Profile Policy Files: - Create `src/pdf2md/mineru_profile.py` - Create `tests/test_mineru_profile.py` Steps: - [x] Add failing tests for safe, auto, and performance profile policy. - [x] Add tests proving 16GB+ Turing-or-newer GPUs get the moderately aggressive auto environment. - [x] Add tests proving GTX 1070 Ti 8GB stays safe. - [x] Implement the allowlisted environment mapping. - [x] Run `uv run pytest tests/test_mineru_profile.py tests/test_gpu.py`. - [x] Commit profile policy. ### Task 3: Adapter And Conversion Wiring Files: - Modify `src/pdf2md/mineru_adapter.py` - Modify `src/pdf2md/conversion.py` - Modify `tests/test_mineru_adapter.py` - Modify `tests/test_conversion.py` Steps: - [x] Add failing adapter tests for profile environment variables and environment restoration. - [x] Add failing conversion tests that metadata receives applied profile information. - [x] Extend `MinerUOptions` and conversion options minimally. - [x] Merge GPU and profile environment variables before the MinerU subprocess. - [x] Run `uv run pytest tests/test_mineru_adapter.py tests/test_conversion.py tests/test_mineru_profile.py tests/test_gpu.py`. - [x] Commit adapter/conversion wiring. ### Task 4: CLI And Doctor Files: - Modify `src/pdf2md/cli.py` - Modify `src/pdf2md/doctor.py` - Modify `tests/test_cli.py` - Modify `tests/test_doctor.py` Steps: - [x] Add failing CLI tests for default `auto`, explicit `safe`, explicit `performance`, invalid profile rejection, and `--gpu auto`. - [x] Add failing doctor tests for GPU inventory and recommended profile details. - [x] Implement CLI argument parsing and doctor report additions. - [x] Run `uv run pytest tests/test_cli.py tests/test_doctor.py tests/test_gpu.py tests/test_mineru_profile.py`. - [x] Commit CLI and doctor wiring. ### Task 5: UI And Documentation Files: - Modify `src/pdf2md_ui/runner.py` only if explicit UI profile passthrough is needed - Modify `src/pdf2md_ui/app.py` only if explicit UI profile control is needed - Modify `tests/test_ui_runner.py` only if runner command construction changes - Modify `README.md` - Modify `ARCHITECTURE.md` - Modify `PRD.md` - Modify `docs/V1IMPLEMENTATIONPLAN.md` - Modify `PLAN.md` - Modify `PROGRESS.md` - Modify `docs/WORKARCHIVE.md` after implementation Steps: - [x] Keep UI unchanged if default CLI `auto` profile is enough for the first implementation pass. - [x] If UI exposes a profile control, add tests for fixed argument-list construction with `shell=False`. - [x] Document `--mineru-profile`, `--gpu auto`, profile policy, strict-local boundaries, and stronger-PC validation command. - [x] Run focused docs/UI tests if changed. - [x] Run final verification commands. - [x] Commit documentation and final coordination updates. ## Verification Commands ```powershell uv run pytest tests/test_gpu.py tests/test_mineru_profile.py tests/test_mineru_adapter.py tests/test_conversion.py tests/test_cli.py tests/test_doctor.py uv run pytest git diff --check git status --short --untracked-files=all ``` Optional stronger-PC validation is listed in the Tests section and must remain explicit opt-in. ## Handoff Requirements After implementation: - Update `PROGRESS.md` with files changed, commands run, test outcomes, optional stronger-PC validation outcome, known failures, residual risks, and next action. - Archive completed implementation details in `docs/WORKARCHIVE.md`. - Keep generated outputs, sample PDFs, local model files, and UI build artifacts out of the commit. - Record the detected GPU, applied profile, and whether `samples\FourNodeQuadrilateralShellElementMITC4.pdf` completed on the stronger PC. Implementation handoff: - Files changed: `src/pdf2md/gpu.py`, `src/pdf2md/mineru_profile.py`, `src/pdf2md/mineru_adapter.py`, `src/pdf2md/conversion.py`, `src/pdf2md/cli.py`, `src/pdf2md/doctor.py`, docs, and focused tests. - Commands run: `uv run pytest tests/test_gpu.py tests/test_mineru_profile.py tests/test_mineru_adapter.py tests/test_conversion.py tests/test_cli.py tests/test_doctor.py`; `uv run pytest`; `uv run pdf2md doctor`. - Tests passed: targeted Sprint 15 suite passed 101 tests; full default suite passed 225 tests with 1 optional skip; local doctor returned WARN with expected GTX 1070 Ti safe-profile recommendation. - Known failures: optional stronger-PC real MinerU conversion validation was not run in this workspace. - Residual risks: GTX 1070 Ti 8GB remains likely to stall on hard pages; stronger-PC behavior still needs local runtime validation. - Next action: on a stronger NVIDIA GPU PC, run `pdf2md doctor` and an explicit local conversion with `--gpu auto --mineru-profile auto`. ## Future Sprint Boundary A later sprint may add page-level timeout handling, resumable page caches, or a performance mode that can run multiple page conversions concurrently on GPUs with enough VRAM. Those behaviors are intentionally out of Sprint 15 scope.