# Step 0: formula-block-detection ## Read First - /AGENTS.md - /PLAN.md - /PROGRESS.md - /docs/HARNESS.md - /docs/IMPLEMENTATION_PLAN.md - /docs/CONVERSION_POLICY.md - /phases/2-marker-adapter/step2.md ## Task Implement formula candidate detection from normalized Marker blocks. Detect Marker equation blocks and text-pattern candidates while classifying inline versus block formulas based on block role and layout hints. ## Sprint Contract - Done means: formula candidates are represented as internal objects ready for Nougat or Marker fallback. - Hard thresholds: ordinary currency-like dollar text is not blindly treated as math; inline/block distinction is tested; no Nougat invocation occurs yet. - Files owned: `src/pdftomd/formulas.py`, tests, `PROGRESS.md`, `phases/3-formula-pipeline/index.json`. - Dependencies: Phase 2 block normalization. ## Acceptance Criteria ```powershell python scripts\validate_workspace.py .\venv\python.exe -m pytest tests ``` ## Verification 1. Run the acceptance commands. 2. Confirm tests include inline and block formula candidates. 3. Update `PROGRESS.md` and this phase index. ## Do Not - Do not call Nougat. - Do not render Markdown math. - Do not make regex the only source when structured block role exists.