23 KiB
CSV Schema/Tolerance Comparison Implementation Plan
For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (
- [ ]) syntax for tracking.
Goal: Add a no-Abaqus CSV comparison script that validates externally generated ODB-extracted actual CSV files against approved reference CSV artifacts by schema, row identity, units/coordinate metadata, and tolerance.
Architecture: Keep scripts/validate_reference_artifacts.py responsible for artifact completeness only. Add scripts/compare_extracted_csv.py as an explicit CLI tool that reads references/<feature-id>/<model-id>/metadata.json, validates the reference bundle, loads actual CSVs from a user-provided external result bundle, compares rows using declared schema/tolerance rules, and emits pass/fail plus optional JSON evidence. Do not integrate this into default scripts/validate_workspace.py because actual CSVs are generated outside this project and may not exist on every machine.
Tech Stack: Python standard library only (argparse, csv, json, math, dataclasses, pathlib, statistics or direct RMS math), existing unittest test style, existing reference artifact metadata contract.
File Structure
- Create:
scripts/compare_extracted_csv.py- CLI and importable functions for loading metadata, resolving CSV paths, validating CSV schema, matching rows, computing tolerance metrics, classifying failures, and emitting text/JSON reports.
- Create:
scripts/test_compare_extracted_csv.py- TDD coverage for pass, schema mismatch, missing actual output, ID mismatch, unit/coordinate mismatch, nonfinite values, and tolerance failure.
- Modify:
docs/reference-verifications/README.md- Document CLI usage, metadata comparison contract, failure classifications, and expected report fields.
- Modify:
docs/reference-models/README.md- Add optional
comparisonsmetadata block to the artifact bundle example.
- Add optional
- Do not modify:
scripts/validate_workspace.py- CSV comparison needs explicit actual output paths, so it stays outside default workspace validation.
Metadata Contract
Add an optional comparisons block to metadata.json. The comparison script requires this block for quantities it compares, but validate_reference_artifacts.py does not need to require it for all ready-for-comparison bundles.
{
"comparisons": {
"stresses": {
"reference_csv": "extracted/stresses.csv",
"actual_csv": "extracted/stresses.csv",
"required_columns": [
"step",
"frame",
"instance",
"element_label",
"integration_point",
"section_point",
"output_position",
"component",
"coordinate_system",
"unit",
"value"
],
"key_columns": [
"step",
"frame",
"instance",
"element_label",
"integration_point",
"section_point",
"output_position",
"component"
],
"value_column": "value",
"unit_column": "unit",
"coordinate_system_column": "coordinate_system",
"tolerance": {
"absolute": 1.0e-8,
"relative": 1.0e-6,
"relative_floor": 1.0e-12
}
}
}
}
Tolerance rule:
absolute_error = abs(actual - reference)
relative_error = absolute_error / max(abs(reference), relative_floor)
allowed_error = absolute + relative * max(abs(reference), relative_floor)
row_pass = absolute_error <= allowed_error
quantity_pass = all rows pass and no schema/id/unit/coordinate/nonfinite errors exist
CLI Contract
Primary command:
python scripts/compare_extracted_csv.py --metadata references/<feature-id>/<model-id>/metadata.json --actual-root external-results/<feature-id>/<model-id>
Optional filters and report output:
python scripts/compare_extracted_csv.py --metadata references/umat/single-element/metadata.json --actual-root external-results/umat/single-element --quantity stresses --report-json build/reference-verification/umat-single-element.json
Exit codes:
0: every requested quantity passed.1: comparison completed and one or more quantities failed.2: invalid CLI arguments, invalid metadata, missing files, or unreadable CSV.
Failure Classification
The script should produce one primary classification per failed quantity:
missing-reference-artifact: declared reference CSV is absent after metadata validation.missing-generated-output: actual CSV under--actual-rootis absent.schema-mismatch: required columns are missing, duplicate headers exist, or duplicate key rows exist.id-mismatch: missing or extra key rows exist.unit-or-coordinate-mismatch: matched rows disagree on unit or coordinate system.nonfinite-result: reference or actualvalueis NaN or infinite.tolerance-failure: schema, IDs, unit, and coordinate checks pass, but numeric error exceeds tolerance.upstream-contract: requested quantity has nocomparisons.<quantity>contract.environment: file cannot be read due to encoding or OS errors.
Report Contract
Text output should be concise and machine-adjacent:
PASS stresses rows=8 max_abs_error=1.2e-10 max_rel_error=3.0e-9 rms_error=8.1e-11 worst_key=Step-1|1|PART-1-1|1|1||INTEGRATION_POINT|S11
Failed quantity example:
FAIL stresses classification=tolerance-failure rows=8 max_abs_error=2.4e-4 max_rel_error=1.2e-2 rms_error=8.5e-5 worst_key=Step-1|1|PART-1-1|1|1||INTEGRATION_POINT|S11
JSON report should contain:
{
"metadata": "references/umat/single-element/metadata.json",
"actual_root": "external-results/umat/single-element",
"overall_result": "pass",
"quantities": [
{
"quantity": "stresses",
"result": "pass",
"classification": "N/A",
"compared_rows": 8,
"missing_rows": 0,
"extra_rows": 0,
"max_abs_error": 1.2e-10,
"max_rel_error": 3.0e-9,
"rms_error": 8.1e-11,
"worst_key": "Step-1|1|PART-1-1|1|1||INTEGRATION_POINT|S11",
"worst_component": "S11"
}
]
}
Task 1: Write Pass-Case Test Fixture
Files:
-
Create:
scripts/test_compare_extracted_csv.py -
Step 1: Write dynamic import and fixture helpers
import csv
import importlib.util
import json
import tempfile
import unittest
from pathlib import Path
def load_compare_extracted_csv():
module_path = Path(__file__).resolve().parent / "compare_extracted_csv.py"
spec = importlib.util.spec_from_file_location("compare_extracted_csv", module_path)
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
return module
def write_json(path: Path, payload: dict):
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(json.dumps(payload, indent=2), encoding="utf-8")
def write_csv(path: Path, rows: list[dict[str, str]]):
path.parent.mkdir(parents=True, exist_ok=True)
with path.open("w", newline="", encoding="utf-8") as handle:
writer = csv.DictWriter(handle, fieldnames=list(rows[0]))
writer.writeheader()
writer.writerows(rows)
def metadata_payload() -> dict:
return {
"schema_version": "abaqus-user-subroutine-artifact-v1",
"feature_id": "umat",
"model_id": "single-element",
"artifact_status": "ready-for-comparison",
"abaqus": {"version": "2024", "precision": "double"},
"compiler": {"vendor": "Intel oneAPI", "name": "ifx", "version": "2024"},
"subroutine": {"entry_points": ["UMAT"], "source_files": []},
"input_file": "model.inp",
"outputs": {
"tails": {
"msg": "job.msg.tail.txt",
"dat": "job.dat.tail.txt",
"log": "job.log.tail.txt",
"sta": "job.sta.tail.txt"
},
"csv": {"stresses": "extracted/stresses.csv"}
},
"extraction": {
"source_odb": "job.odb",
"tool": "Abaqus Python",
"extracted_at": "2026-06-10T00:00:00+09:00",
"csv_directory": "extracted"
},
"comparisons": {
"stresses": {
"reference_csv": "extracted/stresses.csv",
"actual_csv": "extracted/stresses.csv",
"required_columns": [
"step", "frame", "instance", "element_label", "integration_point",
"section_point", "output_position", "component",
"coordinate_system", "unit", "value"
],
"key_columns": [
"step", "frame", "instance", "element_label", "integration_point",
"section_point", "output_position", "component"
],
"value_column": "value",
"unit_column": "unit",
"coordinate_system_column": "coordinate_system",
"tolerance": {"absolute": 1.0e-8, "relative": 1.0e-6, "relative_floor": 1.0e-12}
}
}
}
def stress_rows(value: str = "100.0") -> list[dict[str, str]]:
return [
{
"step": "Step-1",
"frame": "1",
"instance": "PART-1-1",
"element_label": "1",
"integration_point": "1",
"section_point": "",
"output_position": "INTEGRATION_POINT",
"component": "S11",
"coordinate_system": "GLOBAL",
"unit": "MPa",
"value": value
}
]
- Step 2: Write passing comparison test
class CompareExtractedCsvTests(unittest.TestCase):
def test_quantity_passes_when_schema_keys_units_and_values_match_within_tolerance(self):
compare = load_compare_extracted_csv()
with tempfile.TemporaryDirectory() as tmp:
root = Path(tmp)
reference = root / "references" / "umat" / "single-element"
actual = root / "external-results" / "umat" / "single-element"
write_json(reference / "metadata.json", metadata_payload())
write_csv(reference / "extracted" / "stresses.csv", stress_rows("100.0"))
write_csv(actual / "extracted" / "stresses.csv", stress_rows("100.00000001"))
report = compare.compare_metadata(reference / "metadata.json", actual, quantities=["stresses"], validate_artifacts=False)
self.assertEqual(report["overall_result"], "pass")
self.assertEqual(report["quantities"][0]["result"], "pass")
self.assertEqual(report["quantities"][0]["classification"], "N/A")
self.assertEqual(report["quantities"][0]["compared_rows"], 1)
- Step 3: Run test to verify RED
Run:
python -m unittest scripts.test_compare_extracted_csv
Expected: FAIL because scripts/compare_extracted_csv.py does not exist.
Task 2: Implement Minimal Pass-Case Comparison
Files:
-
Create:
scripts/compare_extracted_csv.py -
Step 1: Add importable API skeleton and minimal comparison
Implement these functions:
def compare_metadata(metadata_path: Path, actual_root: Path, *, quantities: list[str] | None = None, validate_artifacts: bool = True) -> dict:
...
def load_csv_rows(path: Path) -> tuple[list[str], list[dict[str, str]]]:
...
def compare_quantity(quantity: str, contract: dict, reference_root: Path, actual_root: Path) -> dict:
...
Minimum behavior for GREEN:
-
Load metadata JSON.
-
Resolve
comparisons.<quantity>.reference_csvundermetadata_path.parent. -
Resolve
comparisons.<quantity>.actual_csvunderactual_root. -
Load both CSV files with
csv.DictReader. -
Check required columns are present.
-
Match rows by
key_columns. -
Parse
value_columnas finite float. -
Compute
max_abs_error,max_rel_error,rms_error,worst_key. -
Return
overall_result=passif no errors exceed tolerance. -
Step 2: Run pass-case test
Run:
python -m unittest scripts.test_compare_extracted_csv
Expected: PASS.
Task 3: Add Schema and Contract Failure Tests
Files:
-
Modify:
scripts/test_compare_extracted_csv.py -
Modify:
scripts/compare_extracted_csv.py -
Step 1: Add missing actual output test
def test_missing_actual_csv_is_missing_generated_output(self):
compare = load_compare_extracted_csv()
with tempfile.TemporaryDirectory() as tmp:
root = Path(tmp)
reference = root / "references" / "umat" / "single-element"
actual = root / "external-results" / "umat" / "single-element"
write_json(reference / "metadata.json", metadata_payload())
write_csv(reference / "extracted" / "stresses.csv", stress_rows("100.0"))
report = compare.compare_metadata(reference / "metadata.json", actual, quantities=["stresses"], validate_artifacts=False)
self.assertEqual(report["overall_result"], "fail")
self.assertEqual(report["quantities"][0]["classification"], "missing-generated-output")
- Step 2: Add missing required column test
def test_missing_required_column_is_schema_mismatch(self):
compare = load_compare_extracted_csv()
with tempfile.TemporaryDirectory() as tmp:
root = Path(tmp)
reference = root / "references" / "umat" / "single-element"
actual = root / "external-results" / "umat" / "single-element"
write_json(reference / "metadata.json", metadata_payload())
row = stress_rows("100.0")[0]
write_csv(reference / "extracted" / "stresses.csv", [row])
actual_row = dict(row)
actual_row.pop("coordinate_system")
write_csv(actual / "extracted" / "stresses.csv", [actual_row])
report = compare.compare_metadata(reference / "metadata.json", actual, quantities=["stresses"], validate_artifacts=False)
self.assertEqual(report["quantities"][0]["classification"], "schema-mismatch")
- Step 3: Add missing comparison contract test
def test_missing_quantity_contract_is_upstream_contract(self):
compare = load_compare_extracted_csv()
with tempfile.TemporaryDirectory() as tmp:
root = Path(tmp)
reference = root / "references" / "umat" / "single-element"
actual = root / "external-results" / "umat" / "single-element"
payload = metadata_payload()
payload["comparisons"].pop("stresses")
write_json(reference / "metadata.json", payload)
report = compare.compare_metadata(reference / "metadata.json", actual, quantities=["stresses"], validate_artifacts=False)
self.assertEqual(report["quantities"][0]["classification"], "upstream-contract")
- Step 4: Run tests to verify RED
Run:
python -m unittest scripts.test_compare_extracted_csv
Expected: FAIL on the new failure classifications.
- Step 5: Implement missing file, schema, and contract classification
Add helper functions:
def failed_quantity(quantity: str, classification: str, message: str) -> dict:
...
def validate_columns(headers: list[str], required_columns: list[str]) -> list[str]:
...
Return stable fields even on failure:
{
"quantity": quantity,
"result": "fail",
"classification": classification,
"message": message,
"compared_rows": 0,
"missing_rows": 0,
"extra_rows": 0,
"max_abs_error": None,
"max_rel_error": None,
"rms_error": None,
"worst_key": None,
"worst_component": None
}
- Step 6: Run tests to verify GREEN
Run:
python -m unittest scripts.test_compare_extracted_csv
Expected: PASS.
Task 4: Add Row Matching, Unit, Coordinate, Nonfinite, and Tolerance Tests
Files:
-
Modify:
scripts/test_compare_extracted_csv.py -
Modify:
scripts/compare_extracted_csv.py -
Step 1: Add ID mismatch test
Change actual element_label from 1 to 2. Expected classification: id-mismatch, with missing_rows=1 and extra_rows=1.
- Step 2: Add unit mismatch test
Change actual unit from MPa to Pa. Expected classification: unit-or-coordinate-mismatch.
- Step 3: Add coordinate mismatch test
Change actual coordinate_system from GLOBAL to LOCAL-1. Expected classification: unit-or-coordinate-mismatch.
- Step 4: Add nonfinite test
Set actual value to nan. Expected classification: nonfinite-result.
- Step 5: Add tolerance failure test
Set actual value to 101.0 for reference 100.0. Expected classification: tolerance-failure, max_abs_error=1.0, and result=fail.
- Step 6: Run tests to verify RED
Run:
python -m unittest scripts.test_compare_extracted_csv
Expected: FAIL until these classifications are implemented.
- Step 7: Implement row matching and classification precedence
Use this precedence:
missing-reference-artifact
missing-generated-output
upstream-contract
schema-mismatch
id-mismatch
nonfinite-result
unit-or-coordinate-mismatch
tolerance-failure
N/A
Implement row keys as:
def make_key(row: dict[str, str], key_columns: list[str]) -> tuple[str, ...]:
return tuple(row.get(column, "") for column in key_columns)
Detect duplicate keys in either CSV as schema-mismatch.
- Step 8: Run tests to verify GREEN
Run:
python -m unittest scripts.test_compare_extracted_csv
Expected: PASS.
Task 5: Add CLI and JSON Report Tests
Files:
-
Modify:
scripts/test_compare_extracted_csv.py -
Modify:
scripts/compare_extracted_csv.py -
Step 1: Add CLI pass test using
main(argv)
Test:
exit_code = compare.main([
"--metadata", str(reference / "metadata.json"),
"--actual-root", str(actual),
"--quantity", "stresses",
"--report-json", str(report_json)
])
self.assertEqual(exit_code, 0)
self.assertEqual(json.loads(report_json.read_text(encoding="utf-8"))["overall_result"], "pass")
- Step 2: Add CLI fail test
Use a tolerance failure fixture. Expected main(...) == 1 and JSON overall_result == "fail".
- Step 3: Add CLI invalid argument test
Call without --metadata or --actual-root. Expected main(...) == 2.
- Step 4: Run tests to verify RED
Run:
python -m unittest scripts.test_compare_extracted_csv
Expected: FAIL until CLI exists.
- Step 5: Implement CLI
Implement:
def build_arg_parser() -> argparse.ArgumentParser:
...
def main(argv: list[str] | None = None) -> int:
...
CLI behavior:
-
--quantitymay be repeated. -
If no
--quantityis supplied, compare all keys undermetadata["comparisons"]. -
--report-jsoncreates parent directories and writes UTF-8 JSON. -
Print one summary line per quantity.
-
Return
0,1, or2according to the CLI contract. -
Step 6: Run tests to verify GREEN
Run:
python -m unittest scripts.test_compare_extracted_csv
Expected: PASS.
Task 6: Optional Artifact Validator Integration
Files:
-
Modify:
scripts/compare_extracted_csv.py -
Modify:
scripts/test_compare_extracted_csv.py -
Step 1: Add test that default comparison calls metadata validation
Use a metadata file missing required ready-for-comparison fields and call compare_metadata(..., validate_artifacts=True). Expected classification: missing-reference-artifact or exit code 2 with validation errors.
- Step 2: Implement reuse of
validate_reference_artifacts.validate_metadata
Import safely:
try:
from validate_reference_artifacts import validate_metadata
except ImportError:
from scripts.validate_reference_artifacts import validate_metadata
Run validation before comparison when validate_artifacts=True.
- Step 3: Keep tests able to bypass validation
Continue supporting validate_artifacts=False in unit tests that only exercise comparison logic.
Task 7: Documentation Updates
Files:
-
Modify:
docs/reference-verifications/README.md -
Modify:
docs/reference-models/README.md -
Step 1: Update reference verification README
Add a section:
## CSV Comparison Command
Run explicit external-result comparison with:
```bash
python scripts/compare_extracted_csv.py --metadata references/<feature-id>/<model-id>/metadata.json --actual-root external-results/<feature-id>/<model-id>
```
The command does not run Abaqus and does not parse ODB files. It compares approved `references/.../extracted/*.csv` files with externally generated actual CSV files under `--actual-root`.
- Step 2: Update reference model README metadata example
Add the comparisons JSON block shown in this plan.
- Step 3: Run documentation-sensitive search
Run:
rg -n "compare_extracted_csv|comparisons|extracted/.*\\.csv" docs scripts
Expected: The new script, tests, and docs mention the comparison contract.
Task 8: Full Verification
Files:
-
No new edits unless failures reveal a bug in this task's changes.
-
Step 1: Run targeted tests
python -m unittest scripts.test_compare_extracted_csv
Expected: PASS.
- Step 2: Run full script tests
python -m unittest discover -s scripts -p "test_*.py"
Expected: PASS.
- Step 3: Run reference artifact validation
python scripts/validate_reference_artifacts.py
Expected: Reference artifact metadata validation succeeded.
- Step 4: Run Fortran validation
python scripts/validate_fortran.py
Expected: PASS, or No Fortran validation commands configured. when no manifest exists.
- Step 5: Run workspace validation
python scripts/validate_workspace.py
Expected: PASS. It should not require actual CSV outputs because compare_extracted_csv.py is explicit-use only.
Acceptance Criteria
- The project never runs Abaqus and never opens ODB files during CSV comparison.
- Reference bundle completeness remains checked by
scripts/validate_reference_artifacts.py. - CSV numeric validation is performed by explicit command only.
- Actual generated CSVs are read from a user-supplied
--actual-root. - Comparison requires declared schema, key columns, value column, unit/coordinate columns, and absolute/relative tolerance.
- Missing files, schema mismatch, ID mismatch, unit/coordinate mismatch, nonfinite results, and tolerance failures have distinct classifications.
- JSON report includes enough metrics for Reference Verification Agent handoff: compared rows, missing rows, extra rows, max absolute error, max relative error, RMS error, worst key, and pass/fail.
Open Decisions
- Whether actual result bundles should live under a conventional ignored path such as
external-results/orruns/. The script should accept any--actual-root, so this can remain a documentation convention. - Whether comparison metadata should later move from
metadata.jsoninto feature-specific I/O definition documents. For the first implementation, keep executable comparison rules inmetadata.jsonso the script has one deterministic contract source.