modify documents
This commit is contained in:
@@ -0,0 +1,169 @@
|
||||
# Harness Engineering
|
||||
|
||||
## Purpose
|
||||
This document defines how FESA uses long-running agent harnesses for planning, implementation, and evaluation.
|
||||
|
||||
The goal is not to maximize agent count. The goal is to keep long solver work coherent, testable, and reference-verified across context resets and independent sessions.
|
||||
|
||||
## Default Harness Shape
|
||||
Use the smallest harness that can safely handle the task.
|
||||
|
||||
For meaningful solver implementation or phase execution, use:
|
||||
|
||||
```text
|
||||
Planner -> Generator -> Evaluator
|
||||
```
|
||||
|
||||
Roles:
|
||||
- `Planner`: turns project docs and `PLAN.md` tasks into a testable sprint contract or phase step.
|
||||
- `Generator`: implements exactly one accepted contract using TDD.
|
||||
- `Evaluator`: independently checks the result against the contract, docs, tests, reference artifacts, and validation commands.
|
||||
|
||||
Do not use multi-agent ceremony for tiny documentation edits or obvious mechanical changes. Do use the full harness when a task touches solver behavior, numerical conventions, reference comparison, parser compatibility, result schema, or phase execution.
|
||||
|
||||
## Sprint Contract
|
||||
Every implementation sprint must have a contract before code changes begin.
|
||||
|
||||
Recommended location:
|
||||
- `phases/{phase}/stepN.md` for phase execution.
|
||||
- `phases/{phase}/contracts/stepN-contract.md` only when a separate negotiation artifact is useful.
|
||||
|
||||
Required sections:
|
||||
|
||||
````markdown
|
||||
# Sprint Contract: {name}
|
||||
|
||||
## Objective
|
||||
{one concise outcome}
|
||||
|
||||
## Required Reading
|
||||
- /AGENTS.md
|
||||
- /PROGRESS.md
|
||||
- /PLAN.md
|
||||
- /docs/README.md
|
||||
- /docs/HARNESS_ENGINEERING.md
|
||||
- {topic docs}
|
||||
|
||||
## Scope
|
||||
- {what may be changed}
|
||||
|
||||
## Allowed Files
|
||||
- {paths or modules}
|
||||
|
||||
## Explicit Non-Goals
|
||||
- {what must not be done}
|
||||
|
||||
## Tests To Write First
|
||||
- {test files or test cases}
|
||||
|
||||
## Reference Artifacts
|
||||
- {references/*.inp or references/*_displacements.csv, or "none"}
|
||||
|
||||
## Acceptance Commands
|
||||
```bash
|
||||
python scripts/validate_workspace.py
|
||||
```
|
||||
|
||||
## Evaluator Checklist
|
||||
- {contract-specific checks}
|
||||
|
||||
## Handoff Requirements
|
||||
- Update PROGRESS.md for completed work.
|
||||
- Update PLAN.md for future work or changed blockers.
|
||||
````
|
||||
|
||||
Contract quality rules:
|
||||
- The contract must be testable.
|
||||
- The contract must identify unsupported Abaqus features rather than expanding support implicitly.
|
||||
- The contract must state whether reference data is used.
|
||||
- The contract must name file ownership boundaries to reduce conflicts.
|
||||
- The contract must not prescribe formulas that are not present in `docs/MITC4_FORMULATION.md` or a cited source.
|
||||
|
||||
## Generator Rules
|
||||
The Generator implements one contract at a time.
|
||||
|
||||
Required behavior:
|
||||
- Read the contract and required docs before editing.
|
||||
- Write or update tests before implementation.
|
||||
- Keep changes inside allowed files unless the contract is updated first.
|
||||
- Preserve architecture boundaries from `docs/ARCHITECTURE.md` and `docs/ADR.md`.
|
||||
- Preserve numerical conventions from `docs/NUMERICAL_CONVENTIONS.md`.
|
||||
- Run acceptance commands.
|
||||
- Update `PROGRESS.md` and `PLAN.md` only for factual state changes.
|
||||
|
||||
Generator failure modes to avoid:
|
||||
- Broad refactors outside the contract.
|
||||
- Implementing parser support because a stored reference `.inp` contains unsupported Abaqus features.
|
||||
- Comparing only reduced vectors when full-vector reaction recovery is required.
|
||||
- Treating a passing compile as sufficient without tests or reference checks.
|
||||
|
||||
## Evaluator Rules
|
||||
The Evaluator is independent from the Generator.
|
||||
|
||||
Evaluation order:
|
||||
1. Read the sprint contract.
|
||||
2. Read `AGENTS.md`, `PROGRESS.md`, `PLAN.md`, and the topic docs.
|
||||
3. Inspect the changed files.
|
||||
4. Run or review the acceptance commands.
|
||||
5. Check tests, reference artifacts, and documented conventions.
|
||||
6. Return pass/fail findings with concrete file references.
|
||||
|
||||
The Evaluator must fail the sprint if any of these are true:
|
||||
- Required tests were not written first or are missing.
|
||||
- `python scripts/validate_workspace.py` fails without explanation.
|
||||
- A CRITICAL rule in `AGENTS.md` is violated.
|
||||
- A change drifts from `docs/ARCHITECTURE.md`, `docs/ADR.md`, or `docs/NUMERICAL_CONVENTIONS.md`.
|
||||
- `references/*_displacements.csv` comparison is required but not implemented or not checked.
|
||||
- `RF` is computed from reduced quantities when full-vector recovery is required.
|
||||
- Unsupported Abaqus features are silently accepted.
|
||||
- Completed work is not recorded in `PROGRESS.md`, or future tasks are not recorded in `PLAN.md`.
|
||||
|
||||
If the sprint fails, the Evaluator should produce a concise feedback artifact:
|
||||
|
||||
```markdown
|
||||
# Evaluation Feedback: {contract}
|
||||
|
||||
## Verdict
|
||||
fail
|
||||
|
||||
## Findings
|
||||
- {severity}: {file} - {risk}
|
||||
|
||||
## Required Fixes
|
||||
- {minimal fix}
|
||||
|
||||
## Verification To Rerun
|
||||
- {commands}
|
||||
```
|
||||
|
||||
## FESA Evaluation Rubric
|
||||
Use this rubric for implementation review.
|
||||
|
||||
| Criterion | Pass Condition |
|
||||
|---|---|
|
||||
| Contract compliance | Changes stay within scope and allowed files |
|
||||
| Architecture | Domain, AnalysisModel, AnalysisState, DofManager, adapters, and factories follow documented ownership |
|
||||
| Numerical conventions | DOF order, units, signs, double precision, int64 ids, constrained/free mapping, and full-vector reactions are preserved |
|
||||
| Reference verification | Stored `references/` artifacts are used when required; CSV column mapping is correct |
|
||||
| Tests | Tests exist before implementation and cover failure modes, not only happy paths |
|
||||
| Diagnostics | Unsupported input and singular systems produce actionable diagnostics |
|
||||
| Results schema | Outputs follow step/frame/field/history and HDF5 schema rules |
|
||||
| Handoff | `PLAN.md` and `PROGRESS.md` reflect the new state |
|
||||
|
||||
## Harness Complexity Policy
|
||||
Add harness complexity only when it catches real risk.
|
||||
|
||||
Use a single agent for:
|
||||
- small wording changes.
|
||||
- mechanical docs updates.
|
||||
- metadata-only corrections.
|
||||
|
||||
Use Planner -> Generator -> Evaluator for:
|
||||
- C++ solver implementation.
|
||||
- parser behavior changes.
|
||||
- result schema or HDF5 writer changes.
|
||||
- reference comparator changes.
|
||||
- MITC4 formulation-dependent work.
|
||||
- phase generation or execution.
|
||||
|
||||
Review the harness periodically. If an agent role no longer adds value, simplify it.
|
||||
Reference in New Issue
Block a user