1089 lines
32 KiB
Markdown
1089 lines
32 KiB
Markdown
# Harden Execute Runner Implementation Plan
|
|
|
|
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
|
|
|
**Goal:** Make `scripts/execute.py` safe enough to use by enforcing `codex/` branch names, clean starting state, explicit staging, per-step file allowlists, and validation before every runner-created commit.
|
|
|
|
**Architecture:** Keep the runner as a single Python module and add small helper methods to `StepExecutor` instead of introducing a new framework. Add focused `unittest` coverage in `scripts/test_execute.py` using temporary phase directories and mocked subprocess/git calls, then update stale docs only after behavior is covered.
|
|
|
|
**Tech Stack:** Python 3 standard library, `unittest`, `unittest.mock`, Git command-line interface, existing `python scripts/validate_workspace.py`.
|
|
|
|
---
|
|
|
|
## Scope
|
|
|
|
This plan does not use the `harness-workflow` skill. It only prepares `scripts/execute.py` so that using it later is less likely to damage unrelated work.
|
|
|
|
Required behavior:
|
|
- Branches created or used by the runner must use `codex/<phase-name>`.
|
|
- Runner startup must abort on a dirty worktree before branch checkout or Codex execution.
|
|
- Runner commits must never use broad git staging.
|
|
- Each phase step must define an allowlist of paths it may modify.
|
|
- Runner must refuse to commit disallowed file changes.
|
|
- Runner must run Python self-tests and workspace validation before every commit it creates.
|
|
- Commit failure and validation failure must stop the runner instead of only warning.
|
|
|
|
## File Map
|
|
|
|
- Modify `scripts/execute.py`: branch naming, dirty worktree check, allowlist parsing, explicit staging, validation-before-commit, fatal commit failures.
|
|
- Create `scripts/test_execute.py`: unit tests for runner safety behavior.
|
|
- Modify `AGENTS.md`: document `codex/` branch prefix and safer runner requirements.
|
|
- Modify `docs/ARCHITECTURE.md`: update Harness execution layer notes if they mention unsafe staging or old branch behavior.
|
|
- Modify `.codex/skills/harness-workflow/SKILL.md`: update stale documentation that says the runner creates the legacy feat task branch. Do not invoke the skill while making this edit.
|
|
|
|
## Data Contract
|
|
|
|
Extend each step object in `phases/<phase-dir>/index.json` with `allowed_paths`.
|
|
|
|
```json
|
|
{
|
|
"step": 1,
|
|
"name": "Update docs",
|
|
"status": "pending",
|
|
"summary": "",
|
|
"allowed_paths": [
|
|
"docs/*.md",
|
|
"docs/**/*.md",
|
|
"AGENTS.md"
|
|
]
|
|
}
|
|
```
|
|
|
|
Allowlist semantics:
|
|
- Paths are repository-relative and normalized to forward slashes.
|
|
- Exact file path matches are allowed.
|
|
- Directory prefix entries ending with `/` allow every path under that directory.
|
|
- Glob entries using `*`, `?`, or `[` are matched with `fnmatch.fnmatchcase`.
|
|
- Runner-managed housekeeping files are always allowed separately:
|
|
- `phases/<phase-dir>/index.json`
|
|
- `phases/<phase-dir>/step<step-num>-output.json`
|
|
- `phases/index.json`
|
|
|
|
## Task 1: Test Harness for `scripts/execute.py`
|
|
|
|
**Files:**
|
|
- Create: `scripts/test_execute.py`
|
|
- Read: `scripts/execute.py`
|
|
|
|
- [ ] **Step 1: Add loader and temporary phase helpers**
|
|
|
|
Create `scripts/test_execute.py` with this initial content:
|
|
|
|
```python
|
|
import importlib.util
|
|
import json
|
|
import subprocess
|
|
import tempfile
|
|
import unittest
|
|
from pathlib import Path
|
|
from unittest.mock import patch
|
|
|
|
|
|
def load_execute():
|
|
module_path = Path(__file__).resolve().parent / "execute.py"
|
|
spec = importlib.util.spec_from_file_location("execute", module_path)
|
|
module = importlib.util.module_from_spec(spec)
|
|
spec.loader.exec_module(module)
|
|
return module
|
|
|
|
|
|
def write_phase(root: Path, phase_dir: str = "0-mvp", phase_name: str = "0-mvp", steps=None):
|
|
phase_path = root / "phases" / phase_dir
|
|
phase_path.mkdir(parents=True)
|
|
if steps is None:
|
|
steps = [
|
|
{
|
|
"step": 1,
|
|
"name": "Docs",
|
|
"status": "pending",
|
|
"summary": "",
|
|
"allowed_paths": ["docs/*.md"],
|
|
}
|
|
]
|
|
(phase_path / "index.json").write_text(
|
|
json.dumps({"project": "FESA", "phase": phase_name, "steps": steps}, indent=2),
|
|
encoding="utf-8",
|
|
)
|
|
(phase_path / "step1.md").write_text("# Step 1\n", encoding="utf-8")
|
|
return phase_path
|
|
|
|
|
|
def make_executor(execute, root: Path, phase_dir: str = "0-mvp"):
|
|
with patch.object(execute, "ROOT", root):
|
|
return execute.StepExecutor(phase_dir)
|
|
|
|
|
|
class ExecuteRunnerSafetyTests(unittest.TestCase):
|
|
pass
|
|
|
|
|
|
if __name__ == "__main__":
|
|
unittest.main()
|
|
```
|
|
|
|
- [ ] **Step 2: Run the new empty test module**
|
|
|
|
Run:
|
|
|
|
```powershell
|
|
python -m unittest scripts.test_execute
|
|
```
|
|
|
|
Expected: pass with one empty test class loaded and no failures.
|
|
|
|
- [ ] **Step 3: Commit the test scaffold**
|
|
|
|
Run:
|
|
|
|
```powershell
|
|
git add scripts/test_execute.py
|
|
git commit -m "test: add execute runner safety test scaffold"
|
|
```
|
|
|
|
## Task 2: `codex/` Branch Prefix
|
|
|
|
**Files:**
|
|
- Modify: `scripts/test_execute.py`
|
|
- Modify: `scripts/execute.py`
|
|
|
|
- [ ] **Step 1: Add failing tests for branch naming and push branch**
|
|
|
|
Add these methods to `ExecuteRunnerSafetyTests`:
|
|
|
|
```python
|
|
def test_branch_name_uses_codex_prefix_and_sanitized_phase(self):
|
|
execute = load_execute()
|
|
with tempfile.TemporaryDirectory() as tmp:
|
|
root = Path(tmp)
|
|
write_phase(root, phase_name="linear truss/1d")
|
|
executor = make_executor(execute, root)
|
|
|
|
self.assertEqual(executor._branch_name(), "codex/linear-truss-1d")
|
|
|
|
def test_finalize_push_uses_codex_branch_name(self):
|
|
execute = load_execute()
|
|
with tempfile.TemporaryDirectory() as tmp:
|
|
root = Path(tmp)
|
|
write_phase(root, phase_name="0-mvp")
|
|
executor = make_executor(execute, root)
|
|
executor._auto_push = True
|
|
calls = []
|
|
|
|
def fake_git(*args):
|
|
calls.append(args)
|
|
if args == ("diff", "--cached", "--quiet"):
|
|
return subprocess.CompletedProcess(args, 0, "", "")
|
|
return subprocess.CompletedProcess(args, 0, "", "")
|
|
|
|
with patch.object(executor, "_run_git", side_effect=fake_git):
|
|
with patch.object(executor, "_validate_before_commit", create=True):
|
|
executor._finalize()
|
|
|
|
self.assertIn(("push", "-u", "origin", "codex/0-mvp"), calls)
|
|
```
|
|
|
|
- [ ] **Step 2: Run tests and verify failure**
|
|
|
|
Run:
|
|
|
|
```powershell
|
|
python -m unittest scripts.test_execute -v
|
|
```
|
|
|
|
Expected: fail because `StepExecutor` has no `_branch_name()` and `_finalize()` still uses `feat-`.
|
|
|
|
- [ ] **Step 3: Implement branch naming**
|
|
|
|
In `scripts/execute.py`, import `re` near the other imports:
|
|
|
|
```python
|
|
import re
|
|
```
|
|
|
|
Add this method to `StepExecutor` under the `# --- git ---` section:
|
|
|
|
```python
|
|
def _branch_name(self) -> str:
|
|
slug = re.sub(r"[^A-Za-z0-9._-]+", "-", self._phase_name.strip())
|
|
slug = slug.strip("/.-")
|
|
if not slug:
|
|
slug = self._phase_dir_name
|
|
return f"codex/{slug}"
|
|
```
|
|
|
|
Replace both direct branch constructions:
|
|
|
|
```python
|
|
branch = self._branch_name()
|
|
```
|
|
|
|
with:
|
|
|
|
```python
|
|
branch = self._branch_name()
|
|
```
|
|
|
|
- [ ] **Step 4: Run targeted tests**
|
|
|
|
Run:
|
|
|
|
```powershell
|
|
python -m unittest scripts.test_execute -v
|
|
```
|
|
|
|
Expected: pass.
|
|
|
|
- [ ] **Step 5: Commit**
|
|
|
|
Run:
|
|
|
|
```powershell
|
|
git add scripts/test_execute.py scripts/execute.py
|
|
git commit -m "fix: use codex branch prefix in execute runner"
|
|
```
|
|
|
|
## Task 3: Dirty Worktree Protection
|
|
|
|
**Files:**
|
|
- Modify: `scripts/test_execute.py`
|
|
- Modify: `scripts/execute.py`
|
|
|
|
- [ ] **Step 1: Add failing tests for dirty worktree guard**
|
|
|
|
Add these methods:
|
|
|
|
```python
|
|
def test_assert_clean_worktree_exits_when_git_status_has_changes(self):
|
|
execute = load_execute()
|
|
with tempfile.TemporaryDirectory() as tmp:
|
|
root = Path(tmp)
|
|
write_phase(root)
|
|
executor = make_executor(execute, root)
|
|
|
|
with patch.object(
|
|
executor,
|
|
"_run_git",
|
|
return_value=subprocess.CompletedProcess([], 0, " M AGENTS.md\n?? scratch.txt\n", ""),
|
|
):
|
|
with self.assertRaises(SystemExit) as cm:
|
|
executor._assert_clean_worktree("before checkout")
|
|
|
|
self.assertEqual(cm.exception.code, 1)
|
|
|
|
def test_run_checks_clean_worktree_before_checkout(self):
|
|
execute = load_execute()
|
|
with tempfile.TemporaryDirectory() as tmp:
|
|
root = Path(tmp)
|
|
write_phase(root)
|
|
executor = make_executor(execute, root)
|
|
calls = []
|
|
|
|
def record(name):
|
|
def inner(*args, **kwargs):
|
|
calls.append(name)
|
|
return inner
|
|
|
|
with patch.object(executor, "_assert_clean_worktree", side_effect=record("clean")):
|
|
with patch.object(executor, "_checkout_branch", side_effect=record("checkout")):
|
|
with patch.object(executor, "_print_header"):
|
|
with patch.object(executor, "_check_blockers"):
|
|
with patch.object(executor, "_load_guardrails", return_value=""):
|
|
with patch.object(executor, "_ensure_created_at"):
|
|
with patch.object(executor, "_execute_all_steps"):
|
|
with patch.object(executor, "_finalize"):
|
|
executor.run()
|
|
|
|
self.assertLess(calls.index("clean"), calls.index("checkout"))
|
|
```
|
|
|
|
- [ ] **Step 2: Run tests and verify failure**
|
|
|
|
Run:
|
|
|
|
```powershell
|
|
python -m unittest scripts.test_execute -v
|
|
```
|
|
|
|
Expected: fail because `_assert_clean_worktree()` does not exist and `run()` does not call it.
|
|
|
|
- [ ] **Step 3: Implement dirty worktree guard**
|
|
|
|
Add this method under `# --- git ---`:
|
|
|
|
```python
|
|
def _assert_clean_worktree(self, context: str):
|
|
r = self._run_git("status", "--porcelain")
|
|
if r.returncode != 0:
|
|
print(" ERROR: git status failed.")
|
|
print(f" {r.stderr.strip()}")
|
|
sys.exit(1)
|
|
dirty = r.stdout.strip()
|
|
if dirty:
|
|
print(f" ERROR: dirty worktree detected {context}.")
|
|
print(" Commit, stash, or remove these changes before running scripts/execute.py:")
|
|
for line in dirty.splitlines():
|
|
print(f" {line}")
|
|
sys.exit(1)
|
|
```
|
|
|
|
Change `run()` so the guard runs before branch checkout:
|
|
|
|
```python
|
|
self._print_header()
|
|
self._check_blockers()
|
|
self._assert_clean_worktree("before branch checkout")
|
|
self._checkout_branch()
|
|
```
|
|
|
|
- [ ] **Step 4: Run targeted tests**
|
|
|
|
Run:
|
|
|
|
```powershell
|
|
python -m unittest scripts.test_execute -v
|
|
```
|
|
|
|
Expected: pass.
|
|
|
|
- [ ] **Step 5: Commit**
|
|
|
|
Run:
|
|
|
|
```powershell
|
|
git add scripts/test_execute.py scripts/execute.py
|
|
git commit -m "fix: block execute runner on dirty worktree"
|
|
```
|
|
|
|
## Task 4: Per-Step File Allowlist
|
|
|
|
**Files:**
|
|
- Modify: `scripts/test_execute.py`
|
|
- Modify: `scripts/execute.py`
|
|
|
|
- [ ] **Step 1: Add failing tests for allowlist matching and missing allowlist**
|
|
|
|
Add these methods:
|
|
|
|
```python
|
|
def test_step_allowlist_accepts_exact_prefix_and_glob_paths(self):
|
|
execute = load_execute()
|
|
with tempfile.TemporaryDirectory() as tmp:
|
|
root = Path(tmp)
|
|
write_phase(root)
|
|
executor = make_executor(execute, root)
|
|
patterns = ["AGENTS.md", "docs/", "scripts/*.py"]
|
|
|
|
self.assertTrue(executor._path_allowed("AGENTS.md", patterns))
|
|
self.assertTrue(executor._path_allowed("docs/PRD.md", patterns))
|
|
self.assertTrue(executor._path_allowed("scripts/execute.py", patterns))
|
|
self.assertFalse(executor._path_allowed(".codex/hooks.json", patterns))
|
|
|
|
def test_step_without_allowed_paths_is_rejected_before_codex_invocation(self):
|
|
execute = load_execute()
|
|
steps = [{"step": 1, "name": "Unsafe", "status": "pending", "summary": ""}]
|
|
with tempfile.TemporaryDirectory() as tmp:
|
|
root = Path(tmp)
|
|
write_phase(root, steps=steps)
|
|
executor = make_executor(execute, root)
|
|
|
|
with self.assertRaises(SystemExit) as cm:
|
|
executor._validate_step_allowlist(steps[0])
|
|
|
|
self.assertEqual(cm.exception.code, 1)
|
|
```
|
|
|
|
- [ ] **Step 2: Run tests and verify failure**
|
|
|
|
Run:
|
|
|
|
```powershell
|
|
python -m unittest scripts.test_execute -v
|
|
```
|
|
|
|
Expected: fail because allowlist helpers do not exist.
|
|
|
|
- [ ] **Step 3: Implement allowlist helpers**
|
|
|
|
Add `fnmatch` import:
|
|
|
|
```python
|
|
import fnmatch
|
|
```
|
|
|
|
Add these methods under `# --- git ---` or a new `# --- file allowlist ---` section:
|
|
|
|
```python
|
|
@staticmethod
|
|
def _normalize_rel_path(path: str) -> str:
|
|
return path.replace("\\", "/").lstrip("./")
|
|
|
|
def _path_allowed(self, path: str, patterns: list[str]) -> bool:
|
|
rel = self._normalize_rel_path(path)
|
|
for raw in patterns:
|
|
pattern = self._normalize_rel_path(str(raw))
|
|
if not pattern:
|
|
continue
|
|
if pattern.endswith("/") and rel.startswith(pattern):
|
|
return True
|
|
if any(ch in pattern for ch in "*?[") and fnmatch.fnmatchcase(rel, pattern):
|
|
return True
|
|
if rel == pattern:
|
|
return True
|
|
return False
|
|
|
|
def _validate_step_allowlist(self, step: dict):
|
|
allowed = step.get("allowed_paths")
|
|
if not isinstance(allowed, list) or not allowed or not all(isinstance(p, str) and p.strip() for p in allowed):
|
|
print(f" ERROR: Step {step.get('step')} must define non-empty allowed_paths.")
|
|
sys.exit(1)
|
|
```
|
|
|
|
Call `_validate_step_allowlist(pending)` before `_execute_single_step(pending, guardrails)` in `_execute_all_steps()`.
|
|
|
|
- [ ] **Step 4: Add allowlist to prompt context**
|
|
|
|
In `_build_preamble`, add a parameter:
|
|
|
|
```python
|
|
def _build_preamble(self, guardrails: str, step_context: str,
|
|
allowed_paths: list[str],
|
|
prev_error: Optional[str] = None) -> str:
|
|
```
|
|
|
|
Add this block before `## 작업 규칙`:
|
|
|
|
```python
|
|
f"## Step 파일 allowlist\n\n"
|
|
f"이 step은 아래 repository-relative path만 수정할 수 있습니다.\n"
|
|
f"{chr(10).join(f'- {p}' for p in allowed_paths)}\n\n"
|
|
```
|
|
|
|
Update the caller in `_execute_single_step()`:
|
|
|
|
```python
|
|
preamble = self._build_preamble(guardrails, step_context, step.get("allowed_paths", []), prev_error)
|
|
```
|
|
|
|
- [ ] **Step 5: Run targeted tests**
|
|
|
|
Run:
|
|
|
|
```powershell
|
|
python -m unittest scripts.test_execute -v
|
|
```
|
|
|
|
Expected: pass.
|
|
|
|
- [ ] **Step 6: Commit**
|
|
|
|
Run:
|
|
|
|
```powershell
|
|
git add scripts/test_execute.py scripts/execute.py
|
|
git commit -m "fix: require execute step file allowlists"
|
|
```
|
|
|
|
## Task 5: Detect and Block Disallowed Step Changes
|
|
|
|
**Files:**
|
|
- Modify: `scripts/test_execute.py`
|
|
- Modify: `scripts/execute.py`
|
|
|
|
- [ ] **Step 1: Add failing tests for changed path classification**
|
|
|
|
Add this method:
|
|
|
|
```python
|
|
def test_classify_step_changes_splits_allowed_housekeeping_and_disallowed_paths(self):
|
|
execute = load_execute()
|
|
step = {
|
|
"step": 1,
|
|
"name": "Docs",
|
|
"status": "completed",
|
|
"summary": "",
|
|
"allowed_paths": ["docs/*.md"],
|
|
}
|
|
with tempfile.TemporaryDirectory() as tmp:
|
|
root = Path(tmp)
|
|
write_phase(root)
|
|
executor = make_executor(execute, root)
|
|
changed = [
|
|
"docs/PRD.md",
|
|
"phases/0-mvp/index.json",
|
|
"phases/0-mvp/step1-output.json",
|
|
"scripts/execute.py",
|
|
]
|
|
|
|
allowed, housekeeping, disallowed = executor._classify_step_changes(1, step, changed)
|
|
|
|
self.assertEqual(allowed, ["docs/PRD.md"])
|
|
self.assertEqual(housekeeping, ["phases/0-mvp/index.json", "phases/0-mvp/step1-output.json"])
|
|
self.assertEqual(disallowed, ["scripts/execute.py"])
|
|
```
|
|
|
|
- [ ] **Step 2: Run tests and verify failure**
|
|
|
|
Run:
|
|
|
|
```powershell
|
|
python -m unittest scripts.test_execute -v
|
|
```
|
|
|
|
Expected: fail because `_classify_step_changes()` does not exist.
|
|
|
|
- [ ] **Step 3: Implement changed path collection and classification**
|
|
|
|
Add these methods:
|
|
|
|
```python
|
|
def _changed_paths(self) -> list[str]:
|
|
paths: list[str] = []
|
|
tracked = self._run_git("diff", "--name-only")
|
|
if tracked.returncode != 0:
|
|
print(" ERROR: git diff --name-only failed.")
|
|
print(f" {tracked.stderr.strip()}")
|
|
sys.exit(1)
|
|
paths.extend(tracked.stdout.splitlines())
|
|
|
|
staged = self._run_git("diff", "--cached", "--name-only")
|
|
if staged.returncode != 0:
|
|
print(" ERROR: git diff --cached --name-only failed.")
|
|
print(f" {staged.stderr.strip()}")
|
|
sys.exit(1)
|
|
paths.extend(staged.stdout.splitlines())
|
|
|
|
untracked = self._run_git("ls-files", "--others", "--exclude-standard")
|
|
if untracked.returncode != 0:
|
|
print(" ERROR: git ls-files --others failed.")
|
|
print(f" {untracked.stderr.strip()}")
|
|
sys.exit(1)
|
|
paths.extend(untracked.stdout.splitlines())
|
|
|
|
return sorted({self._normalize_rel_path(p) for p in paths if p.strip()})
|
|
|
|
def _housekeeping_paths(self, step_num: int) -> set[str]:
|
|
return {
|
|
f"phases/{self._phase_dir_name}/index.json",
|
|
f"phases/{self._phase_dir_name}/step{step_num}-output.json",
|
|
"phases/index.json",
|
|
}
|
|
|
|
def _classify_step_changes(self, step_num: int, step: dict, changed_paths: list[str]) -> tuple[list[str], list[str], list[str]]:
|
|
allowed_patterns = step.get("allowed_paths", [])
|
|
housekeeping_set = self._housekeeping_paths(step_num)
|
|
allowed: list[str] = []
|
|
housekeeping: list[str] = []
|
|
disallowed: list[str] = []
|
|
for path in changed_paths:
|
|
rel = self._normalize_rel_path(path)
|
|
if rel in housekeeping_set:
|
|
housekeeping.append(rel)
|
|
elif self._path_allowed(rel, allowed_patterns):
|
|
allowed.append(rel)
|
|
else:
|
|
disallowed.append(rel)
|
|
return allowed, housekeeping, disallowed
|
|
```
|
|
|
|
- [ ] **Step 4: Run targeted tests**
|
|
|
|
Run:
|
|
|
|
```powershell
|
|
python -m unittest scripts.test_execute -v
|
|
```
|
|
|
|
Expected: pass.
|
|
|
|
- [ ] **Step 5: Commit**
|
|
|
|
Run:
|
|
|
|
```powershell
|
|
git add scripts/test_execute.py scripts/execute.py
|
|
git commit -m "fix: classify execute step file changes"
|
|
```
|
|
|
|
## Task 6: Remove Broad Staging and Stage Explicit Paths Only
|
|
|
|
**Files:**
|
|
- Modify: `scripts/test_execute.py`
|
|
- Modify: `scripts/execute.py`
|
|
|
|
- [ ] **Step 1: Add failing test that commit path never uses broad staging**
|
|
|
|
Add this method:
|
|
|
|
```python
|
|
def test_commit_step_stages_only_explicit_allowed_and_housekeeping_paths(self):
|
|
execute = load_execute()
|
|
step = {
|
|
"step": 1,
|
|
"name": "Docs",
|
|
"status": "completed",
|
|
"summary": "",
|
|
"allowed_paths": ["docs/*.md"],
|
|
}
|
|
with tempfile.TemporaryDirectory() as tmp:
|
|
root = Path(tmp)
|
|
write_phase(root)
|
|
executor = make_executor(execute, root)
|
|
calls = []
|
|
|
|
def fake_git(*args):
|
|
calls.append(args)
|
|
if args in {
|
|
("diff", "--quiet", "--cached", "--"),
|
|
("diff", "--cached", "--quiet"),
|
|
}:
|
|
return subprocess.CompletedProcess(args, 1, "", "")
|
|
return subprocess.CompletedProcess(args, 0, "", "")
|
|
|
|
with patch.object(executor, "_changed_paths", return_value=["docs/PRD.md", "phases/0-mvp/index.json", "phases/0-mvp/step1-output.json"]):
|
|
with patch.object(executor, "_run_git", side_effect=fake_git):
|
|
with patch.object(executor, "_validate_before_commit"):
|
|
executor._commit_step(step, "Docs")
|
|
|
|
self.assertNotIn(("add", "-A"), calls)
|
|
self.assertIn(("add", "--", "docs/PRD.md"), calls)
|
|
self.assertIn(("add", "--", "phases/0-mvp/index.json", "phases/0-mvp/step1-output.json"), calls)
|
|
```
|
|
|
|
- [ ] **Step 2: Run tests and verify failure**
|
|
|
|
Run:
|
|
|
|
```powershell
|
|
python -m unittest scripts.test_execute -v
|
|
```
|
|
|
|
Expected: fail because `_commit_step()` still accepts `(step_num, step_name)` and uses broad staging.
|
|
|
|
- [ ] **Step 3: Implement explicit staging helper**
|
|
|
|
Add this method:
|
|
|
|
```python
|
|
def _stage_paths(self, paths: list[str]):
|
|
if not paths:
|
|
return
|
|
r = self._run_git("add", "--", *paths)
|
|
if r.returncode != 0:
|
|
print(" ERROR: git add failed.")
|
|
print(f" {r.stderr.strip()}")
|
|
sys.exit(1)
|
|
```
|
|
|
|
Change `_commit_step` signature:
|
|
|
|
```python
|
|
def _commit_step(self, step: dict, step_name: str):
|
|
step_num = step["step"]
|
|
```
|
|
|
|
Replace the old broad staging logic with:
|
|
|
|
```python
|
|
changed = self._changed_paths()
|
|
allowed, housekeeping, disallowed = self._classify_step_changes(step_num, step, changed)
|
|
if disallowed:
|
|
print(f" ERROR: Step {step_num} modified files outside allowed_paths:")
|
|
for path in disallowed:
|
|
print(f" {path}")
|
|
sys.exit(1)
|
|
|
|
if allowed:
|
|
msg = self.FEAT_MSG.format(phase=self._phase_name, num=step_num, name=step_name)
|
|
self._validate_before_commit(msg)
|
|
self._stage_paths(allowed)
|
|
if self._run_git("diff", "--cached", "--quiet").returncode != 0:
|
|
r = self._run_git("commit", "-m", msg)
|
|
if r.returncode != 0:
|
|
print(f" ERROR: 코드 커밋 실패: {r.stderr.strip()}")
|
|
sys.exit(1)
|
|
print(f" Commit: {msg}")
|
|
|
|
if housekeeping:
|
|
msg = self.CHORE_MSG.format(phase=self._phase_name, num=step_num)
|
|
self._validate_before_commit(msg)
|
|
self._stage_paths(housekeeping)
|
|
if self._run_git("diff", "--cached", "--quiet").returncode != 0:
|
|
r = self._run_git("commit", "-m", msg)
|
|
if r.returncode != 0:
|
|
print(f" ERROR: housekeeping 커밋 실패: {r.stderr.strip()}")
|
|
sys.exit(1)
|
|
```
|
|
|
|
Update caller:
|
|
|
|
```python
|
|
self._commit_step(step, step_name)
|
|
```
|
|
|
|
- [ ] **Step 4: Run targeted tests**
|
|
|
|
Run:
|
|
|
|
```powershell
|
|
python -m unittest scripts.test_execute -v
|
|
```
|
|
|
|
Expected: fail only if `_validate_before_commit()` is not implemented yet. If so, continue to Task 7 before expecting full pass.
|
|
|
|
## Task 7: Validation Before Commit
|
|
|
|
**Files:**
|
|
- Modify: `scripts/test_execute.py`
|
|
- Modify: `scripts/execute.py`
|
|
|
|
- [ ] **Step 1: Add failing tests for validation command behavior**
|
|
|
|
Add these methods:
|
|
|
|
```python
|
|
def test_validate_before_commit_runs_python_selftest_then_workspace_validation(self):
|
|
execute = load_execute()
|
|
with tempfile.TemporaryDirectory() as tmp:
|
|
root = Path(tmp)
|
|
write_phase(root)
|
|
executor = make_executor(execute, root)
|
|
commands = []
|
|
|
|
def fake_run(cmd, **kwargs):
|
|
commands.append(cmd)
|
|
return subprocess.CompletedProcess(cmd, 0, "ok", "")
|
|
|
|
with patch.object(execute.subprocess, "run", side_effect=fake_run):
|
|
executor._validate_before_commit("feat(0-mvp): step 1")
|
|
|
|
self.assertEqual(
|
|
commands,
|
|
[
|
|
[sys.executable, "-m", "unittest", "discover", "-s", "scripts", "-p", "test_*.py"],
|
|
[sys.executable, "scripts/validate_workspace.py"],
|
|
],
|
|
)
|
|
|
|
def test_validate_before_commit_exits_before_commit_when_validation_fails(self):
|
|
execute = load_execute()
|
|
with tempfile.TemporaryDirectory() as tmp:
|
|
root = Path(tmp)
|
|
write_phase(root)
|
|
executor = make_executor(execute, root)
|
|
|
|
def fake_run(cmd, **kwargs):
|
|
return subprocess.CompletedProcess(cmd, 1, "bad", "failed")
|
|
|
|
with patch.object(execute.subprocess, "run", side_effect=fake_run):
|
|
with self.assertRaises(SystemExit) as cm:
|
|
executor._validate_before_commit("feat(0-mvp): step 1")
|
|
|
|
self.assertEqual(cm.exception.code, 1)
|
|
```
|
|
|
|
Add `import sys` to `scripts/test_execute.py`.
|
|
|
|
- [ ] **Step 2: Run tests and verify failure**
|
|
|
|
Run:
|
|
|
|
```powershell
|
|
python -m unittest scripts.test_execute -v
|
|
```
|
|
|
|
Expected: fail because `_validate_before_commit()` does not exist.
|
|
|
|
- [ ] **Step 3: Implement validation-before-commit**
|
|
|
|
Add this class constant near `MAX_RETRIES`:
|
|
|
|
```python
|
|
VALIDATION_COMMANDS = (
|
|
[sys.executable, "-m", "unittest", "discover", "-s", "scripts", "-p", "test_*.py"],
|
|
[sys.executable, "scripts/validate_workspace.py"],
|
|
)
|
|
```
|
|
|
|
Add this method:
|
|
|
|
```python
|
|
def _validate_before_commit(self, commit_message: str):
|
|
print(f" Validation before commit: {commit_message}")
|
|
for cmd in self.VALIDATION_COMMANDS:
|
|
r = subprocess.run(cmd, cwd=self._root, capture_output=True, text=True)
|
|
if r.returncode != 0:
|
|
print(f" ERROR: validation failed before commit: {' '.join(cmd)}")
|
|
if r.stdout:
|
|
print(r.stdout[-2000:])
|
|
if r.stderr:
|
|
print(r.stderr[-2000:])
|
|
sys.exit(1)
|
|
```
|
|
|
|
- [ ] **Step 4: Run targeted tests**
|
|
|
|
Run:
|
|
|
|
```powershell
|
|
python -m unittest scripts.test_execute -v
|
|
```
|
|
|
|
Expected: pass.
|
|
|
|
- [ ] **Step 5: Commit**
|
|
|
|
Run:
|
|
|
|
```powershell
|
|
git add scripts/test_execute.py scripts/execute.py
|
|
git commit -m "fix: validate execute runner before commits"
|
|
```
|
|
|
|
## Task 8: Finalize Without Broad Staging
|
|
|
|
**Files:**
|
|
- Modify: `scripts/test_execute.py`
|
|
- Modify: `scripts/execute.py`
|
|
|
|
- [ ] **Step 1: Add failing test for final commit explicit staging**
|
|
|
|
Add this method:
|
|
|
|
```python
|
|
def test_finalize_stages_only_phase_indexes(self):
|
|
execute = load_execute()
|
|
with tempfile.TemporaryDirectory() as tmp:
|
|
root = Path(tmp)
|
|
write_phase(root)
|
|
(root / "phases" / "index.json").write_text('{"phases":[]}', encoding="utf-8")
|
|
executor = make_executor(execute, root)
|
|
calls = []
|
|
|
|
def fake_git(*args):
|
|
calls.append(args)
|
|
if args == ("diff", "--cached", "--quiet"):
|
|
return subprocess.CompletedProcess(args, 1, "", "")
|
|
return subprocess.CompletedProcess(args, 0, "", "")
|
|
|
|
with patch.object(executor, "_run_git", side_effect=fake_git):
|
|
with patch.object(executor, "_validate_before_commit"):
|
|
executor._finalize()
|
|
|
|
self.assertNotIn(("add", "-A"), calls)
|
|
self.assertIn(("add", "--", "phases/0-mvp/index.json", "phases/index.json"), calls)
|
|
```
|
|
|
|
- [ ] **Step 2: Run tests and verify failure**
|
|
|
|
Run:
|
|
|
|
```powershell
|
|
python -m unittest scripts.test_execute -v
|
|
```
|
|
|
|
Expected: fail because `_finalize()` still uses broad staging.
|
|
|
|
- [ ] **Step 3: Replace final staging**
|
|
|
|
In `_finalize()`, replace:
|
|
|
|
```python
|
|
self._run_git("add", "-A")
|
|
```
|
|
|
|
with:
|
|
|
|
```python
|
|
final_paths = [f"phases/{self._phase_dir_name}/index.json"]
|
|
if self._top_index_file.exists():
|
|
final_paths.append("phases/index.json")
|
|
self._validate_before_commit(f"chore({self._phase_name}): mark phase completed")
|
|
self._stage_paths(final_paths)
|
|
```
|
|
|
|
Keep the existing cached diff check and commit message, but change commit failure from silent no-op to fatal:
|
|
|
|
```python
|
|
if r.returncode == 0:
|
|
print(f" ✓ {msg}")
|
|
else:
|
|
print(f" ERROR: phase completion commit failed: {r.stderr.strip()}")
|
|
sys.exit(1)
|
|
```
|
|
|
|
- [ ] **Step 4: Run targeted tests**
|
|
|
|
Run:
|
|
|
|
```powershell
|
|
python -m unittest scripts.test_execute -v
|
|
```
|
|
|
|
Expected: pass.
|
|
|
|
- [ ] **Step 5: Commit**
|
|
|
|
Run:
|
|
|
|
```powershell
|
|
git add scripts/test_execute.py scripts/execute.py
|
|
git commit -m "fix: stage execute finalization explicitly"
|
|
```
|
|
|
|
## Task 9: Documentation and Existing Contract Tests
|
|
|
|
**Files:**
|
|
- Modify: `AGENTS.md`
|
|
- Modify: `docs/ARCHITECTURE.md`
|
|
- Modify: `.codex/skills/harness-workflow/SKILL.md` only to remove stale legacy feat branch wording; do not use the skill.
|
|
- Modify: existing `scripts/test_*` files only if they assert stale branch or staging behavior.
|
|
|
|
- [ ] **Step 1: Update `AGENTS.md` runner rules**
|
|
|
|
Add these bullets under the Harness runner rules:
|
|
|
|
```markdown
|
|
- `scripts/execute.py`는 `codex/<phase-name>` branch prefix만 사용한다.
|
|
- `scripts/execute.py` 실행 전 worktree는 clean 상태여야 한다.
|
|
- phase step은 `allowed_paths`를 선언해야 하며 runner는 allowlist 밖 변경을 커밋하지 않는다.
|
|
- runner는 broad staging을 사용하지 않고 explicit path만 stage한다.
|
|
- runner가 만드는 모든 commit 전에는 Python self-test와 `python scripts/validate_workspace.py`가 통과해야 한다.
|
|
```
|
|
|
|
- [ ] **Step 2: Update architecture docs**
|
|
|
|
In `docs/ARCHITECTURE.md`, update the Harness execution layer section so it states:
|
|
|
|
```markdown
|
|
scripts/execute.py:
|
|
- creates/checks out `codex/<phase-name>`
|
|
- refuses to run on a dirty worktree
|
|
- requires per-step `allowed_paths`
|
|
- stages only explicit allowed paths and runner housekeeping files
|
|
- runs Python self-test and workspace validation before every runner-created commit
|
|
```
|
|
|
|
- [ ] **Step 3: Update stale harness workflow description without invoking the skill**
|
|
|
|
In `.codex/skills/harness-workflow/SKILL.md`, replace the sentence that says:
|
|
|
|
```markdown
|
|
`scripts/execute.py` creates or checks out the legacy feat task branch
|
|
```
|
|
|
|
with:
|
|
|
|
```markdown
|
|
`scripts/execute.py` creates or checks out `codex/{task-name}`
|
|
```
|
|
|
|
Also add one sentence that phase steps must provide `allowed_paths`.
|
|
|
|
- [ ] **Step 4: Run docs-related searches**
|
|
|
|
Run:
|
|
|
|
```powershell
|
|
rg -n "legacy-feat-branch|broad staging|allowed_paths|codex/" AGENTS.md docs .codex scripts
|
|
```
|
|
|
|
Expected:
|
|
- No stale `scripts/execute.py` branch docs saying the legacy feat task branch.
|
|
- No broad staging in `scripts/execute.py`.
|
|
- `allowed_paths` appears in runner docs and tests.
|
|
|
|
- [ ] **Step 5: Commit**
|
|
|
|
Run:
|
|
|
|
```powershell
|
|
git add AGENTS.md docs/ARCHITECTURE.md .codex/skills/harness-workflow/SKILL.md
|
|
git commit -m "docs: document safe execute runner contract"
|
|
```
|
|
|
|
## Task 10: Full Verification
|
|
|
|
**Files:**
|
|
- Read: all changed files
|
|
|
|
- [ ] **Step 1: Run Python self-tests**
|
|
|
|
Run:
|
|
|
|
```powershell
|
|
python -m unittest discover -s scripts -p "test_*.py"
|
|
```
|
|
|
|
Expected:
|
|
|
|
```text
|
|
OK
|
|
```
|
|
|
|
- [ ] **Step 2: Run workspace validation**
|
|
|
|
Run:
|
|
|
|
```powershell
|
|
python scripts/validate_workspace.py
|
|
```
|
|
|
|
Expected in current no-CMake workspace:
|
|
|
|
```text
|
|
No C++ validation commands configured.
|
|
Add CMakeLists.txt or set HARNESS_VALIDATION_COMMANDS.
|
|
```
|
|
|
|
Exit code must be `0`.
|
|
|
|
- [ ] **Step 3: Run whitespace check**
|
|
|
|
Run:
|
|
|
|
```powershell
|
|
git diff --check
|
|
```
|
|
|
|
Expected: exit code `0`. LF-to-CRLF warnings are acceptable if there are no whitespace error lines.
|
|
|
|
- [ ] **Step 4: Run safety searches**
|
|
|
|
Run:
|
|
|
|
```powershell
|
|
rg -n "legacy-feat-branch|broad staging" scripts/execute.py AGENTS.md docs .codex
|
|
```
|
|
|
|
Expected: no matches.
|
|
|
|
Run:
|
|
|
|
```powershell
|
|
rg -n "allowed_paths|codex/" scripts/execute.py scripts/test_execute.py AGENTS.md docs/ARCHITECTURE.md .codex/skills/harness-workflow/SKILL.md
|
|
```
|
|
|
|
Expected: matches showing the new branch prefix and allowlist contract.
|
|
|
|
- [ ] **Step 5: Final commit if verification-only edits were needed**
|
|
|
|
Run only if Task 10 caused additional edits:
|
|
|
|
```powershell
|
|
git add scripts/execute.py scripts/test_execute.py AGENTS.md docs/ARCHITECTURE.md .codex/skills/harness-workflow/SKILL.md
|
|
git commit -m "fix: harden execute runner safety checks"
|
|
```
|
|
|
|
## Acceptance Checklist
|
|
|
|
- `scripts/execute.py` contains no broad staging.
|
|
- `_branch_name()` returns a `codex/` branch name and `_checkout_branch()`/`_finalize()` both use it.
|
|
- `run()` calls `_assert_clean_worktree()` before `_checkout_branch()`.
|
|
- Every pending step must have a non-empty `allowed_paths`.
|
|
- Runner-managed phase index/output files are allowed as housekeeping, not mixed with code/doc step commits.
|
|
- Any path outside `allowed_paths` and housekeeping paths blocks commit and exits non-zero.
|
|
- `_validate_before_commit()` runs Python self-tests and `scripts/validate_workspace.py` before every runner-created commit.
|
|
- Validation failure exits before staging/committing.
|
|
- Commit failure exits non-zero.
|
|
- Documentation no longer describes `feat-` branches for `scripts/execute.py`.
|