# Harden Execute Runner Implementation Plan > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. **Goal:** Make `scripts/execute.py` safe enough to use by enforcing `codex/` branch names, clean starting state, explicit staging, per-step file allowlists, and validation before every runner-created commit. **Architecture:** Keep the runner as a single Python module and add small helper methods to `StepExecutor` instead of introducing a new framework. Add focused `unittest` coverage in `scripts/test_execute.py` using temporary phase directories and mocked subprocess/git calls, then update stale docs only after behavior is covered. **Tech Stack:** Python 3 standard library, `unittest`, `unittest.mock`, Git command-line interface, existing `python scripts/validate_workspace.py`. --- ## Scope This plan does not use the `harness-workflow` skill. It only prepares `scripts/execute.py` so that using it later is less likely to damage unrelated work. Required behavior: - Branches created or used by the runner must use `codex/`. - Runner startup must abort on a dirty worktree before branch checkout or Codex execution. - Runner commits must never use broad git staging. - Each phase step must define an allowlist of paths it may modify. - Runner must refuse to commit disallowed file changes. - Runner must run Python self-tests and workspace validation before every commit it creates. - Commit failure and validation failure must stop the runner instead of only warning. ## File Map - Modify `scripts/execute.py`: branch naming, dirty worktree check, allowlist parsing, explicit staging, validation-before-commit, fatal commit failures. - Create `scripts/test_execute.py`: unit tests for runner safety behavior. - Modify `AGENTS.md`: document `codex/` branch prefix and safer runner requirements. - Modify `docs/ARCHITECTURE.md`: update Harness execution layer notes if they mention unsafe staging or old branch behavior. - Modify `.codex/skills/harness-workflow/SKILL.md`: update stale documentation that says the runner creates the legacy feat task branch. Do not invoke the skill while making this edit. ## Data Contract Extend each step object in `phases//index.json` with `allowed_paths`. ```json { "step": 1, "name": "Update docs", "status": "pending", "summary": "", "allowed_paths": [ "docs/*.md", "docs/**/*.md", "AGENTS.md" ] } ``` Allowlist semantics: - Paths are repository-relative and normalized to forward slashes. - Exact file path matches are allowed. - Directory prefix entries ending with `/` allow every path under that directory. - Glob entries using `*`, `?`, or `[` are matched with `fnmatch.fnmatchcase`. - Runner-managed housekeeping files are always allowed separately: - `phases//index.json` - `phases//step-output.json` - `phases/index.json` ## Task 1: Test Harness for `scripts/execute.py` **Files:** - Create: `scripts/test_execute.py` - Read: `scripts/execute.py` - [ ] **Step 1: Add loader and temporary phase helpers** Create `scripts/test_execute.py` with this initial content: ```python import importlib.util import json import subprocess import tempfile import unittest from pathlib import Path from unittest.mock import patch def load_execute(): module_path = Path(__file__).resolve().parent / "execute.py" spec = importlib.util.spec_from_file_location("execute", module_path) module = importlib.util.module_from_spec(spec) spec.loader.exec_module(module) return module def write_phase(root: Path, phase_dir: str = "0-mvp", phase_name: str = "0-mvp", steps=None): phase_path = root / "phases" / phase_dir phase_path.mkdir(parents=True) if steps is None: steps = [ { "step": 1, "name": "Docs", "status": "pending", "summary": "", "allowed_paths": ["docs/*.md"], } ] (phase_path / "index.json").write_text( json.dumps({"project": "FESA", "phase": phase_name, "steps": steps}, indent=2), encoding="utf-8", ) (phase_path / "step1.md").write_text("# Step 1\n", encoding="utf-8") return phase_path def make_executor(execute, root: Path, phase_dir: str = "0-mvp"): with patch.object(execute, "ROOT", root): return execute.StepExecutor(phase_dir) class ExecuteRunnerSafetyTests(unittest.TestCase): pass if __name__ == "__main__": unittest.main() ``` - [ ] **Step 2: Run the new empty test module** Run: ```powershell python -m unittest scripts.test_execute ``` Expected: pass with one empty test class loaded and no failures. - [ ] **Step 3: Commit the test scaffold** Run: ```powershell git add scripts/test_execute.py git commit -m "test: add execute runner safety test scaffold" ``` ## Task 2: `codex/` Branch Prefix **Files:** - Modify: `scripts/test_execute.py` - Modify: `scripts/execute.py` - [ ] **Step 1: Add failing tests for branch naming and push branch** Add these methods to `ExecuteRunnerSafetyTests`: ```python def test_branch_name_uses_codex_prefix_and_sanitized_phase(self): execute = load_execute() with tempfile.TemporaryDirectory() as tmp: root = Path(tmp) write_phase(root, phase_name="linear truss/1d") executor = make_executor(execute, root) self.assertEqual(executor._branch_name(), "codex/linear-truss-1d") def test_finalize_push_uses_codex_branch_name(self): execute = load_execute() with tempfile.TemporaryDirectory() as tmp: root = Path(tmp) write_phase(root, phase_name="0-mvp") executor = make_executor(execute, root) executor._auto_push = True calls = [] def fake_git(*args): calls.append(args) if args == ("diff", "--cached", "--quiet"): return subprocess.CompletedProcess(args, 0, "", "") return subprocess.CompletedProcess(args, 0, "", "") with patch.object(executor, "_run_git", side_effect=fake_git): with patch.object(executor, "_validate_before_commit", create=True): executor._finalize() self.assertIn(("push", "-u", "origin", "codex/0-mvp"), calls) ``` - [ ] **Step 2: Run tests and verify failure** Run: ```powershell python -m unittest scripts.test_execute -v ``` Expected: fail because `StepExecutor` has no `_branch_name()` and `_finalize()` still uses `feat-`. - [ ] **Step 3: Implement branch naming** In `scripts/execute.py`, import `re` near the other imports: ```python import re ``` Add this method to `StepExecutor` under the `# --- git ---` section: ```python def _branch_name(self) -> str: slug = re.sub(r"[^A-Za-z0-9._-]+", "-", self._phase_name.strip()) slug = slug.strip("/.-") if not slug: slug = self._phase_dir_name return f"codex/{slug}" ``` Replace both direct branch constructions: ```python branch = self._branch_name() ``` with: ```python branch = self._branch_name() ``` - [ ] **Step 4: Run targeted tests** Run: ```powershell python -m unittest scripts.test_execute -v ``` Expected: pass. - [ ] **Step 5: Commit** Run: ```powershell git add scripts/test_execute.py scripts/execute.py git commit -m "fix: use codex branch prefix in execute runner" ``` ## Task 3: Dirty Worktree Protection **Files:** - Modify: `scripts/test_execute.py` - Modify: `scripts/execute.py` - [ ] **Step 1: Add failing tests for dirty worktree guard** Add these methods: ```python def test_assert_clean_worktree_exits_when_git_status_has_changes(self): execute = load_execute() with tempfile.TemporaryDirectory() as tmp: root = Path(tmp) write_phase(root) executor = make_executor(execute, root) with patch.object( executor, "_run_git", return_value=subprocess.CompletedProcess([], 0, " M AGENTS.md\n?? scratch.txt\n", ""), ): with self.assertRaises(SystemExit) as cm: executor._assert_clean_worktree("before checkout") self.assertEqual(cm.exception.code, 1) def test_run_checks_clean_worktree_before_checkout(self): execute = load_execute() with tempfile.TemporaryDirectory() as tmp: root = Path(tmp) write_phase(root) executor = make_executor(execute, root) calls = [] def record(name): def inner(*args, **kwargs): calls.append(name) return inner with patch.object(executor, "_assert_clean_worktree", side_effect=record("clean")): with patch.object(executor, "_checkout_branch", side_effect=record("checkout")): with patch.object(executor, "_print_header"): with patch.object(executor, "_check_blockers"): with patch.object(executor, "_load_guardrails", return_value=""): with patch.object(executor, "_ensure_created_at"): with patch.object(executor, "_execute_all_steps"): with patch.object(executor, "_finalize"): executor.run() self.assertLess(calls.index("clean"), calls.index("checkout")) ``` - [ ] **Step 2: Run tests and verify failure** Run: ```powershell python -m unittest scripts.test_execute -v ``` Expected: fail because `_assert_clean_worktree()` does not exist and `run()` does not call it. - [ ] **Step 3: Implement dirty worktree guard** Add this method under `# --- git ---`: ```python def _assert_clean_worktree(self, context: str): r = self._run_git("status", "--porcelain") if r.returncode != 0: print(" ERROR: git status failed.") print(f" {r.stderr.strip()}") sys.exit(1) dirty = r.stdout.strip() if dirty: print(f" ERROR: dirty worktree detected {context}.") print(" Commit, stash, or remove these changes before running scripts/execute.py:") for line in dirty.splitlines(): print(f" {line}") sys.exit(1) ``` Change `run()` so the guard runs before branch checkout: ```python self._print_header() self._check_blockers() self._assert_clean_worktree("before branch checkout") self._checkout_branch() ``` - [ ] **Step 4: Run targeted tests** Run: ```powershell python -m unittest scripts.test_execute -v ``` Expected: pass. - [ ] **Step 5: Commit** Run: ```powershell git add scripts/test_execute.py scripts/execute.py git commit -m "fix: block execute runner on dirty worktree" ``` ## Task 4: Per-Step File Allowlist **Files:** - Modify: `scripts/test_execute.py` - Modify: `scripts/execute.py` - [ ] **Step 1: Add failing tests for allowlist matching and missing allowlist** Add these methods: ```python def test_step_allowlist_accepts_exact_prefix_and_glob_paths(self): execute = load_execute() with tempfile.TemporaryDirectory() as tmp: root = Path(tmp) write_phase(root) executor = make_executor(execute, root) patterns = ["AGENTS.md", "docs/", "scripts/*.py"] self.assertTrue(executor._path_allowed("AGENTS.md", patterns)) self.assertTrue(executor._path_allowed("docs/PRD.md", patterns)) self.assertTrue(executor._path_allowed("scripts/execute.py", patterns)) self.assertFalse(executor._path_allowed(".codex/hooks.json", patterns)) def test_step_without_allowed_paths_is_rejected_before_codex_invocation(self): execute = load_execute() steps = [{"step": 1, "name": "Unsafe", "status": "pending", "summary": ""}] with tempfile.TemporaryDirectory() as tmp: root = Path(tmp) write_phase(root, steps=steps) executor = make_executor(execute, root) with self.assertRaises(SystemExit) as cm: executor._validate_step_allowlist(steps[0]) self.assertEqual(cm.exception.code, 1) ``` - [ ] **Step 2: Run tests and verify failure** Run: ```powershell python -m unittest scripts.test_execute -v ``` Expected: fail because allowlist helpers do not exist. - [ ] **Step 3: Implement allowlist helpers** Add `fnmatch` import: ```python import fnmatch ``` Add these methods under `# --- git ---` or a new `# --- file allowlist ---` section: ```python @staticmethod def _normalize_rel_path(path: str) -> str: return path.replace("\\", "/").lstrip("./") def _path_allowed(self, path: str, patterns: list[str]) -> bool: rel = self._normalize_rel_path(path) for raw in patterns: pattern = self._normalize_rel_path(str(raw)) if not pattern: continue if pattern.endswith("/") and rel.startswith(pattern): return True if any(ch in pattern for ch in "*?[") and fnmatch.fnmatchcase(rel, pattern): return True if rel == pattern: return True return False def _validate_step_allowlist(self, step: dict): allowed = step.get("allowed_paths") if not isinstance(allowed, list) or not allowed or not all(isinstance(p, str) and p.strip() for p in allowed): print(f" ERROR: Step {step.get('step')} must define non-empty allowed_paths.") sys.exit(1) ``` Call `_validate_step_allowlist(pending)` before `_execute_single_step(pending, guardrails)` in `_execute_all_steps()`. - [ ] **Step 4: Add allowlist to prompt context** In `_build_preamble`, add a parameter: ```python def _build_preamble(self, guardrails: str, step_context: str, allowed_paths: list[str], prev_error: Optional[str] = None) -> str: ``` Add this block before `## 작업 규칙`: ```python f"## Step 파일 allowlist\n\n" f"이 step은 아래 repository-relative path만 수정할 수 있습니다.\n" f"{chr(10).join(f'- {p}' for p in allowed_paths)}\n\n" ``` Update the caller in `_execute_single_step()`: ```python preamble = self._build_preamble(guardrails, step_context, step.get("allowed_paths", []), prev_error) ``` - [ ] **Step 5: Run targeted tests** Run: ```powershell python -m unittest scripts.test_execute -v ``` Expected: pass. - [ ] **Step 6: Commit** Run: ```powershell git add scripts/test_execute.py scripts/execute.py git commit -m "fix: require execute step file allowlists" ``` ## Task 5: Detect and Block Disallowed Step Changes **Files:** - Modify: `scripts/test_execute.py` - Modify: `scripts/execute.py` - [ ] **Step 1: Add failing tests for changed path classification** Add this method: ```python def test_classify_step_changes_splits_allowed_housekeeping_and_disallowed_paths(self): execute = load_execute() step = { "step": 1, "name": "Docs", "status": "completed", "summary": "", "allowed_paths": ["docs/*.md"], } with tempfile.TemporaryDirectory() as tmp: root = Path(tmp) write_phase(root) executor = make_executor(execute, root) changed = [ "docs/PRD.md", "phases/0-mvp/index.json", "phases/0-mvp/step1-output.json", "scripts/execute.py", ] allowed, housekeeping, disallowed = executor._classify_step_changes(1, step, changed) self.assertEqual(allowed, ["docs/PRD.md"]) self.assertEqual(housekeeping, ["phases/0-mvp/index.json", "phases/0-mvp/step1-output.json"]) self.assertEqual(disallowed, ["scripts/execute.py"]) ``` - [ ] **Step 2: Run tests and verify failure** Run: ```powershell python -m unittest scripts.test_execute -v ``` Expected: fail because `_classify_step_changes()` does not exist. - [ ] **Step 3: Implement changed path collection and classification** Add these methods: ```python def _changed_paths(self) -> list[str]: paths: list[str] = [] tracked = self._run_git("diff", "--name-only") if tracked.returncode != 0: print(" ERROR: git diff --name-only failed.") print(f" {tracked.stderr.strip()}") sys.exit(1) paths.extend(tracked.stdout.splitlines()) staged = self._run_git("diff", "--cached", "--name-only") if staged.returncode != 0: print(" ERROR: git diff --cached --name-only failed.") print(f" {staged.stderr.strip()}") sys.exit(1) paths.extend(staged.stdout.splitlines()) untracked = self._run_git("ls-files", "--others", "--exclude-standard") if untracked.returncode != 0: print(" ERROR: git ls-files --others failed.") print(f" {untracked.stderr.strip()}") sys.exit(1) paths.extend(untracked.stdout.splitlines()) return sorted({self._normalize_rel_path(p) for p in paths if p.strip()}) def _housekeeping_paths(self, step_num: int) -> set[str]: return { f"phases/{self._phase_dir_name}/index.json", f"phases/{self._phase_dir_name}/step{step_num}-output.json", "phases/index.json", } def _classify_step_changes(self, step_num: int, step: dict, changed_paths: list[str]) -> tuple[list[str], list[str], list[str]]: allowed_patterns = step.get("allowed_paths", []) housekeeping_set = self._housekeeping_paths(step_num) allowed: list[str] = [] housekeeping: list[str] = [] disallowed: list[str] = [] for path in changed_paths: rel = self._normalize_rel_path(path) if rel in housekeeping_set: housekeeping.append(rel) elif self._path_allowed(rel, allowed_patterns): allowed.append(rel) else: disallowed.append(rel) return allowed, housekeeping, disallowed ``` - [ ] **Step 4: Run targeted tests** Run: ```powershell python -m unittest scripts.test_execute -v ``` Expected: pass. - [ ] **Step 5: Commit** Run: ```powershell git add scripts/test_execute.py scripts/execute.py git commit -m "fix: classify execute step file changes" ``` ## Task 6: Remove Broad Staging and Stage Explicit Paths Only **Files:** - Modify: `scripts/test_execute.py` - Modify: `scripts/execute.py` - [ ] **Step 1: Add failing test that commit path never uses broad staging** Add this method: ```python def test_commit_step_stages_only_explicit_allowed_and_housekeeping_paths(self): execute = load_execute() step = { "step": 1, "name": "Docs", "status": "completed", "summary": "", "allowed_paths": ["docs/*.md"], } with tempfile.TemporaryDirectory() as tmp: root = Path(tmp) write_phase(root) executor = make_executor(execute, root) calls = [] def fake_git(*args): calls.append(args) if args in { ("diff", "--quiet", "--cached", "--"), ("diff", "--cached", "--quiet"), }: return subprocess.CompletedProcess(args, 1, "", "") return subprocess.CompletedProcess(args, 0, "", "") with patch.object(executor, "_changed_paths", return_value=["docs/PRD.md", "phases/0-mvp/index.json", "phases/0-mvp/step1-output.json"]): with patch.object(executor, "_run_git", side_effect=fake_git): with patch.object(executor, "_validate_before_commit"): executor._commit_step(step, "Docs") self.assertNotIn(("add", "-A"), calls) self.assertIn(("add", "--", "docs/PRD.md"), calls) self.assertIn(("add", "--", "phases/0-mvp/index.json", "phases/0-mvp/step1-output.json"), calls) ``` - [ ] **Step 2: Run tests and verify failure** Run: ```powershell python -m unittest scripts.test_execute -v ``` Expected: fail because `_commit_step()` still accepts `(step_num, step_name)` and uses broad staging. - [ ] **Step 3: Implement explicit staging helper** Add this method: ```python def _stage_paths(self, paths: list[str]): if not paths: return r = self._run_git("add", "--", *paths) if r.returncode != 0: print(" ERROR: git add failed.") print(f" {r.stderr.strip()}") sys.exit(1) ``` Change `_commit_step` signature: ```python def _commit_step(self, step: dict, step_name: str): step_num = step["step"] ``` Replace the old broad staging logic with: ```python changed = self._changed_paths() allowed, housekeeping, disallowed = self._classify_step_changes(step_num, step, changed) if disallowed: print(f" ERROR: Step {step_num} modified files outside allowed_paths:") for path in disallowed: print(f" {path}") sys.exit(1) if allowed: msg = self.FEAT_MSG.format(phase=self._phase_name, num=step_num, name=step_name) self._validate_before_commit(msg) self._stage_paths(allowed) if self._run_git("diff", "--cached", "--quiet").returncode != 0: r = self._run_git("commit", "-m", msg) if r.returncode != 0: print(f" ERROR: 코드 커밋 실패: {r.stderr.strip()}") sys.exit(1) print(f" Commit: {msg}") if housekeeping: msg = self.CHORE_MSG.format(phase=self._phase_name, num=step_num) self._validate_before_commit(msg) self._stage_paths(housekeeping) if self._run_git("diff", "--cached", "--quiet").returncode != 0: r = self._run_git("commit", "-m", msg) if r.returncode != 0: print(f" ERROR: housekeeping 커밋 실패: {r.stderr.strip()}") sys.exit(1) ``` Update caller: ```python self._commit_step(step, step_name) ``` - [ ] **Step 4: Run targeted tests** Run: ```powershell python -m unittest scripts.test_execute -v ``` Expected: fail only if `_validate_before_commit()` is not implemented yet. If so, continue to Task 7 before expecting full pass. ## Task 7: Validation Before Commit **Files:** - Modify: `scripts/test_execute.py` - Modify: `scripts/execute.py` - [ ] **Step 1: Add failing tests for validation command behavior** Add these methods: ```python def test_validate_before_commit_runs_python_selftest_then_workspace_validation(self): execute = load_execute() with tempfile.TemporaryDirectory() as tmp: root = Path(tmp) write_phase(root) executor = make_executor(execute, root) commands = [] def fake_run(cmd, **kwargs): commands.append(cmd) return subprocess.CompletedProcess(cmd, 0, "ok", "") with patch.object(execute.subprocess, "run", side_effect=fake_run): executor._validate_before_commit("feat(0-mvp): step 1") self.assertEqual( commands, [ [sys.executable, "-m", "unittest", "discover", "-s", "scripts", "-p", "test_*.py"], [sys.executable, "scripts/validate_workspace.py"], ], ) def test_validate_before_commit_exits_before_commit_when_validation_fails(self): execute = load_execute() with tempfile.TemporaryDirectory() as tmp: root = Path(tmp) write_phase(root) executor = make_executor(execute, root) def fake_run(cmd, **kwargs): return subprocess.CompletedProcess(cmd, 1, "bad", "failed") with patch.object(execute.subprocess, "run", side_effect=fake_run): with self.assertRaises(SystemExit) as cm: executor._validate_before_commit("feat(0-mvp): step 1") self.assertEqual(cm.exception.code, 1) ``` Add `import sys` to `scripts/test_execute.py`. - [ ] **Step 2: Run tests and verify failure** Run: ```powershell python -m unittest scripts.test_execute -v ``` Expected: fail because `_validate_before_commit()` does not exist. - [ ] **Step 3: Implement validation-before-commit** Add this class constant near `MAX_RETRIES`: ```python VALIDATION_COMMANDS = ( [sys.executable, "-m", "unittest", "discover", "-s", "scripts", "-p", "test_*.py"], [sys.executable, "scripts/validate_workspace.py"], ) ``` Add this method: ```python def _validate_before_commit(self, commit_message: str): print(f" Validation before commit: {commit_message}") for cmd in self.VALIDATION_COMMANDS: r = subprocess.run(cmd, cwd=self._root, capture_output=True, text=True) if r.returncode != 0: print(f" ERROR: validation failed before commit: {' '.join(cmd)}") if r.stdout: print(r.stdout[-2000:]) if r.stderr: print(r.stderr[-2000:]) sys.exit(1) ``` - [ ] **Step 4: Run targeted tests** Run: ```powershell python -m unittest scripts.test_execute -v ``` Expected: pass. - [ ] **Step 5: Commit** Run: ```powershell git add scripts/test_execute.py scripts/execute.py git commit -m "fix: validate execute runner before commits" ``` ## Task 8: Finalize Without Broad Staging **Files:** - Modify: `scripts/test_execute.py` - Modify: `scripts/execute.py` - [ ] **Step 1: Add failing test for final commit explicit staging** Add this method: ```python def test_finalize_stages_only_phase_indexes(self): execute = load_execute() with tempfile.TemporaryDirectory() as tmp: root = Path(tmp) write_phase(root) (root / "phases" / "index.json").write_text('{"phases":[]}', encoding="utf-8") executor = make_executor(execute, root) calls = [] def fake_git(*args): calls.append(args) if args == ("diff", "--cached", "--quiet"): return subprocess.CompletedProcess(args, 1, "", "") return subprocess.CompletedProcess(args, 0, "", "") with patch.object(executor, "_run_git", side_effect=fake_git): with patch.object(executor, "_validate_before_commit"): executor._finalize() self.assertNotIn(("add", "-A"), calls) self.assertIn(("add", "--", "phases/0-mvp/index.json", "phases/index.json"), calls) ``` - [ ] **Step 2: Run tests and verify failure** Run: ```powershell python -m unittest scripts.test_execute -v ``` Expected: fail because `_finalize()` still uses broad staging. - [ ] **Step 3: Replace final staging** In `_finalize()`, replace: ```python self._run_git("add", "-A") ``` with: ```python final_paths = [f"phases/{self._phase_dir_name}/index.json"] if self._top_index_file.exists(): final_paths.append("phases/index.json") self._validate_before_commit(f"chore({self._phase_name}): mark phase completed") self._stage_paths(final_paths) ``` Keep the existing cached diff check and commit message, but change commit failure from silent no-op to fatal: ```python if r.returncode == 0: print(f" ✓ {msg}") else: print(f" ERROR: phase completion commit failed: {r.stderr.strip()}") sys.exit(1) ``` - [ ] **Step 4: Run targeted tests** Run: ```powershell python -m unittest scripts.test_execute -v ``` Expected: pass. - [ ] **Step 5: Commit** Run: ```powershell git add scripts/test_execute.py scripts/execute.py git commit -m "fix: stage execute finalization explicitly" ``` ## Task 9: Documentation and Existing Contract Tests **Files:** - Modify: `AGENTS.md` - Modify: `docs/ARCHITECTURE.md` - Modify: `.codex/skills/harness-workflow/SKILL.md` only to remove stale legacy feat branch wording; do not use the skill. - Modify: existing `scripts/test_*` files only if they assert stale branch or staging behavior. - [ ] **Step 1: Update `AGENTS.md` runner rules** Add these bullets under the Harness runner rules: ```markdown - `scripts/execute.py`는 `codex/` branch prefix만 사용한다. - `scripts/execute.py` 실행 전 worktree는 clean 상태여야 한다. - phase step은 `allowed_paths`를 선언해야 하며 runner는 allowlist 밖 변경을 커밋하지 않는다. - runner는 broad staging을 사용하지 않고 explicit path만 stage한다. - runner가 만드는 모든 commit 전에는 Python self-test와 `python scripts/validate_workspace.py`가 통과해야 한다. ``` - [ ] **Step 2: Update architecture docs** In `docs/ARCHITECTURE.md`, update the Harness execution layer section so it states: ```markdown scripts/execute.py: - creates/checks out `codex/` - refuses to run on a dirty worktree - requires per-step `allowed_paths` - stages only explicit allowed paths and runner housekeeping files - runs Python self-test and workspace validation before every runner-created commit ``` - [ ] **Step 3: Update stale harness workflow description without invoking the skill** In `.codex/skills/harness-workflow/SKILL.md`, replace the sentence that says: ```markdown `scripts/execute.py` creates or checks out the legacy feat task branch ``` with: ```markdown `scripts/execute.py` creates or checks out `codex/{task-name}` ``` Also add one sentence that phase steps must provide `allowed_paths`. - [ ] **Step 4: Run docs-related searches** Run: ```powershell rg -n "legacy-feat-branch|broad staging|allowed_paths|codex/" AGENTS.md docs .codex scripts ``` Expected: - No stale `scripts/execute.py` branch docs saying the legacy feat task branch. - No broad staging in `scripts/execute.py`. - `allowed_paths` appears in runner docs and tests. - [ ] **Step 5: Commit** Run: ```powershell git add AGENTS.md docs/ARCHITECTURE.md .codex/skills/harness-workflow/SKILL.md git commit -m "docs: document safe execute runner contract" ``` ## Task 10: Full Verification **Files:** - Read: all changed files - [ ] **Step 1: Run Python self-tests** Run: ```powershell python -m unittest discover -s scripts -p "test_*.py" ``` Expected: ```text OK ``` - [ ] **Step 2: Run workspace validation** Run: ```powershell python scripts/validate_workspace.py ``` Expected in current no-CMake workspace: ```text No C++ validation commands configured. Add CMakeLists.txt or set HARNESS_VALIDATION_COMMANDS. ``` Exit code must be `0`. - [ ] **Step 3: Run whitespace check** Run: ```powershell git diff --check ``` Expected: exit code `0`. LF-to-CRLF warnings are acceptable if there are no whitespace error lines. - [ ] **Step 4: Run safety searches** Run: ```powershell rg -n "legacy-feat-branch|broad staging" scripts/execute.py AGENTS.md docs .codex ``` Expected: no matches. Run: ```powershell rg -n "allowed_paths|codex/" scripts/execute.py scripts/test_execute.py AGENTS.md docs/ARCHITECTURE.md .codex/skills/harness-workflow/SKILL.md ``` Expected: matches showing the new branch prefix and allowlist contract. - [ ] **Step 5: Final commit if verification-only edits were needed** Run only if Task 10 caused additional edits: ```powershell git add scripts/execute.py scripts/test_execute.py AGENTS.md docs/ARCHITECTURE.md .codex/skills/harness-workflow/SKILL.md git commit -m "fix: harden execute runner safety checks" ``` ## Acceptance Checklist - `scripts/execute.py` contains no broad staging. - `_branch_name()` returns a `codex/` branch name and `_checkout_branch()`/`_finalize()` both use it. - `run()` calls `_assert_clean_worktree()` before `_checkout_branch()`. - Every pending step must have a non-empty `allowed_paths`. - Runner-managed phase index/output files are allowed as housekeeping, not mixed with code/doc step commits. - Any path outside `allowed_paths` and housekeeping paths blocks commit and exits non-zero. - `_validate_before_commit()` runs Python self-tests and `scripts/validate_workspace.py` before every runner-created commit. - Validation failure exits before staging/committing. - Commit failure exits non-zero. - Documentation no longer describes `feat-` branches for `scripts/execute.py`.