FESADev/docs/superpowers/plans/2026-06-11-harden-execute-runner.md

# Harden Execute Runner Implementation Plan

> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.

**Goal:** Make `scripts/execute.py` safe enough to use by enforcing `codex/` branch names, clean starting state, explicit staging, per-step file allowlists, and validation before every runner-created commit.

**Architecture:** Keep the runner as a single Python module and add small helper methods to `StepExecutor` instead of introducing a new framework. Add focused `unittest` coverage in `scripts/test_execute.py` using temporary phase directories and mocked subprocess/git calls, then update stale docs only after behavior is covered.

**Tech Stack:** Python 3 standard library, `unittest`, `unittest.mock`, Git command-line interface, existing `python scripts/validate_workspace.py`.

---

## Scope

This plan does not use the `harness-workflow` skill. It only prepares `scripts/execute.py` so that using it later is less likely to damage unrelated work.

Required behavior:
- Branches created or used by the runner must use `codex/<phase-name>`.
- Runner startup must abort on a dirty worktree before branch checkout or Codex execution.
- Runner commits must never use broad git staging.
- Each phase step must define an allowlist of paths it may modify.
- Runner must refuse to commit disallowed file changes.
- Runner must run Python self-tests and workspace validation before every commit it creates.
- Commit failure and validation failure must stop the runner instead of only warning.

## File Map

- Modify `scripts/execute.py`: branch naming, dirty worktree check, allowlist parsing, explicit staging, validation-before-commit, fatal commit failures.
- Create `scripts/test_execute.py`: unit tests for runner safety behavior.
- Modify `AGENTS.md`: document `codex/` branch prefix and safer runner requirements.
- Modify `docs/ARCHITECTURE.md`: update Harness execution layer notes if they mention unsafe staging or old branch behavior.
- Modify `.codex/skills/harness-workflow/SKILL.md`: update stale documentation that says the runner creates the legacy feat task branch. Do not invoke the skill while making this edit.

## Data Contract

Extend each step object in `phases/<phase-dir>/index.json` with `allowed_paths`.

```json
{
  "step": 1,
  "name": "Update docs",
  "status": "pending",
  "summary": "",
  "allowed_paths": [
    "docs/*.md",
    "docs/**/*.md",
    "AGENTS.md"
  ]
}
```

Allowlist semantics:
- Paths are repository-relative and normalized to forward slashes.
- Exact file path matches are allowed.
- Directory prefix entries ending with `/` allow every path under that directory.
- Glob entries using `*`, `?`, or `[` are matched with `fnmatch.fnmatchcase`.
- Runner-managed housekeeping files are always allowed separately:
  - `phases/<phase-dir>/index.json`
  - `phases/<phase-dir>/step<step-num>-output.json`
  - `phases/index.json`

## Task 1: Test Harness for `scripts/execute.py`

**Files:**
- Create: `scripts/test_execute.py`
- Read: `scripts/execute.py`

- [ ] **Step 1: Add loader and temporary phase helpers**

Create `scripts/test_execute.py` with this initial content:

```python
import importlib.util
import json
import subprocess
import tempfile
import unittest
from pathlib import Path
from unittest.mock import patch


def load_execute():
    module_path = Path(__file__).resolve().parent / "execute.py"
    spec = importlib.util.spec_from_file_location("execute", module_path)
    module = importlib.util.module_from_spec(spec)
    spec.loader.exec_module(module)
    return module


def write_phase(root: Path, phase_dir: str = "0-mvp", phase_name: str = "0-mvp", steps=None):
    phase_path = root / "phases" / phase_dir
    phase_path.mkdir(parents=True)
    if steps is None:
        steps = [
            {
                "step": 1,
                "name": "Docs",
                "status": "pending",
                "summary": "",
                "allowed_paths": ["docs/*.md"],
            }
        ]
    (phase_path / "index.json").write_text(
        json.dumps({"project": "FESA", "phase": phase_name, "steps": steps}, indent=2),
        encoding="utf-8",
    )
    (phase_path / "step1.md").write_text("# Step 1\n", encoding="utf-8")
    return phase_path


def make_executor(execute, root: Path, phase_dir: str = "0-mvp"):
    with patch.object(execute, "ROOT", root):
        return execute.StepExecutor(phase_dir)


class ExecuteRunnerSafetyTests(unittest.TestCase):
    pass


if __name__ == "__main__":
    unittest.main()
```

- [ ] **Step 2: Run the new empty test module**

Run:

```powershell
python -m unittest scripts.test_execute
```

Expected: pass with one empty test class loaded and no failures.

- [ ] **Step 3: Commit the test scaffold**

Run:

```powershell
git add scripts/test_execute.py
git commit -m "test: add execute runner safety test scaffold"
```

## Task 2: `codex/` Branch Prefix

**Files:**
- Modify: `scripts/test_execute.py`
- Modify: `scripts/execute.py`

- [ ] **Step 1: Add failing tests for branch naming and push branch**

Add these methods to `ExecuteRunnerSafetyTests`:

```python
    def test_branch_name_uses_codex_prefix_and_sanitized_phase(self):
        execute = load_execute()
        with tempfile.TemporaryDirectory() as tmp:
            root = Path(tmp)
            write_phase(root, phase_name="linear truss/1d")
            executor = make_executor(execute, root)

            self.assertEqual(executor._branch_name(), "codex/linear-truss-1d")

    def test_finalize_push_uses_codex_branch_name(self):
        execute = load_execute()
        with tempfile.TemporaryDirectory() as tmp:
            root = Path(tmp)
            write_phase(root, phase_name="0-mvp")
            executor = make_executor(execute, root)
            executor._auto_push = True
            calls = []

            def fake_git(*args):
                calls.append(args)
                if args == ("diff", "--cached", "--quiet"):
                    return subprocess.CompletedProcess(args, 0, "", "")
                return subprocess.CompletedProcess(args, 0, "", "")

            with patch.object(executor, "_run_git", side_effect=fake_git):
                with patch.object(executor, "_validate_before_commit", create=True):
                    executor._finalize()

            self.assertIn(("push", "-u", "origin", "codex/0-mvp"), calls)
```

- [ ] **Step 2: Run tests and verify failure**

Run:

```powershell
python -m unittest scripts.test_execute -v
```

Expected: fail because `StepExecutor` has no `_branch_name()` and `_finalize()` still uses `feat-`.

- [ ] **Step 3: Implement branch naming**

In `scripts/execute.py`, import `re` near the other imports:

```python
import re
```

Add this method to `StepExecutor` under the `# --- git ---` section:

```python
    def _branch_name(self) -> str:
        slug = re.sub(r"[^A-Za-z0-9._-]+", "-", self._phase_name.strip())
        slug = slug.strip("/.-")
        if not slug:
            slug = self._phase_dir_name
        return f"codex/{slug}"
```

Replace both direct branch constructions:

```python
branch = self._branch_name()
```

with:

```python
branch = self._branch_name()
```

- [ ] **Step 4: Run targeted tests**

Run:

```powershell
python -m unittest scripts.test_execute -v
```

Expected: pass.

- [ ] **Step 5: Commit**

Run:

```powershell
git add scripts/test_execute.py scripts/execute.py
git commit -m "fix: use codex branch prefix in execute runner"
```

## Task 3: Dirty Worktree Protection

**Files:**
- Modify: `scripts/test_execute.py`
- Modify: `scripts/execute.py`

- [ ] **Step 1: Add failing tests for dirty worktree guard**

Add these methods:

```python
    def test_assert_clean_worktree_exits_when_git_status_has_changes(self):
        execute = load_execute()
        with tempfile.TemporaryDirectory() as tmp:
            root = Path(tmp)
            write_phase(root)
            executor = make_executor(execute, root)

            with patch.object(
                executor,
                "_run_git",
                return_value=subprocess.CompletedProcess([], 0, " M AGENTS.md\n?? scratch.txt\n", ""),
            ):
                with self.assertRaises(SystemExit) as cm:
                    executor._assert_clean_worktree("before checkout")

            self.assertEqual(cm.exception.code, 1)

    def test_run_checks_clean_worktree_before_checkout(self):
        execute = load_execute()
        with tempfile.TemporaryDirectory() as tmp:
            root = Path(tmp)
            write_phase(root)
            executor = make_executor(execute, root)
            calls = []

            def record(name):
                def inner(*args, **kwargs):
                    calls.append(name)
                return inner

            with patch.object(executor, "_assert_clean_worktree", side_effect=record("clean")):
                with patch.object(executor, "_checkout_branch", side_effect=record("checkout")):
                    with patch.object(executor, "_print_header"):
                        with patch.object(executor, "_check_blockers"):
                            with patch.object(executor, "_load_guardrails", return_value=""):
                                with patch.object(executor, "_ensure_created_at"):
                                    with patch.object(executor, "_execute_all_steps"):
                                        with patch.object(executor, "_finalize"):
                                            executor.run()

            self.assertLess(calls.index("clean"), calls.index("checkout"))
```

- [ ] **Step 2: Run tests and verify failure**

Run:

```powershell
python -m unittest scripts.test_execute -v
```

Expected: fail because `_assert_clean_worktree()` does not exist and `run()` does not call it.

- [ ] **Step 3: Implement dirty worktree guard**

Add this method under `# --- git ---`:

```python
    def _assert_clean_worktree(self, context: str):
        r = self._run_git("status", "--porcelain")
        if r.returncode != 0:
            print("  ERROR: git status failed.")
            print(f"  {r.stderr.strip()}")
            sys.exit(1)
        dirty = r.stdout.strip()
        if dirty:
            print(f"  ERROR: dirty worktree detected {context}.")
            print("  Commit, stash, or remove these changes before running scripts/execute.py:")
            for line in dirty.splitlines():
                print(f"    {line}")
            sys.exit(1)
```

Change `run()` so the guard runs before branch checkout:

```python
        self._print_header()
        self._check_blockers()
        self._assert_clean_worktree("before branch checkout")
        self._checkout_branch()
```

- [ ] **Step 4: Run targeted tests**

Run:

```powershell
python -m unittest scripts.test_execute -v
```

Expected: pass.

- [ ] **Step 5: Commit**

Run:

```powershell
git add scripts/test_execute.py scripts/execute.py
git commit -m "fix: block execute runner on dirty worktree"
```

## Task 4: Per-Step File Allowlist

**Files:**
- Modify: `scripts/test_execute.py`
- Modify: `scripts/execute.py`

- [ ] **Step 1: Add failing tests for allowlist matching and missing allowlist**

Add these methods:

```python
    def test_step_allowlist_accepts_exact_prefix_and_glob_paths(self):
        execute = load_execute()
        with tempfile.TemporaryDirectory() as tmp:
            root = Path(tmp)
            write_phase(root)
            executor = make_executor(execute, root)
            patterns = ["AGENTS.md", "docs/", "scripts/*.py"]

            self.assertTrue(executor._path_allowed("AGENTS.md", patterns))
            self.assertTrue(executor._path_allowed("docs/PRD.md", patterns))
            self.assertTrue(executor._path_allowed("scripts/execute.py", patterns))
            self.assertFalse(executor._path_allowed(".codex/hooks.json", patterns))

    def test_step_without_allowed_paths_is_rejected_before_codex_invocation(self):
        execute = load_execute()
        steps = [{"step": 1, "name": "Unsafe", "status": "pending", "summary": ""}]
        with tempfile.TemporaryDirectory() as tmp:
            root = Path(tmp)
            write_phase(root, steps=steps)
            executor = make_executor(execute, root)

            with self.assertRaises(SystemExit) as cm:
                executor._validate_step_allowlist(steps[0])

            self.assertEqual(cm.exception.code, 1)
```

- [ ] **Step 2: Run tests and verify failure**

Run:

```powershell
python -m unittest scripts.test_execute -v
```

Expected: fail because allowlist helpers do not exist.

- [ ] **Step 3: Implement allowlist helpers**

Add `fnmatch` import:

```python
import fnmatch
```

Add these methods under `# --- git ---` or a new `# --- file allowlist ---` section:

```python
    @staticmethod
    def _normalize_rel_path(path: str) -> str:
        return path.replace("\\", "/").lstrip("./")

    def _path_allowed(self, path: str, patterns: list[str]) -> bool:
        rel = self._normalize_rel_path(path)
        for raw in patterns:
            pattern = self._normalize_rel_path(str(raw))
            if not pattern:
                continue
            if pattern.endswith("/") and rel.startswith(pattern):
                return True
            if any(ch in pattern for ch in "*?[") and fnmatch.fnmatchcase(rel, pattern):
                return True
            if rel == pattern:
                return True
        return False

    def _validate_step_allowlist(self, step: dict):
        allowed = step.get("allowed_paths")
        if not isinstance(allowed, list) or not allowed or not all(isinstance(p, str) and p.strip() for p in allowed):
            print(f"  ERROR: Step {step.get('step')} must define non-empty allowed_paths.")
            sys.exit(1)
```

Call `_validate_step_allowlist(pending)` before `_execute_single_step(pending, guardrails)` in `_execute_all_steps()`.

- [ ] **Step 4: Add allowlist to prompt context**

In `_build_preamble`, add a parameter:

```python
    def _build_preamble(self, guardrails: str, step_context: str,
                        allowed_paths: list[str],
                        prev_error: Optional[str] = None) -> str:
```

Add this block before `## 작업 규칙`:

```python
            f"## Step 파일 allowlist\n\n"
            f"이 step은 아래 repository-relative path만 수정할 수 있습니다.\n"
            f"{chr(10).join(f'- {p}' for p in allowed_paths)}\n\n"
```

Update the caller in `_execute_single_step()`:

```python
preamble = self._build_preamble(guardrails, step_context, step.get("allowed_paths", []), prev_error)
```

- [ ] **Step 5: Run targeted tests**

Run:

```powershell
python -m unittest scripts.test_execute -v
```

Expected: pass.

- [ ] **Step 6: Commit**

Run:

```powershell
git add scripts/test_execute.py scripts/execute.py
git commit -m "fix: require execute step file allowlists"
```

## Task 5: Detect and Block Disallowed Step Changes

**Files:**
- Modify: `scripts/test_execute.py`
- Modify: `scripts/execute.py`

- [ ] **Step 1: Add failing tests for changed path classification**

Add this method:

```python
    def test_classify_step_changes_splits_allowed_housekeeping_and_disallowed_paths(self):
        execute = load_execute()
        step = {
            "step": 1,
            "name": "Docs",
            "status": "completed",
            "summary": "",
            "allowed_paths": ["docs/*.md"],
        }
        with tempfile.TemporaryDirectory() as tmp:
            root = Path(tmp)
            write_phase(root)
            executor = make_executor(execute, root)
            changed = [
                "docs/PRD.md",
                "phases/0-mvp/index.json",
                "phases/0-mvp/step1-output.json",
                "scripts/execute.py",
            ]

            allowed, housekeeping, disallowed = executor._classify_step_changes(1, step, changed)

            self.assertEqual(allowed, ["docs/PRD.md"])
            self.assertEqual(housekeeping, ["phases/0-mvp/index.json", "phases/0-mvp/step1-output.json"])
            self.assertEqual(disallowed, ["scripts/execute.py"])
```

- [ ] **Step 2: Run tests and verify failure**

Run:

```powershell
python -m unittest scripts.test_execute -v
```

Expected: fail because `_classify_step_changes()` does not exist.

- [ ] **Step 3: Implement changed path collection and classification**

Add these methods:

```python
    def _changed_paths(self) -> list[str]:
        paths: list[str] = []
        tracked = self._run_git("diff", "--name-only")
        if tracked.returncode != 0:
            print("  ERROR: git diff --name-only failed.")
            print(f"  {tracked.stderr.strip()}")
            sys.exit(1)
        paths.extend(tracked.stdout.splitlines())

        staged = self._run_git("diff", "--cached", "--name-only")
        if staged.returncode != 0:
            print("  ERROR: git diff --cached --name-only failed.")
            print(f"  {staged.stderr.strip()}")
            sys.exit(1)
        paths.extend(staged.stdout.splitlines())

        untracked = self._run_git("ls-files", "--others", "--exclude-standard")
        if untracked.returncode != 0:
            print("  ERROR: git ls-files --others failed.")
            print(f"  {untracked.stderr.strip()}")
            sys.exit(1)
        paths.extend(untracked.stdout.splitlines())

        return sorted({self._normalize_rel_path(p) for p in paths if p.strip()})

    def _housekeeping_paths(self, step_num: int) -> set[str]:
        return {
            f"phases/{self._phase_dir_name}/index.json",
            f"phases/{self._phase_dir_name}/step{step_num}-output.json",
            "phases/index.json",
        }

    def _classify_step_changes(self, step_num: int, step: dict, changed_paths: list[str]) -> tuple[list[str], list[str], list[str]]:
        allowed_patterns = step.get("allowed_paths", [])
        housekeeping_set = self._housekeeping_paths(step_num)
        allowed: list[str] = []
        housekeeping: list[str] = []
        disallowed: list[str] = []
        for path in changed_paths:
            rel = self._normalize_rel_path(path)
            if rel in housekeeping_set:
                housekeeping.append(rel)
            elif self._path_allowed(rel, allowed_patterns):
                allowed.append(rel)
            else:
                disallowed.append(rel)
        return allowed, housekeeping, disallowed
```

- [ ] **Step 4: Run targeted tests**

Run:

```powershell
python -m unittest scripts.test_execute -v
```

Expected: pass.

- [ ] **Step 5: Commit**

Run:

```powershell
git add scripts/test_execute.py scripts/execute.py
git commit -m "fix: classify execute step file changes"
```

## Task 6: Remove Broad Staging and Stage Explicit Paths Only

**Files:**
- Modify: `scripts/test_execute.py`
- Modify: `scripts/execute.py`

- [ ] **Step 1: Add failing test that commit path never uses broad staging**

Add this method:

```python
    def test_commit_step_stages_only_explicit_allowed_and_housekeeping_paths(self):
        execute = load_execute()
        step = {
            "step": 1,
            "name": "Docs",
            "status": "completed",
            "summary": "",
            "allowed_paths": ["docs/*.md"],
        }
        with tempfile.TemporaryDirectory() as tmp:
            root = Path(tmp)
            write_phase(root)
            executor = make_executor(execute, root)
            calls = []

            def fake_git(*args):
                calls.append(args)
                if args in {
                    ("diff", "--quiet", "--cached", "--"),
                    ("diff", "--cached", "--quiet"),
                }:
                    return subprocess.CompletedProcess(args, 1, "", "")
                return subprocess.CompletedProcess(args, 0, "", "")

            with patch.object(executor, "_changed_paths", return_value=["docs/PRD.md", "phases/0-mvp/index.json", "phases/0-mvp/step1-output.json"]):
                with patch.object(executor, "_run_git", side_effect=fake_git):
                    with patch.object(executor, "_validate_before_commit"):
                        executor._commit_step(step, "Docs")

            self.assertNotIn(("add", "-A"), calls)
            self.assertIn(("add", "--", "docs/PRD.md"), calls)
            self.assertIn(("add", "--", "phases/0-mvp/index.json", "phases/0-mvp/step1-output.json"), calls)
```

- [ ] **Step 2: Run tests and verify failure**

Run:

```powershell
python -m unittest scripts.test_execute -v
```

Expected: fail because `_commit_step()` still accepts `(step_num, step_name)` and uses broad staging.

- [ ] **Step 3: Implement explicit staging helper**

Add this method:

```python
    def _stage_paths(self, paths: list[str]):
        if not paths:
            return
        r = self._run_git("add", "--", *paths)
        if r.returncode != 0:
            print("  ERROR: git add failed.")
            print(f"  {r.stderr.strip()}")
            sys.exit(1)
```

Change `_commit_step` signature:

```python
    def _commit_step(self, step: dict, step_name: str):
        step_num = step["step"]
```

Replace the old broad staging logic with:

```python
        changed = self._changed_paths()
        allowed, housekeeping, disallowed = self._classify_step_changes(step_num, step, changed)
        if disallowed:
            print(f"  ERROR: Step {step_num} modified files outside allowed_paths:")
            for path in disallowed:
                print(f"    {path}")
            sys.exit(1)

        if allowed:
            msg = self.FEAT_MSG.format(phase=self._phase_name, num=step_num, name=step_name)
            self._validate_before_commit(msg)
            self._stage_paths(allowed)
            if self._run_git("diff", "--cached", "--quiet").returncode != 0:
                r = self._run_git("commit", "-m", msg)
                if r.returncode != 0:
                    print(f"  ERROR: 코드 커밋 실패: {r.stderr.strip()}")
                    sys.exit(1)
                print(f"  Commit: {msg}")

        if housekeeping:
            msg = self.CHORE_MSG.format(phase=self._phase_name, num=step_num)
            self._validate_before_commit(msg)
            self._stage_paths(housekeeping)
            if self._run_git("diff", "--cached", "--quiet").returncode != 0:
                r = self._run_git("commit", "-m", msg)
                if r.returncode != 0:
                    print(f"  ERROR: housekeeping 커밋 실패: {r.stderr.strip()}")
                    sys.exit(1)
```

Update caller:

```python
self._commit_step(step, step_name)
```

- [ ] **Step 4: Run targeted tests**

Run:

```powershell
python -m unittest scripts.test_execute -v
```

Expected: fail only if `_validate_before_commit()` is not implemented yet. If so, continue to Task 7 before expecting full pass.

## Task 7: Validation Before Commit

**Files:**
- Modify: `scripts/test_execute.py`
- Modify: `scripts/execute.py`

- [ ] **Step 1: Add failing tests for validation command behavior**

Add these methods:

```python
    def test_validate_before_commit_runs_python_selftest_then_workspace_validation(self):
        execute = load_execute()
        with tempfile.TemporaryDirectory() as tmp:
            root = Path(tmp)
            write_phase(root)
            executor = make_executor(execute, root)
            commands = []

            def fake_run(cmd, **kwargs):
                commands.append(cmd)
                return subprocess.CompletedProcess(cmd, 0, "ok", "")

            with patch.object(execute.subprocess, "run", side_effect=fake_run):
                executor._validate_before_commit("feat(0-mvp): step 1")

            self.assertEqual(
                commands,
                [
                    [sys.executable, "-m", "unittest", "discover", "-s", "scripts", "-p", "test_*.py"],
                    [sys.executable, "scripts/validate_workspace.py"],
                ],
            )

    def test_validate_before_commit_exits_before_commit_when_validation_fails(self):
        execute = load_execute()
        with tempfile.TemporaryDirectory() as tmp:
            root = Path(tmp)
            write_phase(root)
            executor = make_executor(execute, root)

            def fake_run(cmd, **kwargs):
                return subprocess.CompletedProcess(cmd, 1, "bad", "failed")

            with patch.object(execute.subprocess, "run", side_effect=fake_run):
                with self.assertRaises(SystemExit) as cm:
                    executor._validate_before_commit("feat(0-mvp): step 1")

            self.assertEqual(cm.exception.code, 1)
```

Add `import sys` to `scripts/test_execute.py`.

- [ ] **Step 2: Run tests and verify failure**

Run:

```powershell
python -m unittest scripts.test_execute -v
```

Expected: fail because `_validate_before_commit()` does not exist.

- [ ] **Step 3: Implement validation-before-commit**

Add this class constant near `MAX_RETRIES`:

```python
    VALIDATION_COMMANDS = (
        [sys.executable, "-m", "unittest", "discover", "-s", "scripts", "-p", "test_*.py"],
        [sys.executable, "scripts/validate_workspace.py"],
    )
```

Add this method:

```python
    def _validate_before_commit(self, commit_message: str):
        print(f"  Validation before commit: {commit_message}")
        for cmd in self.VALIDATION_COMMANDS:
            r = subprocess.run(cmd, cwd=self._root, capture_output=True, text=True)
            if r.returncode != 0:
                print(f"  ERROR: validation failed before commit: {' '.join(cmd)}")
                if r.stdout:
                    print(r.stdout[-2000:])
                if r.stderr:
                    print(r.stderr[-2000:])
                sys.exit(1)
```

- [ ] **Step 4: Run targeted tests**

Run:

```powershell
python -m unittest scripts.test_execute -v
```

Expected: pass.

- [ ] **Step 5: Commit**

Run:

```powershell
git add scripts/test_execute.py scripts/execute.py
git commit -m "fix: validate execute runner before commits"
```

## Task 8: Finalize Without Broad Staging

**Files:**
- Modify: `scripts/test_execute.py`
- Modify: `scripts/execute.py`

- [ ] **Step 1: Add failing test for final commit explicit staging**

Add this method:

```python
    def test_finalize_stages_only_phase_indexes(self):
        execute = load_execute()
        with tempfile.TemporaryDirectory() as tmp:
            root = Path(tmp)
            write_phase(root)
            (root / "phases" / "index.json").write_text('{"phases":[]}', encoding="utf-8")
            executor = make_executor(execute, root)
            calls = []

            def fake_git(*args):
                calls.append(args)
                if args == ("diff", "--cached", "--quiet"):
                    return subprocess.CompletedProcess(args, 1, "", "")
                return subprocess.CompletedProcess(args, 0, "", "")

            with patch.object(executor, "_run_git", side_effect=fake_git):
                with patch.object(executor, "_validate_before_commit"):
                    executor._finalize()

            self.assertNotIn(("add", "-A"), calls)
            self.assertIn(("add", "--", "phases/0-mvp/index.json", "phases/index.json"), calls)
```

- [ ] **Step 2: Run tests and verify failure**

Run:

```powershell
python -m unittest scripts.test_execute -v
```

Expected: fail because `_finalize()` still uses broad staging.

- [ ] **Step 3: Replace final staging**

In `_finalize()`, replace:

```python
        self._run_git("add", "-A")
```

with:

```python
        final_paths = [f"phases/{self._phase_dir_name}/index.json"]
        if self._top_index_file.exists():
            final_paths.append("phases/index.json")
        self._validate_before_commit(f"chore({self._phase_name}): mark phase completed")
        self._stage_paths(final_paths)
```

Keep the existing cached diff check and commit message, but change commit failure from silent no-op to fatal:

```python
            if r.returncode == 0:
                print(f"  ✓ {msg}")
            else:
                print(f"  ERROR: phase completion commit failed: {r.stderr.strip()}")
                sys.exit(1)
```

- [ ] **Step 4: Run targeted tests**

Run:

```powershell
python -m unittest scripts.test_execute -v
```

Expected: pass.

- [ ] **Step 5: Commit**

Run:

```powershell
git add scripts/test_execute.py scripts/execute.py
git commit -m "fix: stage execute finalization explicitly"
```

## Task 9: Documentation and Existing Contract Tests

**Files:**
- Modify: `AGENTS.md`
- Modify: `docs/ARCHITECTURE.md`
- Modify: `.codex/skills/harness-workflow/SKILL.md` only to remove stale legacy feat branch wording; do not use the skill.
- Modify: existing `scripts/test_*` files only if they assert stale branch or staging behavior.

- [ ] **Step 1: Update `AGENTS.md` runner rules**

Add these bullets under the Harness runner rules:

```markdown
- `scripts/execute.py`는 `codex/<phase-name>` branch prefix만 사용한다.
- `scripts/execute.py` 실행 전 worktree는 clean 상태여야 한다.
- phase step은 `allowed_paths`를 선언해야 하며 runner는 allowlist 밖 변경을 커밋하지 않는다.
- runner는 broad staging을 사용하지 않고 explicit path만 stage한다.
- runner가 만드는 모든 commit 전에는 Python self-test와 `python scripts/validate_workspace.py`가 통과해야 한다.
```

- [ ] **Step 2: Update architecture docs**

In `docs/ARCHITECTURE.md`, update the Harness execution layer section so it states:

```markdown
scripts/execute.py:
- creates/checks out `codex/<phase-name>`
- refuses to run on a dirty worktree
- requires per-step `allowed_paths`
- stages only explicit allowed paths and runner housekeeping files
- runs Python self-test and workspace validation before every runner-created commit
```

- [ ] **Step 3: Update stale harness workflow description without invoking the skill**

In `.codex/skills/harness-workflow/SKILL.md`, replace the sentence that says:

```markdown
`scripts/execute.py` creates or checks out the legacy feat task branch
```

with:

```markdown
`scripts/execute.py` creates or checks out `codex/{task-name}`
```

Also add one sentence that phase steps must provide `allowed_paths`.

- [ ] **Step 4: Run docs-related searches**

Run:

```powershell
rg -n "legacy-feat-branch|broad staging|allowed_paths|codex/" AGENTS.md docs .codex scripts
```

Expected:
- No stale `scripts/execute.py` branch docs saying the legacy feat task branch.
- No broad staging in `scripts/execute.py`.
- `allowed_paths` appears in runner docs and tests.

- [ ] **Step 5: Commit**

Run:

```powershell
git add AGENTS.md docs/ARCHITECTURE.md .codex/skills/harness-workflow/SKILL.md
git commit -m "docs: document safe execute runner contract"
```

## Task 10: Full Verification

**Files:**
- Read: all changed files

- [ ] **Step 1: Run Python self-tests**

Run:

```powershell
python -m unittest discover -s scripts -p "test_*.py"
```

Expected:

```text
OK
```

- [ ] **Step 2: Run workspace validation**

Run:

```powershell
python scripts/validate_workspace.py
```

Expected in current no-CMake workspace:

```text
No C++ validation commands configured.
Add CMakeLists.txt or set HARNESS_VALIDATION_COMMANDS.
```

Exit code must be `0`.

- [ ] **Step 3: Run whitespace check**

Run:

```powershell
git diff --check
```

Expected: exit code `0`. LF-to-CRLF warnings are acceptable if there are no whitespace error lines.

- [ ] **Step 4: Run safety searches**

Run:

```powershell
rg -n "legacy-feat-branch|broad staging" scripts/execute.py AGENTS.md docs .codex
```

Expected: no matches.

Run:

```powershell
rg -n "allowed_paths|codex/" scripts/execute.py scripts/test_execute.py AGENTS.md docs/ARCHITECTURE.md .codex/skills/harness-workflow/SKILL.md
```

Expected: matches showing the new branch prefix and allowlist contract.

- [ ] **Step 5: Final commit if verification-only edits were needed**

Run only if Task 10 caused additional edits:

```powershell
git add scripts/execute.py scripts/test_execute.py AGENTS.md docs/ARCHITECTURE.md .codex/skills/harness-workflow/SKILL.md
git commit -m "fix: harden execute runner safety checks"
```

## Acceptance Checklist

- `scripts/execute.py` contains no broad staging.
- `_branch_name()` returns a `codex/` branch name and `_checkout_branch()`/`_finalize()` both use it.
- `run()` calls `_assert_clean_worktree()` before `_checkout_branch()`.
- Every pending step must have a non-empty `allowed_paths`.
- Runner-managed phase index/output files are allowed as housekeeping, not mixed with code/doc step commits.
- Any path outside `allowed_paths` and housekeeping paths blocks commit and exits non-zero.
- `_validate_before_commit()` runs Python self-tests and `scripts/validate_workspace.py` before every runner-created commit.
- Validation failure exits before staging/committing.
- Commit failure exits non-zero.
- Documentation no longer describes `feat-` branches for `scripts/execute.py`.