# v1.7.1 Fixes Plan — Post-Audit Punch List **Status:** READY TO EXECUTE **Date:** 2026-05-17 **Source:** [v1.7.0 audit](v1.7.0-audit-2026-05-17.md) §10.1 + §10.2 **Branch:** continue on `v1.7.0-compound-vault` (local-only; no push) **Goal:** close the 1 BLOCKER + 6 HIGH findings; tag/push **`v1.7.1`** (not `v1.7.0`) when done **Estimated effort:** ~2.5 hours of focused work, 7 small commits --- ## Pre-flight (read once before starting) - Working tree state on resume: - Untracked: `docs/audits/v1.7.0-audit-2026-05-17.md`, `docs/audits/v1.7.1-fixes-plan.md` (this file), `scripts/baseline-v16.py`, `scripts/benchmark-runner.py` - Auto-committed during audit: `96a5505 wiki: auto-commit 2026-05-17 03:17` (the retrieval benchmark corpus at `wiki/meta/retrieval-benchmark-v1.7.md`) - Branch head: `4a362ed` (last v1.7 feat code commit) — wait, actually the auto-commit is on top, so head = `96a5505` - `make test` is green (7 suites, ~1162 assertions) - `bash bin/setup-retrieve.sh --no-llm` is already provisioned (`.vault-meta/chunks/`, `.vault-meta/bm25/`, `.vault-meta/embed-cache.json` exist locally; gitignored) - The PostToolUse hook will auto-commit any change under `wiki/`. Soft-reset after if you want the fix grouped with a chore commit. - After every fix: run `make test`. If green, commit. If red, do not commit until green. --- ## Fix 1 (BLOCKER) — `contextual-prefix.py` data-egress consent gap **Severity:** BLOCKER (must fix before any public push) **Source:** audit §3.2 commit `45a5bd3` cut #6; cross-confirmed by Agent 1 + Agent 3 **File:** [`scripts/contextual-prefix.py`](../../scripts/contextual-prefix.py) **Precedent to mirror:** [`scripts/tiling-check.py:351-352`](../../scripts/tiling-check.py) `--allow-remote-ollama` flag + localhost guard ### What's broken `pick_prefix_tier()` at line 252-258 reads: ```python def pick_prefix_tier(force_synthetic): if force_synthetic: return "synthetic" if os.environ.get("ANTHROPIC_API_KEY"): return "anthropic-api" # ← silently sends page bodies off-machine if shutil.which("claude"): return "claude-cli" # ← also off-machine, via subprocess return "synthetic" ``` A user with `ANTHROPIC_API_KEY` in their env (very common — CC users) runs `bash bin/setup-retrieve.sh` and gets their wiki page bodies streamed to `https://api.anthropic.com/v1/messages` with no prompt, no log warning, no opt-in. The existing precedent (tiling-check.py's `--allow-remote-ollama`) shows this codebase already has a pattern for explicit consent on data egress — contextual-prefix.py just didn't follow it. ### Changes 1. **`scripts/contextual-prefix.py`** — add `--allow-egress` CLI flag, default `False`. Without the flag, `pick_prefix_tier()` returns `"synthetic"` regardless of env vars or `claude` binary. Update the help text in the docstring (lines 11-19) to document the flag + the default. ```python # in main(), after the existing flags: parser.add_argument("--allow-egress", action="store_true", help="Allow tier-1 (Anthropic API) or tier-2 (claude CLI subprocess) " "prefix generation. Without this flag, page bodies stay on-machine " "and only the tier-3 synthetic prefix is used. Mirror of " "tiling-check.py's --allow-remote-ollama guard.") # in pick_prefix_tier(), add allow_egress parameter and guard early: def pick_prefix_tier(force_synthetic, allow_egress=False): if force_synthetic or not allow_egress: return "synthetic" if os.environ.get("ANTHROPIC_API_KEY"): return "anthropic-api" if shutil.which("claude"): return "claude-cli" return "synthetic" # at the process_page call site, pass through: tier = pick_prefix_tier(force_synthetic, allow_egress=args.allow_egress) ``` 2. **`bin/setup-retrieve.sh`** — when invoked WITHOUT `--no-llm`, prompt the user. Add right after the prefix-tier-picker block (around line 100): ```bash if ! $NO_LLM && ! $CHECK_ONLY; then case "$PREFIX_TIER" in *anthropic-api*|*claude-cli*) say "" say "⚠️ Stage 1 will send page bodies off-machine via the '$PREFIX_TIER' tier." say " Estimated egress: ~\$0 (claude-cli, free) to ~\$12 per 1,000 pages (Anthropic API)." say " Per-page bodies are POSTed to the LLM; check the Anthropic privacy policy." printf " Continue? [y/N]: " read -r reply case "$reply" in [yY]|[yY][eE][sS]) say "Proceeding with egress." ;; *) say "Aborted. Re-run with --no-llm for the synthetic-only path." ; exit 0 ;; esac ;; esac fi # Also: when contextual-prefix.py is invoked, pass --allow-egress through: ARGS=("--all") $NO_LLM && ARGS+=("--no-llm") ! $NO_LLM && ARGS+=("--allow-egress") # ← new line $REBUILD && ARGS+=("--rebuild") ``` 3. **`skills/wiki-retrieve/SKILL.md`** — add a Data Privacy callout near the top, right after the substrate note (bundle with H6 fix below): ```markdown ## Data privacy (v1.7.1+) Tier 1 (Anthropic API) and tier 2 (claude CLI subprocess) of the contextual-prefix generator send wiki page bodies off-machine. By default, both tiers are GUARDED behind: - `scripts/contextual-prefix.py --allow-egress` flag (default off → falls through to tier 3) - `bin/setup-retrieve.sh` consent prompt before any non-synthetic Stage 1 run To run fully on-machine (tier 3 synthetic prefix + local ollama rerank), use: `bash bin/setup-retrieve.sh --no-llm`. This is the default if you do not pass --allow-egress. The egress guard mirrors `scripts/tiling-check.py:351-352`'s `--allow-remote-ollama` precedent. ``` ### Verification ```bash # 1. Existing tests must still pass make test # 7 suites green # 2. New behavior: without --allow-egress, tier defaults to synthetic python3 scripts/contextual-prefix.py wiki/concepts/Hot\ Cache.md --peek # Expect log line: "tier=synthetic" # 3. With --allow-egress AND key set, tier picks api/cli ANTHROPIC_API_KEY=test python3 scripts/contextual-prefix.py wiki/concepts/Hot\ Cache.md --peek --allow-egress # Expect log line: "tier=anthropic-api" # 4. setup-retrieve.sh consent prompt fires when non-synthetic echo "" | bash bin/setup-retrieve.sh # press enter at the prompt # Expect: prompt appears, defaults to "no", script exits cleanly with "Aborted." ``` ### Commit message ``` fix(v1.7.1): contextual-prefix egress requires explicit consent Close the v1.7.0 audit BLOCKER (docs/audits/v1.7.0-audit-2026-05-17.md §3.2, §8.1 B1). pick_prefix_tier() was selecting tier-1 (Anthropic API) whenever ANTHROPIC_API_KEY was set in env — no flag, no prompt, no warning. Wiki page bodies streamed off-machine without user opt-in. Mirror the existing precedent at scripts/tiling-check.py:351-352 (--allow-remote-ollama + localhost-only default): - scripts/contextual-prefix.py: new --allow-egress flag (default off). pick_prefix_tier() now requires allow_egress=True to pick non-synthetic tiers. - bin/setup-retrieve.sh: prompts before any non-synthetic Stage 1 run; defaults to abort on empty/no input; passes --allow-egress through to the helper. - skills/wiki-retrieve/SKILL.md: Data Privacy callout at top of skill body. v1.6 vaults that never ran setup-retrieve.sh see no behavior change. v1.7 adopters who DID run it will be prompted on their next refresh. Co-Authored-By: Claude Opus 4.7 (1M context) ``` --- ## Fix 2 (HIGH H1) — `bin/setup-retrieve.sh` no rollback if Stage 1 fails partway **Severity:** HIGH **File:** [`bin/setup-retrieve.sh`](../../bin/setup-retrieve.sh) ### What's broken `bin/setup-retrieve.sh:128-140` runs `python3 scripts/contextual-prefix.py --all` and continues to Stage 2 regardless of return code. If Stage 1 fails on chunk 31 of 47, partial `.vault-meta/chunks/` is left, Stage 2 builds a stale index, the user has no documented recovery path. ### Changes Add explicit exit-code check + recovery hint: ```bash # Replace the existing Stage 1 invocation block with: say "" say "═══ Stage 1/2: chunking + contextual-prefix generation ═══" ARGS=("--all") $NO_LLM && ARGS+=("--no-llm") ! $NO_LLM && ARGS+=("--allow-egress") # from Fix 1 $REBUILD && ARGS+=("--rebuild") if ! python3 "$VAULT/scripts/contextual-prefix.py" "${ARGS[@]}"; then STAGE1_RC=$? warn "Stage 1 failed (rc=$STAGE1_RC). Partial chunks are at:" warn " $META/chunks/" warn "Recovery options:" warn " 1. Re-run setup-retrieve.sh — incremental skip will resume from body_hash" warn " 2. Wipe and start over: rm -rf $META/chunks/ && bash bin/setup-retrieve.sh" warn " 3. Provision the failed page only: python3 scripts/contextual-prefix.py wiki/.md --rebuild" exit 5 fi ``` ### Verification ```bash # Force a failure: invalid path that the script will try to read echo "" | python3 scripts/contextual-prefix.py /nonexistent.md # Then run setup, expect the failure-mode warning bash bin/setup-retrieve.sh --no-llm # Hand-verify recovery message appears with the three options ``` ### Commit message ``` fix(v1.7.1): setup-retrieve rollback path on Stage 1 failure Close audit H1. bin/setup-retrieve.sh ignored Stage 1's exit code and proceeded to Stage 2 with partial chunks. Now exits 5 on Stage 1 failure with a 3-option recovery hint (incremental resume, full wipe, single-page re-process). ``` --- ## Fix 3 (HIGH H2) — `make clean-test-state` doesn't remove v1.7 artifacts **Severity:** HIGH (hygiene; test artifacts persist) **File:** [`Makefile`](../../Makefile) ### What's broken `Makefile:55-61` `clean-test-state` target removes v1.6 lockfiles + tiling cache + bm25 lock + embed-cache, but does NOT remove the v1.7 artifacts that `bin/setup-retrieve.sh` provisions: `.vault-meta/chunks/`, `.vault-meta/bm25/`, `.vault-meta/locks/`, `.vault-meta/transport.json`. The gitignore (commit `753fc8a`) correctly lists these as runtime artifacts; the Makefile target should match. ### Changes ```makefile clean-test-state: @rm -f .vault-meta/.address.lock .vault-meta/.tiling.lock .vault-meta/.bm25.lock \ .vault-meta/.embed-cache.lock .vault-meta/.wiki-lock.meta \ .vault-meta/tiling-cache.json \ .vault-meta/tiling-cache.*.tmp .vault-meta/embed-cache.json \ .vault-meta/embed-cache.*.tmp .vault-meta/transport.json \ .vault-meta/transport.*.tmp @rm -rf .vault-meta/chunks/ .vault-meta/bm25/ .vault-meta/locks/ @echo "Runtime lockfiles, caches, and v1.7 retrieval/lock artifacts removed." ``` ### Verification ```bash bash bin/setup-retrieve.sh --no-llm # provision artifacts test -d .vault-meta/chunks && echo "chunks exist" # should print test -f .vault-meta/bm25/index.json && echo "bm25 exists" # should print make clean-test-state # remove test -d .vault-meta/chunks || echo "chunks gone" # should print test -f .vault-meta/bm25/index.json || echo "bm25 gone" # should print make test # still green ``` ### Commit message ``` fix(v1.7.1): clean-test-state removes v1.7 retrieval artifacts Close audit H2. The gitignore (753fc8a) listed chunks/, bm25/, locks/, transport.json as regenerable artifacts but make clean-test-state only removed v1.6 caches. Extend to match the gitignore set. ``` --- ## Fix 4 (HIGH H3) — PostToolUse hook `|| true` swallows lock-check errors **Severity:** HIGH (safety property silently degrades on any check failure) **File:** [`hooks/hooks.json`](../../hooks/hooks.json) ### What's broken `hooks/hooks.json:34-37` PostToolUse command is one long pipeline ending in `|| true`. If `bash scripts/wiki-lock.sh list` errors (permission denied on `.vault-meta/.wiki-lock.meta`, missing script, etc.), the `||true` swallows it and `git add/commit` still fires. The intended "defer commit while locks held" property silently degrades to "always commit." ### Changes Restructure to test the lock-list exit code explicitly: ```json { "matcher": "Write|Edit", "hooks": [ { "type": "command", "command": "[ -d .git ] || exit 0; if [ -x scripts/wiki-lock.sh ]; then LOCK_COUNT=$(bash scripts/wiki-lock.sh list 2>/dev/null | wc -l | tr -d ' '); LOCK_RC=$?; if [ \"$LOCK_RC\" != \"0\" ]; then exit 0; fi; if [ \"$LOCK_COUNT\" != \"0\" ]; then exit 0; fi; fi; git add wiki/ .raw/ .vault-meta/ 2>/dev/null && (git diff --cached --quiet || git commit -m \"wiki: auto-commit $(date '+%Y-%m-%d %H:%M')\" 2>/dev/null) || true" } ] } ``` Key changes: - `[ -d .git ] || exit 0` — early bail if no git - `LOCK_RC=$?` captures the wiki-lock script's exit code separately - If the lock check errored, EXIT 0 (defer commit; do not auto-commit on unknown state) - The final `|| true` now only covers the git commands themselves ### Verification ```bash # Manual: temporarily chmod -x scripts/wiki-lock.sh; trigger a Write tool; # verify the hook does NOT auto-commit (the `[ -x ... ]` test fails harmlessly) # Restore: chmod +x scripts/wiki-lock.sh # Manual: hold a lock, trigger a Write, verify NO commit fires bash scripts/wiki-lock.sh acquire wiki/concepts/Test.md # (then trigger a Write tool via Claude — verify no "wiki: auto-commit" in git log) bash scripts/wiki-lock.sh release wiki/concepts/Test.md ``` ### Commit message ``` fix(v1.7.1): PostToolUse hook tests lock-check exit code explicitly Close audit H3. The hook's || true terminator swallowed errors from `wiki-lock list`. If the lock check failed for any reason (permission, missing script, deleted meta-lock), the safety property silently degraded to "always commit." Restructure: capture LOCK_RC separately; on non-zero, exit 0 (defer commit on unknown state) instead of falling through. ``` --- ## Fix 5 (HIGH H5) — `detect-transport.sh` JSON escaping **Severity:** HIGH (theoretical; obsidian-cli unlikely to emit malicious version strings, but defense in depth) **File:** [`scripts/detect-transport.sh`](../../scripts/detect-transport.sh) ### What's broken `scripts/detect-transport.sh:79,86` substitutes `obsidian-cli --version` output directly into the JSON via shell variable expansion. Only `tr -d '"'` is applied. Newlines, backslashes, control chars would break the JSON. ### Changes Wrap the version capture in a Python json-escape pass. Add this helper at the top of the script (after the var defs): ```bash json_escape() { python3 -c 'import json,sys; print(json.dumps(sys.stdin.read().strip()), end="")' } ``` Then change lines 79 and 86 from: ```bash CLI_VERSION="$(obsidian-cli --version 2>/dev/null | head -1 | tr -d '"' || echo unknown)" ``` to: ```bash CLI_VERSION="$(obsidian-cli --version 2>/dev/null | head -1 | json_escape || echo '"unknown"')" ``` Note: `json_escape` outputs an already-quoted JSON string, so the `snapshot()` heredoc needs to drop the quotes around `${CLI_VERSION}`: ```bash # in snapshot(), change: "version_string": "${CLI_VERSION}", # to: "version_string": ${CLI_VERSION}, ``` ### Verification ```bash bash scripts/detect-transport.sh --peek --quiet | python3 -c 'import json,sys; json.load(sys.stdin); print("OK")' # Expect "OK" # Synthetic test: rig the function to receive a JSON-breaking value echo 'Obsidian "1.12"' | python3 -c 'import json,sys; print(json.dumps(sys.stdin.read().strip()))' # Expect: "Obsidian \"1.12\"" ``` ### Commit message ``` fix(v1.7.1): detect-transport.sh escapes version_string as JSON Close audit H5. CLI_VERSION came from `obsidian-cli --version` and was embedded in JSON via shell variable expansion with only `tr -d '"'` applied. Backslashes, newlines, control chars would break the output. Pipe through `python3 json.dumps` (json_escape() helper). ``` --- ## Fix 6 (HIGH H6) — `skills/wiki-retrieve/SKILL.md` no Data Privacy callout **Severity:** HIGH (transparency; documents the egress posture) **File:** [`skills/wiki-retrieve/SKILL.md`](../../skills/wiki-retrieve/SKILL.md) ### What's broken The skill frontmatter description doesn't mention that tier-1/tier-2 contextual prefix sends page bodies off-machine. The architecture section implies it; the user-facing description doesn't surface it. ### Changes **This fix is BUNDLED with Fix 1** — the Data Privacy callout in §"Data privacy (v1.7.1+)" added by Fix 1 closes this. No separate commit needed; the Fix 1 commit message references "closes H6" already. Alternatively, if the bundled callout from Fix 1 wasn't sufficient, expand the frontmatter description to add a privacy phrase: ```yaml description: "Hybrid retrieval primitive ... [existing text] ... Triggers on: retrieve, hybrid retrieval, BM25, rerank, contextual retrieval, search the chunks, ... DATA PRIVACY: tier-1/tier-2 contextual prefix sends page bodies to LLM (Anthropic API / claude CLI) and requires --allow-egress flag opt-in; tier-3 (default) keeps all data on-machine." ``` --- ## Fix 7 (HIGH H4) — Process gap: no verifier-agent pass **Severity:** HIGH (process; produced the v1.7 BLOCKER) **File:** new doc + agents/ folder update ### What's broken During v1.7 development, code went straight from worker to commit without a separate verifier-agent pass. This audit IS the missing verifier — but doing it post-commit means findings become patches instead of pre-merge fixes. The BLOCKER (B1) would have been caught by any security-focused review before §3.3 was committed. ### Changes Create `agents/verifier.md` as a dispatched-on-demand specialist that workstream owners can call before commit. Mirrors the `superpowers:verification-before-completion` skill's checklist. ```markdown # verifier — Pre-Commit Audit Specialist You are a verifier agent. Your job is to find issues a worker just missed, BEFORE they commit. Apply the /best-practices six-cut kernel to the staged diff. ## When invoked After a worker has staged changes for a workstream but BEFORE the commit. Workflow: 1. Worker: `git add ; git status` shows staged changes 2. Dispatch this agent with the workstream context 3. Agent reads every staged file + the precedent files it touches 4. Agent applies six-cut + agent-kernel checks 5. Agent returns findings in 4 tiers (BLOCKER / HIGH / MEDIUM / LOW) 6. Worker addresses BLOCKER + HIGH before commit; MEDIUM/LOW become follow-ups ## Six-cut checklist (verify EACH cut) [content matches the audit's §3 six-cut framework — copy from docs/audits/v1.7.0-audit-2026-05-17.md §3] ## Specifically check for in EVERY workstream - **Data egress**: any new outbound network call, subprocess, or file write outside the vault root. If yes: is there a user opt-in checkpoint? Compare to existing `--allow-remote-ollama` / `--allow-egress` precedents. - **Atomic operations**: any file write that could be interrupted mid-stream. If yes: is there a temp + rename, or other atomicity guarantee? - **Failure-mode rollback**: any multi-step operation. If yes: is there a documented recovery path for partial completion? - **Hermetic test coverage**: any new code path. If yes: is there a test that exercises it without network/LLM? ## Output Report under 800 words. Findings in tiered list with file:line citations + recommended fix. ``` ### Verification This is a process change, not a code change. Verify by: - Confirm `agents/verifier.md` exists - Reference it from `CLAUDE.md` "How to Use" section as the recommended pre-commit step - Document in next release notes (v1.7.1 changelog entry) so future workstreams know to use it ### Commit message ``` docs(v1.7.1): add verifier-agent specialist for pre-commit audits Close audit H4. The v1.7 development cycle had no verifier-agent pass at workstream gates — code went straight from worker to commit. The audit itself filled that role post-hoc, which is why the BLOCKER (B1) became a v1.7.1 patch instead of a pre-merge fix. New agents/verifier.md documents the on-demand specialist + the six-cut checklist a workstream owner should dispatch BEFORE commit. CLAUDE.md references it as the recommended pre-commit step. No code changes; process change only. ``` --- ## Sequencing recommendation Order the fixes by dependency + grouping: 1. **Fix 1 (BLOCKER)** + **Fix 6 (HIGH H6)** — bundle as ONE commit since the Data Privacy callout is part of Fix 1's changes anyway 2. **Fix 2 (HIGH H1)** — independent; small 3. **Fix 3 (HIGH H2)** — independent; tiny 4. **Fix 4 (HIGH H3)** — independent 5. **Fix 5 (HIGH H5)** — independent 6. **Fix 7 (HIGH H4)** — doc-only, last Total: **6 commits** (one per fix, Fix 1+6 combined). Each followed by `make test`. If anything red, stop and diagnose. --- ## Post-fix steps After all 6 commits land: 1. Run `make test` one final time. 7 suites green is the gate. 2. Run `bash bin/setup-retrieve.sh --no-llm` end-to-end to verify the retrieval pipeline still provisions cleanly post-fixes. 3. Update [`docs/audits/v1.7.0-audit-2026-05-17.md`](v1.7.0-audit-2026-05-17.md) §10.2 to mark each H1-H6 as "FIXED in v1.7.1 commit ``" with the actual SHAs. 4. Update [`wiki/hot.md`](../../wiki/hot.md) "Last Updated" section to reflect v1.7.1. 5. Update [`CHANGELOG.md`](../../CHANGELOG.md) — add a v1.7.1 entry that references the audit + lists the 7 fixes. 6. Update [`.claude-plugin/plugin.json`](../../.claude-plugin/plugin.json) + [`.claude-plugin/marketplace.json`](../../.claude-plugin/marketplace.json) — version bump 1.7.0 → 1.7.1. 7. **Then** ask the user whether to push + tag `v1.7.1`. Do not push without explicit go. --- ## What's NOT in this plan - MEDIUM findings (M1-M14) — file as v1.7.x issues; address opportunistically - LOW findings (L1-L10) — bundle into a "polish PR" before v1.8 - v1.8 (methodology modes), v1.9 (multimodal ingest), v2.0 (derive), v2.5+ (GUI shell) — separate work; see audit §9 verdict + §10 punch list --- ## Files this plan modifies (summary) | File | Reason | Fix # | |---|---|---| | `scripts/contextual-prefix.py` | --allow-egress flag, pick_prefix_tier guard | 1 | | `bin/setup-retrieve.sh` | egress consent prompt + flag pass-through + rollback on Stage 1 failure | 1, 2 | | `skills/wiki-retrieve/SKILL.md` | Data Privacy callout | 1 (H6 bundled) | | `Makefile` | clean-test-state extension | 3 | | `hooks/hooks.json` | PostToolUse explicit lock-check exit handling | 4 | | `scripts/detect-transport.sh` | json_escape helper + apply to CLI_VERSION | 5 | | `agents/verifier.md` | NEW — pre-commit specialist | 7 | | `CLAUDE.md` | reference verifier-agent in recommended workflow | 7 | | `docs/audits/v1.7.0-audit-2026-05-17.md` | mark fixes as FIXED with SHAs | post-fix step 3 | | `wiki/hot.md` | v1.7.1 state update | post-fix step 4 | | `CHANGELOG.md` | v1.7.1 entry | post-fix step 5 | | `.claude-plugin/{plugin,marketplace}.json` | 1.7.0 → 1.7.1 | post-fix step 6 | --- ## Resumption hint (for post-compact me) Quick state recovery on next session: ```bash cd ~/Desktop/claude-obsidian git log --oneline main..HEAD | head -10 # see all v1.7 commits + auto-commits git status --short # confirm working tree make test # 7 suites should be green cat docs/audits/v1.7.1-fixes-plan.md # this file — your roadmap cat docs/audits/v1.7.0-audit-2026-05-17.md | head -40 # exec verdict reminder ``` Then execute Fix 1 first. The plan is sequenced; just walk it top to bottom.