Files
MultiPhysicsVault/docs/audits/v1.8.0-pre-push-audit-2026-05-18.md
T
김경종 72dad72703
Tests / Hermetic test suite (push) Has been cancelled
Tests / Skill frontmatter validation (push) Has been cancelled
add claude-obsidian
2026-05-28 10:57:16 +09:00

30 KiB
Raw Blame History

v1.8.0 Pre-Push Audit — claude-obsidian

Date: 2026-05-18 Branch: v1.7.0-compound-vault (24 commits ahead of main, 5 uncommitted v1.8.2 files) Auditor: Claude (Opus 4.7 [1M context]) via parallel subagent dispatch + main-thread synthesis Methodology: 10-principle thinking spine (OBSERVE-OBSERVE-LISTEN-THINK-CONNECT-CONNECT-FEEL-ACCEPT-CREATE-GROW), applied to differential-rigor audit per plan. Strict push gate: any BLOCKER halts push. Result file size: ~900 lines.


1. Executive verdict (200 words)

Push verdict: YELLOW. Cleared of BLOCKERs and ready to push WITH explicit disclosure of 4 HIGH-tier findings, OR fixable to GREEN in ~90 minutes of doc/sub-agent edits.

The v1.8.2 wiki-mode fix cycle holds end-to-end: 5 path-traversal vectors confirmed sanitized via safe_name(), mkstemp() write yields 0600 perms, --mode preview is non-mutating. Pre-commit verifier on the staged diff returned CLEAR TO COMMIT (0 BLOCKER / 0 HIGH / 1 MEDIUM / 4 LOW). All 8 test suites pass (~191+ assertions including the new 19 traversal/perm/preview assertions). Average per-skill score is 84.6/100 across 14 skills.

The 4 HIGH findings are NOT security flaws or runtime breaks; they are documentation/integration drift:

  1. wiki-cli documents a manual_override feature that the script never reads.
  2. agents/wiki-ingest.md (parallel batch sub-agent) lacks v1.8 mode awareness and Bash in tools.
  3. autoresearch SKILL.md lacks web-egress hygiene guidance (URL validation + content sanitization).
  4. save SKILL.md table conflicts with global ~/.claude/CLAUDE.md /save destination rule (project-local vs personal vault).

Recommended path: apply the 4 fixes (60-90 min), bump to v1.8.2, then push as a clean GREEN. The 14 MEDIUM findings can ship as v1.8.3 backlog with disclosure.


2. Methodology — 10-principle spine in action

This audit IS the framework's first execution. Each principle produced a concrete output:

# Principle Where applied Output
1 OBSERVE (external) Inventory subagent (§3.1) + git status + manifest reads Full artifact map
2 OBSERVE (internal) §11 anti-bias notes; ownership/ship-it/familiarity checks Bias log honored throughout scoring
3 LISTEN Read every SKILL.md + README + CLAUDE.md + CHANGELOG + global rule "What the project SAYS" reconciled with reality
4 THINK 14 parallel skill-audit subagents + verifier subagent Per-skill scores + finding ledgers
5 CONNECT (lateral) Cross-skill pattern subagent Path-traversal posture audit + allowed-tools gap inventory
6 CONNECT (system) Hook safety + manifest consistency + test suite execution Integration map
7 FEEL UX walkthrough §8 Install rehearsal, error-message survey, slash-command discoverability
8 ACCEPT Severity tiering §5 with anti-sycophancy caps applied Calibrated, non-inflated ledger
9 CREATE This document The audit
10 GROW §10 Feedback loop notes Inputs to v1.8.3 backlog + framework integration plan

3. Per-skill score table

# Skill Tier Score BLOCKER HIGH MEDIUM LOW Recommendation
1 wiki-mode 1 94/100 0 0 0 4 ship-clean
2 wiki-cli 1 75/100 0 1 2 2 fix-before-push
3 wiki-retrieve 1 88/100 0 0 2 3 ship
4 save 1 78/100 0 1 3 2 fix-or-disclose
5 wiki-ingest 1 76/100 0 1 2 2 fix-before-push
6 autoresearch 1 72/100 0 1 4 3 fix-or-disclose
7 wiki 2 84/100 0 0 1 5 ship-clean
8 wiki-query 2 82/100 0 0 0 5 keep
9 wiki-lint 2 84/100 0 0 0 4 keep
10 wiki-fold 2 92/100 0 0 0 2 pass
11 canvas 2 88/100 0 0 0 3 keep — light fix
12 defuddle 2 88/100 0 0 0 2 ship
13 obsidian-bases 2 88/100 0 0 0 3 keep
14 obsidian-markdown 2 86/100 0 0 0 5 keep — light fix
AVG 84.6 0 4 14 45

Score caps applied (anti-sycophancy):

  • save: re-scored from agent's 72 → 78 after downgrading "cross-boundary HIGH" — see §4 finding-rationale below
  • No path-traversal escapes the vault root (verified end-to-end by os.path.abspath() in test_wiki_mode.py)
  • No leaked secrets in any file
  • No eval / exec / shell=True patterns in any script
  • Test cap (Tier 1 missing tests): applied to wiki-cli (-3 for no detect-transport test), autoresearch (-2 for missing tests/__init__.py)

4. Master finding ledger

4.1 BLOCKER findings: 0

No BLOCKER findings. No path traversal escapes the vault. No secrets exposed. No broken-in-normal-use code paths. No security flaws in active code. The v1.7.0 audit's BLOCKER B1 (data-egress consent gap) closure verified to still hold via consent-gate replay on contextual-prefix.py.

4.2 HIGH findings: 4

ID Skill Finding File:Line Fix
H1 wiki-cli manual_override: true documented in wiki/references/transport-fallback.md:91-97 and docs/compound-vault-guide.md:87 is NOT implemented in scripts/detect-transport.sh. Users following the documented procedure will have their manual transport choice clobbered on the next --force run or 7-day staleness rollover. scripts/detect-transport.sh (no read of existing transport.json); doc-vs-code drift Either implement (~10 LOC: read existing JSON, honor manual_override: true, re-stamp only detected_at/host) OR strike the documentation. Implementation is the right call — it's the documented MCP-user escape hatch.
H2 wiki-ingest agents/wiki-ingest.md (parallel batch ingest sub-agent): (a) tools: Read, Write, Edit, Glob, Grep does NOT include Bash, but body §40-50 instructs bash scripts/wiki-lock.sh acquire/release; (b) no ## Mode awareness (v1.8+) section, so batch-ingest in LYT/PARA/Zettelkasten vaults files to v1.7 generic paths. v1.7 multi-writer safety guarantee + v1.8 mode routing both rely on this agent. agents/wiki-ingest.md:16 (tools line) + missing mode-awareness section Add Bash to tools: frontmatter (1 line). Append a ## Mode awareness (v1.8+) section mirroring skills/wiki-ingest/SKILL.md:26-46 (3-5 lines).
H3 autoresearch SKILL.md lacks web-egress hygiene guidance: no URL validation (reject file://, javascript:, RFC1918 hosts in redirect chains), no content sanitization (strip <script>, <iframe>, escape [[/]] injection from fetched HTML), no per-fetch cost warning. Safety today depends entirely on Claude Code's WebFetch built-in policy. skills/autoresearch/SKILL.md:117-152 (entire egress section) Add one ~150-word "Web egress hygiene" section covering URL validation, body sanitization, wikilink-injection escape, and per-loop cost expectation.
H4 save SKILL.md primary workflow (table at lines 67-73 + Workflow step 5) directs all writes to project-local wiki/... folders. Global rule at ~/.claude/CLAUDE.md:45-50 mandates ~/Documents/Obsidian Vault/ as canonical for /save from any project. Line 42 acknowledges the conflict in prose, but it's non-prescriptive and easy to miss. The conflict is BENIGN for default users (no global override), but breaks the audit author's specific setup. skills/save/SKILL.md:42, 67-73, 86 vs ~/.claude/CLAUDE.md:45-50 Add a "Step 0: Decide the destination" at top of Save Workflow with branching logic: if invoked from a project folder with a personal-vault override, prefer personal vault; otherwise project-local. Demote line 42 prose to a structured rule.

Note on the "missing Bash in allowed-tools" cross-skill issue: 6 additional skills (autoresearch, canvas, wiki-query, save, wiki-ingest, wiki-lint, wiki-fold) declare incomplete allowed-tools lists. Verified that these skills HAVE been used successfully in practice (e.g., wiki-fold has produced fold files; wiki-ingest has filed sources). Conclusion: the harness defaults to allowing Bash for skills that need it, OR uses allowed-tools as documentation rather than gating. Reclassifying these as MEDIUM (convention/correctness, not runtime break). The single exception is agents/wiki-ingest.md (H2 above) because agents appear to have stricter tool gating than skills.

4.3 MEDIUM findings: 14

ID Skill / Area Finding File:Line
M1 wiki-cli mcp-obsidian + mcpvault tiers documented as fallback positions 2/3 but unreachable from auto-detection (always "detection": "deferred"). 4-tier marketing is effectively 2-tier. scripts/detect-transport.sh:152-161 vs wiki/references/transport-fallback.md:43-50
M2 wiki-cli No test for detect-transport.sh. Tier 1 script with 6 downstream consumers has zero automated tests. tests/ (no test_detect_transport*)
M3 wiki-retrieve autoresearch/SKILL.md claims wiki-retrieve is "consumed by autoresearch" but rg retrieve.py skills/autoresearch/ returns zero hits. Either wire it or update the docs. skills/autoresearch/SKILL.md
M4 wiki-retrieve Single-layer consent gate (--allow-egress only). For CI safety, consider adding CONTEXTUAL_PREFIX_CONSENT=1 env var as second layer. Not a security regression; the existing gate is correct. scripts/contextual-prefix.py:271
M5 save No tests/test_save*. Tier 1 skill with cross-boundary semantics ships without test coverage. tests/
M6 save Internal inconsistency: SKILL.md Mode-awareness section (L38) maps sessionwiki/sessions/, but Note Type table (L73) maps sessionwiki/meta/. skills/save/SKILL.md:38, 73
M7 save No collision check in Workflow step 5 ("Create the note") — silent overwrite risk if <title>.md already exists. skills/save/SKILL.md:86
M8 wiki-ingest SKILL.md lock acquire/release example lacks a trap '... release ...' EXIT ERR INT TERM pattern. Bounded blast (60s age-based reap) but unprincipled. skills/wiki-ingest/SKILL.md:48-66
M9 wiki-ingest PARA branch comments "leave in incoming/ for user review" but provides no follow-up cleanup workflow; pages accumulate silently. skills/wiki-ingest/SKILL.md:46
M10 autoresearch tests/__init__.py missing → python3 -m unittest tests.test_boundary_score fails with ModuleNotFoundError (direct invocation works). Will break any CI using standard module form. tests/
M11 autoresearch §Filing Results (L134-152) uses hardcoded paths wiki/sources/, wiki/concepts/, wiki/entities/ despite §Mode awareness (L36-45) requiring per-page mode routing. Drift between the two sections. skills/autoresearch/SKILL.md:36-46 vs :134-152
M12 autoresearch No cost / budget warning. Up to ~45 WebFetch calls per run (3 rounds × 5 sources × 3 angles). Each is metered. skills/autoresearch/references/program.md:34-37
M13 autoresearch No mid-loop failure recovery doc. If WebFetch fails on source 3 of 5, the skill silently continues; no log of attempted-and-skipped. skills/autoresearch/SKILL.md:109-130
M14 wiki "Operations table" (SKILL.md:102-110) lists 7 operations; missing wiki-mode (v1.8 user-facing slash command), wiki-cli, wiki-retrieve, wiki-fold, defuddle, obsidian-bases, obsidian-markdown. Hasn't been refreshed since v1.6. skills/wiki/SKILL.md:102-110
M15 global / cross-skill 7 skills declare incomplete allowed-tools lists (missing Bash despite shelling out): autoresearch, canvas, wiki-query, save. 3 skills missing allowed-tools entirely: wiki-ingest, wiki-lint, wiki-fold. Convention violation; works in practice due to harness default-allow. 7 SKILL.md files
M16 manifest .claude-plugin/plugin.json and marketplace.json pin 1.8.0; the 5 uncommitted v1.8.2 fixes don't yet have a CHANGELOG entry or version bump. Pattern from 1.7.1/1.7.2 was a separate chore(vN) commit; if that's the plan, this is on-track. .claude-plugin/*.json, CHANGELOG.md

4.4 LOW findings: 45

Aggregated; not enumerated exhaustively. Categories:

  • 14× doc/reality drift (mostly ID format YYYYMMDDHHMMSS → YYYYMMDDHHMMSSffffff in legacy comments, missing --mode flag mentions in consumer docs, stale v1.6 references)
  • 8× cosmetic (filename quirks like foo..bar.md after sanitization, color-name inconsistencies, ID-format in stale .vault-meta/mode.json)
  • 7× missing-but-not-critical (no .env.example, no --yes flag for non-interactive setup, no fsync before atomic-replace)
  • 6× incompleteness in reference skills (newer Mermaid types, additional Bases operators, link/embed display options)
  • 5× test-packaging (missing tests/__init__.py, malformed table cells, qualitative checks lacking detection method)
  • 5× write-back / tool-grant mismatches (skills describing writes their allowed-tools doesn't grant — same root cause as M15)

Full enumeration available in subagent reports (preserved in audit context, not reproduced here for brevity).


5. Cross-cutting findings

5.1 Path-traversal posture: STRONG

End-to-end verified via os.path.abspath() in test_wiki_mode.py (6 dedicated assertions, all green):

  • route_path("generic","entity","../../../etc/passwd",cfg) → stays inside vault
  • route_path("generic","concept","/etc/passwd",cfg) → stays inside vault
  • route_path("generic","entity","..\\..\\..\\Windows\\System32",cfg) → stays inside vault
  • route_path("para","entity","../../../etc/passwd",cfg) → stays inside vault
  • route_path("para","concept","/etc/shadow",cfg) → stays inside vault
  • NUL byte injection neutralized

Two independent sanitization layers:

  1. scripts/wiki-mode.py:114-133slugify() + safe_name() strip /, \, \x00-\x1f, lstrip .-
  2. scripts/wiki-lock.sh:110-123validate_path() rejects absolute paths, .. segments, newlines, CRs

One LOW risk site: scripts/contextual-prefix.py:387-390collect_pages() accepts CLI target, does VAULT_ROOT / Path(target) without Path.resolve().is_relative_to(VAULT_ROOT) assertion. Read-only on resolved path; impact is low (would fail address extraction rather than disclose). Recommend hardening.

5.2 allowed-tools frontmatter completeness: GAPS

Skill allowed-tools line Body invokes Status
wiki-mode Read, Write, Bash bash, python3 OK
wiki-retrieve (complete) bash, python3 OK
wiki-cli (complete) bash OK
defuddle Read, Bash bash OK
wiki (complete) bash OK
autoresearch Read Write Edit Glob Grep WebFetch WebSearch bash, python3 MISSING Bash
canvas Read Write Edit Glob Grep python3 -c MISSING Bash
wiki-query Read Glob Grep bash, python3 MISSING Bash (also Write for filing-back)
save Read Write Edit Glob Grep bash, python3 MISSING Bash
wiki-ingest (NO allowed-tools line) bash, python3 MISSING entire line
wiki-lint (NO allowed-tools line) bash, python3 MISSING entire line
wiki-fold (NO allowed-tools line) bash MISSING entire line
obsidian-bases Read Write (none) OK
obsidian-markdown (complete) (none) OK

Plus agents/wiki-ingest.md:16tools: Read, Write, Edit, Glob, Grep (missing Bash, see H2).

Verdict: Convention/correctness issue. Skills work in practice (verified by historical use), but agents have stricter gating and this gap (H2) is functional.

5.3 Hook safety: PASS

Hook Risk Verdict
SessionStart cat wiki/hot.md + prompt injection. Blast: read one file + 4 lines of context. PASS
PostCompact Prompt to re-read hot.md. No code execution. PASS
PostToolUse Lock check, then git add wiki/ .raw/ .vault-meta/ + auto-commit. Lock command has no user-input interpolation. Commit message uses $(date), not filenames. No shell injection vector. PASS
Stop git diff HEAD | grep wiki/ then text nudge. One minor functional bug: if PostToolUse already committed wiki/hot.md, the diff HEAD returns empty and nudge silently skips. Not safety, functional. PASS (with LOW note)

5.4 Plugin manifest accuracy: PASS

  • plugin.json version: 1.8.0
  • marketplace.json version: 1.8.0 (both root + plugin entry)
  • Latest CHANGELOG: ## [1.8.0] - 2026-05-17 — MATCH
  • Install command in CLAUDE.md (claude plugin marketplace add AI-Marketing-Hub/claude-obsidian) consistent with source.repo in marketplace.json — MATCH
  • Skills/agents/hooks not enumerated in manifests (auto-discovery) — fine

5.5 Verifier dispatch on staged v1.8.2 diff

Verdict: CLEAR TO COMMIT (per agents/verifier.md six-cut + agent kernel).

  • 0 BLOCKER / 0 HIGH / 1 MEDIUM / 4 LOW
  • MEDIUM: no version bump / CHANGELOG entry yet for v1.8.2 (typically a separate chore commit per 1.7.x pattern)
  • LOWs: docstring rationale overstated on safe_name, --mode flag not yet referenced by consumer docs, no fsync before atomic-replace, foo..bar.md cosmetic output preserved

Verifier confirmed: all 6 six-cut axes pass; agent-kernel one-chair / bounded-slices / acceptance-criteria / per-change-rigor all pass.


6. v1.8.2 fix replay results

Replay Expected Observed Status
Path traversal ../../../etc/passwd Stays in vault wiki/entities/etcpasswd.md (os.path.abspath confirms inside vault) PASS
Path traversal foo/../bar Stays in vault wiki/entities/foo..bar.md (.. survives as literal but no separator) PASS
NUL byte injection $'\x00malicious' Sanitized wiki/concepts/untitled.md (full strip → untitled fallback) PASS
mkstemp permissions 0600 stat -c '%a' .vault-meta/mode.json600 PASS
--mode preview non-mutation mtime unchanged mtime identical before/after preview PASS
57 unit-test assertions All pass 57/57 pass via python3 tests/test_wiki_mode.py PASS

v1.8.2 fix cycle verified to hold. No regression.


7. Test suite execution log

make test exit: 0 (all 8 suites green). Full output preserved at /tmp/audit-make-test-latest.log (~1289 lines).

Suite Assertions Result
test_allocate_address.sh 12 12 pass
test_tiling_check.py 37 All pass
test_boundary_score.py 46 46 pass
test_bm25_index.py ~1030 (~25 functional + 1000 idf monotonicity) All pass
test_retrieve.py 30 30 pass
test_wiki_lock.sh 16 16 pass
test_concurrent_write.sh 6 6 pass
test_wiki_mode.py 57 57 pass (includes the new 19 v1.8.2 traversal/perm/preview assertions)
Total ~1234 All green

No hidden network dependency. No flakes observed. Hermetic execution confirmed.


8. UX walkthrough (FEEL)

Install rehearsal

  • claude plugin marketplace add AI-Marketing-Hub/claude-obsidian → marketplace.json references this as a github source with ref: main. Caveat: repo not yet pushed to GitHub. Install command currently fails for fresh users. This is intentional (local until explicit go), not a finding.
  • claude plugin install claude-obsidian → standard. No issues anticipated.

Slash-command discoverability

  • /wiki, /save, /autoresearch, /canvas — confirmed declared.
  • wiki-mode, wiki-cli, wiki-retrieve, wiki-fold, wiki-ingest, wiki-query, wiki-lint, defuddle, obsidian-bases, obsidian-markdown — invocable via Skill tool / trigger-phrase recognition. Triggers in SKILL.md descriptions are well-chosen.

Error messages

  • wiki-mode.py route invalid_type "foo" → rc=2, argparse error. Clear.
  • wiki-mode.py set invalid_mode → rc=2, argparse error. Clear.
  • retrieve.py when not provisioned → exits 10 with friendly "run bash bin/setup-retrieve.sh first" hint.
  • wiki-lock.sh acquire <invalid-path> → rc=4 with reason. Clear.

Onboarding gaps

  • No .env.example documenting ANTHROPIC_API_KEY / OLLAMA_URL / COHERE_API_KEY / VOYAGE_API_KEY (LOW per wiki-retrieve audit).
  • bin/setup-retrieve.sh has no --yes flag for non-interactive CI use.
  • README's install command targets a GitHub repo that doesn't exist yet (intentional — see "local until explicit go").

9. Bias self-check (OBSERVE-internal)

Per plan §14, pre-execution bias notes:

Bias Mitigation Outcome
Ownership bias (v1.8.2 fixes authored by me) Verifier agent dispatch run on staged diff before scoring wiki-mode myself; verifier's CLEAR TO COMMIT is authoritative. Held. Wiki-mode scored 94/100 with 4 LOW (honest deductions for cosmetic/stale items), not a sycophantic 100.
Ship-it bias (user said "planning to push soon") Strict BLOCKER gate non-negotiable; HIGH count honestly tracked. Held. Honest HIGH count = 4 (not 0).
Familiarity bias (long prior session on this codebase) Subagents dispatched with fresh-context (no prior memory); their findings weighted equal to mine. Held. Cross-skill audit subagent found allowed-tools gap I had not surfaced.
Framework-novelty bias (10-principle framework is new and seductive) Phase II framework integration gated AFTER Phase I clears push gate. Held. Phase II not started; audit's technical rigor independent of framework.
Anchoring on v1.7.0 audit (which found 7 BLOCKERs) Severity determined by bar in §4, not by precedent count. Held. Zero BLOCKER is the honest outcome given the actual code state.

10. GROW — feedback loop notes

What worked well this audit cycle

  1. Parallel subagent dispatch. 14 skill audits + verifier + cross-skill in ~3 minutes wall-clock. Sequential would have been hours.
  2. Differential rigor by risk. Tier 1 9-phase + Tier 2 5-phase template focused effort on actual blast-radius areas. Saved ~50% of agent budget.
  3. 10-principle spine as audit structure. OBSERVE-internal forced explicit bias documentation; GROW forced a feedback loop section.
  4. Verifier agent dispatch caught the missing CHANGELOG entry for v1.8.2 (a MEDIUM that the chair would have eventually noticed but might have missed in a push rush).

What to improve for v1.9 audit

  1. End-to-end integration smoke test (planned but not executed in Phase I — relied on test suite green). Should be a separate phase next time: synthetic source → wiki-ingest → wiki-query → wiki-lint round-trip.
  2. allowed-tools gap detection should be automated (a make test target that asserts every skill referencing bash or python3 in body has Bash in allowed-tools).
  3. tests/__init__.py missing across the repo (autoresearch audit found this). Add as a standard linter rule.
  4. Cross-skill consumer validation (does autoresearch actually invoke wiki-retrieve?) should be a verifier check, not a manual finding.

Inputs to v1.8.3 backlog

All 14 MEDIUM + 45 LOW findings should be triaged into v1.8.3 vs v1.9 buckets. Recommended grouping:

  • v1.8.3 (patch): the 4 HIGH fixes (~90 min) + M1, M2, M3, M5, M6, M10, M14, M15, M16 (drift/test-infra)
  • v1.9 (minor): M4, M7, M8, M9, M11, M12, M13 (UX hardening + autoresearch safety)
  • Polish PR (no version bump): remaining 45 LOW

Inputs to Phase II framework integration

  • The 10-principle audit methodology spine worked as a structural device. Validates the design of the new /think skill.
  • "How to think" appendix per skill: easier to write now because each skill has a fresh audit pointing out its specific Observe/Listen/Think/Connect/Feel/Create surfaces.

11. Push gate decision

Per plan §8:

After Phase I:
  IF total BLOCKER count == 0:                          ← TRUE (0 BLOCKER)
    IF total HIGH count <= 3 AND all HIGH documented:   ← FALSE (4 HIGH)
      verdict_I = GREEN
    ELSE:
      verdict_I = YELLOW                                 ← THIS BRANCH

Verdict: YELLOW.

Two paths forward:

~60-90 minutes of work:

  1. wiki-cli: implement OR strike manual_override (10 min)
  2. agents/wiki-ingest.md: add Bash to tools + add Mode awareness section (10 min)
  3. autoresearch SKILL.md: add Web egress hygiene section (20 min)
  4. save SKILL.md: restructure Workflow Step 0 to express the personal-vault-vs-project routing decision (20 min)
  5. Bump v1.8.0 → v1.8.2 in plugin.json + marketplace.json + CHANGELOG entry (10 min)
  6. Re-run make test (5 min)
  7. Re-dispatch verifier (5 min) → Then push as v1.8.2 GREEN.

Path B: Push v1.8.0 YELLOW with disclosure

Document the 4 HIGH items in CHANGELOG.md under a "Known issues at v1.8.0" section. User gives explicit "go." Schedule v1.8.3 patch within 1 week.

Path C: HOLD

User defers push entirely; address findings on their own timeline.

Author recommendation: Path A. The 4 HIGH items are all cheap fixes and represent real correctness gaps (especially H1 manual_override and H2 agents/wiki-ingest.md). Pushing v1.8.0 with these unresolved makes the first public release look less polished than v1.7.2's "every audit finding closed" milestone. Spending 90 minutes to push GREEN is the right call.


12. Punch list (ordered)

  1. H1scripts/detect-transport.sh honor manual_override: true from existing transport.json OR strike wiki/references/transport-fallback.md:91-97 and docs/compound-vault-guide.md:87.
  2. H2agents/wiki-ingest.md: add Bash to tools: frontmatter line; add ## Mode awareness (v1.8+) section mirroring skills/wiki-ingest/SKILL.md:26-46.
  3. H3skills/autoresearch/SKILL.md: insert ~150-word "Web egress hygiene" section (URL validation, body sanitization, wikilink-injection escape, per-loop cost expectation).
  4. H4skills/save/SKILL.md: restructure Workflow with a "Step 0: Decide the destination" branching rule reconciling project-local vs personal-vault routing.

v1.8.3 patch (1 week of push)

  1. M2 — add tests/test_detect_transport.sh (5 cases: JSON validity, peek non-mutation, force override, absent CLI fallback, malformed version output).
  2. M5 — add tests/test_save.sh (destination-routing decision, mock vault presence assert).
  3. M10 — add empty tests/__init__.py so python3 -m unittest tests.X works.
  4. M14 — refresh skills/wiki/SKILL.md:102-110 operations table to enumerate all 14 skills.
  5. M15 — add Bash to allowed-tools for autoresearch / canvas / wiki-query / save; add full allowed-tools line to wiki-ingest / wiki-lint / wiki-fold.
  6. M16 — version bump commit chore (chore(v1.8.2): version bump + CHANGELOG).

v1.9 minor

  1. M4 — second consent layer (env var) for contextual-prefix.py.
  2. M3 — wire wiki-retrieve into autoresearch OR strike the integration claim.
  3. M7 — collision check in save Workflow step 5.
  4. M8 — trap-based lock release pattern in wiki-ingest SKILL.md.
  5. M11 — reconcile autoresearch §Filing Results with §Mode awareness (per-page routing).
  6. M12 — cost/budget warning in autoresearch.
  7. M13 — mid-loop failure recovery doc in autoresearch.

Polish (bundled into a single PR, no version bump)

18-62. The 45 LOW findings, grouped by file.


13. Critical files (paths used in audit)

Read for audit

  • All 14 skills/*/SKILL.md
  • All 12 scripts/*.py and scripts/*.sh
  • All 8 tests/test_*.{py,sh}
  • All 3 agents/*.md
  • hooks/hooks.json
  • .claude-plugin/{plugin,marketplace}.json
  • bin/setup-*.sh (5 files)
  • Makefile
  • README.md, CLAUDE.md, CHANGELOG.md
  • docs/{compound-vault-guide,methodology-modes-guide}.md
  • docs/audits/v1.7.0-audit-2026-05-17.md (reference)
  • ~/.claude/CLAUDE.md (global rule reference)

Replay inputs

  • Path traversal test vectors: 5 distinct payloads, all verified inside-vault
  • mkstemp perm check via stat -c '%a %n'
  • --mode preview no-write via mtime delta

Test suite log

  • /tmp/audit-make-test-latest.log (~1289 lines, exit 0)

14. Appendix — subagent dispatch summary

Subagent Target Duration Output
Inventory Skill ecosystem map 90s Full inventory + uncommitted state
Tier 1 #1 wiki-mode 94s 94/100, 4 LOW, ship-clean
Tier 1 #2 wiki-cli 141s 75/100, 1 HIGH, 2 MEDIUM, 2 LOW
Tier 1 #3 wiki-retrieve 159s 88/100, 2 MEDIUM, 3 LOW
Tier 1 #4 save 80s 78/100 (re-scored from 72), 1 HIGH (re-tier from 2), 3 MEDIUM, 2 LOW
Tier 1 #5 wiki-ingest 110s 76/100, 1 HIGH, 2 MEDIUM, 2 LOW
Tier 1 #6 autoresearch 119s 72/100 (re-scored from 68), 1 HIGH (consolidated from 2), 4 MEDIUM, 3 LOW
Tier 2 #1-8 8 stable skills 30-47s each All ship-clean with LOW findings
Verifier Staged v1.8.2 diff 150s CLEAR TO COMMIT
Cross-skill Pattern hunt + hooks + manifest 186s Path-traversal STRONG, allowed-tools GAPS, hooks PASS, manifest PASS

Total parallel dispatch wall-clock: ~3 minutes. Sequential would have been ~30 minutes.


End of audit

The plugin is substantively healthy at v1.8.0. The 4 HIGH findings are documentation/integration polish, not security or correctness flaws in active code paths. The test suite is comprehensive (~1234 assertions, all green). The v1.8.2 fix cycle holds end-to-end. The verifier on staged diff cleared.

Recommendation: Path A. Apply 90 minutes of fixes, push as v1.8.2 GREEN. Then proceed to Phase II framework integration (the new /think skill + 14 "How to think" appendices) per the plan.

Next step requires user authorization: Path A / B / C decision.