add claude-obsidian
Tests / Hermetic test suite (push) Has been cancelled
Tests / Skill frontmatter validation (push) Has been cancelled

This commit is contained in:
김경종
2026-05-28 10:57:16 +09:00
parent 1b07531a45
commit 72dad72703
205 changed files with 41703 additions and 80 deletions
+504
View File
@@ -0,0 +1,504 @@
# v1.7.0 Compound Vault — Full Audit
**Status:** COMPLETE — all 4 phases executed; 9 verification gates per plan §7 closed.
**Date:** 2026-05-17
**Branch audited:** `v1.7.0-compound-vault` (local, not pushed)
**Commits in scope:** 8 commits, SHAs `2dad552``4a362ed`
**Method:** /best-practices six-cut + agent kernel applied per commit; compass artifact coverage matrix (5 priority gaps + 20 backlog items); 3 parallel Explore agents (six-cut audit, coverage matrix, code-quality deep-read); main-thread verification of every BLOCKER and HIGH finding before filing.
**Auditor:** Claude Opus 4.7 (1M ctx) under human chair Daniel; agents were independent context (each got a self-contained brief without seeing each other's output).
---
## 1. Executive verdict (full audit)
v1.7 is **not ship-ready as `v1.7.0`** but is **close**. **31 findings**: 1 BLOCKER, 6 HIGH, 14 MEDIUM, 10 LOW. The BLOCKER is a real data-egress consent gap in `scripts/contextual-prefix.py:252-258` — surfaced by two independent agent reviews and verified by main-thread code read against the `scripts/tiling-check.py:351-352` `--allow-remote-ollama` precedent. ~1 hour fix. The 6 HIGH findings are design gaps fixable in ~2.5 hours total. Recommend pushing **v1.7.1** (BLOCKER + 6 HIGH addressed) instead of v1.7.0.
**Compass artifact coverage** (5 priority gaps + 20 backlog items = 25 cells): 6 SHIPPED, 3 PARTIAL, 9 DEFERRED with explicit v1.8/v1.9/v2.0/v2.5+ milestones, 4 OUT-OF-SCOPE. Matches the v1.7 plan's claim exactly — no over-delivery, no quiet under-delivery. The shipped items are the top-quartile by value/effort per the compass artifact's own scoring. The biggest remaining gap is the derivative-outputs surface (NotebookLM-class audio/video/quiz/study), which **widened during the audit** — Phase C found NotebookLM shipped Video Overviews + a 4-tile Studio panel in May 2026, expanding their lead.
**Retrieval benchmark** (50 queries, scripted v1.6 baseline, real ollama rerank): **+39.5% error reduction. PASS** vs the v1.7 plan §7 ship-gate target of ≥30%. Top-1 accuracy 24% → 54% (+30pp); top-5 accuracy 48% → 88% (+40pp). Biggest win on derived natural questions (+52pp); ties on synonym and negative-query categories (those become findings M11, M12).
**Verdict on "is the repo #1 best ever?"** — Per-axis (§9), we are **#1 on 4 of 7 axes**: compounding wiki primitive, multi-writer safety, retrieval-architecture-free-tier, license/openness. **TIED on 1**: methodology support (nobody serves LYT/PARA/Zettel; v1.8 closes this into a 5th lead). **NOT #1 on 2**: GUI / install ergonomics (CLI-only vs Community-Plugins from Smart Connections + Copilot), derivative outputs (NotebookLM ships 4 first-class artifact tiles; we ship zero). Honest answer: **#1 on the axes that matter for sophisticated power users who control their own LLM stack — not #1 in mainstream adoption and won't be without v2.0 (derive) + v2.5 (GUI shell).**
**Recommendation**: (1) Fix the BLOCKER (~1h). (2) Ship v1.7.1 with the 6 HIGH patches (~2.5h). (3) v1.8 priority: methodology modes (gets us to 5/7 leads, cheapest move). (4) v2.0 derive spec needs to expand to include Video Overviews (new finding M13) to match NotebookLM's May 2026 bar. (5) Defer v1.7.0 tag until v1.7.1 is ready — tagging the blocker version is avoidable footprint.
---
## 2. Methodology
Findings filed in 4 tiers:
| Tier | Bar | Action |
|---|---|---|
| **BLOCKER** | Affects ship/push decision; back out the release if not fixed | Must fix before push |
| **HIGH** | Should fix before public push | Patch as v1.7.1, push after |
| **MEDIUM** | File as tracked issue | Defer to v1.7.x or v1.8 |
| **LOW** | Note for posterity / future polish | Bundle into a polish PR before v1.8 |
Verification gate: every BLOCKER and HIGH was independently verified by the main-thread auditor (Read on the actual file:line) before being filed at that severity. MEDIUM and LOW are filed on agent attribution.
---
## 3. Six-cut engineering kernel findings (per commit)
### 3.1 Commit ladder
```
2dad552 chore: pre-v1.7 cleanup
9c8e510 feat(v1.7): §3.1 substrate hard-prefer on kepano/obsidian-skills
6c7671e feat(v1.7): §3.2 default transport — Obsidian CLI with fallback chain
45a5bd3 feat(v1.7): §3.3 hybrid retrieval pipeline (wiki-retrieve)
66c11f9 feat(v1.7): §3.4 multi-writer safety — wiki-lock per-file advisory locks
51fa2da chore(v1.7): cross-cutting — version bump, docs, hot cache refresh
753fc8a chore(v1.7): gitignore runtime artifacts from Compound Vault scripts
4a362ed fix(v1.7): contextual-prefix.py — proper --all flag handling
```
8 commits. All authored by Daniel. Co-author trailer on every commit cites Claude Opus 4.7 (acceptable; consistent disclosure).
### 3.2 Per-commit six-cut walkthrough
For each commit, only NON-clean cells are reported. A "5/6 clean; 1 finding on cut N" line means the other 5 cuts were verified clean.
**`2dad552` (cleanup)** — 6/6 clean. Pure infrastructure prep (CLAUDE.md docs + .gitignore additions). No code paths to check.
**`9c8e510` (§3.1 substrate)** — 5/6 clean. 1 finding on cut #4 (delete more than you add): `+17 / -5` lines. The "soft-defer → hard-prefer" rewrite was an opportunity to delete the local fallback bodies in obsidian-markdown/obsidian-bases/canvas SKILL.md files. The decision to keep the fallbacks is documented and defensible (users without kepano installed need them), but the kernel cut still flags zero-deletion as a signal to verify intent. **Filed: LOW** (intentional, documented).
**`6c7671e` (§3.2 transport)** — 5/6 clean. 1 finding on cut #6 (failure is the spec): `detect-transport.sh` substitutes external command output (`obsidian-cli --version`) directly into JSON via shell variable expansion. Only `tr -d '"'` is applied; newlines, backslashes, control chars are not escaped. On this machine the CLI isn't installed so the bug never triggers, but a malicious or buggy `obsidian-cli` could break JSON output. **Filed: MEDIUM** (theoretical; obsidian-cli is well-behaved in practice).
**`45a5bd3` (§3.3 retrieval)** — 4/6 clean. **2 findings**, including the BLOCKER:
- **Cut #6 (failure is the spec) — BLOCKER**: `scripts/contextual-prefix.py:252-258` `pick_prefix_tier()` selects tier 1 (Anthropic API) automatically whenever `ANTHROPIC_API_KEY` env var is set. No flag, no consent prompt, no warning. Sends full wiki page bodies (`anthropic_api_prefix()` at line 264, body included in prompt-cached system message) to `https://api.anthropic.com/v1/messages`. The existing precedent in `scripts/tiling-check.py:351-352` is to require `--allow-remote-ollama` explicitly when sending body content off-localhost. `contextual-prefix.py` has no equivalent guard. **VERIFIED by main thread**: read `scripts/contextual-prefix.py:240-281` directly.
- **Cut #6 (failure is the spec) — HIGH**: `bin/setup-retrieve.sh` has no rollback if Stage 1 (chunking) fails partway through. Partial `.vault-meta/chunks/` is left on disk. Re-run is idempotent (chunks with matching body_hash skip), but the user has no documented recovery path if Stage 1 fails on chunk 31 of 47.
**`66c11f9` (§3.4 concurrency)** — 5/6 clean. 1 finding on cut #6 (failure is the spec) — HIGH: `hooks/hooks.json` PostToolUse defers commit if `wiki-lock list | wc -l != 0`, but the entire pipeline ends with `|| true`. If `wiki-lock list` errors (permission denied on .vault-meta/.wiki-lock.meta, missing script, etc.), the `||true` swallows it and `git add/commit` proceeds anyway. The intended safety property (defer commit on locks held) silently degrades to "always commit" on any error in the check.
**`51fa2da` (cross-cutting docs)** — 6/6 clean. Pure documentation + version bump.
**`753fc8a` (gitignore)** — 6/6 clean. Manually added by the user during the previous session.
**`4a362ed` (--all flag fix)** — 6/6 clean. 14-line targeted fix surfaced by the real-vault smoke; commit message correctly explains root cause.
### 3.3 Hermeticity verification
Ran `make test` — all 7 suites green. Counted: 1162 OK assertions, 0 failures, 0 errors.
Grep for network-touching code in tests/:
```
grep -rE 'urllib\.|requests|socket\.|http://|https://' tests/
```
Returns: only mock patches (`unittest.mock.patch.object(rerank, 'ollama_alive', ...)`) and subprocess invocations that target sibling scripts in temp sandboxes. No real network egress at test time. **Hermeticity claim verified.**
---
## 4. Agent kernel findings (4 workstreams)
| Constraint | Status | Evidence |
|---|---|---|
| **one chair** | VERIFIED | All 8 commits authored by Daniel; single human owner across all workstreams. |
| **bounded slices** | PARTIAL | 4 skills (`wiki-ingest`, `wiki-query`, `save`, `autoresearch`) were touched by both §3.2 (Transport section) and §3.4 (Concurrency section). No conflict in practice — sections are adjacent and compose cleanly — but the file-set overlap is real. The cross-cutting commit (51fa2da) is allowed to touch many files by definition; the §3.x feat commits were not strictly disjoint. **Filed: MEDIUM** (no harm done; flag for future releases to consider tighter scoping). |
| **explorers/workers/verifiers** | PARTIAL | Phase 1 of the original v1.7 implementation plan used 3 parallel Explore agents (verified in conversation log). Workers were the main-thread author. Verifier agents were NOT dispatched at workstream gates — code went straight from author to commit without an independent review pass. This audit IS the missing verifier pass; doing it post-commit instead of pre-commit means findings become patches instead of pre-merge fixes. **Filed: MEDIUM** (process gap; not a code bug). |
| **acceptance criteria before execution** | VERIFIED | Each feat commit references its §3.x scope; file sets match scope descriptions; original plan §7 ship gates documented. |
| **per-change rigor inside every slice** | PARTIAL | The six-cut kernel was clearly applied to code patterns (locking, flock guards, fallback chains, exit codes). BUT the BLOCKER on contextual-prefix.py egress shows the rigor was insufficient on the security/blast-radius cut. Had the author re-read tiling-check.py's `--allow-remote-ollama` pattern during §3.3 implementation, the egress gap would have been caught at write time. **Filed: HIGH** (process gap that produced a real bug). |
| **5-part closeout** | VERIFIED | CHANGELOG.md 1.7.0 entry covers: integrated result ✓, verification summary (7 suites, 1162 assertions, zero network) ✓, commit ids implicit via §3.x→commit mapping ✓, notes current ✓, next-slice rationale (v1.8/v1.9/v2.0 roadmap) ✓. |
---
## 5. Compass artifact coverage matrix
### 5.1 Five priority gaps
| # | Gap | Status | Evidence |
|---|---|---|---|
| 1 | Platform-owner substrate (kepano/obsidian-skills) | **SHIPPED** | 3 SKILL.md files defer hard-prefer; `marketplace.json:28-34` declares recommendedCompanions |
| 2 | Obsidian CLI first-class transport | **SHIPPED** | `scripts/detect-transport.sh` + `.vault-meta/transport.json` + decision tree at `wiki/references/transport-fallback.md` + 5 skill "Transport (v1.7+)" sections |
| 3 | NotebookLM-class derivative artifacts | **DEFERRED → v2.0** | Documented in `compound-vault-guide.md:274` ("v2.0 — NotebookLM-class derivative outputs") |
| 4 | Contextual retrieval + hybrid + rerank | **SHIPPED** | 4 new scripts (`contextual-prefix`, `bm25-index`, `rerank`, `retrieve`) + setup + skill + wired into `wiki-query` |
| 5 | Adoption friction (GUI onramp, one-liner installer) | **PARTIAL** | CLI transport reduces friction; GUI onramp deferred to v2.5+; no `npx claude-obsidian init` shipped |
### 5.2 Twenty backlog items
| # | Item | Status | Where |
|---|---|---|---|
| 1 | Substrate dependency on kepano | SHIPPED | §3.1 (commit 9c8e510) |
| 2 | wiki-cli default transport | SHIPPED | §3.2 (commit 6c7671e) |
| 3 | Contextual retrieval per-chunk prefix | SHIPPED | §3.3 `scripts/contextual-prefix.py` |
| 4 | Hybrid BM25 + vector + rerank | **PARTIAL** | BM25 + rerank shipped; rerank uses dense vectors internally, but no SEPARATE vector candidate stage. `compound-vault-guide.md:97` acknowledges "A separate dense vector stage is on the v1.7.x roadmap." |
| 5 | wiki-derive audio | DEFERRED → v2.0 | `CHANGELOG.md:36` |
| 6 | wiki-mode bootstrap (LYT/PARA/Zettel/Generic) | DEFERRED → v1.8 | `CHANGELOG.md:35` |
| 7 | GUI onramp Obsidian-plugin shell | DEFERRED → v2.5+ | `compound-vault-guide.md:263` |
| 8 | --from notebooklm/readwise/zotero adapters | DEFERRED → v1.9 | `CHANGELOG.md:37` |
| 9 | wiki-derive quiz/flashcards/study-guide/brief | DEFERRED → v2.0 | `CHANGELOG.md:36` |
| 10 | Out-of-box local embedding + Ollama fully-local path | **SHIPPED** | `--no-llm` flag in `bin/setup-retrieve.sh` forces tier-3 synthetic; rerank uses ollama (fully local) |
| 11 | wiki-review (PARA weekly/monthly) | DEFERRED → v1.8 | `CHANGELOG.md:38` |
| 12 | Multimodal ingest (YouTube/PDF/audio/image) | DEFERRED → v1.9 | `CHANGELOG.md:37` |
| 13 | ACP transport (Copilot #2179) | OUT-OF-SCOPE | No ACP mention in codebase; 4-tier fallback shipped without it |
| 14 | wiki-derive slides + mindmap | DEFERRED → v2.0 | implicit in §wiki-derive deferral |
| 15 | Multi-vault federation (wiki-federate) | DEFERRED → v2.x | `compound-vault-guide.md:264` |
| 16 | iOS Share extension ingest | OUT-OF-SCOPE | `skills/wiki-cli/SKILL.md` notes mobile is filesystem-only; no v1.7 work |
| 17 | Cursor/Codex/OpenCode parity | SHIPPED | `bin/setup-multi-agent.sh` (predates v1.7 but covers this) |
| 18 | Hosted Pro tier | OUT-OF-SCOPE | `compound-vault-guide.md:262` "Not a paid plugin" |
| 19 | DragonScale promoted from extension to default | **PARTIAL** | DragonScale still opt-in; v1.7 did NOT promote. wiki-lock (§3.4) is universally beneficial but is a separate concern from full DragonScale |
| 20 | Spaced-repetition Anki round-trip | OUT-OF-SCOPE | Not in roadmap |
### 5.3 Coverage summary
- **SHIPPED**: 6 (Gap 1, 2, 4 + Backlog 1, 2, 3, 10, 17 — note Gap 1=Backlog 1, Gap 2=Backlog 2 collapse to 6 distinct items)
- **PARTIAL**: 3 (Gap 5, Backlog 4, Backlog 19)
- **DEFERRED (with milestone)**: 9 (Gap 3, Backlog 5, 6, 8, 9, 11, 12, 14, 15)
- **OUT-OF-SCOPE**: 4 (Backlog 13, 16, 18, 20)
**Honest read**: v1.7 delivers EXACTLY what the v1.7 plan claimed — top-quartile items 1-4 by value/effort + the latent multi-writer bug fix. No accidental over-delivery; no quiet under-delivery. The biggest gap to category leadership is item #5 (NotebookLM-class outputs) and item #7 (GUI onramp), both explicitly deferred.
---
## 6. Retrieval benchmark results (Phase B)
### 6.1 Method
- Corpus: 50 queries (25 derived natural questions + 25 hard: 5 synonym + 10 cross-page + 5 partial-recall + 5 negative). Each annotated with `correct` page(s), `relevant` supporting pages, category, and rationale. Stored at [wiki/meta/retrieval-benchmark-v1.7.md](../../wiki/meta/retrieval-benchmark-v1.7.md).
- Pipelines compared:
- **v1.7 hybrid**: `python3 scripts/retrieve.py "<query>" --top 5` (BM25 over contextually-prefixed chunks → cosine rerank via ollama nomic-embed-text → page-address dedupe).
- **v1.6 baseline**: `python3 scripts/baseline-v16.py "<query>" --top 5` (mirrors the legacy `hot→index→drill` chain: tokenize query, score each page by distinct-term presence + hot-cache boost + index-cite boost; top-5 by score).
- Scoring:
- **top-1 success**: top result's path == one of `correct[]`
- **top-5 success**: any of top-5 paths in `correct[]`
- **Negative queries** (correct=null): success if no results, or top result in `relevant[]`.
- Runner: `scripts/benchmark-runner.py` (per-query subprocess to both pipelines, tabulates).
- Per-query raw results: `/tmp/benchmark-results.json` (50 queries × 2 pipelines = 100 result sets, with v17 and v16 paths captured for each).
### 6.2 Aggregate results
| Category | N | v1.7 top-1 | v1.7 top-5 | v1.6 top-1 | v1.6 top-5 | Δ top-1 |
|---|---|---|---|---|---|---|
| cross-page | 10 | 30.0% | 80.0% | 30.0% | 50.0% | +0.0pp |
| derived | 25 | **64.0%** | **88.0%** | 12.0% | 28.0% | **+52.0pp** |
| negative | 5 | 40.0% | 80.0% | 40.0% | 80.0% | +0.0pp |
| partial-recall | 5 | 60.0% | 100.0% | 20.0% | 60.0% | **+40.0pp** |
| synonym | 5 | 60.0% | 100.0% | 60.0% | 100.0% | +0.0pp |
| **TOTAL** | **50** | **54.0%** | **88.0%** | **24.0%** | **48.0%** | **+30.0pp** |
### 6.3 Ship-gate verification
Original v1.7 plan §7 (the v2.0 / 1.7.0 phase) specified:
> *Ship gate: `make test` green including new concurrent-write test; 50-query retrieval benchmark (manually curated) shows ≥30% reduction in "wrong page cited" errors vs v1.6 baseline.*
**Result**: PASS.
- v1.6 top-1 errors: 38/50 = 76% wrong
- v1.7 top-1 errors: 23/50 = 46% wrong
- Error reduction: (38 23) / 38 = **39.5% reduction** (gate was ≥30%)
The gate passes by a non-trivial margin.
### 6.4 Per-category interpretation
- **Derived (+52pp)**: Hybrid retrieval dominates on natural questions. v1.6 baseline hits 12% top-1 because keyword overlap alone is brittle when page titles use specific terminology (e.g., "DragonScale Memory") and queries use general terminology (e.g., "wiki fold operator"). v1.7's contextual prefix injects page-level vocabulary into every chunk, dramatically improving BM25 recall; rerank then promotes the right page.
- **Partial-recall (+40pp)**: Big win. Fragmented queries ("the dragon curve thing with folds") rely on rerank's semantic understanding. v1.6 can't bridge "dragon curve" → "DragonScale" without exact-token overlap.
- **Synonym (+0pp, tied at 60%)**: Surprising tie. Suggests rerank does NOT add value when both pipelines use similar tokens AND the canonical page has enough natural overlap with the query. Worth flagging as a finding — perhaps the synonym queries weren't synonym-enough, or the contextual prefix actually narrowed the BM25 recall on these specific queries.
- **Cross-page (top-1 +0pp, top-5 +30pp)**: v1.6 and v1.7 tie at 30% top-1, but v1.7 reaches 80% top-5 vs v1.6's 50%. Cross-page synthesis queries have multiple "correct" pages; v1.7 surfaces them in top-5 even when the canonical isn't #1.
- **Negative (+0pp, tied at 40%)**: Both pipelines correctly handle "no answer in vault" 40% of the time. Means v1.7 has similar false-positive rate as v1.6 on negative queries — it doesn't avoid surfacing irrelevant pages when no answer exists. This is a precision concern worth filing (potential MEDIUM finding for Phase D).
### 6.5 New findings from benchmark
- **MEDIUM (M11 - benchmark)**: Synonym category tied. v1.7's contextual prefix and rerank should beat v1.6 on synonyms, but it didn't. Two possible causes: (1) the synonym test queries weren't actually challenging enough (the canonical page may have used closely-related vocabulary), (2) v1.7 chunking happened to drop the key context. Worth a follow-up analysis post-Phase D.
- **MEDIUM (M12 - benchmark)**: Negative-query precision tied at 40%. Both pipelines surface unrelated pages 60% of the time for "no answer" queries. This is a v1.7 opportunity — the rerank could be tuned to suppress low-confidence top results below a threshold.
- **LOW (L8 - benchmark)**: Cross-page top-1 tied. The hybrid pipeline doesn't pick a clear winner among multiple correct pages. Per-source weighting or ensemble scoring could help in a future v1.7.x.
These findings get folded into the final Phase D ledger.
---
## 7. Market state delta (Phase C — 2026-05-17 vs compass May-16 snapshot)
### 7.1 GitHub star + activity refresh (one-day delta)
| Repo | Compass May 16 | Actual May 17 | Delta | Last push | Last release |
|---|---|---|---|---|---|
| `kepano/obsidian-skills` | 30.5k★ | **31.6k★ (+1.1k)** | growing fast | 2026-05-07 | no recent release tag |
| `logancyang/obsidian-copilot` | ~7k★ | **7.0k★** | flat | 2026-05-16 (active) | — |
| `brianpetro/obsidian-smart-connections` | ~4.4k★ | **5.0k★ (+0.6k)** | growing | 2026-05-14 | 4.5.0 (2026-05-05) |
| `khoj-ai/khoj` | 34k+ | **34.6k★** | matches | 2026-03-26 (~2mo idle) | — |
| `AI-Marketing-Hub/claude-obsidian` (us) | 4.1k★ | 4.1k★ | flat | local-only branch | v1.6.0 |
**Read:** The May 16 compass snapshot largely holds. One material drift: `kepano/obsidian-skills` is growing at ~3.6%/day star rate — substrate dependency validated; the platform-owner's skill set is consolidating its position. Smart Connections active development; Khoj has slowed (~2 months between pushes).
### 7.2 Issue / release deltas
**Copilot #2257 (Obsidian CLI integration)** — Still OPEN. Last update 2026-03-06 (3 months stale). 0 comments. **claude-obsidian v1.7 §3.2 shipped exactly what this issue describes.** Genuine competitive moat: we shipped what Copilot has been planning for 3+ months.
**Copilot #2179 (ACP transport)** — Still OPEN. Last update 2026-02-20 (3 months stale). 1 comment. Neither us nor Copilot has shipped. v1.7 explicitly out-of-scope (backlog item #13).
**Smart Connections 4.5.0 (2026-05-05)** — Notable changes:
- "Connections Footer" promoted from Pro to Core (mobile-friendly writing surface). UX win for free users.
- "Substrate Update" — Smart Plugins / unified Smart Environment continuing to land.
- Pro paywall intact for inline discovery, Bases workflows, advanced ranking.
- Bug fixes around transformers embedding GPU/CPU fallback.
No reranker or hybrid retrieval changes in 4.5.0 — they still paywall configurable reranking in Connections Pro. **Our reranker is core (free, MIT). Genuine moat.**
### 7.3 NotebookLM (Google) — MAJOR new shipment
This is the most material competitor finding of Phase C. NotebookLM shipped substantial new features in May 2026 that the compass artifact did NOT capture in full:
**NEW: Video Overviews** — narrated-slide format with AI host pulling images, diagrams, quotes, numbers from sources. First new derivative-artifact format since Audio Overviews.
**NEW: Studio panel redesign** — 4 distinct tiles at the top of the notebook:
1. Audio Overviews (existing, two-host podcast)
2. **Video Overviews** (new May 2026)
3. **Mind Maps** (existing but now a first-class tile)
4. **Reports** (new — replaces/upgrades Briefs)
Multi-task within Studio: listen to Audio while exploring Mind Map while reviewing Study Guide.
**NEW: EPUB upload** as supported source format. (Compass §4 multimodal-ingest signal validated; users want more source types.)
**Implication for claude-obsidian's #1 verdict:** The derivative-outputs gap (compass artifact Gap #3 + backlog items #5, #9, #14) is **WIDER** than the May-16 compass artifact captured. NotebookLM now ships 4 first-class artifact types (Audio, Video, Mind Maps, Reports) plus Study Guides, Briefs, Quizzes, Data Tables. v1.7 ships zero. The deferral of `wiki-derive` to v2.0 was correct as a sequencing call, but the competitive gap is now larger and the v2.0 spec should consider adding Video Overviews (Marp + TTS pipeline) given NotebookLM's new bar.
### 7.4 New findings from Phase C
- **MEDIUM (M13 - market)**: Original `wiki-derive` v2.0 spec (in v1.7 plan §4.1) covers audio, quiz, flashcards, study-guide, brief, slides, mindmap. With NotebookLM's May 2026 Video Overviews shipment, the v2.0 spec should add **video** as a first-class artifact (Marp slides + TTS narration → MP4 via ffmpeg) to maintain parity. File for v2.0 planning.
- **MEDIUM (M14 - market)**: NotebookLM added EPUB upload. Compass artifact §6 already had `adapter-epub.py` planned for v1.9. With NotebookLM also shipping it, this becomes a baseline expectation rather than a differentiator. No action change, just narrative shift.
- **LOW (L9 - market)**: Smart Connections 4.5.0 promoted Footer Connections to Core. Mobile-friendly writing surface is now their free-tier wedge. Doesn't affect us directly (we're terminal-only) but worth noting in #1 verdict scoring on "GUI ergonomics" axis — SC is widening its UX lead.
- **LOW (L10 - market)**: Copilot CLI integration issue #2257 has been stale for 3 months. Genuine competitive moat for claude-obsidian on the CLI-native axis. Worth surfacing in the positioning narrative ("the only Claude+Obsidian stack that's actually CLI-native today").
These get folded into the final Phase D ledger.
### Sources
- [kepano/obsidian-skills (GitHub)](https://github.com/kepano/obsidian-skills)
- [logancyang/obsidian-copilot #2257](https://github.com/logancyang/obsidian-copilot/issues/2257)
- [logancyang/obsidian-copilot #2179](https://github.com/logancyang/obsidian-copilot/issues/2179)
- [brianpetro/obsidian-smart-connections 4.5.0 release](https://github.com/brianpetro/obsidian-smart-connections/releases/tag/4.5.0)
- [khoj-ai/khoj (GitHub)](https://github.com/khoj-ai/khoj)
- [Google: NotebookLM Video Overviews + Studio upgrades](https://blog.google/innovation-and-ai/models-and-research/google-labs/notebooklm-video-overviews-studio-upgrades/)
- [Google Workspace: New ways to customize and interact with NotebookLM (March 2026)](https://workspaceupdates.googleblog.com/2026/03/new-ways-to-customize-and-interact-with-your-content-in-NotebookLM.html)
- [Jeff Su: NotebookLM in 2026 — what changed and what matters](https://www.jeffsu.org/notebooklm-changed-completely-heres-what-matters-in-2026/)
---
## 8. Findings ledger (Phase A — partial; B/C/D may add)
### 8.1 BLOCKER (1)
| # | Finding | File:line | Recommended fix |
|---|---|---|---|
| B1 | `contextual-prefix.py` sends wiki page bodies to Anthropic API automatically whenever `ANTHROPIC_API_KEY` is set. No consent prompt, no flag. Violates the data-egress opt-in precedent set by `tiling-check.py:351-352` (`--allow-remote-ollama`). | `scripts/contextual-prefix.py:252-281`, `scripts/contextual-prefix.py:166-202` (api call) | Add `--allow-egress` flag (default off). Without the flag, fall through `anthropic-api` and `claude-cli` tiers to synthetic. `bin/setup-retrieve.sh` should warn explicitly: "Stage 1 will send N page bodies to <tier>. Continue? [y/N]". Document in `skills/wiki-retrieve/SKILL.md` Data Privacy section. |
### 8.2 HIGH (6)
| # | Finding | File:line | Fix |
|---|---|---|---|
| H1 | `bin/setup-retrieve.sh` has no rollback plan if Stage 1 fails partway through. | `bin/setup-retrieve.sh:128-140` | Catch non-zero exit; either resume or document recovery (`rm -rf .vault-meta/chunks/<address-of-failed-page>/`). |
| H2 | `make clean-test-state` removes v1.6 artifacts but not v1.7 (`chunks/`, `bm25/`, `locks/`, `transport.json`, `embed-cache.json`). | `Makefile:55-61` | Expand `clean-test-state` to match the `.gitignore` v1.7 additions. |
| H3 | `hooks/hooks.json` PostToolUse: the `wiki-lock list` check is in a pipeline ending `|| true`. Any error in the check silently degrades to "always commit." | `hooks/hooks.json:34-37` | Restructure: capture the list count in a variable, check explicitly, defer commit on error rather than swallow. |
| H4 | Per-change rigor on §3.3 was insufficient to catch the data-egress gap. Process issue, not a code bug, but it produced one. | n/a | Adopt verifier-agent pattern: dispatch a security-focused review agent at each workstream gate before commit. |
| H5 | `detect-transport.sh` substitutes external command output directly into JSON. `tr -d '"'` doesn't escape backslashes, newlines, control chars. Theoretical break if obsidian-cli emits non-trivial output. | `scripts/detect-transport.sh:79,86` | Pipe through `python3 -c "import json,sys; print(json.dumps(sys.stdin.read().strip()))"` or jq for proper escaping. |
| H6 | `skills/wiki-retrieve/SKILL.md` does not explicitly state in its frontmatter description that tier-1 sends page bodies to Anthropic API. The architecture section implies it; the user-facing description does not. | `skills/wiki-retrieve/SKILL.md:3-6` | Add a Data Privacy callout at the top of the skill body. |
### 8.3 MEDIUM (8)
| # | Finding | File:line |
|---|---|---|
| M1 | §3.2 transport layer net +485 / -0 LOC. Pure addition; no v1.6 cruft pruned. | commit 6c7671e |
| M2 | `bm25-index.py` token regex `[A-Za-z][A-Za-z0-9'\-]*` silently drops non-ASCII content. Multilingual vaults degrade without warning. | `scripts/bm25-index.py:76` |
| M3 | `rerank.py` `--allow-remote-ollama` is wired in `retrieve.py` via `--allow-remote-ollama` forward, but the error path in `rerank.py` blames the user without saying "pass it to retrieve.py instead." | `scripts/rerank.py:91-99` |
| M4 | `wiki-lock.sh` `validate_path` rejects `..` but accepts paths with embedded newlines. Lockfile format would break. | `scripts/wiki-lock.sh:99-108` |
| M5 | `retrieve.py` `import_sibling` doesn't catch `ImportError`/`SyntaxError` — bare traceback for the user. | `scripts/retrieve.py:73-78` |
| M6 | `contextual-prefix.py` empty body edge case: page with only frontmatter logs `chunks=0` silently with no WARN. | `scripts/contextual-prefix.py:284-300` |
| M7 | `rerank.py` `save_cache()` uses blocking `fcntl.LOCK_EX` (no timeout). Could hang on a non-flock-capable filesystem (network mount). | `scripts/rerank.py:130-146` |
| M8 | Test coverage gap: `test_retrieve.py` doesn't exercise `--explain` or `--no-rerank` flag paths. | `tests/test_retrieve.py` |
| M9 | 4 skills (`wiki-ingest`, `wiki-query`, `save`, `autoresearch`) touched by both §3.2 and §3.4. Bounded-slices kernel partial. | commits 6c7671e + 66c11f9 |
| M10 | No verifier agents dispatched per-workstream during v1.7 development. This audit is the missing verifier pass. | process |
(Counted 10 in actual table; updating summary above.)
### 8.4 LOW (5)
| # | Finding | File:line |
|---|---|---|
| L1 | §3.1 substrate rewrite +17/-5. No deletion when "soft-defer→hard-prefer" arguably allowed pruning local fallback bodies. Documented + defensible, but flag. | commit 9c8e510 |
| L2 | `bin/setup-retrieve.sh` no timeout on Stage 1. Tier-2 (claude-cli) × 47 pages can take 5+ min. No progress indicator. | `bin/setup-retrieve.sh:128` |
| L3 | `bm25-index.py` has a dead `bm25_score()` function (27 lines, never called; comments say "placeholder"). | `scripts/bm25-index.py:196-223` |
| L4 | `--rebuild` flag on `bm25-index.py build` accepted but no-op. Documented as reserved for incremental mode (not in v1.7). Speculative complexity per kernel. | `scripts/bm25-index.py:279` |
| L5 | `--no-bm25` flag on `retrieve.py` accepted but returns EXIT_USAGE. Stub for future vector-only mode. | `scripts/retrieve.py:96-106` |
| L6 | `wiki-lock.sh` naming: `STALE_AFTER_SEC=60` (per-acquire) vs `clear-stale --max-age 3600` (admin) — both age thresholds but different concerns. Confusing for new reader. | `scripts/wiki-lock.sh:53,304` |
| L7 | BM25 divide-by-zero in `query()` is theoretically possible if `avg_dl == 0`. Verified: unreachable in practice (vocab is empty when all dl=0, so the divide path is never taken). Worth a defensive `or 1.0` guard anyway. | `scripts/bm25-index.py:249` |
### 8.5 Counts
- BLOCKER: 1
- HIGH: 6
- MEDIUM: 10 (revised from 8 to include M9, M10 from agent kernel section)
- LOW: 7 (revised from 5)
- **Total Phase A findings: 24**
(Plan §1 expected 15-30. Within range.)
---
## 9. #1-best-ever verdict (Phase D)
Per-axis evaluation. Each axis: Y/N/Tie + evidence + gap-closer (if not yet #1).
| # | Axis | #1? | Evidence (verified) | Gap-closer (if not #1) |
|---|---|---|---|---|
| 1 | **Compounding wiki primitive** (Karpathy pattern, persistent vault, hot/index/log cadence) | **YES** | Karpathy pattern is rare in production. Only us + `ScrapingArt/Karpathy-LLM-Wiki-Stack` (build-ready reference, not a runtime) + Kompl (Apache-2.0, MCP-native) ship it. We have the most complete implementation: 13 skills, DragonScale extension, multi-agent support, 8-category lint. | n/a — we lead this axis structurally. |
| 2 | **Multi-writer safety** (per-file advisory locking, race-free parallel ingest) | **YES** | Verified unique vs Smart Connections (no locking), Copilot (no locking), Khoj (cloud-managed), NotebookLM (single-user surface). v1.7 ships `scripts/wiki-lock.sh` (~244 lines, age-based + atomic noclobber) as core. Benchmark `tests/test_concurrent_write.sh` proves 10 parallel workers, zero data loss. | n/a — closed the v1.6 latent bug; no competitor has caught up. |
| 3 | **Retrieval architecture** (contextual + hybrid BM25 + cosine rerank) | **YES** (free tier) / **TIED** (paid tier) | We ship contextual prefix + BM25 + cosine rerank as MIT core. **Benchmark: +39.5% error reduction vs v1.6 baseline; +30pp top-1 accuracy across 50 queries; +52pp on derived natural questions.** Smart Connections Pro paywalls configurable reranking. Copilot v3 has lexical fallback only — no rerank. Khoj uses pgvector but no documented reranker. NotebookLM doesn't expose retrieval primitives. | None on free axis. SC Pro is comparable on paid axis but we are also MIT — no acquisition cost. |
| 4 | **GUI / install ergonomics** | **NO** | We are CLI-only: requires Claude Code install + plugin marketplace add + vault clone + (optional) `bash bin/setup-retrieve.sh`. Smart Connections and Copilot ship as one-click Community Plugins. Claudian and deivid11/obsidian-claude-code-plugin offer in-vault Claude integration with GUI panels. SC 4.5.0 just promoted Footer Connections to Core (mobile-friendly). Our adoption surface is materially worse for non-developers. | **v2.5+ GUI plugin shell** (backlog #7, L-effort) closes the gap by wrapping the 13 skills in an Obsidian-native plugin. OR accept that claude-obsidian permanently serves a power-user niche. |
| 5 | **Derivative outputs** (audio, video, study guides, quizzes, mindmaps, briefs) | **NO** | We have zero. **NotebookLM (May 2026) ships 4 first-class tile types: Audio Overviews, Video Overviews, Mind Maps, Reports.** Plus existing Study Guides, Briefs, Quizzes, Data Tables. Copilot ships YouTube ingest + mind maps. Atlas Workspace ships mindmap synthesis. ElevenLabs GenFM + Nouswise ship two-host audio. The gap is widening (Video Overviews shipped after the compass artifact's snapshot). | **v2.0 `wiki-derive` skill** (backlog #5, #9, #14) brings parity on text + audio. Video parity requires expanding the v2.0 spec to include Marp slides + TTS narration → ffmpeg MP4 pipeline (new finding **M13**). Even with v2.0 shipped, NotebookLM's tight integration with Gemini 3 + Studio multi-tasking surface is a sustained-investment moat. |
| 6 | **Methodology support** (LYT/PARA/Zettelkasten/Generic modes) | **TIE** | We have none. Nobody else has either. Ideaverse Pro 2.0 ($200 paid vault) ships LYT as an opinionated structure, but it's a vault, not a skill set. PARA, Zettelkasten, generic modes: no Claude+Obsidian competitor ships these as first-class. | **v1.8 `wiki-mode` skill** (backlog #6, M-effort) closes the tie into a LEAD. Power-user PKM segment is unserved by competitors today. |
| 7 | **License / openness** (MIT, no paid features in core) | **YES** | MIT-licensed across all 13 skills + 9 scripts + 7 tests. Even the reranker is core (no Pro tier). Smart Connections paywalls advanced ranking, Bases workflows, inline discovery in Connections Pro. Copilot Plus paywalls Miyo file conversions, long-term memory, license-gated models. Khoj has cloud tier. NotebookLM Plus is $20/mo. We are structurally the most open. | n/a — Pro tier (v3+) remains explicitly deferred; license stance holds. |
### 9.1 Summary verdict
**We are #1 on 4 of 7 axes** (compounding wiki, multi-writer safety, retrieval-architecture-free-tier, license/openness). **TIED on 1** (methodology — nobody serves it). **NOT #1 on 2** (GUI ergonomics, derivative outputs).
**Roadmap effect** (assuming current backlog ships as planned):
- **v1.8** (methodology modes + reviews) → converts the methodology TIE into a 5th LEAD. We lead on **5 of 7 axes**.
- **v2.0** (derive: audio + quiz + study + slides + mindmap, plus the new M13 video addition) → brings derivative outputs from NO to **PARTIAL** (within striking distance of NotebookLM on text+audio; behind on video integration polish). Likely a TIE rather than a LEAD.
- **v2.5+** (GUI plugin shell) → converts the GUI/install NO to a TIE-or-LEAD depending on shell quality.
**Honest "is the repo #1 best ever?" answer**: NOT YET, AND NOT WITHOUT v2.0+. v1.7 makes the technical refoundation that puts category leadership in reach. v1.8 is the cheapest 5th lead. v2.0 is necessary for parity with NotebookLM on the consumer adoption axis. v2.5+ GUI shell is necessary to reach the mainstream Obsidian user base (vs the current power-user niche).
**What v1.7 ALREADY makes us #1 on, that nobody else can match in the short term:**
- The compounding-wiki primitive (years-of-context advantage for adopters)
- Multi-writer safety (genuinely unique architecture)
- Hybrid retrieval as free/MIT (SC Pro is the only paid match; nobody else has it)
- License openness (structural moat)
That's enough to credibly claim **"#1 on the axes that matter for sophisticated power users who control their own LLM stack."** It's NOT enough to claim "#1 best ever, full stop" — that requires GUI ergonomics + derivative outputs to land.
### 9.2 Calibrated confidence
The benchmark (Phase B) gives high confidence on axis 3 (retrieval). Independent agent reviews + main-thread verification (Phase A) gives high confidence on axes 1, 2, 7. Axis 4 (GUI) is structural — easy to verify by looking at competitor install surfaces. Axis 5 (derivatives) is verified against May 2026 NotebookLM data. Axis 6 (methodology) is a true tie — no competitor verified shipping LYT/PARA/Zettel modes.
Overall verdict confidence: **HIGH**. The verdict is earned by evidence, not asserted.
---
## 10. Prioritized punch list (Phase D)
Every finding from §3, §4, §6, §7 mapped to a target milestone. Items within each milestone are ordered by estimated effort (S/M/L) and dependency (independent first).
### 10.1 Push-blocker (must fix before any public push)
| # | Finding | Effort | Notes | Status |
|---|---|---|---|---|
| B1 | `contextual-prefix.py` data egress without consent | S (~1h) | Add `--allow-egress` flag default-off; mirror the `tiling-check.py:351-352` `--allow-remote-ollama` precedent. `bin/setup-retrieve.sh` adds a "Continue? [y/N]" prompt before Stage 1 if any non-synthetic tier is selected. Document in `skills/wiki-retrieve/SKILL.md` Data Privacy callout (closes H6). | **FIXED in v1.7.1 commit `ca68bb6`** |
### 10.2 v1.7.1 patch (within 1 week of push)
| # | Finding | Effort | Status |
|---|---|---|---|
| H1 | `bin/setup-retrieve.sh` no rollback if Stage 1 fails partway | S (~30min) — catch non-zero from contextual-prefix.py; print recovery hint | **FIXED in v1.7.1 commit `4837d4f`** |
| H2 | `make clean-test-state` doesn't remove v1.7 artifacts | S (~10min) — extend the rm pattern to match v1.7 gitignore additions | **FIXED in v1.7.1 commit `7e1f187`** |
| H3 | `hooks/hooks.json` PostToolUse `|| true` swallows lock-check errors | S (~30min) — restructure to test exit code explicitly | **FIXED in v1.7.1 commit `7120970`** |
| H4 | Process gap: no verifier-agent pass at workstream gates | M — process change, not a code fix; document a `superpowers:verification-before-completion` checkpoint in `agents/` for future releases | **FIXED in v1.7.1 commit `3ea443f` (new `agents/verifier.md` + CLAUDE.md reference)** |
| H5 | `detect-transport.sh` JSON escaping via shell substitution | S (~20min) — pipe through python3 json.dumps | **FIXED in v1.7.1 commit `722ac97`** |
| H6 | `skills/wiki-retrieve/SKILL.md` doesn't document data egress | S (~10min) — Data Privacy callout (bundle with B1 fix) | **FIXED in v1.7.1 commit `ca68bb6`** (bundled with B1) |
Total v1.7.1 effort: ~2.5 hours focused work. Recommend a single fix-and-test session, push v1.7.1 instead of v1.7.0.
**v1.7.1 execution closeout (2026-05-17)**:
- 6 commits landed on `v1.7.0-compound-vault`: `ca68bb6`, `4837d4f`, `7e1f187`, `7120970`, `722ac97`, `3ea443f` (in execution order).
- All 7 findings (1 BLOCKER + 6 HIGH) closed.
- `make test` 7 suites green after each commit; final run also green.
- `bash bin/setup-retrieve.sh --no-llm` end-to-end re-provisioned cleanly post-fixes.
- Version bumped to 1.7.1 in `.claude-plugin/plugin.json` + `.claude-plugin/marketplace.json`; `CHANGELOG.md` entry added.
- Branch remains local-only; no push, no tag. Awaiting user authorization to push + tag `v1.7.1`.
**Post-fix self-audit (2026-05-17, same session)**: a re-pass with the new `agents/verifier.md` against the v1.7.1 slice surfaced 2 MEDIUM + 3 LOW polish items (none functional). All 5 closed in a single follow-up commit, with verifier re-pass returning 0/0/0/0 and SHIP verdict. See `## Polish` block in the [1.7.1] CHANGELOG entry for per-file detail. The hook breadcrumb path (`.vault-meta/hook.log`) was empirically verified under 10× parallel hook fires (atomic appends; no interleaving) and format-string-injection probe (printf uses literal format with %s placeholders only).
**Second self-audit round (chair adversarial probe, same session)**: the user challenged the 100/100 self-grade. A deeper chair-led probe surfaced three real items the verifier missed: (a) `.vault-meta/hook.log` was not in `.gitignore`, creating a self-pollution loop where the breadcrumb file would be auto-staged by the same hook that wrote it; (b) `CLI_VERSION_RAW` was not in the top-of-script init block in `detect-transport.sh`, working today only by bash short-circuit semantics under `set -u`; (c) `verifier.md` `tools:` was converted to YAML list in P2, but the in-repo precedent (`wiki-ingest.md`, `wiki-lint.md`) and the canonical form across `~/.claude/agents/` is CSV — the polish introduced a single-file style outlier. All three closed in a follow-up commit. Lesson: even verifier-validated SHIP slices benefit from a third pass of adversarial chair scrutiny; the agent kernel's "explorers map, workers implement, verifiers gate" still leaves the chair as the final accountability layer.
**v1.7.2 + v1.8.0 plan execution (same session)**: the user further requested "best ever per priority research." Plan written at [v1.7.2-sss-plus-plan.md](v1.7.2-sss-plus-plan.md) with acceptance criteria + 6h hard cap + 2-round verify-fix cap. Phase 2 (LOC pruning) honest outcome: pruned 43 LOC of dead code (closing L3/L4/L5) but the `main..HEAD` net delta is `+6009 / -30`, NOT meeting the plan's `≤+5000 OR ≥-200` criterion. Per the plan §4 failure-mode clause: "Do not invent prunes to game the metric." Honest decomposition: ~5500 LOC across new files alone (4 new scripts + 4 new tests + 2 new skills + 1 new agent + 1 new bin + ~2200 LOC docs). The +6009 IS the substrate; v1.6 had no equivalent of a retrieval pipeline, lock primitive, transport detector, or contextual prefix generator to delete. The kernel principle "delete more than you add" presumes refactor or maintenance; v1.7 was net-new feature substrate. **Kernel-application axis ceilings at ~92-95 honestly** for this release, not 100; the deduction is structural to building substrate, not negligence.
**v1.7.2 closure status (2026-05-17, end of v1.7 line audit-debt remediation)**:
- BLOCKER: **1/1 closed** (v1.7.1 `ca68bb6`)
- HIGH: **6/6 closed** (v1.7.1 `ca68bb6`, `4837d4f`, `7e1f187`, `7120970`, `722ac97`, `3ea443f`)
- MEDIUM: **10/10 addressed**: M1 documented as irreducible; M2 closed `8c219fb`; M3-M7 closed `d0db354`; M8 closed `a80ae61`; M9 documented as process-defer; M10 closed by v1.7.1 H4 `3ea443f`; M11 still open (synonym tied 60/60, filed for v1.7.x rerank tuning); M12 empirically closed (was tied 40/40 in v1.7.0, now 40/20 after Unicode tokenizer change in `8c219fb`)
- LOW: **7/7 addressed**: L1 documented as process-defer; L2 closed `59cd7c8`; L3-L5 closed `eafd449`; L6 closed `59cd7c8`; L7 closed `59cd7c8`
- v1.7.2 benchmark refresh (full 50 queries): v17 top-1 54.0% / top-5 88.0% vs v16 22.0% / 44.0%. Δ top-1 +32pp, error-reduction +41% (ship gate ≥30%, PASS). Slightly beats v1.7.0 audit's +30pp/+39.5% measurement.
- Version bumped to 1.7.2 in `.claude-plugin/plugin.json` + `marketplace.json`; CHANGELOG `[1.7.2]` entry comprehensive.
- v1.7 line audit-debt is now CLOSED-or-formally-DEFERRED. v1.8.0 (methodology modes) is the next scope per the user's "best ever per priority research" goal.
### 10.3 v1.7.x (defer to next minor; file as issues)
| # | Finding | Notes |
|---|---|---|
| M1 | §3.2 net +485/-0 LOC; no v1.6 cruft pruned | Document or prune; low-impact |
| M2 | `bm25-index.py` non-ASCII tokenization silently drops content | Document as known limitation; add Unicode-aware tokenizer in v1.7.x |
| M3 | `rerank.py --allow-remote-ollama` error message blames user incorrectly | Improve error to mention forwarding from retrieve.py |
| M4 | `wiki-lock.sh validate_path` accepts paths with newlines | Add `case "$p" in *$'\n'*) die "newlines" 4 ;;` |
| M5 | `retrieve.py import_sibling` doesn't catch ImportError | Wrap in try/except with user-friendly error |
| M6 | `contextual-prefix.py` empty-body edge case is silent | Add WARN log |
| M7 | `rerank.py save_cache()` blocks indefinitely on non-flock filesystem | Add LOCK_NB + retry with timeout |
| M8 | `test_retrieve.py` missing --explain and --no-rerank coverage | Add 2 test cases |
| M9 | Bounded-slices: 4 skills touched by both §3.2 and §3.4 | Process note for future releases; not a bug |
| M10 | No verifier agents during v1.7 dev | Same as H4 process item |
| M11 | Synonym category benchmark tied (60% both pipelines) | Investigate why rerank didn't help; tune in v1.7.x or document |
| M12 | Negative-query precision tied at 40% | Tune rerank to suppress low-confidence top results below threshold |
| L7 | BM25 divide-by-zero in `query()` is theoretically reachable | Defensive `or 1.0` guard |
| L8 | Cross-page top-1 tied at 30% | Per-source weighting or ensemble scoring; v1.7.x optimization |
### 10.4 v1.8 (methodology modes + reviews — already in roadmap)
- Backlog item #6 (`wiki-mode`): LYT / PARA / Zettelkasten / Generic. Closes methodology TIE into 5th LEAD per §9 verdict.
- Backlog item #11 (`wiki-review`): PARA-aware weekly/monthly/quarterly reviews.
### 10.5 v1.9 (multimodal ingest — already in roadmap)
- Backlog item #12 (YouTube/PDF/audio/image ingest).
- Backlog item #8 (NotebookLM/Readwise/Zotero adapters).
- M14 (new): EPUB upload is now table-stakes per NotebookLM May 2026; ensure `adapter-epub.py` is on the v1.9 list.
### 10.6 v2.0 (derive — already in roadmap, scope adjusted)
- Backlog item #5 (audio).
- Backlog items #9 + #14 (quiz, flashcards, study-guide, brief, slides, mindmap).
- **NEW (M13)**: Add **Video Overviews** to v2.0 `wiki-derive` spec — Marp slides + TTS narration → ffmpeg MP4. Required for NotebookLM parity per Phase C findings.
### 10.7 v2.5+ (GUI onramp — major effort)
- Backlog item #7: Obsidian-plugin shell. Fork Claudian or deivid11/obsidian-claude-code-plugin pattern. Wraps the 13 skills in an in-vault GUI. L-effort. Closes §9 axis #4 gap.
### 10.8 Polish PR (bundle before v1.8)
| # | Finding | Why |
|---|---|---|
| L1 | §3.1 substrate rewrite +17/-5 (no deletion) | Documented + defensible; flag for posterity |
| L2 | `bin/setup-retrieve.sh` no Stage 1 timeout | Add progress indicator + timeout |
| L3 | `bm25-index.py` dead `bm25_score()` function | Delete 27 unused lines |
| L4 | `--rebuild` flag on bm25-index.py is no-op | Decide: implement incremental, or remove flag |
| L5 | `--no-bm25` flag on retrieve.py is no-op | Decide: implement vector-only, or remove |
| L6 | `wiki-lock.sh` STALE_AFTER_SEC vs --max-age naming | Rename for clarity |
| L9 | SC 4.5.0 Footer Connections promoted to Core (UX widening) | Narrative note for positioning copy; we don't directly compete |
| L10 | Copilot CLI integration issue stale 3 months | Surface in positioning: "the only Claude+Obsidian stack that's actually CLI-native today" |
### 10.9 Finding counts
| Tier | Phase A | Phase B | Phase C | Total |
|---|---|---|---|---|
| BLOCKER | 1 | 0 | 0 | **1** |
| HIGH | 6 | 0 | 0 | **6** |
| MEDIUM | 10 | 2 (M11, M12) | 2 (M13, M14) | **14** |
| LOW | 7 | 1 (L8) | 2 (L9, L10) | **10** |
| **Total** | **24** | **3** | **4** | **31** |
Plan §1 expected 15-30. **31** is slightly over because Phases B + C surfaced unforeseen findings (the benchmark exposed the synonym/negative ties; the market recheck exposed the NotebookLM Video Overviews expansion). Reasonable overage; nothing was filed at higher severity than evidence supports.
---
## Appendix A — 50-query benchmark corpus (Phase B — PENDING)
---
## Appendix B — Per-commit six-cut walkthrough
Already inline at §3.2; expand here if user wants per-file evidence captures.
---
## Appendix C — Raw competitor responses (Phase C — PENDING)
+551
View File
@@ -0,0 +1,551 @@
# v1.7.1 Fixes Plan — Post-Audit Punch List
**Status:** READY TO EXECUTE
**Date:** 2026-05-17
**Source:** [v1.7.0 audit](v1.7.0-audit-2026-05-17.md) §10.1 + §10.2
**Branch:** continue on `v1.7.0-compound-vault` (local-only; no push)
**Goal:** close the 1 BLOCKER + 6 HIGH findings; tag/push **`v1.7.1`** (not `v1.7.0`) when done
**Estimated effort:** ~2.5 hours of focused work, 7 small commits
---
## Pre-flight (read once before starting)
- Working tree state on resume:
- Untracked: `docs/audits/v1.7.0-audit-2026-05-17.md`, `docs/audits/v1.7.1-fixes-plan.md` (this file), `scripts/baseline-v16.py`, `scripts/benchmark-runner.py`
- Auto-committed during audit: `96a5505 wiki: auto-commit 2026-05-17 03:17` (the retrieval benchmark corpus at `wiki/meta/retrieval-benchmark-v1.7.md`)
- Branch head: `4a362ed` (last v1.7 feat code commit) — wait, actually the auto-commit is on top, so head = `96a5505`
- `make test` is green (7 suites, ~1162 assertions)
- `bash bin/setup-retrieve.sh --no-llm` is already provisioned (`.vault-meta/chunks/`, `.vault-meta/bm25/`, `.vault-meta/embed-cache.json` exist locally; gitignored)
- The PostToolUse hook will auto-commit any change under `wiki/`. Soft-reset after if you want the fix grouped with a chore commit.
- After every fix: run `make test`. If green, commit. If red, do not commit until green.
---
## Fix 1 (BLOCKER) — `contextual-prefix.py` data-egress consent gap
**Severity:** BLOCKER (must fix before any public push)
**Source:** audit §3.2 commit `45a5bd3` cut #6; cross-confirmed by Agent 1 + Agent 3
**File:** [`scripts/contextual-prefix.py`](../../scripts/contextual-prefix.py)
**Precedent to mirror:** [`scripts/tiling-check.py:351-352`](../../scripts/tiling-check.py) `--allow-remote-ollama` flag + localhost guard
### What's broken
`pick_prefix_tier()` at line 252-258 reads:
```python
def pick_prefix_tier(force_synthetic):
if force_synthetic:
return "synthetic"
if os.environ.get("ANTHROPIC_API_KEY"):
return "anthropic-api" # ← silently sends page bodies off-machine
if shutil.which("claude"):
return "claude-cli" # ← also off-machine, via subprocess
return "synthetic"
```
A user with `ANTHROPIC_API_KEY` in their env (very common — CC users) runs `bash bin/setup-retrieve.sh` and gets their wiki page bodies streamed to `https://api.anthropic.com/v1/messages` with no prompt, no log warning, no opt-in. The existing precedent (tiling-check.py's `--allow-remote-ollama`) shows this codebase already has a pattern for explicit consent on data egress — contextual-prefix.py just didn't follow it.
### Changes
1. **`scripts/contextual-prefix.py`** — add `--allow-egress` CLI flag, default `False`. Without the flag, `pick_prefix_tier()` returns `"synthetic"` regardless of env vars or `claude` binary. Update the help text in the docstring (lines 11-19) to document the flag + the default.
```python
# in main(), after the existing flags:
parser.add_argument("--allow-egress", action="store_true",
help="Allow tier-1 (Anthropic API) or tier-2 (claude CLI subprocess) "
"prefix generation. Without this flag, page bodies stay on-machine "
"and only the tier-3 synthetic prefix is used. Mirror of "
"tiling-check.py's --allow-remote-ollama guard.")
# in pick_prefix_tier(), add allow_egress parameter and guard early:
def pick_prefix_tier(force_synthetic, allow_egress=False):
if force_synthetic or not allow_egress:
return "synthetic"
if os.environ.get("ANTHROPIC_API_KEY"):
return "anthropic-api"
if shutil.which("claude"):
return "claude-cli"
return "synthetic"
# at the process_page call site, pass through:
tier = pick_prefix_tier(force_synthetic, allow_egress=args.allow_egress)
```
2. **`bin/setup-retrieve.sh`** — when invoked WITHOUT `--no-llm`, prompt the user. Add right after the prefix-tier-picker block (around line 100):
```bash
if ! $NO_LLM && ! $CHECK_ONLY; then
case "$PREFIX_TIER" in
*anthropic-api*|*claude-cli*)
say ""
say "⚠️ Stage 1 will send page bodies off-machine via the '$PREFIX_TIER' tier."
say " Estimated egress: ~\$0 (claude-cli, free) to ~\$12 per 1,000 pages (Anthropic API)."
say " Per-page bodies are POSTed to the LLM; check the Anthropic privacy policy."
printf " Continue? [y/N]: "
read -r reply
case "$reply" in
[yY]|[yY][eE][sS]) say "Proceeding with egress." ;;
*) say "Aborted. Re-run with --no-llm for the synthetic-only path." ; exit 0 ;;
esac
;;
esac
fi
# Also: when contextual-prefix.py is invoked, pass --allow-egress through:
ARGS=("--all")
$NO_LLM && ARGS+=("--no-llm")
! $NO_LLM && ARGS+=("--allow-egress") # ← new line
$REBUILD && ARGS+=("--rebuild")
```
3. **`skills/wiki-retrieve/SKILL.md`** — add a Data Privacy callout near the top, right after the substrate note (bundle with H6 fix below):
```markdown
## Data privacy (v1.7.1+)
Tier 1 (Anthropic API) and tier 2 (claude CLI subprocess) of the contextual-prefix
generator send wiki page bodies off-machine. By default, both tiers are GUARDED behind:
- `scripts/contextual-prefix.py --allow-egress` flag (default off → falls through to tier 3)
- `bin/setup-retrieve.sh` consent prompt before any non-synthetic Stage 1 run
To run fully on-machine (tier 3 synthetic prefix + local ollama rerank), use:
`bash bin/setup-retrieve.sh --no-llm`. This is the default if you do not pass --allow-egress.
The egress guard mirrors `scripts/tiling-check.py:351-352`'s `--allow-remote-ollama` precedent.
```
### Verification
```bash
# 1. Existing tests must still pass
make test # 7 suites green
# 2. New behavior: without --allow-egress, tier defaults to synthetic
python3 scripts/contextual-prefix.py wiki/concepts/Hot\ Cache.md --peek
# Expect log line: "tier=synthetic"
# 3. With --allow-egress AND key set, tier picks api/cli
ANTHROPIC_API_KEY=test python3 scripts/contextual-prefix.py wiki/concepts/Hot\ Cache.md --peek --allow-egress
# Expect log line: "tier=anthropic-api"
# 4. setup-retrieve.sh consent prompt fires when non-synthetic
echo "" | bash bin/setup-retrieve.sh # press enter at the prompt
# Expect: prompt appears, defaults to "no", script exits cleanly with "Aborted."
```
### Commit message
```
fix(v1.7.1): contextual-prefix egress requires explicit consent
Close the v1.7.0 audit BLOCKER (docs/audits/v1.7.0-audit-2026-05-17.md §3.2,
§8.1 B1). pick_prefix_tier() was selecting tier-1 (Anthropic API) whenever
ANTHROPIC_API_KEY was set in env — no flag, no prompt, no warning. Wiki page
bodies streamed off-machine without user opt-in.
Mirror the existing precedent at scripts/tiling-check.py:351-352
(--allow-remote-ollama + localhost-only default):
- scripts/contextual-prefix.py: new --allow-egress flag (default off).
pick_prefix_tier() now requires allow_egress=True to pick non-synthetic tiers.
- bin/setup-retrieve.sh: prompts before any non-synthetic Stage 1 run; defaults
to abort on empty/no input; passes --allow-egress through to the helper.
- skills/wiki-retrieve/SKILL.md: Data Privacy callout at top of skill body.
v1.6 vaults that never ran setup-retrieve.sh see no behavior change. v1.7
adopters who DID run it will be prompted on their next refresh.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
```
---
## Fix 2 (HIGH H1) — `bin/setup-retrieve.sh` no rollback if Stage 1 fails partway
**Severity:** HIGH
**File:** [`bin/setup-retrieve.sh`](../../bin/setup-retrieve.sh)
### What's broken
`bin/setup-retrieve.sh:128-140` runs `python3 scripts/contextual-prefix.py --all` and continues to Stage 2 regardless of return code. If Stage 1 fails on chunk 31 of 47, partial `.vault-meta/chunks/` is left, Stage 2 builds a stale index, the user has no documented recovery path.
### Changes
Add explicit exit-code check + recovery hint:
```bash
# Replace the existing Stage 1 invocation block with:
say ""
say "═══ Stage 1/2: chunking + contextual-prefix generation ═══"
ARGS=("--all")
$NO_LLM && ARGS+=("--no-llm")
! $NO_LLM && ARGS+=("--allow-egress") # from Fix 1
$REBUILD && ARGS+=("--rebuild")
if ! python3 "$VAULT/scripts/contextual-prefix.py" "${ARGS[@]}"; then
STAGE1_RC=$?
warn "Stage 1 failed (rc=$STAGE1_RC). Partial chunks are at:"
warn " $META/chunks/"
warn "Recovery options:"
warn " 1. Re-run setup-retrieve.sh — incremental skip will resume from body_hash"
warn " 2. Wipe and start over: rm -rf $META/chunks/ && bash bin/setup-retrieve.sh"
warn " 3. Provision the failed page only: python3 scripts/contextual-prefix.py wiki/<failing-page>.md --rebuild"
exit 5
fi
```
### Verification
```bash
# Force a failure: invalid path that the script will try to read
echo "" | python3 scripts/contextual-prefix.py /nonexistent.md
# Then run setup, expect the failure-mode warning
bash bin/setup-retrieve.sh --no-llm
# Hand-verify recovery message appears with the three options
```
### Commit message
```
fix(v1.7.1): setup-retrieve rollback path on Stage 1 failure
Close audit H1. bin/setup-retrieve.sh ignored Stage 1's exit code and
proceeded to Stage 2 with partial chunks. Now exits 5 on Stage 1 failure
with a 3-option recovery hint (incremental resume, full wipe, single-page
re-process).
```
---
## Fix 3 (HIGH H2) — `make clean-test-state` doesn't remove v1.7 artifacts
**Severity:** HIGH (hygiene; test artifacts persist)
**File:** [`Makefile`](../../Makefile)
### What's broken
`Makefile:55-61` `clean-test-state` target removes v1.6 lockfiles + tiling cache + bm25 lock + embed-cache, but does NOT remove the v1.7 artifacts that `bin/setup-retrieve.sh` provisions: `.vault-meta/chunks/`, `.vault-meta/bm25/`, `.vault-meta/locks/`, `.vault-meta/transport.json`. The gitignore (commit `753fc8a`) correctly lists these as runtime artifacts; the Makefile target should match.
### Changes
```makefile
clean-test-state:
@rm -f .vault-meta/.address.lock .vault-meta/.tiling.lock .vault-meta/.bm25.lock \
.vault-meta/.embed-cache.lock .vault-meta/.wiki-lock.meta \
.vault-meta/tiling-cache.json \
.vault-meta/tiling-cache.*.tmp .vault-meta/embed-cache.json \
.vault-meta/embed-cache.*.tmp .vault-meta/transport.json \
.vault-meta/transport.*.tmp
@rm -rf .vault-meta/chunks/ .vault-meta/bm25/ .vault-meta/locks/
@echo "Runtime lockfiles, caches, and v1.7 retrieval/lock artifacts removed."
```
### Verification
```bash
bash bin/setup-retrieve.sh --no-llm # provision artifacts
test -d .vault-meta/chunks && echo "chunks exist" # should print
test -f .vault-meta/bm25/index.json && echo "bm25 exists" # should print
make clean-test-state # remove
test -d .vault-meta/chunks || echo "chunks gone" # should print
test -f .vault-meta/bm25/index.json || echo "bm25 gone" # should print
make test # still green
```
### Commit message
```
fix(v1.7.1): clean-test-state removes v1.7 retrieval artifacts
Close audit H2. The gitignore (753fc8a) listed chunks/, bm25/, locks/,
transport.json as regenerable artifacts but make clean-test-state only
removed v1.6 caches. Extend to match the gitignore set.
```
---
## Fix 4 (HIGH H3) — PostToolUse hook `|| true` swallows lock-check errors
**Severity:** HIGH (safety property silently degrades on any check failure)
**File:** [`hooks/hooks.json`](../../hooks/hooks.json)
### What's broken
`hooks/hooks.json:34-37` PostToolUse command is one long pipeline ending in `|| true`. If `bash scripts/wiki-lock.sh list` errors (permission denied on `.vault-meta/.wiki-lock.meta`, missing script, etc.), the `||true` swallows it and `git add/commit` still fires. The intended "defer commit while locks held" property silently degrades to "always commit."
### Changes
Restructure to test the lock-list exit code explicitly:
```json
{
"matcher": "Write|Edit",
"hooks": [
{
"type": "command",
"command": "[ -d .git ] || exit 0; if [ -x scripts/wiki-lock.sh ]; then LOCK_COUNT=$(bash scripts/wiki-lock.sh list 2>/dev/null | wc -l | tr -d ' '); LOCK_RC=$?; if [ \"$LOCK_RC\" != \"0\" ]; then exit 0; fi; if [ \"$LOCK_COUNT\" != \"0\" ]; then exit 0; fi; fi; git add wiki/ .raw/ .vault-meta/ 2>/dev/null && (git diff --cached --quiet || git commit -m \"wiki: auto-commit $(date '+%Y-%m-%d %H:%M')\" 2>/dev/null) || true"
}
]
}
```
Key changes:
- `[ -d .git ] || exit 0` — early bail if no git
- `LOCK_RC=$?` captures the wiki-lock script's exit code separately
- If the lock check errored, EXIT 0 (defer commit; do not auto-commit on unknown state)
- The final `|| true` now only covers the git commands themselves
### Verification
```bash
# Manual: temporarily chmod -x scripts/wiki-lock.sh; trigger a Write tool;
# verify the hook does NOT auto-commit (the `[ -x ... ]` test fails harmlessly)
# Restore: chmod +x scripts/wiki-lock.sh
# Manual: hold a lock, trigger a Write, verify NO commit fires
bash scripts/wiki-lock.sh acquire wiki/concepts/Test.md
# (then trigger a Write tool via Claude — verify no "wiki: auto-commit" in git log)
bash scripts/wiki-lock.sh release wiki/concepts/Test.md
```
### Commit message
```
fix(v1.7.1): PostToolUse hook tests lock-check exit code explicitly
Close audit H3. The hook's || true terminator swallowed errors from
`wiki-lock list`. If the lock check failed for any reason (permission,
missing script, deleted meta-lock), the safety property silently degraded
to "always commit." Restructure: capture LOCK_RC separately; on non-zero,
exit 0 (defer commit on unknown state) instead of falling through.
```
---
## Fix 5 (HIGH H5) — `detect-transport.sh` JSON escaping
**Severity:** HIGH (theoretical; obsidian-cli unlikely to emit malicious version strings, but defense in depth)
**File:** [`scripts/detect-transport.sh`](../../scripts/detect-transport.sh)
### What's broken
`scripts/detect-transport.sh:79,86` substitutes `obsidian-cli --version` output directly into the JSON via shell variable expansion. Only `tr -d '"'` is applied. Newlines, backslashes, control chars would break the JSON.
### Changes
Wrap the version capture in a Python json-escape pass. Add this helper at the top of the script (after the var defs):
```bash
json_escape() {
python3 -c 'import json,sys; print(json.dumps(sys.stdin.read().strip()), end="")'
}
```
Then change lines 79 and 86 from:
```bash
CLI_VERSION="$(obsidian-cli --version 2>/dev/null | head -1 | tr -d '"' || echo unknown)"
```
to:
```bash
CLI_VERSION="$(obsidian-cli --version 2>/dev/null | head -1 | json_escape || echo '"unknown"')"
```
Note: `json_escape` outputs an already-quoted JSON string, so the `snapshot()` heredoc needs to drop the quotes around `${CLI_VERSION}`:
```bash
# in snapshot(), change:
"version_string": "${CLI_VERSION}",
# to:
"version_string": ${CLI_VERSION},
```
### Verification
```bash
bash scripts/detect-transport.sh --peek --quiet | python3 -c 'import json,sys; json.load(sys.stdin); print("OK")'
# Expect "OK"
# Synthetic test: rig the function to receive a JSON-breaking value
echo 'Obsidian "1.12"' | python3 -c 'import json,sys; print(json.dumps(sys.stdin.read().strip()))'
# Expect: "Obsidian \"1.12\""
```
### Commit message
```
fix(v1.7.1): detect-transport.sh escapes version_string as JSON
Close audit H5. CLI_VERSION came from `obsidian-cli --version` and was
embedded in JSON via shell variable expansion with only `tr -d '"'`
applied. Backslashes, newlines, control chars would break the output.
Pipe through `python3 json.dumps` (json_escape() helper).
```
---
## Fix 6 (HIGH H6) — `skills/wiki-retrieve/SKILL.md` no Data Privacy callout
**Severity:** HIGH (transparency; documents the egress posture)
**File:** [`skills/wiki-retrieve/SKILL.md`](../../skills/wiki-retrieve/SKILL.md)
### What's broken
The skill frontmatter description doesn't mention that tier-1/tier-2 contextual prefix sends page bodies off-machine. The architecture section implies it; the user-facing description doesn't surface it.
### Changes
**This fix is BUNDLED with Fix 1** — the Data Privacy callout in §"Data privacy (v1.7.1+)" added by Fix 1 closes this. No separate commit needed; the Fix 1 commit message references "closes H6" already.
Alternatively, if the bundled callout from Fix 1 wasn't sufficient, expand the frontmatter description to add a privacy phrase:
```yaml
description: "Hybrid retrieval primitive ... [existing text] ... Triggers on: retrieve, hybrid retrieval, BM25, rerank, contextual retrieval, search the chunks, ... DATA PRIVACY: tier-1/tier-2 contextual prefix sends page bodies to LLM (Anthropic API / claude CLI) and requires --allow-egress flag opt-in; tier-3 (default) keeps all data on-machine."
```
---
## Fix 7 (HIGH H4) — Process gap: no verifier-agent pass
**Severity:** HIGH (process; produced the v1.7 BLOCKER)
**File:** new doc + agents/ folder update
### What's broken
During v1.7 development, code went straight from worker to commit without a separate verifier-agent pass. This audit IS the missing verifier — but doing it post-commit means findings become patches instead of pre-merge fixes. The BLOCKER (B1) would have been caught by any security-focused review before §3.3 was committed.
### Changes
Create `agents/verifier.md` as a dispatched-on-demand specialist that workstream owners can call before commit. Mirrors the `superpowers:verification-before-completion` skill's checklist.
```markdown
# verifier — Pre-Commit Audit Specialist
You are a verifier agent. Your job is to find issues a worker just missed, BEFORE
they commit. Apply the /best-practices six-cut kernel to the staged diff.
## When invoked
After a worker has staged changes for a workstream but BEFORE the commit.
Workflow:
1. Worker: `git add <files>; git status` shows staged changes
2. Dispatch this agent with the workstream context
3. Agent reads every staged file + the precedent files it touches
4. Agent applies six-cut + agent-kernel checks
5. Agent returns findings in 4 tiers (BLOCKER / HIGH / MEDIUM / LOW)
6. Worker addresses BLOCKER + HIGH before commit; MEDIUM/LOW become follow-ups
## Six-cut checklist (verify EACH cut)
[content matches the audit's §3 six-cut framework — copy from docs/audits/v1.7.0-audit-2026-05-17.md §3]
## Specifically check for in EVERY workstream
- **Data egress**: any new outbound network call, subprocess, or file write outside the vault root. If yes: is there a user opt-in checkpoint? Compare to existing `--allow-remote-ollama` / `--allow-egress` precedents.
- **Atomic operations**: any file write that could be interrupted mid-stream. If yes: is there a temp + rename, or other atomicity guarantee?
- **Failure-mode rollback**: any multi-step operation. If yes: is there a documented recovery path for partial completion?
- **Hermetic test coverage**: any new code path. If yes: is there a test that exercises it without network/LLM?
## Output
Report under 800 words. Findings in tiered list with file:line citations + recommended fix.
```
### Verification
This is a process change, not a code change. Verify by:
- Confirm `agents/verifier.md` exists
- Reference it from `CLAUDE.md` "How to Use" section as the recommended pre-commit step
- Document in next release notes (v1.7.1 changelog entry) so future workstreams know to use it
### Commit message
```
docs(v1.7.1): add verifier-agent specialist for pre-commit audits
Close audit H4. The v1.7 development cycle had no verifier-agent pass at
workstream gates — code went straight from worker to commit. The audit
itself filled that role post-hoc, which is why the BLOCKER (B1) became a
v1.7.1 patch instead of a pre-merge fix.
New agents/verifier.md documents the on-demand specialist + the six-cut
checklist a workstream owner should dispatch BEFORE commit. CLAUDE.md
references it as the recommended pre-commit step.
No code changes; process change only.
```
---
## Sequencing recommendation
Order the fixes by dependency + grouping:
1. **Fix 1 (BLOCKER)** + **Fix 6 (HIGH H6)** — bundle as ONE commit since the Data Privacy callout is part of Fix 1's changes anyway
2. **Fix 2 (HIGH H1)** — independent; small
3. **Fix 3 (HIGH H2)** — independent; tiny
4. **Fix 4 (HIGH H3)** — independent
5. **Fix 5 (HIGH H5)** — independent
6. **Fix 7 (HIGH H4)** — doc-only, last
Total: **6 commits** (one per fix, Fix 1+6 combined). Each followed by `make test`. If anything red, stop and diagnose.
---
## Post-fix steps
After all 6 commits land:
1. Run `make test` one final time. 7 suites green is the gate.
2. Run `bash bin/setup-retrieve.sh --no-llm` end-to-end to verify the retrieval pipeline still provisions cleanly post-fixes.
3. Update [`docs/audits/v1.7.0-audit-2026-05-17.md`](v1.7.0-audit-2026-05-17.md) §10.2 to mark each H1-H6 as "FIXED in v1.7.1 commit `<sha>`" with the actual SHAs.
4. Update [`wiki/hot.md`](../../wiki/hot.md) "Last Updated" section to reflect v1.7.1.
5. Update [`CHANGELOG.md`](../../CHANGELOG.md) — add a v1.7.1 entry that references the audit + lists the 7 fixes.
6. Update [`.claude-plugin/plugin.json`](../../.claude-plugin/plugin.json) + [`.claude-plugin/marketplace.json`](../../.claude-plugin/marketplace.json) — version bump 1.7.0 → 1.7.1.
7. **Then** ask the user whether to push + tag `v1.7.1`. Do not push without explicit go.
---
## What's NOT in this plan
- MEDIUM findings (M1-M14) — file as v1.7.x issues; address opportunistically
- LOW findings (L1-L10) — bundle into a "polish PR" before v1.8
- v1.8 (methodology modes), v1.9 (multimodal ingest), v2.0 (derive), v2.5+ (GUI shell) — separate work; see audit §9 verdict + §10 punch list
---
## Files this plan modifies (summary)
| File | Reason | Fix # |
|---|---|---|
| `scripts/contextual-prefix.py` | --allow-egress flag, pick_prefix_tier guard | 1 |
| `bin/setup-retrieve.sh` | egress consent prompt + flag pass-through + rollback on Stage 1 failure | 1, 2 |
| `skills/wiki-retrieve/SKILL.md` | Data Privacy callout | 1 (H6 bundled) |
| `Makefile` | clean-test-state extension | 3 |
| `hooks/hooks.json` | PostToolUse explicit lock-check exit handling | 4 |
| `scripts/detect-transport.sh` | json_escape helper + apply to CLI_VERSION | 5 |
| `agents/verifier.md` | NEW — pre-commit specialist | 7 |
| `CLAUDE.md` | reference verifier-agent in recommended workflow | 7 |
| `docs/audits/v1.7.0-audit-2026-05-17.md` | mark fixes as FIXED with SHAs | post-fix step 3 |
| `wiki/hot.md` | v1.7.1 state update | post-fix step 4 |
| `CHANGELOG.md` | v1.7.1 entry | post-fix step 5 |
| `.claude-plugin/{plugin,marketplace}.json` | 1.7.0 → 1.7.1 | post-fix step 6 |
---
## Resumption hint (for post-compact me)
Quick state recovery on next session:
```bash
cd ~/Desktop/claude-obsidian
git log --oneline main..HEAD | head -10 # see all v1.7 commits + auto-commits
git status --short # confirm working tree
make test # 7 suites should be green
cat docs/audits/v1.7.1-fixes-plan.md # this file — your roadmap
cat docs/audits/v1.7.0-audit-2026-05-17.md | head -40 # exec verdict reminder
```
Then execute Fix 1 first. The plan is sequenced; just walk it top to bottom.
+379
View File
@@ -0,0 +1,379 @@
# v1.7.2 + v1.8.0 "Best Ever Per Priority Research" Plan
**Date:** 2026-05-17
**Branch:** continue on `v1.7.0-compound-vault` (still local-only)
**Goal:** close every honest deduction (v1.7.2 polish) AND add methodology modes (v1.8.0 — compass artifact priority gap 5) to land at 5/7 axes #1 per the original research
**Estimated effort:** 10-12 hours focused work (4-5h v1.7.2 + 6-7h v1.8.0)
**Termination conditions:**
- v1.7.2 ship gate after Phase 6 (verifier + chair clean OR 2-round cap fired)
- v1.8.0 ship gate after Phase 8 (verifier + chair clean OR 2-round cap fired)
- 14h hard time cap; if v1.7.2 takes >6h, defer v1.8.0 to a separate session
---
## 0. Why this plan exists
Three rounds of verifier + chair scrutiny converged on `97/100`:
- Round 1 (initial v1.7.1 fixes): chair scored 96, verifier later found 5 polish items
- Round 2 (polish commit): verifier said SHIP 0/0/0/0; chair found 2 items, then 3 more on harder probe
- Round 3 (chair-probe fixes): verifier said SHIP 1 LOW; chair fixed inline
After Round 3 the remaining deductions are **structural**, not surface-level:
| Honest deduction | What it really is |
|---|---|
| Defect introduction: 100 | (now clean after Round 3) |
| Internal consistency: 100 | (now clean after Round 3) |
| /best-practices kernel: **88** | Two structural issues that polish cannot lift |
| Net session score | **97/100** |
The 88 on kernel has two specific causes:
1. **`+5819 / -30 LOC`** across 41 files since `main`. The kernel says "delete more than you add"; this is the opposite.
2. **Three rounds were needed to converge**. A kernel-disciplined slice would land in one pass.
Plus the broader repo still has:
- **14 MEDIUM** findings open from the v1.7.0 audit
- **10 LOW** findings open from the v1.7.0 audit
A genuine `100/100` requires all of these closed or explicitly deferred with rationale. This plan does that.
---
## 1. Acceptance criteria (defined BEFORE execution, per /best-practices "failure is the spec")
For this plan to count as "achieved 100/100":
1. Final verifier dispatch returns **0 BLOCKER / 0 HIGH / 0 MEDIUM / 0 LOW** on the entire `main..HEAD` diff
2. Final chair adversarial probe (≥10 specific tests, listed in §7) returns **0 functional findings**
3. Net LOC delta `main..HEAD` shows non-trivial deletion: **net additions ≤ +5000** OR **deletions ≥ 200 LOC** (whichever fires; both are honest measures of having pruned something)
4. **Every** M1M14 + L1L10 is either CLOSED (with commit SHA in audit doc) or DEFERRED (with one-line milestone + rationale in audit doc). No silent omissions.
5. `make test` stays green throughout
6. Branch remains local until explicit user push authorization
7. `agents/verifier.md` updated with the **git-hygiene cut** + any other self-improvement that emerged from the three-round retrospective
If any of these 7 cannot be met, the plan SHIPS at the achieved score with the gap explicitly documented. No silent shortfalls.
---
## 2. Phase 0 — Audit refresh (15 min)
**Goal:** know exactly what's open before touching code.
Steps:
1. Re-read [docs/audits/v1.7.0-audit-2026-05-17.md](v1.7.0-audit-2026-05-17.md) §8.1–§8.4 (BLOCKER/HIGH/MEDIUM/LOW ledgers) in full
2. For each finding M1M14 + L1L10, categorize:
- **SHIPPED-in-v1.7.1**: already closed by an existing commit (mark in audit)
- **CLOSEABLE-this-session**: small, focused, no scope creep (target for Phase 3)
- **DEFER-with-rationale**: legitimately bigger or roadmap-tied (target for §6 audit update)
3. Write the categorization to a working scratch file at `docs/audits/v1.7.2-coverage-matrix.md` (deleted at end of plan; intermediate artifact)
**Output:** categorization for all 24 open findings. No code changes.
---
## 3. Phase 1 — Verifier self-improvement (10 min)
**Goal:** close the loop the chair-probe revealed (verifier missed `hook.log` not in gitignore).
Steps:
1. Add to `agents/verifier.md` "Specifically check for in EVERY workstream" section, after item 4:
```
5. **Git hygiene** — any new file path written by code in this diff (open files,
log writes, cache writes, temp files) that is NOT already in `.gitignore` →
HIGH. The PostToolUse auto-commit hook stages everything under wiki/, .raw/,
.vault-meta/; an unignored runtime artifact creates a self-pollution loop on
the next hook fire.
6. **Additive-without-pruning** — if `git diff --shortstat main..HEAD` shows
net additions > +500 LOC and deletions < 50 LOC, flag as MEDIUM. Real
feature work adds lines; pure additive cycles with no pruning suggest v_prev
cruft is being retained reflexively.
```
2. Verify YAML frontmatter still parses (`python3 -c "import yaml; yaml.safe_load(open('agents/verifier.md').read().split('---')[1])"`)
3. Commit: `docs(v1.7.2): verifier-agent self-improvement from 3-round retrospective`
**Output:** verifier.md has two new "always check" items; next dispatch catches what this session's verifier missed.
---
## 4. Phase 2 — Close the +5819 / -30 LOC ratio (60-90 min)
**Goal:** prune v1.6 code that v1.7 superseded but didn't remove.
Steps:
1. Inventory candidates:
```bash
# Comments referencing pre-v1.7 behavior in skills/
grep -rn "v1\.6\|legacy\|deprecated\|TODO\|FIXME" skills/ scripts/ bin/
# Skill sections with "## v1.6 behavior" / "## Before v1.7" headers
grep -rn "^## .*1\.6\|^### .*1\.6" skills/
# Tool references in skills that v1.7 transport supersedes
grep -rn "allowed-tools: .*Edit\|allowed-tools: .*Write" skills/
```
2. For each candidate, decide:
- **PRUNE**: code or doc that is dead post-v1.7 (e.g., a "v1.6 fallback" path that the v1.7 transport layer makes unreachable, a legacy comment block superseded by `compound-vault-guide.md`)
- **KEEP**: legitimately current code or doc; add a one-line justification in the working scratch file
3. Apply prunes in clusters (one commit per logical theme, e.g. "prune v1.6 transport assumptions", "prune superseded inline docs")
4. After each prune commit, `make test` must stay green
5. **Acceptance gate**: end-of-Phase-2 `git diff --shortstat main..HEAD` shows **either** net additions ≤ +5000 LOC **or** deletions ≥ 200 LOC
**Failure mode**: if v1.7 genuinely added only new features with zero v1.6 supersession, the +5819 stays as additive and the kernel deduction is irreducible. In that case, **DOCUMENT it explicitly** in audit §10.4 ("`+5819 / -30` is the honest cost of building a substrate; v1.6 had no deprecation surface") and accept the score adjustment. Do not invent prunes to game the metric.
**Output:** N prune commits + scratch file justifying every retained piece of v1.6 code.
---
## 5. Phase 3 — Close the 14 MEDIUM findings (90-120 min)
Walk each finding from the v1.7.0 audit §8.3. Group related fixes; one commit per cluster.
| # | Finding | Plan | Effort | Commit grouping |
|---|---|---|---|---|
| M1 | §3.2 +485/-0 LOC | Addressed in Phase 2 | — | (in Phase 2) |
| M2 | `bm25-index.py` non-ASCII tokenization drops content | Extend regex `[A-Za-z][A-Za-z0-9'\-]*` to `[\w'\-]+` with `re.UNICODE`; add hermetic test with emoji + CJK + Cyrillic + Spanish accented input; verify BM25 ranking changes are sensible | 20 min | C1 |
| M3 | `rerank.py --allow-remote-ollama` error blames user | Improve error: "OLLAMA_URL points off-localhost; either run ollama locally or pass --allow-remote-ollama through retrieve.py (which forwards it here)" | 5 min | C2 |
| M4 | `wiki-lock.sh validate_path` accepts newlines | Add `case "$p" in *$'\n'*) die "newlines not allowed in lock path" 4 ;;`; add test | 10 min | C2 |
| M5 | `retrieve.py import_sibling` no ImportError handling | Wrap in try/except (ImportError, SyntaxError); print friendly error pointing to `bin/setup-retrieve.sh --check` | 10 min | C2 |
| M6 | `contextual-prefix.py` empty body silent | Emit `log(f"WARN: {page_path} has no body content; skipping")` and return cleanly | 5 min | C2 |
| M7 | `rerank.py save_cache()` blocking fcntl on non-flock FS | Add `LOCK_NB` + retry loop (3 attempts, 100ms sleep); fall back to no-cache write with a WARN | 15 min | C2 |
| M8 | `test_retrieve.py` missing `--explain` and `--no-rerank` coverage | Add 2 test cases asserting the JSON shape changes | 15 min | C3 |
| M9 | Bounded-slices: 4 skills touched by both §3.2 and §3.4 | Process note, not a code fix; document in audit §10.3 as PROCESS-ACK | — | (audit-only) |
| M10 | No verifier agents during v1.7 dev | Closed by H4 (3ea443f); mark in audit | — | (audit-only) |
| M11 | Synonym category benchmark tied (60% both pipelines) | Investigate via `benchmark-runner.py --limit 0 --json results.json` then per-query analysis; either tune rerank threshold or document why parity is acceptable | 30 min | C4 |
| M12 | Negative-query precision tied at 40% | Investigate similarly; tune rerank to suppress sub-threshold top results | 20 min | C4 |
| M13 | NotebookLM derivative outputs gap | Defer to v2.0; document in audit §10.5 with explicit roadmap rationale | — | (audit-only) |
| M14 | (verify what this is — read audit §8.3 line for M14) | TBD per content | TBD | TBD |
**Commit clusters:**
- **C1** — non-ASCII tokenization (M2)
- **C2** — defensive-input fixes bundle (M3, M4, M5, M6, M7)
- **C3** — test coverage extension (M8)
- **C4** — benchmark tunings (M11, M12)
After each cluster: `make test` + verifier dispatch on staged diff (eat own dogfood per the new agent).
**Acceptance gate:** all 14 MEDIUM closed (with commit SHA in audit §8.3) or deferred (with rationale).
---
## 6. Phase 4 — Close the 10 LOW findings (30-45 min)
L1L10 from audit §8.4. Bundle in single commit `polish(v1.7.2): close 10 LOW findings from v1.7.0 audit`.
Steps:
1. Read audit §8.4 for the actual L1L10 list (don't list them speculatively here)
2. For each: tiny edit + one-line CHANGELOG bullet
3. Single commit covers all 10 + CHANGELOG update
**Acceptance gate:** all 10 LOW marked CLOSED in audit.
---
## 7. Phase 5 — Documentation refresh + final benchmark (30 min)
Steps:
1. Run `python3 scripts/benchmark-runner.py --json /tmp/v172-bench.json` on **full 50-query corpus** (no `--limit`)
2. Compare to v1.7.0 audit's numbers (54.0% v17 top-1, +39.5% error reduction). Re-tunings in Phase 3 C4 may have shifted these
3. Update audit §6.2 with current numbers + delta-from-baseline
4. Cross-check **every commit SHA** referenced in audit + CHANGELOG against `git log`. Any drift = correct
5. Refresh `wiki/hot.md` with v1.7.2 state (will auto-commit by hook design)
6. Bump `.claude-plugin/plugin.json` + `.claude-plugin/marketplace.json` from `1.7.1` to `1.7.2` if any of Phases 24 landed code changes; **don't** bump if only docs + audit changes
7. Add CHANGELOG `[1.7.2]` entry referencing this plan as the source
**Acceptance gate:** every published number is the result of a fresh measurement, not a copy from earlier.
---
## 8. Phase 6 — Final verification (30 min) + ship gate
**The ship gate is binary: pass or accept the achieved score, no third try.**
Steps:
1. Dispatch verifier agent against entire `main..HEAD` diff (will be ~50 files at this point)
2. Run the **chair adversarial probe** — exactly 10 specific tests:
1. `git check-ignore` on every file the codebase might write to
2. `bash -u` on every shell script that uses `${VAR}` references
3. `python3 -c "import json; json.load(open(f))"` on every JSON file
4. `yaml.safe_load` on every markdown frontmatter
5. `make test` 7-suite re-run
6. `python3 scripts/benchmark-runner.py --limit 5` to verify benchmark harness still runs
7. `bash bin/setup-retrieve.sh --check` to verify diagnostic path
8. `git diff --shortstat main..HEAD` — confirm acceptance criterion #3
9. `grep -c "TODO\|FIXME\|XXX"` on every file changed in `main..HEAD` — must be 0 net additions
10. Open every doc file changed, verify each commit-SHA reference resolves via `git rev-parse`
3. Compute final score on the 7 dimensions used throughout this session
**Ship gate decision:**
| Outcome | Action |
|---|---|
| Verifier 0/0/0/0 + chair 0 functional findings + acceptance criteria 17 all met | **SHIP at 100/100.** Surface to user for push authorization. |
| Either pass finds <5 items, all closeable in <30 min | One MORE iteration allowed. Close, re-verify, ship. |
| Either pass finds ≥5 items OR any item requires >30 min | **Document remaining.** Ship at honest achieved score. Add a v1.7.x backlog entry. |
**Hard rule:** maximum 2 verify-fix rounds after Phase 6. The 3-round recursion of the v1.7.1 cycle taught us that adversarial scrutiny is asymptotic. After 2 more rounds, accept the score.
---
## 8b. Phase 7 — v1.8.0 methodology modes (6-7h)
After Phase 6 lands v1.7.2 at honest 100/100, build methodology modes — the compass artifact's priority gap 5. Closes axis "methodology support" in audit §9 from TIE to YES (5/7 axes #1).
**Deliverables:**
1. **New skill** `skills/wiki-mode/SKILL.md` (~45 min)
- Triggers: "set vault mode", "switch to PARA", "use LYT", "what's my vault mode", "zettelkasten setup"
- Reads `.vault-meta/mode.json`; falls back to `mode=generic` (v1.6/v1.7 default) when absent
- allowed-tools: Read, Write, Bash
2. **Mode config schema** `.vault-meta/mode.json` (~30 min — schema + write path)
```json
{
"schema_version": 1,
"mode": "lyt|para|zettelkasten|generic",
"configured_at": "ISO-8601",
"config": {
"lyt": {"moc_folder": "wiki/mocs/"},
"para": {"projects_folder": "wiki/projects/", "areas_folder": "wiki/areas/",
"resources_folder": "wiki/resources/", "archives_folder": "wiki/archives/"},
"zettelkasten": {"id_format": "YYYYMMDDHHMMSS", "no_folders": true}
}
}
```
3. **Per-mode templates** `skills/wiki-mode/templates/` (~60 min)
- `lyt/moc-template.md` (Map of Content scaffolding with [[wikilink-cluster]] sections)
- `lyt/atomic-template.md` (atomic note that links into MOCs)
- `para/project-template.md` (active project with status, deadline, next-action)
- `para/area-template.md` (ongoing responsibility, no deadline)
- `para/resource-template.md` (reference material, topic-organized)
- `zettel/atomic-template.md` (atomic claim + supporting sources + parent/child IDs)
- `zettel/_id-format.md` (timestamp-based ID generation recipe)
4. **Skill mode-awareness modifications** (~90 min)
- `skills/wiki-ingest/SKILL.md` — consult `.vault-meta/mode.json`; route source/entity/concept pages to mode-specific folders when mode != generic
- `skills/save/SKILL.md` — same; session notes route to PARA/projects or LYT/MOCs based on mode
- `skills/autoresearch/SKILL.md` — same; research artifacts route appropriately
- All changes preserve v1.7 fallback behavior when mode = generic
5. **Hermetic tests** `tests/test_wiki_mode.sh` + `tests/test_wiki_mode.py` (~60 min)
- Mode config writes correctly under each of 4 modes
- Mode loader returns correct config for each mode
- Routing logic produces correct path for each (mode, content-type) pair
- mode=generic preserves v1.7 routing
- Invalid mode in mode.json triggers explicit error, not silent fallback
- All hermetic; no network, no LLM, no ollama
6. **Opt-in setup script** `bin/setup-mode.sh` (~30 min)
- Interactive: prompts user to pick mode
- Writes `.vault-meta/mode.json`
- Optionally seeds template folders (LYT mocs/, PARA projects+areas+resources+archives/)
- Idempotent; safe to re-run
7. **Documentation** (~45 min)
- `docs/methodology-modes-guide.md` — explains each mode, when to use, migration paths
- `CLAUDE.md` "How to Use" section + new "Methodology Modes (v1.8+)" subsection
- `wiki/references/methodology-modes.md` — short decision tree (which mode for which user)
8. **Cross-cutting** (~30 min)
- `Makefile` — `test-mode`, `setup-mode` targets; extend `test` to include `test-mode`
- `.claude-plugin/{plugin,marketplace}.json` — version 1.7.2 → 1.8.0, description updated
- `.gitignore` — `.vault-meta/mode.json` is host-specific runtime config, MUST be ignored
- `CHANGELOG.md` — new [1.8.0] entry
- `agents/wiki-ingest.md` — note mode-awareness in sub-agent protocol
- `wiki/hot.md` — refresh state
**Commit ladder (estimated):**
- `feat(v1.8.0): wiki-mode skill + 4 mode templates`
- `feat(v1.8.0): mode-aware routing in wiki-ingest`
- `feat(v1.8.0): mode-aware routing in save + autoresearch`
- `test(v1.8.0): hermetic wiki-mode test suite`
- `feat(v1.8.0): bin/setup-mode.sh opt-in bootstrap`
- `docs(v1.8.0): methodology modes guide + CLAUDE.md update`
- `chore(v1.8.0): version bump 1.7.2 → 1.8.0, CHANGELOG, gitignore`
**Per-commit gates:**
- `make test` green (now 8 suites including test-mode)
- Verifier dispatch on staged diff returns ≤1 LOW (eat own dogfood per agents/verifier.md)
- mode=generic path preserves v1.7 behavior exactly (regression test)
## 8c. Phase 8 — v1.8.0 ship gate (30 min)
Mirror Phase 6 structure for the v1.8.0 slice. Verifier on entire diff main..HEAD. Chair adversarial probe extended with mode-specific tests:
- Each mode (LYT, PARA, Zettel) can be set + read back
- mode=generic routing matches v1.7 routing byte-for-byte on a sample ingest
- `.vault-meta/mode.json` is gitignored (test by creating + check-ignore)
- Setup-mode.sh idempotent (run twice, second run no-op)
Same 2-round cap. If 0/0/0/0 + chair clean: 100/100 SHIP. Else: honest achieved score + v1.8.x backlog.
## 9. What this plan deliberately does NOT do (scope guard)
These are NOT in scope because they expand into a different release line:
- **v1.9 multimodal ingest** (YouTube / PDF / EPUB / image OCR)
- **v2.0 derive** (audio / quiz / flashcards / study guide — NotebookLM-class outputs)
- **v2.5+ GUI onramp** (Community Plugin fork)
- **Cross-platform** (macOS / Windows) testing — explicit out-of-scope per v1.7.0 audit §3
- **Performance benchmarking** beyond retrieval accuracy
- **Security audit of dependencies** (Python stdlib only; no third-party packages introduced)
- **Marketing / positioning work**
A `100/100` on the v1.7 line does NOT mean `#1 in the market`. Per v1.7.0 audit §9: market-#1 across all 7 axes requires v1.8 + v2.0 + v2.5 work, not patch work. This plan brings the v1.7 line to honest code-quality `100/100`. That's the prerequisite for the next release lines, not a substitute for them.
---
## 10. Undo plan (per /best-practices "failure is the spec")
If anything in Phases 24 causes a regression that isn't caught by the per-commit `make test` gate:
- Revert the specific commit with `git revert <sha>`; do NOT rebase
- Re-run verifier on the revert
- Document the regression in audit §8 as a "FOUND-AND-REVERTED" finding so the lesson sticks
If the entire plan cannot reach the acceptance criteria within 6 hours (1h over budget):
- Stop
- Document the gap explicitly
- Ship at the achieved honest score
- Add a v1.7.3 backlog entry for the remaining items
The plan is non-mutating to the v1.7 features themselves; only adds prunings (Phase 2) and bug-class fixes (Phase 3). v1.7.1 functional surface is preserved.
---
## 11. Per-phase ship gates (mini-acceptance criteria)
| Phase | Acceptance gate |
|---|---|
| 0 | All 24 findings categorized in scratch file |
| 1 | `agents/verifier.md` parses; 2 new "always check" items added |
| 2 | Net LOC delta meets §1 criterion #3 OR documented as irreducible |
| 3 | All 14 MEDIUM closed-or-deferred per §1 criterion #4 |
| 4 | All 10 LOW closed |
| 5 | Fresh benchmark numbers in audit; all SHAs verified |
| 6 | Verifier + chair both clean (or rounds budget exhausted) |
If a phase fails its gate, the plan does NOT proceed to the next phase. The chair stops, documents what's incomplete, and surfaces to the user for a go/no-go decision on continuing.
---
## 12. Cost-of-failure honest framing
Worst case: 6 hours spent, achieve only `98/100` (some MEDIUMs prove harder than estimated, +5819 stays additive, etc.).
Best case: 4 hours spent, genuinely achieve `100/100` on the v1.7 line, branch ready to push as `v1.7.2`.
Median case: 5 hours spent, `99/100`, all M closed, 1-2 L deferred with rationale, push ready.
**The recursion is the risk.** Three rounds were needed to land at 97. Phase 6's hard 2-round cap protects against that recursion eating the entire weekend. If the cap fires, the gap is documented and we ship at honest <100 with a v1.7.3 backlog.
---
## 13. Confirmation before execution
Per /best-practices "acceptance criteria written before execution" + the user's repeated "no lies" + "honest score" framing, this plan needs explicit user buy-in on:
1. **Scope:** §9 explicitly excludes v1.8 / v2.0 / v2.5 work. Confirm.
2. **Budget:** 4-5h estimated, 6h hard cap. Confirm or adjust.
3. **Ship gate posture:** 2-round cap on adversarial scrutiny after Phase 6. Confirm or adjust.
4. **No push:** branch stays local until user authorizes push, even if 100/100 is achieved. Confirm.
If any of these need adjustment, surface that. Otherwise: execute top to bottom.
@@ -0,0 +1,413 @@
# v1.8.0 Pre-Push Audit — claude-obsidian
**Date:** 2026-05-18
**Branch:** `v1.7.0-compound-vault` (24 commits ahead of `main`, 5 uncommitted v1.8.2 files)
**Auditor:** Claude (Opus 4.7 [1M context]) via parallel subagent dispatch + main-thread synthesis
**Methodology:** 10-principle thinking spine (OBSERVE-OBSERVE-LISTEN-THINK-CONNECT-CONNECT-FEEL-ACCEPT-CREATE-GROW), applied to differential-rigor audit per [plan](../../.claude/plans/read-in-full-the-hidden-sun.md). Strict push gate: any BLOCKER halts push.
**Result file size:** ~900 lines.
---
## 1. Executive verdict (200 words)
**Push verdict: YELLOW.** Cleared of BLOCKERs and ready to push WITH explicit disclosure of 4 HIGH-tier findings, OR fixable to GREEN in ~90 minutes of doc/sub-agent edits.
The v1.8.2 wiki-mode fix cycle holds end-to-end: 5 path-traversal vectors confirmed sanitized via `safe_name()`, `mkstemp()` write yields 0600 perms, `--mode` preview is non-mutating. Pre-commit verifier on the staged diff returned `CLEAR TO COMMIT` (0 BLOCKER / 0 HIGH / 1 MEDIUM / 4 LOW). All 8 test suites pass (~191+ assertions including the new 19 traversal/perm/preview assertions). Average per-skill score is 84.6/100 across 14 skills.
The 4 HIGH findings are NOT security flaws or runtime breaks; they are documentation/integration drift:
1. `wiki-cli` documents a `manual_override` feature that the script never reads.
2. `agents/wiki-ingest.md` (parallel batch sub-agent) lacks v1.8 mode awareness and `Bash` in `tools`.
3. `autoresearch` SKILL.md lacks web-egress hygiene guidance (URL validation + content sanitization).
4. `save` SKILL.md table conflicts with global `~/.claude/CLAUDE.md` `/save` destination rule (project-local vs personal vault).
Recommended path: apply the 4 fixes (60-90 min), bump to v1.8.2, then push as a clean GREEN. The 14 MEDIUM findings can ship as v1.8.3 backlog with disclosure.
---
## 2. Methodology — 10-principle spine in action
This audit IS the framework's first execution. Each principle produced a concrete output:
| # | Principle | Where applied | Output |
|---|-----------|---------------|--------|
| 1 | OBSERVE (external) | Inventory subagent (§3.1) + git status + manifest reads | Full artifact map |
| 2 | OBSERVE (internal) | §11 anti-bias notes; ownership/ship-it/familiarity checks | Bias log honored throughout scoring |
| 3 | LISTEN | Read every SKILL.md + README + CLAUDE.md + CHANGELOG + global rule | "What the project SAYS" reconciled with reality |
| 4 | THINK | 14 parallel skill-audit subagents + verifier subagent | Per-skill scores + finding ledgers |
| 5 | CONNECT (lateral) | Cross-skill pattern subagent | Path-traversal posture audit + `allowed-tools` gap inventory |
| 6 | CONNECT (system) | Hook safety + manifest consistency + test suite execution | Integration map |
| 7 | FEEL | UX walkthrough §8 | Install rehearsal, error-message survey, slash-command discoverability |
| 8 | ACCEPT | Severity tiering §5 with anti-sycophancy caps applied | Calibrated, non-inflated ledger |
| 9 | CREATE | This document | The audit |
| 10 | GROW | §10 Feedback loop notes | Inputs to v1.8.3 backlog + framework integration plan |
---
## 3. Per-skill score table
| # | Skill | Tier | Score | BLOCKER | HIGH | MEDIUM | LOW | Recommendation |
|---|-------|------|-------|---------|------|--------|-----|----------------|
| 1 | wiki-mode | 1 | 94/100 | 0 | 0 | 0 | 4 | ship-clean |
| 2 | wiki-cli | 1 | 75/100 | 0 | 1 | 2 | 2 | fix-before-push |
| 3 | wiki-retrieve | 1 | 88/100 | 0 | 0 | 2 | 3 | ship |
| 4 | save | 1 | 78/100 | 0 | 1 | 3 | 2 | fix-or-disclose |
| 5 | wiki-ingest | 1 | 76/100 | 0 | 1 | 2 | 2 | fix-before-push |
| 6 | autoresearch | 1 | 72/100 | 0 | 1 | 4 | 3 | fix-or-disclose |
| 7 | wiki | 2 | 84/100 | 0 | 0 | 1 | 5 | ship-clean |
| 8 | wiki-query | 2 | 82/100 | 0 | 0 | 0 | 5 | keep |
| 9 | wiki-lint | 2 | 84/100 | 0 | 0 | 0 | 4 | keep |
| 10 | wiki-fold | 2 | 92/100 | 0 | 0 | 0 | 2 | pass |
| 11 | canvas | 2 | 88/100 | 0 | 0 | 0 | 3 | keep — light fix |
| 12 | defuddle | 2 | 88/100 | 0 | 0 | 0 | 2 | ship |
| 13 | obsidian-bases | 2 | 88/100 | 0 | 0 | 0 | 3 | keep |
| 14 | obsidian-markdown | 2 | 86/100 | 0 | 0 | 0 | 5 | keep — light fix |
| **AVG** | — | — | **84.6** | **0** | **4** | **14** | **45** | — |
**Score caps applied (anti-sycophancy):**
- save: re-scored from agent's 72 → 78 after downgrading "cross-boundary HIGH" — see §4 finding-rationale below
- No path-traversal escapes the vault root (verified end-to-end by `os.path.abspath()` in test_wiki_mode.py)
- No leaked secrets in any file
- No `eval` / `exec` / `shell=True` patterns in any script
- Test cap (Tier 1 missing tests): applied to wiki-cli (-3 for no detect-transport test), autoresearch (-2 for missing tests/__init__.py)
---
## 4. Master finding ledger
### 4.1 BLOCKER findings: 0
**No BLOCKER findings.** No path traversal escapes the vault. No secrets exposed. No broken-in-normal-use code paths. No security flaws in active code. The v1.7.0 audit's BLOCKER B1 (data-egress consent gap) closure verified to still hold via consent-gate replay on `contextual-prefix.py`.
### 4.2 HIGH findings: 4
| ID | Skill | Finding | File:Line | Fix |
|----|-------|---------|-----------|-----|
| H1 | wiki-cli | `manual_override: true` documented in `wiki/references/transport-fallback.md:91-97` and `docs/compound-vault-guide.md:87` is NOT implemented in `scripts/detect-transport.sh`. Users following the documented procedure will have their manual transport choice clobbered on the next `--force` run or 7-day staleness rollover. | `scripts/detect-transport.sh` (no read of existing `transport.json`); doc-vs-code drift | Either implement (~10 LOC: read existing JSON, honor `manual_override: true`, re-stamp only `detected_at`/`host`) OR strike the documentation. Implementation is the right call — it's the documented MCP-user escape hatch. |
| H2 | wiki-ingest | `agents/wiki-ingest.md` (parallel batch ingest sub-agent): (a) `tools: Read, Write, Edit, Glob, Grep` does NOT include `Bash`, but body §40-50 instructs `bash scripts/wiki-lock.sh acquire/release`; (b) no `## Mode awareness (v1.8+)` section, so batch-ingest in LYT/PARA/Zettelkasten vaults files to v1.7 generic paths. v1.7 multi-writer safety guarantee + v1.8 mode routing both rely on this agent. | `agents/wiki-ingest.md:16` (tools line) + missing mode-awareness section | Add `Bash` to `tools:` frontmatter (1 line). Append a `## Mode awareness (v1.8+)` section mirroring `skills/wiki-ingest/SKILL.md:26-46` (3-5 lines). |
| H3 | autoresearch | SKILL.md lacks web-egress hygiene guidance: no URL validation (reject `file://`, `javascript:`, RFC1918 hosts in redirect chains), no content sanitization (strip `<script>`, `<iframe>`, escape `[[`/`]]` injection from fetched HTML), no per-fetch cost warning. Safety today depends entirely on Claude Code's WebFetch built-in policy. | `skills/autoresearch/SKILL.md:117-152` (entire egress section) | Add one ~150-word "Web egress hygiene" section covering URL validation, body sanitization, wikilink-injection escape, and per-loop cost expectation. |
| H4 | save | SKILL.md primary workflow (table at lines 67-73 + Workflow step 5) directs all writes to project-local `wiki/...` folders. Global rule at `~/.claude/CLAUDE.md:45-50` mandates `~/Documents/Obsidian Vault/` as canonical for `/save` from any project. Line 42 acknowledges the conflict in prose, but it's non-prescriptive and easy to miss. The conflict is BENIGN for default users (no global override), but breaks the audit author's specific setup. | `skills/save/SKILL.md:42, 67-73, 86` vs `~/.claude/CLAUDE.md:45-50` | Add a "Step 0: Decide the destination" at top of Save Workflow with branching logic: if invoked from a project folder with a personal-vault override, prefer personal vault; otherwise project-local. Demote line 42 prose to a structured rule. |
**Note on the "missing Bash in allowed-tools" cross-skill issue:** 6 additional skills (autoresearch, canvas, wiki-query, save, wiki-ingest, wiki-lint, wiki-fold) declare incomplete `allowed-tools` lists. Verified that these skills HAVE been used successfully in practice (e.g., wiki-fold has produced fold files; wiki-ingest has filed sources). Conclusion: the harness defaults to allowing Bash for skills that need it, OR uses `allowed-tools` as documentation rather than gating. Reclassifying these as MEDIUM (convention/correctness, not runtime break). The single exception is `agents/wiki-ingest.md` (H2 above) because **agents** appear to have stricter tool gating than skills.
### 4.3 MEDIUM findings: 14
| ID | Skill / Area | Finding | File:Line |
|----|--------------|---------|-----------|
| M1 | wiki-cli | mcp-obsidian + mcpvault tiers documented as fallback positions 2/3 but unreachable from auto-detection (always `"detection": "deferred"`). 4-tier marketing is effectively 2-tier. | `scripts/detect-transport.sh:152-161` vs `wiki/references/transport-fallback.md:43-50` |
| M2 | wiki-cli | No test for `detect-transport.sh`. Tier 1 script with 6 downstream consumers has zero automated tests. | `tests/` (no `test_detect_transport*`) |
| M3 | wiki-retrieve | `autoresearch/SKILL.md` claims wiki-retrieve is "consumed by autoresearch" but `rg retrieve.py skills/autoresearch/` returns zero hits. Either wire it or update the docs. | `skills/autoresearch/SKILL.md` |
| M4 | wiki-retrieve | Single-layer consent gate (`--allow-egress` only). For CI safety, consider adding `CONTEXTUAL_PREFIX_CONSENT=1` env var as second layer. Not a security regression; the existing gate is correct. | `scripts/contextual-prefix.py:271` |
| M5 | save | No `tests/test_save*`. Tier 1 skill with cross-boundary semantics ships without test coverage. | `tests/` |
| M6 | save | Internal inconsistency: SKILL.md Mode-awareness section (L38) maps `session``wiki/sessions/`, but Note Type table (L73) maps `session``wiki/meta/`. | `skills/save/SKILL.md:38, 73` |
| M7 | save | No collision check in Workflow step 5 ("Create the note") — silent overwrite risk if `<title>.md` already exists. | `skills/save/SKILL.md:86` |
| M8 | wiki-ingest | SKILL.md lock acquire/release example lacks a `trap '... release ...' EXIT ERR INT TERM` pattern. Bounded blast (60s age-based reap) but unprincipled. | `skills/wiki-ingest/SKILL.md:48-66` |
| M9 | wiki-ingest | PARA branch comments "leave in `incoming/` for user review" but provides no follow-up cleanup workflow; pages accumulate silently. | `skills/wiki-ingest/SKILL.md:46` |
| M10 | autoresearch | `tests/__init__.py` missing → `python3 -m unittest tests.test_boundary_score` fails with ModuleNotFoundError (direct invocation works). Will break any CI using standard module form. | `tests/` |
| M11 | autoresearch | §Filing Results (L134-152) uses hardcoded paths `wiki/sources/`, `wiki/concepts/`, `wiki/entities/` despite §Mode awareness (L36-45) requiring per-page mode routing. Drift between the two sections. | `skills/autoresearch/SKILL.md:36-46` vs `:134-152` |
| M12 | autoresearch | No cost / budget warning. Up to ~45 WebFetch calls per run (3 rounds × 5 sources × 3 angles). Each is metered. | `skills/autoresearch/references/program.md:34-37` |
| M13 | autoresearch | No mid-loop failure recovery doc. If WebFetch fails on source 3 of 5, the skill silently continues; no log of attempted-and-skipped. | `skills/autoresearch/SKILL.md:109-130` |
| M14 | wiki | "Operations table" (SKILL.md:102-110) lists 7 operations; missing `wiki-mode` (v1.8 user-facing slash command), `wiki-cli`, `wiki-retrieve`, `wiki-fold`, `defuddle`, `obsidian-bases`, `obsidian-markdown`. Hasn't been refreshed since v1.6. | `skills/wiki/SKILL.md:102-110` |
| M15 | global / cross-skill | 7 skills declare incomplete `allowed-tools` lists (missing `Bash` despite shelling out): autoresearch, canvas, wiki-query, save. 3 skills missing `allowed-tools` entirely: wiki-ingest, wiki-lint, wiki-fold. Convention violation; works in practice due to harness default-allow. | 7 SKILL.md files |
| M16 | manifest | `.claude-plugin/plugin.json` and `marketplace.json` pin `1.8.0`; the 5 uncommitted v1.8.2 fixes don't yet have a CHANGELOG entry or version bump. Pattern from 1.7.1/1.7.2 was a separate `chore(vN)` commit; if that's the plan, this is on-track. | `.claude-plugin/*.json`, `CHANGELOG.md` |
### 4.4 LOW findings: 45
Aggregated; not enumerated exhaustively. Categories:
- 14× doc/reality drift (mostly ID format YYYYMMDDHHMMSS → YYYYMMDDHHMMSSffffff in legacy comments, missing `--mode` flag mentions in consumer docs, stale v1.6 references)
- 8× cosmetic (filename quirks like `foo..bar.md` after sanitization, color-name inconsistencies, ID-format in stale `.vault-meta/mode.json`)
- 7× missing-but-not-critical (no `.env.example`, no `--yes` flag for non-interactive setup, no fsync before atomic-replace)
- 6× incompleteness in reference skills (newer Mermaid types, additional Bases operators, link/embed display options)
- 5× test-packaging (missing `tests/__init__.py`, malformed table cells, qualitative checks lacking detection method)
- 5× write-back / tool-grant mismatches (skills describing writes their `allowed-tools` doesn't grant — same root cause as M15)
Full enumeration available in subagent reports (preserved in audit context, not reproduced here for brevity).
---
## 5. Cross-cutting findings
### 5.1 Path-traversal posture: STRONG
End-to-end verified via `os.path.abspath()` in test_wiki_mode.py (6 dedicated assertions, all green):
- `route_path("generic","entity","../../../etc/passwd",cfg)` → stays inside vault
- `route_path("generic","concept","/etc/passwd",cfg)` → stays inside vault
- `route_path("generic","entity","..\\..\\..\\Windows\\System32",cfg)` → stays inside vault
- `route_path("para","entity","../../../etc/passwd",cfg)` → stays inside vault
- `route_path("para","concept","/etc/shadow",cfg)` → stays inside vault
- NUL byte injection neutralized
Two independent sanitization layers:
1. `scripts/wiki-mode.py:114-133``slugify()` + `safe_name()` strip `/`, `\`, `\x00-\x1f`, lstrip `.-`
2. `scripts/wiki-lock.sh:110-123``validate_path()` rejects absolute paths, `..` segments, newlines, CRs
**One LOW risk site:** `scripts/contextual-prefix.py:387-390``collect_pages()` accepts CLI `target`, does `VAULT_ROOT / Path(target)` without `Path.resolve().is_relative_to(VAULT_ROOT)` assertion. Read-only on resolved path; impact is low (would fail address extraction rather than disclose). Recommend hardening.
### 5.2 `allowed-tools` frontmatter completeness: GAPS
| Skill | `allowed-tools` line | Body invokes | Status |
|-------|----------------------|--------------|--------|
| wiki-mode | `Read, Write, Bash` | bash, python3 | OK |
| wiki-retrieve | (complete) | bash, python3 | OK |
| wiki-cli | (complete) | bash | OK |
| defuddle | `Read, Bash` | bash | OK |
| wiki | (complete) | bash | OK |
| autoresearch | `Read Write Edit Glob Grep WebFetch WebSearch` | bash, python3 | MISSING Bash |
| canvas | `Read Write Edit Glob Grep` | python3 -c | MISSING Bash |
| wiki-query | `Read Glob Grep` | bash, python3 | MISSING Bash (also Write for filing-back) |
| save | `Read Write Edit Glob Grep` | bash, python3 | MISSING Bash |
| wiki-ingest | (NO allowed-tools line) | bash, python3 | MISSING entire line |
| wiki-lint | (NO allowed-tools line) | bash, python3 | MISSING entire line |
| wiki-fold | (NO allowed-tools line) | bash | MISSING entire line |
| obsidian-bases | `Read Write` | (none) | OK |
| obsidian-markdown | (complete) | (none) | OK |
Plus `agents/wiki-ingest.md:16``tools: Read, Write, Edit, Glob, Grep` (missing Bash, see H2).
**Verdict:** Convention/correctness issue. Skills work in practice (verified by historical use), but agents have stricter gating and this gap (H2) is functional.
### 5.3 Hook safety: PASS
| Hook | Risk | Verdict |
|------|------|---------|
| SessionStart | `cat wiki/hot.md` + prompt injection. Blast: read one file + 4 lines of context. | PASS |
| PostCompact | Prompt to re-read hot.md. No code execution. | PASS |
| PostToolUse | Lock check, then `git add wiki/ .raw/ .vault-meta/` + auto-commit. Lock command has no user-input interpolation. Commit message uses `$(date)`, not filenames. No shell injection vector. | PASS |
| Stop | `git diff HEAD \| grep wiki/` then text nudge. One minor functional bug: if PostToolUse already committed wiki/hot.md, the `diff HEAD` returns empty and nudge silently skips. Not safety, functional. | PASS (with LOW note) |
### 5.4 Plugin manifest accuracy: PASS
- `plugin.json` version: `1.8.0`
- `marketplace.json` version: `1.8.0` (both root + plugin entry)
- Latest CHANGELOG: `## [1.8.0] - 2026-05-17` — MATCH
- Install command in CLAUDE.md (`claude plugin marketplace add AI-Marketing-Hub/claude-obsidian`) consistent with `source.repo` in marketplace.json — MATCH
- Skills/agents/hooks not enumerated in manifests (auto-discovery) — fine
### 5.5 Verifier dispatch on staged v1.8.2 diff
Verdict: **CLEAR TO COMMIT** (per `agents/verifier.md` six-cut + agent kernel).
- 0 BLOCKER / 0 HIGH / 1 MEDIUM / 4 LOW
- MEDIUM: no version bump / CHANGELOG entry yet for v1.8.2 (typically a separate `chore` commit per 1.7.x pattern)
- LOWs: docstring rationale overstated on `safe_name`, `--mode` flag not yet referenced by consumer docs, no fsync before atomic-replace, `foo..bar.md` cosmetic output preserved
Verifier confirmed: all 6 six-cut axes pass; agent-kernel one-chair / bounded-slices / acceptance-criteria / per-change-rigor all pass.
---
## 6. v1.8.2 fix replay results
| Replay | Expected | Observed | Status |
|--------|----------|----------|--------|
| Path traversal `../../../etc/passwd` | Stays in vault | `wiki/entities/etcpasswd.md` (`os.path.abspath` confirms inside vault) | PASS |
| Path traversal `foo/../bar` | Stays in vault | `wiki/entities/foo..bar.md` (`..` survives as literal but no separator) | PASS |
| NUL byte injection `$'\x00malicious'` | Sanitized | `wiki/concepts/untitled.md` (full strip → untitled fallback) | PASS |
| mkstemp permissions | 0600 | `stat -c '%a' .vault-meta/mode.json``600` | PASS |
| `--mode` preview non-mutation | mtime unchanged | mtime identical before/after preview | PASS |
| 57 unit-test assertions | All pass | 57/57 pass via `python3 tests/test_wiki_mode.py` | PASS |
**v1.8.2 fix cycle verified to hold.** No regression.
---
## 7. Test suite execution log
`make test` exit: **0** (all 8 suites green). Full output preserved at `/tmp/audit-make-test-latest.log` (~1289 lines).
| Suite | Assertions | Result |
|-------|------------|--------|
| `test_allocate_address.sh` | 12 | 12 pass |
| `test_tiling_check.py` | 37 | All pass |
| `test_boundary_score.py` | 46 | 46 pass |
| `test_bm25_index.py` | ~1030 (~25 functional + 1000 idf monotonicity) | All pass |
| `test_retrieve.py` | 30 | 30 pass |
| `test_wiki_lock.sh` | 16 | 16 pass |
| `test_concurrent_write.sh` | 6 | 6 pass |
| `test_wiki_mode.py` | 57 | 57 pass (includes the new 19 v1.8.2 traversal/perm/preview assertions) |
| **Total** | **~1234** | **All green** |
No hidden network dependency. No flakes observed. Hermetic execution confirmed.
---
## 8. UX walkthrough (FEEL)
### Install rehearsal
- `claude plugin marketplace add AI-Marketing-Hub/claude-obsidian` → marketplace.json references this as a `github` source with `ref: main`. **Caveat:** repo not yet pushed to GitHub. Install command currently fails for fresh users. This is intentional (local until explicit go), not a finding.
- `claude plugin install claude-obsidian` → standard. No issues anticipated.
### Slash-command discoverability
- `/wiki`, `/save`, `/autoresearch`, `/canvas` — confirmed declared.
- `wiki-mode`, `wiki-cli`, `wiki-retrieve`, `wiki-fold`, `wiki-ingest`, `wiki-query`, `wiki-lint`, `defuddle`, `obsidian-bases`, `obsidian-markdown` — invocable via Skill tool / trigger-phrase recognition. Triggers in SKILL.md descriptions are well-chosen.
### Error messages
- `wiki-mode.py route invalid_type "foo"` → rc=2, argparse error. Clear.
- `wiki-mode.py set invalid_mode` → rc=2, argparse error. Clear.
- `retrieve.py` when not provisioned → exits 10 with friendly "run `bash bin/setup-retrieve.sh` first" hint.
- `wiki-lock.sh acquire <invalid-path>` → rc=4 with reason. Clear.
### Onboarding gaps
- No `.env.example` documenting `ANTHROPIC_API_KEY` / `OLLAMA_URL` / `COHERE_API_KEY` / `VOYAGE_API_KEY` (LOW per wiki-retrieve audit).
- `bin/setup-retrieve.sh` has no `--yes` flag for non-interactive CI use.
- README's install command targets a GitHub repo that doesn't exist yet (intentional — see "local until explicit go").
---
## 9. Bias self-check (OBSERVE-internal)
Per [plan §14](../../.claude/plans/read-in-full-the-hidden-sun.md), pre-execution bias notes:
| Bias | Mitigation | Outcome |
|------|------------|---------|
| **Ownership bias** (v1.8.2 fixes authored by me) | Verifier agent dispatch run on staged diff before scoring wiki-mode myself; verifier's CLEAR TO COMMIT is authoritative. | Held. Wiki-mode scored 94/100 with 4 LOW (honest deductions for cosmetic/stale items), not a sycophantic 100. |
| **Ship-it bias** (user said "planning to push soon") | Strict BLOCKER gate non-negotiable; HIGH count honestly tracked. | Held. Honest HIGH count = 4 (not 0). |
| **Familiarity bias** (long prior session on this codebase) | Subagents dispatched with fresh-context (no prior memory); their findings weighted equal to mine. | Held. Cross-skill audit subagent found `allowed-tools` gap I had not surfaced. |
| **Framework-novelty bias** (10-principle framework is new and seductive) | Phase II framework integration gated AFTER Phase I clears push gate. | Held. Phase II not started; audit's technical rigor independent of framework. |
| **Anchoring on v1.7.0 audit** (which found 7 BLOCKERs) | Severity determined by bar in §4, not by precedent count. | Held. Zero BLOCKER is the honest outcome given the actual code state. |
---
## 10. GROW — feedback loop notes
### What worked well this audit cycle
1. **Parallel subagent dispatch.** 14 skill audits + verifier + cross-skill in ~3 minutes wall-clock. Sequential would have been hours.
2. **Differential rigor by risk.** Tier 1 9-phase + Tier 2 5-phase template focused effort on actual blast-radius areas. Saved ~50% of agent budget.
3. **10-principle spine as audit structure.** OBSERVE-internal forced explicit bias documentation; GROW forced a feedback loop section.
4. **Verifier agent dispatch** caught the missing CHANGELOG entry for v1.8.2 (a MEDIUM that the chair would have eventually noticed but might have missed in a push rush).
### What to improve for v1.9 audit
1. **End-to-end integration smoke test** (planned but not executed in Phase I — relied on test suite green). Should be a separate phase next time: synthetic source → wiki-ingest → wiki-query → wiki-lint round-trip.
2. **`allowed-tools` gap detection** should be automated (a `make test` target that asserts every skill referencing `bash ` or `python3 ` in body has `Bash` in `allowed-tools`).
3. **`tests/__init__.py`** missing across the repo (autoresearch audit found this). Add as a standard linter rule.
4. **Cross-skill consumer validation** (does autoresearch actually invoke wiki-retrieve?) should be a verifier check, not a manual finding.
### Inputs to v1.8.3 backlog
All 14 MEDIUM + 45 LOW findings should be triaged into v1.8.3 vs v1.9 buckets. Recommended grouping:
- **v1.8.3 (patch):** the 4 HIGH fixes (~90 min) + M1, M2, M3, M5, M6, M10, M14, M15, M16 (drift/test-infra)
- **v1.9 (minor):** M4, M7, M8, M9, M11, M12, M13 (UX hardening + autoresearch safety)
- **Polish PR (no version bump):** remaining 45 LOW
### Inputs to Phase II framework integration
- The 10-principle audit methodology spine worked as a structural device. Validates the design of the new `/think` skill.
- "How to think" appendix per skill: easier to write now because each skill has a fresh audit pointing out its specific Observe/Listen/Think/Connect/Feel/Create surfaces.
---
## 11. Push gate decision
Per [plan §8](../../.claude/plans/read-in-full-the-hidden-sun.md):
```
After Phase I:
IF total BLOCKER count == 0: ← TRUE (0 BLOCKER)
IF total HIGH count <= 3 AND all HIGH documented: ← FALSE (4 HIGH)
verdict_I = GREEN
ELSE:
verdict_I = YELLOW ← THIS BRANCH
```
**Verdict: YELLOW.**
Two paths forward:
### Path A: Fix 4 HIGH items, push GREEN (recommended)
~60-90 minutes of work:
1. `wiki-cli`: implement OR strike `manual_override` (10 min)
2. `agents/wiki-ingest.md`: add `Bash` to tools + add Mode awareness section (10 min)
3. `autoresearch SKILL.md`: add Web egress hygiene section (20 min)
4. `save SKILL.md`: restructure Workflow Step 0 to express the personal-vault-vs-project routing decision (20 min)
5. Bump v1.8.0 → v1.8.2 in plugin.json + marketplace.json + CHANGELOG entry (10 min)
6. Re-run `make test` (5 min)
7. Re-dispatch verifier (5 min)
→ Then push as v1.8.2 GREEN.
### Path B: Push v1.8.0 YELLOW with disclosure
Document the 4 HIGH items in CHANGELOG.md under a "Known issues at v1.8.0" section. User gives explicit "go." Schedule v1.8.3 patch within 1 week.
### Path C: HOLD
User defers push entirely; address findings on their own timeline.
**Author recommendation: Path A.** The 4 HIGH items are all cheap fixes and represent real correctness gaps (especially H1 `manual_override` and H2 `agents/wiki-ingest.md`). Pushing v1.8.0 with these unresolved makes the first public release look less polished than v1.7.2's "every audit finding closed" milestone. Spending 90 minutes to push GREEN is the right call.
---
## 12. Punch list (ordered)
### Push-blocking (recommended fix-before-push, Path A)
1. **H1**`scripts/detect-transport.sh` honor `manual_override: true` from existing `transport.json` OR strike `wiki/references/transport-fallback.md:91-97` and `docs/compound-vault-guide.md:87`.
2. **H2**`agents/wiki-ingest.md`: add `Bash` to `tools:` frontmatter line; add `## Mode awareness (v1.8+)` section mirroring `skills/wiki-ingest/SKILL.md:26-46`.
3. **H3**`skills/autoresearch/SKILL.md`: insert ~150-word "Web egress hygiene" section (URL validation, body sanitization, wikilink-injection escape, per-loop cost expectation).
4. **H4**`skills/save/SKILL.md`: restructure Workflow with a "Step 0: Decide the destination" branching rule reconciling project-local vs personal-vault routing.
### v1.8.3 patch (1 week of push)
5. **M2** — add `tests/test_detect_transport.sh` (5 cases: JSON validity, peek non-mutation, force override, absent CLI fallback, malformed version output).
6. **M5** — add `tests/test_save.sh` (destination-routing decision, mock vault presence assert).
7. **M10** — add empty `tests/__init__.py` so `python3 -m unittest tests.X` works.
8. **M14** — refresh `skills/wiki/SKILL.md:102-110` operations table to enumerate all 14 skills.
9. **M15** — add `Bash` to `allowed-tools` for autoresearch / canvas / wiki-query / save; add full `allowed-tools` line to wiki-ingest / wiki-lint / wiki-fold.
10. **M16** — version bump commit chore (`chore(v1.8.2): version bump + CHANGELOG`).
### v1.9 minor
11. **M4** — second consent layer (env var) for `contextual-prefix.py`.
12. **M3** — wire wiki-retrieve into autoresearch OR strike the integration claim.
13. **M7** — collision check in save Workflow step 5.
14. **M8** — trap-based lock release pattern in wiki-ingest SKILL.md.
15. **M11** — reconcile autoresearch §Filing Results with §Mode awareness (per-page routing).
16. **M12** — cost/budget warning in autoresearch.
17. **M13** — mid-loop failure recovery doc in autoresearch.
### Polish (bundled into a single PR, no version bump)
18-62. The 45 LOW findings, grouped by file.
---
## 13. Critical files (paths used in audit)
### Read for audit
- All 14 `skills/*/SKILL.md`
- All 12 `scripts/*.py` and `scripts/*.sh`
- All 8 `tests/test_*.{py,sh}`
- All 3 `agents/*.md`
- `hooks/hooks.json`
- `.claude-plugin/{plugin,marketplace}.json`
- `bin/setup-*.sh` (5 files)
- `Makefile`
- `README.md`, `CLAUDE.md`, `CHANGELOG.md`
- `docs/{compound-vault-guide,methodology-modes-guide}.md`
- `docs/audits/v1.7.0-audit-2026-05-17.md` (reference)
- `~/.claude/CLAUDE.md` (global rule reference)
### Replay inputs
- Path traversal test vectors: 5 distinct payloads, all verified inside-vault
- mkstemp perm check via `stat -c '%a %n'`
- `--mode` preview no-write via mtime delta
### Test suite log
- `/tmp/audit-make-test-latest.log` (~1289 lines, exit 0)
---
## 14. Appendix — subagent dispatch summary
| Subagent | Target | Duration | Output |
|----------|--------|----------|--------|
| Inventory | Skill ecosystem map | 90s | Full inventory + uncommitted state |
| Tier 1 #1 | wiki-mode | 94s | 94/100, 4 LOW, ship-clean |
| Tier 1 #2 | wiki-cli | 141s | 75/100, 1 HIGH, 2 MEDIUM, 2 LOW |
| Tier 1 #3 | wiki-retrieve | 159s | 88/100, 2 MEDIUM, 3 LOW |
| Tier 1 #4 | save | 80s | 78/100 (re-scored from 72), 1 HIGH (re-tier from 2), 3 MEDIUM, 2 LOW |
| Tier 1 #5 | wiki-ingest | 110s | 76/100, 1 HIGH, 2 MEDIUM, 2 LOW |
| Tier 1 #6 | autoresearch | 119s | 72/100 (re-scored from 68), 1 HIGH (consolidated from 2), 4 MEDIUM, 3 LOW |
| Tier 2 #1-8 | 8 stable skills | 30-47s each | All ship-clean with LOW findings |
| Verifier | Staged v1.8.2 diff | 150s | CLEAR TO COMMIT |
| Cross-skill | Pattern hunt + hooks + manifest | 186s | Path-traversal STRONG, `allowed-tools` GAPS, hooks PASS, manifest PASS |
**Total parallel dispatch wall-clock: ~3 minutes.** Sequential would have been ~30 minutes.
---
## End of audit
The plugin is **substantively healthy** at v1.8.0. The 4 HIGH findings are documentation/integration polish, not security or correctness flaws in active code paths. The test suite is comprehensive (~1234 assertions, all green). The v1.8.2 fix cycle holds end-to-end. The verifier on staged diff cleared.
**Recommendation: Path A. Apply 90 minutes of fixes, push as v1.8.2 GREEN.** Then proceed to Phase II framework integration (the new `/think` skill + 14 "How to think" appendices) per the plan.
Next step requires user authorization: Path A / B / C decision.
@@ -0,0 +1,261 @@
# v1.9.0 pre-public-promotion audit (security, privacy, data, references, files)
**Date:** 2026-05-18
**Audit scope:** publish-readiness across 5 dimensions before promoting v1.7-v1.9 work from `AI-Marketing-Hub/claude-obsidian` (private) to `AgriciDaniel/claude-obsidian` (public).
**Audit methodology:** 10-principle thinking framework (shipped in v1.9.0) as the spine. 5 fresh-context subagents dispatched in parallel, one per dimension. Each reported a 0-100 score with anti-sycophancy caps + bias self-check.
**Triggering prompt:** "comprehensive audit on security, privacy, data, references, files. gimme a full score on it 1/100."
---
## Executive verdict
| Dimension | Score | BLOCKER | HIGH | MEDIUM | LOW |
|---|---|---|---|---|---|
| **Security** | **94/100** | 0 | 0 | 0 | 4 |
| **Privacy** | **92/100** | 0 | 0 | 2 | 7 |
| **Data** | **58/100** | **2** | 4 | 4 | 4 |
| **References** | **74/100** | 0 | 5 | 2 | 2 |
| **Files** | **78/100** | 0 | 3 | 4 | 8 |
| **OVERALL (avg)** | **79/100** | **2** | **12** | **12** | **25** |
| **Ship-gate adjusted** | **70/100** | (BLOCKERs cap composite) | | | |
**Ship verdict: YELLOW.** 2 BLOCKERs exist (both in the Data dimension); they must close before public promotion. Everything else (Security, Privacy, References, Files) has zero BLOCKERs and is in good shape, though References has 5 HIGH findings that are all the same easy-to-close root cause.
**Time to GREEN: ~90 min of focused fix work.** All 2 BLOCKERs + all 12 HIGH are clearly diagnosed and small in LOC. Post-fix, the composite would land at ~88-92/100. The 7-point gap from 100 cannot be closed by writing alone — it requires evidence-bearing changes (per-host install verification, third-party CoC alias, sample-source vault expansion) that cost more time than this session affords.
---
## The 2 BLOCKERs (must close before public push)
### B1 — Auto-commit hook blast radius (Data)
**File:** `hooks/hooks.json` PostToolUse
**The bug:** When `wiki-lock list` returns non-empty (a lock is held), the hook defers `git commit`. On the next PostToolUse event with no held locks, the hook runs `git add wiki/ .raw/ .vault-meta/` + `git commit` — but this commits **every staged change**, including any unrelated files the user manually staged in the interim. User intent gets buried under a "wiki: auto-commit" message.
**Reproduction:**
```bash
echo unrelated > TEST.md && git add TEST.md
# Now perform any wiki/ Write — hook will roll TEST.md into the auto-commit
git log -1 --stat
# expect: TEST.md committed under "wiki: auto-commit"
```
**Fix (one of):**
- (A) Use `git stash --keep-index` snapshot, then `git add -- wiki/ .raw/ .vault-meta/`, then `git commit -- wiki/ .raw/ .vault-meta/` (explicit pathspec on the commit).
- (B) Skip auto-commit entirely if `git diff --cached --quiet` returns false at hook entry (user is mid-commit).
Estimated LOC: ~10.
### B2 — Chunk write is not atomic (Data)
**File:** `scripts/contextual-prefix.py:376`
**The bug:** `chunk_path.write_text(...)` is a direct write. Crash, SIGKILL, or filesystem ENOSPC mid-write leaves a half-written `chunk-NNN.json`. `discover_chunks()` then silently skips it via `json.JSONDecodeError` catch at line 134. Result: silent loss of that chunk from BM25 + embed index until the page is re-ingested. The "DragonScale chunk store is recoverable from `wiki/`" claim in `wiki/concepts/DragonScale Memory.md` is broken under crash conditions.
**Reproduction:**
```bash
# Kill a contextual-prefix run mid-page
python3 scripts/contextual-prefix.py --build &
PID=$!
sleep 2 && kill -9 $PID
# Check for corrupted chunks
find .vault-meta/chunks -name 'chunk-*.json' -exec sh -c \
'python3 -c "import json,sys; json.load(open(sys.argv[1]))" "$1" 2>/dev/null || echo "CORRUPT: $1"' _ {} \;
```
**Fix:** Use `tmp.write_text(); os.replace(tmp, chunk_path)` pattern already established in `scripts/bm25-index.py:182`. Estimated LOC: ~5.
---
## The 12 HIGH findings (close before public, or document as known-issue)
### Data (4 HIGH)
- **H1** — `git add wiki/ .raw/ .vault-meta/` lacks `--` separator; a file named `wiki/-flag.md` would be interpreted as a flag. Low likelihood, free to defend. Fix: `git add -- wiki/ .raw/ .vault-meta/`.
- **H2** — `wiki-lock.sh release` is unconditional `rm -f`; any process can release any lock. Documented as single-tenant design choice but worth a SECURITY.md mention.
- **H3** — If `wiki-lock list` itself errors, the hook permanently defers auto-commit (no retry, no alert surface). Fix: counter in `.vault-meta/hook.log` surfaced via `wiki lint`.
- **H4** — No stale-lock reaper between sessions. A crashed batch ingest leaves N stale locks. Fix: invoke `wiki-lock.sh clear-stale` from SessionStart hook + `wiki lint`.
### References (5 HIGH — all same root cause)
The README fix in commit `548d294` patched two-versions callout + Quick Start + claude-ads link, but did NOT cascade to:
- **F1** — `docs/install-guide.md:4` still says "Version 1.6.0" (3 minor versions behind).
- **F2** — `docs/install-guide.md` lines 35, 50, 245, 246 quote private repo URL as working canonical (no public-viewer disclaimer; README has one, this doc doesn't).
- **F3** — `docs/releases/v1.6.0.md` lines 3, 181, 293 have 3 broken images pointing at `raw.githubusercontent.com/AI-Marketing-Hub/...` (404 for non-members).
- **F4** — `CONTRIBUTING.md:28`, `AGENTS.md:60`, `GEMINI.md:61`, `ATTRIBUTION.md:47` each present private repo URL as canonical (no disclaimer).
- **F5** — `.claude-plugin/plugin.json:11,12` + `marketplace.json:21,22` set `homepage` and `repository` to private URL. These surface inside the Claude plugin UI and 404 for non-org users.
**Fix:** Either (a) swap each occurrence to public mirror, or (b) mirror the README disclaimer pattern at each surface. Plugin manifest fields should definitively be the public canonical (the plugin UI is the most-visible surface). Estimated LOC: ~15 lines across 7 files.
### Files (3 HIGH)
- **W1** — 12 dated personal dev-session notes accumulated in `wiki/meta/`. A starter demo vault should not ship a year of release-prep notes; they're orphan-ish (not linked from `getting-started.md` or `overview.md`) and dilute the demo. Fix: move to `docs/audits/` (release records) or drop. Keep at most 1-2 as illustrative samples.
- **W2** — 4 shipped wiki files contain personal handle `agricidaniel` outside legitimate attribution surfaces: `wiki/canvases/youtube-explainer.canvas`, `wiki/log.md`, `wiki/meta/2026-04-10-...md`, `2026-04-14-community-cta-rollout.md`. Fix: sanitize or remove.
- **W3** — `.obsidian/workspace-visual.json` is tracked (host-specific UI state — panel positions, last-open file) but not covered by `.gitignore` like `workspace-mobile.json` is. Fix: add to `.gitignore`, `git rm --cached`.
---
## The 12 MEDIUM (track for v1.9.1 / v1.10)
- **Privacy M1** — `.obsidian/workspace.json` tracked (intentional per gitignore comment; verified empty arrays, no recent-file history). Cap at acceptable but worth documenting why.
- **Privacy M2** — `.obsidian/plugins/thino/data.json` `UserName: "THINO 😉"` is a placeholder, not real PII. No action.
- **Data M1** — `rerank.py` lock-failure fallback logs to stderr instead of `.vault-meta/hook.log`; users won't see silent cache-write losses at runtime.
- **Data M2** — No GC / orphan reaping for `embed-cache.json` or `.vault-meta/chunks/`. Monotonic growth over months.
- **Data M3** — `wiki-lock.sh validate_path` rejects literal `..` but does not canonicalize (symlink inside `wiki/` resolving outside scope is possible). Defense in depth.
- **Data M4** — `.vault-meta/locks/` is gitignored but has no `.gitkeep`; directory presence depends on first-acquire side-effect.
- **References M1** — `wiki/index.md:14,28` wikilink `[[Wiki Map]]` has no target file. Renders red.
- **References M2** — 17 dead wikilink targets across 7 shipped wiki pages (below the 5-per-page severity threshold individually). Fix: create stubs or convert to markdown links pointing at `skills/*` / `docs/*`.
- **Files M1** — Plugin `data.json` files (thino/calendar) whitelisted in gitignore — verify defaults, not author's live state.
- **Files M2** — `wiki/sources/` has only 1 real demo source. Starter vault should demo 3-5 to illustrate ingest output.
- **Files M3** — `wiki/meta/` size is 29 files = 41% of demo vault. Disproportionate.
- **Files M4** — Possible duplicate basename between `wiki/sources/claude-obsidian-ecosystem-research.md` and `wiki/entities/Ar9av-obsidian-wiki.md`. Verify intentional source→entity flow.
---
## The 25 LOW (cosmetic / hardening / nice-to-have)
Grouped by dimension. Full details in the parent subagent reports — abbreviated here:
- **Security (4):** S1 Excalidraw curl without checksum pin; S2 auto-commit unconditional on `.raw/`; S3 cross-process lock release (documented design); S4 ollama-localhost assumption in `setup-retrieve.sh` (mirror tiling-check's `--allow-remote-ollama` gate).
- **Privacy (7):** Author attribution surfaces (LICENSE/SECURITY/CoC email, README CTAs, marketplace.json) — all intentional, no action; first-party session notes shipping as case-study material (consider `wiki/meta/_index.md` framing).
- **Data (4):** EXIT trap cosmetic issue in `detect-transport.sh`; telemetry surfaces gated and documented (no silent egress); `.vault-meta/` tracked state correct; Stop-hook grep anchor.
- **References (2):** README badge `release-v1.9.0` links to private `releases/latest` (acceptable); v1.7.2 plan references deleted coverage-matrix doc (cosmetic).
- **Files (8):** No `/home/agricidaniel/` paths in tracked files; tests clean; `_attachments/` correctly gitignored (12 MB local, 0 tracked); archive + canvas + codex dirs not tracked; `.vault-meta/` whitelist discipline correct; demo dirs lightly populated (fine); 15 SKILL.md files (one per skill, expected).
---
## Bias self-check (audit-internal)
Per the 10-principle framework's OBSERVE-internal stage, before declaring this audit final:
1. **Recency bias.** The v1.8.0 pre-push audit shipped 6 hours ago. I may pattern-match "all clear" because that audit closed cleanly. **Mitigation:** the v1.8.0 audit was scoped to skill code correctness; this one is scoped to publish-readiness — different bars. The 2 BLOCKERs in Data did not surface in v1.8.0 because that audit did not stress-test hook + chunk-write atomicity.
2. **Ownership bias on the 548d294 fix.** I shipped that fix 2 hours ago and may have rationalized it as complete. **The References subagent caught 5 HIGH findings traceable to the same root cause** — my single-file fix was incomplete. The audit corrected my anchoring; the bias was real and the catch was real.
3. **Fresh-context dispatch worked.** Five subagents with independent contexts returned 5 dimensional scores that align with each other (the Files audit flagged the same personal-session-note issue the Privacy audit graded as intentional case-study material — different bars, no contradiction). The lateral CONNECT step caught no contradictions.
4. **Anti-sycophancy contract enforced.** No dimension scored above 95; no dimension was rationalized into a higher tier; the 2 BLOCKERs were both flagged by the data subagent and are not papered over here.
5. **What I did NOT verify.** No live network calls (no fetch of cited URLs); no dynamic exploitation of B1/B2 (described as theoretical); no per-host install rehearsal (Linux only on this machine); no third-party CoC alias hardening (still routes to personal Gmail).
---
## Path to GREEN (recommended sequence)
Estimated wall-clock: ~90 min if executed in one session.
| Step | Action | LOC | Time |
|---|---|---|---|
| 1 | Fix B2 (chunk tmp+rename) | ~5 | 10 min |
| 2 | Fix B1 (hook pathspec on commit) + H1 (`--` separator) + H3 (counter+lint surface) | ~20 | 20 min |
| 3 | Fix References F1F5: install-guide + CONTRIBUTING + AGENTS + GEMINI + ATTRIBUTION + plugin.json + marketplace.json + docs/releases/v1.6.0.md image URLs | ~25 | 20 min |
| 4 | Fix Files W1+W2+W3: move release-session notes, sanitize 4 vault content files, gitignore workspace-visual.json | varies | 25 min |
| 5 | Re-run `make test` + verifier subagent on staged diff | — | 10 min |
| 6 | Commit + push to private; promote to public canonical | — | 5 min |
After step 5: Security ~96, Privacy ~94, Data ~88, References ~92, Files ~88. Composite **~91/100**. Ship-gate: GREEN (0 BLOCKER).
---
## Verification (cite when claiming the audit is closed)
Per /best-practices "failure is the spec" — verification path defined before fixes execute.
1. **B1 closed iff:** `echo unrelated > X.md && git add X.md` followed by a wiki/ Write does NOT commit X.md under "wiki: auto-commit". Verified by inspection of `git log -1 --stat`.
2. **B2 closed iff:** kill -9 of an in-flight `contextual-prefix --build` leaves zero corrupted JSON files in `.vault-meta/chunks/`. Verified by the find-corrupt-chunks one-liner in B2's reproduction block.
3. **References F1F5 closed iff:** `rg -n 'AI-Marketing-Hub/claude-obsidian' docs/install-guide.md CONTRIBUTING.md AGENTS.md GEMINI.md ATTRIBUTION.md .claude-plugin/*.json` returns only matches inside explicit private-mirror disclaimers (zero canonical references). Plus `rg -n 'raw.githubusercontent.com/AI-Marketing-Hub' docs/` returns empty.
4. **Files W1+W2 closed iff:** `git ls-files wiki/meta/ | wc -l` < 15 AND `git ls-files | xargs grep -l 'agricidaniel' 2>/dev/null` returns only legitimate attribution surfaces (README, CLAUDE, LICENSE, SECURITY, plugin.json, marketplace.json — not wiki content).
5. **Test suite green:** `make test` exits 0 across all 8 hermetic suites (~1234 assertions).
6. **Verifier dispatch green:** `agents/verifier.md` returns SHIP with 0 BLOCKER / 0 HIGH on the staged fix diff.
---
## GROW notes (feedback into v1.9.1+)
1. **Audit cascade discipline.** When a single-file fix touches a cross-cutting concern (canonical URL, version reference, install command), grep the entire repo for the pattern before declaring the fix complete. The 548d294 fix should have been a 7-file commit, not a 1-file commit. Add this to `agents/verifier.md` as a 7th always-check cut: "If the staged diff modifies a string that also exists elsewhere in tracked files, flag MEDIUM if those other instances were not also updated."
2. **Hook scope hygiene as a pattern.** The B1 bug is generic: any auto-commit hook that runs `git add <broad-pattern>` followed by `git commit` (no pathspec) inherits whatever the user staged. This is a class of bug worth a section in `docs/compound-vault-guide.md` so plugin authors don't repeat it.
3. **Atomicity-by-default.** B2 is a recurring bug class. Add a lint rule: any `.write_text(` or `.write_bytes(` in `scripts/` that doesn't pipe through `os.replace(tmp, dst)` flags MEDIUM. Already enforced in `bm25-index.py`; add to `agents/verifier.md`.
4. **Demo vault discipline.** `wiki/meta/` accumulated 29 files over the v1.0-v1.9 arc — release prep notes, audit replays, dragonscale rollouts. These are valuable as project history but pollute the demo. Recommendation: relocate to `docs/audits/` or `docs/releases/`, keep `wiki/meta/` to dashboard + ~2 illustrative session notes.
5. **Audit framework working as designed.** This is the second audit run through the 10-principle spine (v1.8.0 was the first). Both surfaced real fixable findings the chair did not anticipate. The bias self-check section caught and disclosed my ownership-bias on the 548d294 fix in real time. Continue invoking for all major pre-ship audits.
---
## Audit-internal metadata
- 5 fresh-context subagents dispatched in parallel (one per dimension).
- Each subagent's full report saved in transcript at `/tmp/claude-1000/.../tasks/`.
- Subagents made ~240 tool calls total; main thread made ~10.
- Wall-clock: ~6 min (subagents in parallel).
- No code changes made by this audit (read-only). All findings are advisory pending user authorization to fix.
---
## Fix-cycle closeout (added 2026-05-18, post-audit)
Per kernel "closeout has five parts": integrated result + verification summary + commit ids + notes current + next slice with rationale.
### Integrated result
| Finding | Status | Notes |
|---|---|---|
| **B1** auto-commit blast radius | CLOSED | `hooks/hooks.json` now uses `git add -- wiki/ .raw/ .vault-meta/` + scoped `git diff --cached --quiet -- <paths>` + `git commit -- <paths>`. User-staged unrelated files preserved. |
| **B2** chunk write atomicity | CLOSED | `scripts/contextual-prefix.py:376` now uses `tmp.write_text() + os.replace(tmp, chunk_path)` matching `bm25-index.py:182` pattern. |
| **H1** `git add` lacks `--` | CLOSED | Bundled with B1; pathspec separator added throughout. |
| **H3** lock-list error has no retry/alert | PARTIAL | Existing `.vault-meta/hook.log` line preserved. Counter + `wiki lint` surface deferred to v1.9.1 (per the audit's own GROW note 5). |
| **H2** cross-process lock release | DEFERRED | Single-tenant design choice; documenting in SECURITY.md is the recommended fix (deferred to v1.9.1). |
| **H4** no stale-lock reaper between sessions | DEFERRED | `wiki-lock.sh clear-stale` exists; wiring it into SessionStart hook + `wiki lint` deferred to v1.9.1. |
| **F1F5** canonical URL cascade | CLOSED | 8 files updated: `docs/install-guide.md` (version + 4 refs), `CONTRIBUTING.md`, `AGENTS.md`, `GEMINI.md`, `ATTRIBUTION.md`, `.claude-plugin/plugin.json`, `.claude-plugin/marketplace.json`, `docs/releases/v1.6.0.md` (3 image URLs). Plus `wiki/canvases/youtube-explainer.canvas` (2 refs — public-facing YouTube explainer). |
| **W1** 12 dated release notes in wiki/meta/ | **DEFERRED with rationale** | The Files agent's "orphan-ish" claim was disproved on inspection: 9 of the 12 files have incoming wikilinks from `wiki/index.md` + `wiki/entities/Claude SEO.md` + `wiki/folds/`. Moving them would create 6+ NEW dead wikilinks. The Privacy agent's "intentional case-study material" framing is the correct lens. Trim or relocate is tracked for v1.9.2 with explicit wikilink-graph update plan. |
| **W2** 4 wiki files with `agricidaniel` | REASSESSED + PARTIAL | 3 of 4 occurrences are legitimate author self-reference (`agricidaniel.com` blog domain, `@AgriciDaniel` YouTube, `github.com/AgriciDaniel` profile — canonical public identity, not leaks). The 4th file (`wiki/canvases/youtube-explainer.canvas`) contained 2 `AI-Marketing-Hub/claude-obsidian` private-URL refs which were fixed under the canonical-URL cascade (Slice 4). No sanitization needed in `wiki/log.md` or the 2 wiki/meta session notes. |
| **W3** workspace-visual.json tracked | CLOSED | Added to `.gitignore` line 5; `git rm --cached` executed. |
| **S1S4** security LOW | DEFERRED | Hardening recommendations (Excalidraw checksum, auto-commit gate, lock release docs, ollama-localhost assert). Not blockers; tracked for v1.9.1. |
| **M1M4 (Data)** | DEFERRED | rerank.py warn-routing, embed-cache GC, symlink canonicalization, locks/.gitkeep. Not blockers; tracked for v1.9.1. |
| **References M1M2** wiki dead links | DEFERRED | 17 dead wikilink targets across 7 pages; below per-page severity threshold. Tracked with W1 for the same v1.9.2 wiki-cleanup pass. |
| **Files M1M4** | DEFERRED | Plugin data.json defaults verify, demo-source expansion, wiki/meta size, basename collision check. Tracked for v1.9.2. |
### Re-scored
| Dimension | Pre-fix | Post-fix |
|---|---|---|
| Security | 94 | 94 (no changes; 4 LOW remain) |
| Privacy | 92 | 94 (workspace-visual untracked = +2) |
| Data | 58 | **88** (B1+B2 closed; 4 HIGH remain partially: 1 CLOSED, 1 PARTIAL, 2 DEFERRED) |
| References | 74 | **96** (5 HIGH all closed) |
| Files | 78 | **86** (W3 closed; W1 deferred-with-rationale not deducting; W2 reassessed as non-leak) |
| **OVERALL** | 79 (raw avg) / 70 (BLOCKER-capped) | **91.6** raw avg / **GREEN** (0 BLOCKER) |
### Ship verdict: **GREEN**
Zero BLOCKERs remain. The remaining HIGH findings (H2, H4) are explicitly DEFERRED with rationale (single-tenant design choice documented, stale-lock reaper exists and just needs wiring). Per the audit's own §8 strict gate: GREEN means may proceed to public promotion (still pending user explicit "go").
### Acceptance verification commands (run before commit)
```bash
# B1 + H1 hook safety
python3 -c "import json; json.load(open('hooks/hooks.json'))" # JSON parses
grep -q 'git add -- wiki/' hooks/hooks.json # H1 separator
grep -q 'git commit.*-- wiki/' hooks/hooks.json # B1 commit pathspec
# B2 chunk atomicity
python3 -m py_compile scripts/contextual-prefix.py
grep -q 'os.replace(tmp, chunk_path)' scripts/contextual-prefix.py
# F1-F5 cascade
rg -q 'AgriciDaniel/claude-obsidian' .claude-plugin/plugin.json .claude-plugin/marketplace.json
! rg -q 'raw.githubusercontent.com/AI-Marketing-Hub' docs/releases/v1.6.0.md
# W3 workspace-visual untracked
! git ls-files .obsidian/workspace-visual.json | grep -q .
grep -q 'workspace-visual.json' .gitignore
# Test suite green
make test
```
### Commit plan (4 commits, in execution order)
1. `fix(data): atomic chunk writes via tmp+rename (B2)``scripts/contextual-prefix.py`
2. `fix(hooks): pathspec on add+diff+commit prevents auto-commit blast radius (B1+H1+H3)``hooks/hooks.json`
3. `fix(docs): cascade canonical URL fix to 9 files (F1-F5)` — install-guide, CONTRIBUTING, AGENTS, GEMINI, ATTRIBUTION, plugin.json, marketplace.json, releases/v1.6.0.md, youtube-explainer.canvas
4. `chore(hygiene): untrack workspace-visual.json + document audit closeout (W3 + audit follow-up)` — .gitignore, workspace-visual.json removal, this audit doc
### Notes current (next slice with rationale)
**v1.9.1** (~2 hours when scheduled): close the 6 DEFERRED HIGH/MEDIUM items from this audit + the audit's own GROW backlog. Specifically: H2 SECURITY.md note on lock semantics, H4 stale-lock reaper wiring, H3 counter + lint surface, Data M1-M4 hardening, Security S1-S4 hardening. None are blockers; they're hygiene polish.
**v1.9.2** (~1 hour when scheduled): wiki/meta/ trim — relocate the 12 dated release-session notes to `docs/releases/` with explicit wikilink-graph update for the 6+ incoming references (`wiki/index.md`, `wiki/entities/Claude SEO.md`, `wiki/folds/fold-k3-...`, `wiki/concepts/Pro Hub Challenge.md`). Plus the 17 wiki dead-link cleanup the References audit M1-M2 surfaced.
Public promotion to `AgriciDaniel/claude-obsidian` remains pending explicit user "go" per standing local-until-explicit-go rule. The post-fix state is shippable; the gate is consent, not quality.