# v1.7.0 Compound Vault — Full Audit **Status:** COMPLETE — all 4 phases executed; 9 verification gates per plan §7 closed. **Date:** 2026-05-17 **Branch audited:** `v1.7.0-compound-vault` (local, not pushed) **Commits in scope:** 8 commits, SHAs `2dad552` → `4a362ed` **Method:** /best-practices six-cut + agent kernel applied per commit; compass artifact coverage matrix (5 priority gaps + 20 backlog items); 3 parallel Explore agents (six-cut audit, coverage matrix, code-quality deep-read); main-thread verification of every BLOCKER and HIGH finding before filing. **Auditor:** Claude Opus 4.7 (1M ctx) under human chair Daniel; agents were independent context (each got a self-contained brief without seeing each other's output). --- ## 1. Executive verdict (full audit) v1.7 is **not ship-ready as `v1.7.0`** but is **close**. **31 findings**: 1 BLOCKER, 6 HIGH, 14 MEDIUM, 10 LOW. The BLOCKER is a real data-egress consent gap in `scripts/contextual-prefix.py:252-258` — surfaced by two independent agent reviews and verified by main-thread code read against the `scripts/tiling-check.py:351-352` `--allow-remote-ollama` precedent. ~1 hour fix. The 6 HIGH findings are design gaps fixable in ~2.5 hours total. Recommend pushing **v1.7.1** (BLOCKER + 6 HIGH addressed) instead of v1.7.0. **Compass artifact coverage** (5 priority gaps + 20 backlog items = 25 cells): 6 SHIPPED, 3 PARTIAL, 9 DEFERRED with explicit v1.8/v1.9/v2.0/v2.5+ milestones, 4 OUT-OF-SCOPE. Matches the v1.7 plan's claim exactly — no over-delivery, no quiet under-delivery. The shipped items are the top-quartile by value/effort per the compass artifact's own scoring. The biggest remaining gap is the derivative-outputs surface (NotebookLM-class audio/video/quiz/study), which **widened during the audit** — Phase C found NotebookLM shipped Video Overviews + a 4-tile Studio panel in May 2026, expanding their lead. **Retrieval benchmark** (50 queries, scripted v1.6 baseline, real ollama rerank): **+39.5% error reduction. PASS** vs the v1.7 plan §7 ship-gate target of ≥30%. Top-1 accuracy 24% → 54% (+30pp); top-5 accuracy 48% → 88% (+40pp). Biggest win on derived natural questions (+52pp); ties on synonym and negative-query categories (those become findings M11, M12). **Verdict on "is the repo #1 best ever?"** — Per-axis (§9), we are **#1 on 4 of 7 axes**: compounding wiki primitive, multi-writer safety, retrieval-architecture-free-tier, license/openness. **TIED on 1**: methodology support (nobody serves LYT/PARA/Zettel; v1.8 closes this into a 5th lead). **NOT #1 on 2**: GUI / install ergonomics (CLI-only vs Community-Plugins from Smart Connections + Copilot), derivative outputs (NotebookLM ships 4 first-class artifact tiles; we ship zero). Honest answer: **#1 on the axes that matter for sophisticated power users who control their own LLM stack — not #1 in mainstream adoption and won't be without v2.0 (derive) + v2.5 (GUI shell).** **Recommendation**: (1) Fix the BLOCKER (~1h). (2) Ship v1.7.1 with the 6 HIGH patches (~2.5h). (3) v1.8 priority: methodology modes (gets us to 5/7 leads, cheapest move). (4) v2.0 derive spec needs to expand to include Video Overviews (new finding M13) to match NotebookLM's May 2026 bar. (5) Defer v1.7.0 tag until v1.7.1 is ready — tagging the blocker version is avoidable footprint. --- ## 2. Methodology Findings filed in 4 tiers: | Tier | Bar | Action | |---|---|---| | **BLOCKER** | Affects ship/push decision; back out the release if not fixed | Must fix before push | | **HIGH** | Should fix before public push | Patch as v1.7.1, push after | | **MEDIUM** | File as tracked issue | Defer to v1.7.x or v1.8 | | **LOW** | Note for posterity / future polish | Bundle into a polish PR before v1.8 | Verification gate: every BLOCKER and HIGH was independently verified by the main-thread auditor (Read on the actual file:line) before being filed at that severity. MEDIUM and LOW are filed on agent attribution. --- ## 3. Six-cut engineering kernel findings (per commit) ### 3.1 Commit ladder ``` 2dad552 chore: pre-v1.7 cleanup 9c8e510 feat(v1.7): §3.1 substrate hard-prefer on kepano/obsidian-skills 6c7671e feat(v1.7): §3.2 default transport — Obsidian CLI with fallback chain 45a5bd3 feat(v1.7): §3.3 hybrid retrieval pipeline (wiki-retrieve) 66c11f9 feat(v1.7): §3.4 multi-writer safety — wiki-lock per-file advisory locks 51fa2da chore(v1.7): cross-cutting — version bump, docs, hot cache refresh 753fc8a chore(v1.7): gitignore runtime artifacts from Compound Vault scripts 4a362ed fix(v1.7): contextual-prefix.py — proper --all flag handling ``` 8 commits. All authored by Daniel. Co-author trailer on every commit cites Claude Opus 4.7 (acceptable; consistent disclosure). ### 3.2 Per-commit six-cut walkthrough For each commit, only NON-clean cells are reported. A "5/6 clean; 1 finding on cut N" line means the other 5 cuts were verified clean. **`2dad552` (cleanup)** — 6/6 clean. Pure infrastructure prep (CLAUDE.md docs + .gitignore additions). No code paths to check. **`9c8e510` (§3.1 substrate)** — 5/6 clean. 1 finding on cut #4 (delete more than you add): `+17 / -5` lines. The "soft-defer → hard-prefer" rewrite was an opportunity to delete the local fallback bodies in obsidian-markdown/obsidian-bases/canvas SKILL.md files. The decision to keep the fallbacks is documented and defensible (users without kepano installed need them), but the kernel cut still flags zero-deletion as a signal to verify intent. **Filed: LOW** (intentional, documented). **`6c7671e` (§3.2 transport)** — 5/6 clean. 1 finding on cut #6 (failure is the spec): `detect-transport.sh` substitutes external command output (`obsidian-cli --version`) directly into JSON via shell variable expansion. Only `tr -d '"'` is applied; newlines, backslashes, control chars are not escaped. On this machine the CLI isn't installed so the bug never triggers, but a malicious or buggy `obsidian-cli` could break JSON output. **Filed: MEDIUM** (theoretical; obsidian-cli is well-behaved in practice). **`45a5bd3` (§3.3 retrieval)** — 4/6 clean. **2 findings**, including the BLOCKER: - **Cut #6 (failure is the spec) — BLOCKER**: `scripts/contextual-prefix.py:252-258` `pick_prefix_tier()` selects tier 1 (Anthropic API) automatically whenever `ANTHROPIC_API_KEY` env var is set. No flag, no consent prompt, no warning. Sends full wiki page bodies (`anthropic_api_prefix()` at line 264, body included in prompt-cached system message) to `https://api.anthropic.com/v1/messages`. The existing precedent in `scripts/tiling-check.py:351-352` is to require `--allow-remote-ollama` explicitly when sending body content off-localhost. `contextual-prefix.py` has no equivalent guard. **VERIFIED by main thread**: read `scripts/contextual-prefix.py:240-281` directly. - **Cut #6 (failure is the spec) — HIGH**: `bin/setup-retrieve.sh` has no rollback if Stage 1 (chunking) fails partway through. Partial `.vault-meta/chunks/` is left on disk. Re-run is idempotent (chunks with matching body_hash skip), but the user has no documented recovery path if Stage 1 fails on chunk 31 of 47. **`66c11f9` (§3.4 concurrency)** — 5/6 clean. 1 finding on cut #6 (failure is the spec) — HIGH: `hooks/hooks.json` PostToolUse defers commit if `wiki-lock list | wc -l != 0`, but the entire pipeline ends with `|| true`. If `wiki-lock list` errors (permission denied on .vault-meta/.wiki-lock.meta, missing script, etc.), the `||true` swallows it and `git add/commit` proceeds anyway. The intended safety property (defer commit on locks held) silently degrades to "always commit" on any error in the check. **`51fa2da` (cross-cutting docs)** — 6/6 clean. Pure documentation + version bump. **`753fc8a` (gitignore)** — 6/6 clean. Manually added by the user during the previous session. **`4a362ed` (--all flag fix)** — 6/6 clean. 14-line targeted fix surfaced by the real-vault smoke; commit message correctly explains root cause. ### 3.3 Hermeticity verification Ran `make test` — all 7 suites green. Counted: 1162 OK assertions, 0 failures, 0 errors. Grep for network-touching code in tests/: ``` grep -rE 'urllib\.|requests|socket\.|http://|https://' tests/ ``` Returns: only mock patches (`unittest.mock.patch.object(rerank, 'ollama_alive', ...)`) and subprocess invocations that target sibling scripts in temp sandboxes. No real network egress at test time. **Hermeticity claim verified.** --- ## 4. Agent kernel findings (4 workstreams) | Constraint | Status | Evidence | |---|---|---| | **one chair** | VERIFIED | All 8 commits authored by Daniel; single human owner across all workstreams. | | **bounded slices** | PARTIAL | 4 skills (`wiki-ingest`, `wiki-query`, `save`, `autoresearch`) were touched by both §3.2 (Transport section) and §3.4 (Concurrency section). No conflict in practice — sections are adjacent and compose cleanly — but the file-set overlap is real. The cross-cutting commit (51fa2da) is allowed to touch many files by definition; the §3.x feat commits were not strictly disjoint. **Filed: MEDIUM** (no harm done; flag for future releases to consider tighter scoping). | | **explorers/workers/verifiers** | PARTIAL | Phase 1 of the original v1.7 implementation plan used 3 parallel Explore agents (verified in conversation log). Workers were the main-thread author. Verifier agents were NOT dispatched at workstream gates — code went straight from author to commit without an independent review pass. This audit IS the missing verifier pass; doing it post-commit instead of pre-commit means findings become patches instead of pre-merge fixes. **Filed: MEDIUM** (process gap; not a code bug). | | **acceptance criteria before execution** | VERIFIED | Each feat commit references its §3.x scope; file sets match scope descriptions; original plan §7 ship gates documented. | | **per-change rigor inside every slice** | PARTIAL | The six-cut kernel was clearly applied to code patterns (locking, flock guards, fallback chains, exit codes). BUT the BLOCKER on contextual-prefix.py egress shows the rigor was insufficient on the security/blast-radius cut. Had the author re-read tiling-check.py's `--allow-remote-ollama` pattern during §3.3 implementation, the egress gap would have been caught at write time. **Filed: HIGH** (process gap that produced a real bug). | | **5-part closeout** | VERIFIED | CHANGELOG.md 1.7.0 entry covers: integrated result ✓, verification summary (7 suites, 1162 assertions, zero network) ✓, commit ids implicit via §3.x→commit mapping ✓, notes current ✓, next-slice rationale (v1.8/v1.9/v2.0 roadmap) ✓. | --- ## 5. Compass artifact coverage matrix ### 5.1 Five priority gaps | # | Gap | Status | Evidence | |---|---|---|---| | 1 | Platform-owner substrate (kepano/obsidian-skills) | **SHIPPED** | 3 SKILL.md files defer hard-prefer; `marketplace.json:28-34` declares recommendedCompanions | | 2 | Obsidian CLI first-class transport | **SHIPPED** | `scripts/detect-transport.sh` + `.vault-meta/transport.json` + decision tree at `wiki/references/transport-fallback.md` + 5 skill "Transport (v1.7+)" sections | | 3 | NotebookLM-class derivative artifacts | **DEFERRED → v2.0** | Documented in `compound-vault-guide.md:274` ("v2.0 — NotebookLM-class derivative outputs") | | 4 | Contextual retrieval + hybrid + rerank | **SHIPPED** | 4 new scripts (`contextual-prefix`, `bm25-index`, `rerank`, `retrieve`) + setup + skill + wired into `wiki-query` | | 5 | Adoption friction (GUI onramp, one-liner installer) | **PARTIAL** | CLI transport reduces friction; GUI onramp deferred to v2.5+; no `npx claude-obsidian init` shipped | ### 5.2 Twenty backlog items | # | Item | Status | Where | |---|---|---|---| | 1 | Substrate dependency on kepano | SHIPPED | §3.1 (commit 9c8e510) | | 2 | wiki-cli default transport | SHIPPED | §3.2 (commit 6c7671e) | | 3 | Contextual retrieval per-chunk prefix | SHIPPED | §3.3 `scripts/contextual-prefix.py` | | 4 | Hybrid BM25 + vector + rerank | **PARTIAL** | BM25 + rerank shipped; rerank uses dense vectors internally, but no SEPARATE vector candidate stage. `compound-vault-guide.md:97` acknowledges "A separate dense vector stage is on the v1.7.x roadmap." | | 5 | wiki-derive audio | DEFERRED → v2.0 | `CHANGELOG.md:36` | | 6 | wiki-mode bootstrap (LYT/PARA/Zettel/Generic) | DEFERRED → v1.8 | `CHANGELOG.md:35` | | 7 | GUI onramp Obsidian-plugin shell | DEFERRED → v2.5+ | `compound-vault-guide.md:263` | | 8 | --from notebooklm/readwise/zotero adapters | DEFERRED → v1.9 | `CHANGELOG.md:37` | | 9 | wiki-derive quiz/flashcards/study-guide/brief | DEFERRED → v2.0 | `CHANGELOG.md:36` | | 10 | Out-of-box local embedding + Ollama fully-local path | **SHIPPED** | `--no-llm` flag in `bin/setup-retrieve.sh` forces tier-3 synthetic; rerank uses ollama (fully local) | | 11 | wiki-review (PARA weekly/monthly) | DEFERRED → v1.8 | `CHANGELOG.md:38` | | 12 | Multimodal ingest (YouTube/PDF/audio/image) | DEFERRED → v1.9 | `CHANGELOG.md:37` | | 13 | ACP transport (Copilot #2179) | OUT-OF-SCOPE | No ACP mention in codebase; 4-tier fallback shipped without it | | 14 | wiki-derive slides + mindmap | DEFERRED → v2.0 | implicit in §wiki-derive deferral | | 15 | Multi-vault federation (wiki-federate) | DEFERRED → v2.x | `compound-vault-guide.md:264` | | 16 | iOS Share extension ingest | OUT-OF-SCOPE | `skills/wiki-cli/SKILL.md` notes mobile is filesystem-only; no v1.7 work | | 17 | Cursor/Codex/OpenCode parity | SHIPPED | `bin/setup-multi-agent.sh` (predates v1.7 but covers this) | | 18 | Hosted Pro tier | OUT-OF-SCOPE | `compound-vault-guide.md:262` "Not a paid plugin" | | 19 | DragonScale promoted from extension to default | **PARTIAL** | DragonScale still opt-in; v1.7 did NOT promote. wiki-lock (§3.4) is universally beneficial but is a separate concern from full DragonScale | | 20 | Spaced-repetition Anki round-trip | OUT-OF-SCOPE | Not in roadmap | ### 5.3 Coverage summary - **SHIPPED**: 6 (Gap 1, 2, 4 + Backlog 1, 2, 3, 10, 17 — note Gap 1=Backlog 1, Gap 2=Backlog 2 collapse to 6 distinct items) - **PARTIAL**: 3 (Gap 5, Backlog 4, Backlog 19) - **DEFERRED (with milestone)**: 9 (Gap 3, Backlog 5, 6, 8, 9, 11, 12, 14, 15) - **OUT-OF-SCOPE**: 4 (Backlog 13, 16, 18, 20) **Honest read**: v1.7 delivers EXACTLY what the v1.7 plan claimed — top-quartile items 1-4 by value/effort + the latent multi-writer bug fix. No accidental over-delivery; no quiet under-delivery. The biggest gap to category leadership is item #5 (NotebookLM-class outputs) and item #7 (GUI onramp), both explicitly deferred. --- ## 6. Retrieval benchmark results (Phase B) ### 6.1 Method - Corpus: 50 queries (25 derived natural questions + 25 hard: 5 synonym + 10 cross-page + 5 partial-recall + 5 negative). Each annotated with `correct` page(s), `relevant` supporting pages, category, and rationale. Stored at [wiki/meta/retrieval-benchmark-v1.7.md](../../wiki/meta/retrieval-benchmark-v1.7.md). - Pipelines compared: - **v1.7 hybrid**: `python3 scripts/retrieve.py "" --top 5` (BM25 over contextually-prefixed chunks → cosine rerank via ollama nomic-embed-text → page-address dedupe). - **v1.6 baseline**: `python3 scripts/baseline-v16.py "" --top 5` (mirrors the legacy `hot→index→drill` chain: tokenize query, score each page by distinct-term presence + hot-cache boost + index-cite boost; top-5 by score). - Scoring: - **top-1 success**: top result's path == one of `correct[]` - **top-5 success**: any of top-5 paths in `correct[]` - **Negative queries** (correct=null): success if no results, or top result in `relevant[]`. - Runner: `scripts/benchmark-runner.py` (per-query subprocess to both pipelines, tabulates). - Per-query raw results: `/tmp/benchmark-results.json` (50 queries × 2 pipelines = 100 result sets, with v17 and v16 paths captured for each). ### 6.2 Aggregate results | Category | N | v1.7 top-1 | v1.7 top-5 | v1.6 top-1 | v1.6 top-5 | Δ top-1 | |---|---|---|---|---|---|---| | cross-page | 10 | 30.0% | 80.0% | 30.0% | 50.0% | +0.0pp | | derived | 25 | **64.0%** | **88.0%** | 12.0% | 28.0% | **+52.0pp** | | negative | 5 | 40.0% | 80.0% | 40.0% | 80.0% | +0.0pp | | partial-recall | 5 | 60.0% | 100.0% | 20.0% | 60.0% | **+40.0pp** | | synonym | 5 | 60.0% | 100.0% | 60.0% | 100.0% | +0.0pp | | **TOTAL** | **50** | **54.0%** | **88.0%** | **24.0%** | **48.0%** | **+30.0pp** | ### 6.3 Ship-gate verification Original v1.7 plan §7 (the v2.0 / 1.7.0 phase) specified: > *Ship gate: `make test` green including new concurrent-write test; 50-query retrieval benchmark (manually curated) shows ≥30% reduction in "wrong page cited" errors vs v1.6 baseline.* **Result**: PASS. - v1.6 top-1 errors: 38/50 = 76% wrong - v1.7 top-1 errors: 23/50 = 46% wrong - Error reduction: (38 − 23) / 38 = **39.5% reduction** (gate was ≥30%) The gate passes by a non-trivial margin. ### 6.4 Per-category interpretation - **Derived (+52pp)**: Hybrid retrieval dominates on natural questions. v1.6 baseline hits 12% top-1 because keyword overlap alone is brittle when page titles use specific terminology (e.g., "DragonScale Memory") and queries use general terminology (e.g., "wiki fold operator"). v1.7's contextual prefix injects page-level vocabulary into every chunk, dramatically improving BM25 recall; rerank then promotes the right page. - **Partial-recall (+40pp)**: Big win. Fragmented queries ("the dragon curve thing with folds") rely on rerank's semantic understanding. v1.6 can't bridge "dragon curve" → "DragonScale" without exact-token overlap. - **Synonym (+0pp, tied at 60%)**: Surprising tie. Suggests rerank does NOT add value when both pipelines use similar tokens AND the canonical page has enough natural overlap with the query. Worth flagging as a finding — perhaps the synonym queries weren't synonym-enough, or the contextual prefix actually narrowed the BM25 recall on these specific queries. - **Cross-page (top-1 +0pp, top-5 +30pp)**: v1.6 and v1.7 tie at 30% top-1, but v1.7 reaches 80% top-5 vs v1.6's 50%. Cross-page synthesis queries have multiple "correct" pages; v1.7 surfaces them in top-5 even when the canonical isn't #1. - **Negative (+0pp, tied at 40%)**: Both pipelines correctly handle "no answer in vault" 40% of the time. Means v1.7 has similar false-positive rate as v1.6 on negative queries — it doesn't avoid surfacing irrelevant pages when no answer exists. This is a precision concern worth filing (potential MEDIUM finding for Phase D). ### 6.5 New findings from benchmark - **MEDIUM (M11 - benchmark)**: Synonym category tied. v1.7's contextual prefix and rerank should beat v1.6 on synonyms, but it didn't. Two possible causes: (1) the synonym test queries weren't actually challenging enough (the canonical page may have used closely-related vocabulary), (2) v1.7 chunking happened to drop the key context. Worth a follow-up analysis post-Phase D. - **MEDIUM (M12 - benchmark)**: Negative-query precision tied at 40%. Both pipelines surface unrelated pages 60% of the time for "no answer" queries. This is a v1.7 opportunity — the rerank could be tuned to suppress low-confidence top results below a threshold. - **LOW (L8 - benchmark)**: Cross-page top-1 tied. The hybrid pipeline doesn't pick a clear winner among multiple correct pages. Per-source weighting or ensemble scoring could help in a future v1.7.x. These findings get folded into the final Phase D ledger. --- ## 7. Market state delta (Phase C — 2026-05-17 vs compass May-16 snapshot) ### 7.1 GitHub star + activity refresh (one-day delta) | Repo | Compass May 16 | Actual May 17 | Delta | Last push | Last release | |---|---|---|---|---|---| | `kepano/obsidian-skills` | 30.5k★ | **31.6k★ (+1.1k)** | growing fast | 2026-05-07 | no recent release tag | | `logancyang/obsidian-copilot` | ~7k★ | **7.0k★** | flat | 2026-05-16 (active) | — | | `brianpetro/obsidian-smart-connections` | ~4.4k★ | **5.0k★ (+0.6k)** | growing | 2026-05-14 | 4.5.0 (2026-05-05) | | `khoj-ai/khoj` | 34k+ | **34.6k★** | matches | 2026-03-26 (~2mo idle) | — | | `AI-Marketing-Hub/claude-obsidian` (us) | 4.1k★ | 4.1k★ | flat | local-only branch | v1.6.0 | **Read:** The May 16 compass snapshot largely holds. One material drift: `kepano/obsidian-skills` is growing at ~3.6%/day star rate — substrate dependency validated; the platform-owner's skill set is consolidating its position. Smart Connections active development; Khoj has slowed (~2 months between pushes). ### 7.2 Issue / release deltas **Copilot #2257 (Obsidian CLI integration)** — Still OPEN. Last update 2026-03-06 (3 months stale). 0 comments. **claude-obsidian v1.7 §3.2 shipped exactly what this issue describes.** Genuine competitive moat: we shipped what Copilot has been planning for 3+ months. **Copilot #2179 (ACP transport)** — Still OPEN. Last update 2026-02-20 (3 months stale). 1 comment. Neither us nor Copilot has shipped. v1.7 explicitly out-of-scope (backlog item #13). **Smart Connections 4.5.0 (2026-05-05)** — Notable changes: - "Connections Footer" promoted from Pro to Core (mobile-friendly writing surface). UX win for free users. - "Substrate Update" — Smart Plugins / unified Smart Environment continuing to land. - Pro paywall intact for inline discovery, Bases workflows, advanced ranking. - Bug fixes around transformers embedding GPU/CPU fallback. No reranker or hybrid retrieval changes in 4.5.0 — they still paywall configurable reranking in Connections Pro. **Our reranker is core (free, MIT). Genuine moat.** ### 7.3 NotebookLM (Google) — MAJOR new shipment This is the most material competitor finding of Phase C. NotebookLM shipped substantial new features in May 2026 that the compass artifact did NOT capture in full: **NEW: Video Overviews** — narrated-slide format with AI host pulling images, diagrams, quotes, numbers from sources. First new derivative-artifact format since Audio Overviews. **NEW: Studio panel redesign** — 4 distinct tiles at the top of the notebook: 1. Audio Overviews (existing, two-host podcast) 2. **Video Overviews** (new May 2026) 3. **Mind Maps** (existing but now a first-class tile) 4. **Reports** (new — replaces/upgrades Briefs) Multi-task within Studio: listen to Audio while exploring Mind Map while reviewing Study Guide. **NEW: EPUB upload** as supported source format. (Compass §4 multimodal-ingest signal validated; users want more source types.) **Implication for claude-obsidian's #1 verdict:** The derivative-outputs gap (compass artifact Gap #3 + backlog items #5, #9, #14) is **WIDER** than the May-16 compass artifact captured. NotebookLM now ships 4 first-class artifact types (Audio, Video, Mind Maps, Reports) plus Study Guides, Briefs, Quizzes, Data Tables. v1.7 ships zero. The deferral of `wiki-derive` to v2.0 was correct as a sequencing call, but the competitive gap is now larger and the v2.0 spec should consider adding Video Overviews (Marp + TTS pipeline) given NotebookLM's new bar. ### 7.4 New findings from Phase C - **MEDIUM (M13 - market)**: Original `wiki-derive` v2.0 spec (in v1.7 plan §4.1) covers audio, quiz, flashcards, study-guide, brief, slides, mindmap. With NotebookLM's May 2026 Video Overviews shipment, the v2.0 spec should add **video** as a first-class artifact (Marp slides + TTS narration → MP4 via ffmpeg) to maintain parity. File for v2.0 planning. - **MEDIUM (M14 - market)**: NotebookLM added EPUB upload. Compass artifact §6 already had `adapter-epub.py` planned for v1.9. With NotebookLM also shipping it, this becomes a baseline expectation rather than a differentiator. No action change, just narrative shift. - **LOW (L9 - market)**: Smart Connections 4.5.0 promoted Footer Connections to Core. Mobile-friendly writing surface is now their free-tier wedge. Doesn't affect us directly (we're terminal-only) but worth noting in #1 verdict scoring on "GUI ergonomics" axis — SC is widening its UX lead. - **LOW (L10 - market)**: Copilot CLI integration issue #2257 has been stale for 3 months. Genuine competitive moat for claude-obsidian on the CLI-native axis. Worth surfacing in the positioning narrative ("the only Claude+Obsidian stack that's actually CLI-native today"). These get folded into the final Phase D ledger. ### Sources - [kepano/obsidian-skills (GitHub)](https://github.com/kepano/obsidian-skills) - [logancyang/obsidian-copilot #2257](https://github.com/logancyang/obsidian-copilot/issues/2257) - [logancyang/obsidian-copilot #2179](https://github.com/logancyang/obsidian-copilot/issues/2179) - [brianpetro/obsidian-smart-connections 4.5.0 release](https://github.com/brianpetro/obsidian-smart-connections/releases/tag/4.5.0) - [khoj-ai/khoj (GitHub)](https://github.com/khoj-ai/khoj) - [Google: NotebookLM Video Overviews + Studio upgrades](https://blog.google/innovation-and-ai/models-and-research/google-labs/notebooklm-video-overviews-studio-upgrades/) - [Google Workspace: New ways to customize and interact with NotebookLM (March 2026)](https://workspaceupdates.googleblog.com/2026/03/new-ways-to-customize-and-interact-with-your-content-in-NotebookLM.html) - [Jeff Su: NotebookLM in 2026 — what changed and what matters](https://www.jeffsu.org/notebooklm-changed-completely-heres-what-matters-in-2026/) --- ## 8. Findings ledger (Phase A — partial; B/C/D may add) ### 8.1 BLOCKER (1) | # | Finding | File:line | Recommended fix | |---|---|---|---| | B1 | `contextual-prefix.py` sends wiki page bodies to Anthropic API automatically whenever `ANTHROPIC_API_KEY` is set. No consent prompt, no flag. Violates the data-egress opt-in precedent set by `tiling-check.py:351-352` (`--allow-remote-ollama`). | `scripts/contextual-prefix.py:252-281`, `scripts/contextual-prefix.py:166-202` (api call) | Add `--allow-egress` flag (default off). Without the flag, fall through `anthropic-api` and `claude-cli` tiers to synthetic. `bin/setup-retrieve.sh` should warn explicitly: "Stage 1 will send N page bodies to . Continue? [y/N]". Document in `skills/wiki-retrieve/SKILL.md` Data Privacy section. | ### 8.2 HIGH (6) | # | Finding | File:line | Fix | |---|---|---|---| | H1 | `bin/setup-retrieve.sh` has no rollback plan if Stage 1 fails partway through. | `bin/setup-retrieve.sh:128-140` | Catch non-zero exit; either resume or document recovery (`rm -rf .vault-meta/chunks//`). | | H2 | `make clean-test-state` removes v1.6 artifacts but not v1.7 (`chunks/`, `bm25/`, `locks/`, `transport.json`, `embed-cache.json`). | `Makefile:55-61` | Expand `clean-test-state` to match the `.gitignore` v1.7 additions. | | H3 | `hooks/hooks.json` PostToolUse: the `wiki-lock list` check is in a pipeline ending `|| true`. Any error in the check silently degrades to "always commit." | `hooks/hooks.json:34-37` | Restructure: capture the list count in a variable, check explicitly, defer commit on error rather than swallow. | | H4 | Per-change rigor on §3.3 was insufficient to catch the data-egress gap. Process issue, not a code bug, but it produced one. | n/a | Adopt verifier-agent pattern: dispatch a security-focused review agent at each workstream gate before commit. | | H5 | `detect-transport.sh` substitutes external command output directly into JSON. `tr -d '"'` doesn't escape backslashes, newlines, control chars. Theoretical break if obsidian-cli emits non-trivial output. | `scripts/detect-transport.sh:79,86` | Pipe through `python3 -c "import json,sys; print(json.dumps(sys.stdin.read().strip()))"` or jq for proper escaping. | | H6 | `skills/wiki-retrieve/SKILL.md` does not explicitly state in its frontmatter description that tier-1 sends page bodies to Anthropic API. The architecture section implies it; the user-facing description does not. | `skills/wiki-retrieve/SKILL.md:3-6` | Add a Data Privacy callout at the top of the skill body. | ### 8.3 MEDIUM (8) | # | Finding | File:line | |---|---|---| | M1 | §3.2 transport layer net +485 / -0 LOC. Pure addition; no v1.6 cruft pruned. | commit 6c7671e | | M2 | `bm25-index.py` token regex `[A-Za-z][A-Za-z0-9'\-]*` silently drops non-ASCII content. Multilingual vaults degrade without warning. | `scripts/bm25-index.py:76` | | M3 | `rerank.py` `--allow-remote-ollama` is wired in `retrieve.py` via `--allow-remote-ollama` forward, but the error path in `rerank.py` blames the user without saying "pass it to retrieve.py instead." | `scripts/rerank.py:91-99` | | M4 | `wiki-lock.sh` `validate_path` rejects `..` but accepts paths with embedded newlines. Lockfile format would break. | `scripts/wiki-lock.sh:99-108` | | M5 | `retrieve.py` `import_sibling` doesn't catch `ImportError`/`SyntaxError` — bare traceback for the user. | `scripts/retrieve.py:73-78` | | M6 | `contextual-prefix.py` empty body edge case: page with only frontmatter logs `chunks=0` silently with no WARN. | `scripts/contextual-prefix.py:284-300` | | M7 | `rerank.py` `save_cache()` uses blocking `fcntl.LOCK_EX` (no timeout). Could hang on a non-flock-capable filesystem (network mount). | `scripts/rerank.py:130-146` | | M8 | Test coverage gap: `test_retrieve.py` doesn't exercise `--explain` or `--no-rerank` flag paths. | `tests/test_retrieve.py` | | M9 | 4 skills (`wiki-ingest`, `wiki-query`, `save`, `autoresearch`) touched by both §3.2 and §3.4. Bounded-slices kernel partial. | commits 6c7671e + 66c11f9 | | M10 | No verifier agents dispatched per-workstream during v1.7 development. This audit is the missing verifier pass. | process | (Counted 10 in actual table; updating summary above.) ### 8.4 LOW (5) | # | Finding | File:line | |---|---|---| | L1 | §3.1 substrate rewrite +17/-5. No deletion when "soft-defer→hard-prefer" arguably allowed pruning local fallback bodies. Documented + defensible, but flag. | commit 9c8e510 | | L2 | `bin/setup-retrieve.sh` no timeout on Stage 1. Tier-2 (claude-cli) × 47 pages can take 5+ min. No progress indicator. | `bin/setup-retrieve.sh:128` | | L3 | `bm25-index.py` has a dead `bm25_score()` function (27 lines, never called; comments say "placeholder"). | `scripts/bm25-index.py:196-223` | | L4 | `--rebuild` flag on `bm25-index.py build` accepted but no-op. Documented as reserved for incremental mode (not in v1.7). Speculative complexity per kernel. | `scripts/bm25-index.py:279` | | L5 | `--no-bm25` flag on `retrieve.py` accepted but returns EXIT_USAGE. Stub for future vector-only mode. | `scripts/retrieve.py:96-106` | | L6 | `wiki-lock.sh` naming: `STALE_AFTER_SEC=60` (per-acquire) vs `clear-stale --max-age 3600` (admin) — both age thresholds but different concerns. Confusing for new reader. | `scripts/wiki-lock.sh:53,304` | | L7 | BM25 divide-by-zero in `query()` is theoretically possible if `avg_dl == 0`. Verified: unreachable in practice (vocab is empty when all dl=0, so the divide path is never taken). Worth a defensive `or 1.0` guard anyway. | `scripts/bm25-index.py:249` | ### 8.5 Counts - BLOCKER: 1 - HIGH: 6 - MEDIUM: 10 (revised from 8 to include M9, M10 from agent kernel section) - LOW: 7 (revised from 5) - **Total Phase A findings: 24** (Plan §1 expected 15-30. Within range.) --- ## 9. #1-best-ever verdict (Phase D) Per-axis evaluation. Each axis: Y/N/Tie + evidence + gap-closer (if not yet #1). | # | Axis | #1? | Evidence (verified) | Gap-closer (if not #1) | |---|---|---|---|---| | 1 | **Compounding wiki primitive** (Karpathy pattern, persistent vault, hot/index/log cadence) | **YES** | Karpathy pattern is rare in production. Only us + `ScrapingArt/Karpathy-LLM-Wiki-Stack` (build-ready reference, not a runtime) + Kompl (Apache-2.0, MCP-native) ship it. We have the most complete implementation: 13 skills, DragonScale extension, multi-agent support, 8-category lint. | n/a — we lead this axis structurally. | | 2 | **Multi-writer safety** (per-file advisory locking, race-free parallel ingest) | **YES** | Verified unique vs Smart Connections (no locking), Copilot (no locking), Khoj (cloud-managed), NotebookLM (single-user surface). v1.7 ships `scripts/wiki-lock.sh` (~244 lines, age-based + atomic noclobber) as core. Benchmark `tests/test_concurrent_write.sh` proves 10 parallel workers, zero data loss. | n/a — closed the v1.6 latent bug; no competitor has caught up. | | 3 | **Retrieval architecture** (contextual + hybrid BM25 + cosine rerank) | **YES** (free tier) / **TIED** (paid tier) | We ship contextual prefix + BM25 + cosine rerank as MIT core. **Benchmark: +39.5% error reduction vs v1.6 baseline; +30pp top-1 accuracy across 50 queries; +52pp on derived natural questions.** Smart Connections Pro paywalls configurable reranking. Copilot v3 has lexical fallback only — no rerank. Khoj uses pgvector but no documented reranker. NotebookLM doesn't expose retrieval primitives. | None on free axis. SC Pro is comparable on paid axis but we are also MIT — no acquisition cost. | | 4 | **GUI / install ergonomics** | **NO** | We are CLI-only: requires Claude Code install + plugin marketplace add + vault clone + (optional) `bash bin/setup-retrieve.sh`. Smart Connections and Copilot ship as one-click Community Plugins. Claudian and deivid11/obsidian-claude-code-plugin offer in-vault Claude integration with GUI panels. SC 4.5.0 just promoted Footer Connections to Core (mobile-friendly). Our adoption surface is materially worse for non-developers. | **v2.5+ GUI plugin shell** (backlog #7, L-effort) closes the gap by wrapping the 13 skills in an Obsidian-native plugin. OR accept that claude-obsidian permanently serves a power-user niche. | | 5 | **Derivative outputs** (audio, video, study guides, quizzes, mindmaps, briefs) | **NO** | We have zero. **NotebookLM (May 2026) ships 4 first-class tile types: Audio Overviews, Video Overviews, Mind Maps, Reports.** Plus existing Study Guides, Briefs, Quizzes, Data Tables. Copilot ships YouTube ingest + mind maps. Atlas Workspace ships mindmap synthesis. ElevenLabs GenFM + Nouswise ship two-host audio. The gap is widening (Video Overviews shipped after the compass artifact's snapshot). | **v2.0 `wiki-derive` skill** (backlog #5, #9, #14) brings parity on text + audio. Video parity requires expanding the v2.0 spec to include Marp slides + TTS narration → ffmpeg MP4 pipeline (new finding **M13**). Even with v2.0 shipped, NotebookLM's tight integration with Gemini 3 + Studio multi-tasking surface is a sustained-investment moat. | | 6 | **Methodology support** (LYT/PARA/Zettelkasten/Generic modes) | **TIE** | We have none. Nobody else has either. Ideaverse Pro 2.0 ($200 paid vault) ships LYT as an opinionated structure, but it's a vault, not a skill set. PARA, Zettelkasten, generic modes: no Claude+Obsidian competitor ships these as first-class. | **v1.8 `wiki-mode` skill** (backlog #6, M-effort) closes the tie into a LEAD. Power-user PKM segment is unserved by competitors today. | | 7 | **License / openness** (MIT, no paid features in core) | **YES** | MIT-licensed across all 13 skills + 9 scripts + 7 tests. Even the reranker is core (no Pro tier). Smart Connections paywalls advanced ranking, Bases workflows, inline discovery in Connections Pro. Copilot Plus paywalls Miyo file conversions, long-term memory, license-gated models. Khoj has cloud tier. NotebookLM Plus is $20/mo. We are structurally the most open. | n/a — Pro tier (v3+) remains explicitly deferred; license stance holds. | ### 9.1 Summary verdict **We are #1 on 4 of 7 axes** (compounding wiki, multi-writer safety, retrieval-architecture-free-tier, license/openness). **TIED on 1** (methodology — nobody serves it). **NOT #1 on 2** (GUI ergonomics, derivative outputs). **Roadmap effect** (assuming current backlog ships as planned): - **v1.8** (methodology modes + reviews) → converts the methodology TIE into a 5th LEAD. We lead on **5 of 7 axes**. - **v2.0** (derive: audio + quiz + study + slides + mindmap, plus the new M13 video addition) → brings derivative outputs from NO to **PARTIAL** (within striking distance of NotebookLM on text+audio; behind on video integration polish). Likely a TIE rather than a LEAD. - **v2.5+** (GUI plugin shell) → converts the GUI/install NO to a TIE-or-LEAD depending on shell quality. **Honest "is the repo #1 best ever?" answer**: NOT YET, AND NOT WITHOUT v2.0+. v1.7 makes the technical refoundation that puts category leadership in reach. v1.8 is the cheapest 5th lead. v2.0 is necessary for parity with NotebookLM on the consumer adoption axis. v2.5+ GUI shell is necessary to reach the mainstream Obsidian user base (vs the current power-user niche). **What v1.7 ALREADY makes us #1 on, that nobody else can match in the short term:** - The compounding-wiki primitive (years-of-context advantage for adopters) - Multi-writer safety (genuinely unique architecture) - Hybrid retrieval as free/MIT (SC Pro is the only paid match; nobody else has it) - License openness (structural moat) That's enough to credibly claim **"#1 on the axes that matter for sophisticated power users who control their own LLM stack."** It's NOT enough to claim "#1 best ever, full stop" — that requires GUI ergonomics + derivative outputs to land. ### 9.2 Calibrated confidence The benchmark (Phase B) gives high confidence on axis 3 (retrieval). Independent agent reviews + main-thread verification (Phase A) gives high confidence on axes 1, 2, 7. Axis 4 (GUI) is structural — easy to verify by looking at competitor install surfaces. Axis 5 (derivatives) is verified against May 2026 NotebookLM data. Axis 6 (methodology) is a true tie — no competitor verified shipping LYT/PARA/Zettel modes. Overall verdict confidence: **HIGH**. The verdict is earned by evidence, not asserted. --- ## 10. Prioritized punch list (Phase D) Every finding from §3, §4, §6, §7 mapped to a target milestone. Items within each milestone are ordered by estimated effort (S/M/L) and dependency (independent first). ### 10.1 Push-blocker (must fix before any public push) | # | Finding | Effort | Notes | Status | |---|---|---|---|---| | B1 | `contextual-prefix.py` data egress without consent | S (~1h) | Add `--allow-egress` flag default-off; mirror the `tiling-check.py:351-352` `--allow-remote-ollama` precedent. `bin/setup-retrieve.sh` adds a "Continue? [y/N]" prompt before Stage 1 if any non-synthetic tier is selected. Document in `skills/wiki-retrieve/SKILL.md` Data Privacy callout (closes H6). | **FIXED in v1.7.1 commit `ca68bb6`** | ### 10.2 v1.7.1 patch (within 1 week of push) | # | Finding | Effort | Status | |---|---|---|---| | H1 | `bin/setup-retrieve.sh` no rollback if Stage 1 fails partway | S (~30min) — catch non-zero from contextual-prefix.py; print recovery hint | **FIXED in v1.7.1 commit `4837d4f`** | | H2 | `make clean-test-state` doesn't remove v1.7 artifacts | S (~10min) — extend the rm pattern to match v1.7 gitignore additions | **FIXED in v1.7.1 commit `7e1f187`** | | H3 | `hooks/hooks.json` PostToolUse `|| true` swallows lock-check errors | S (~30min) — restructure to test exit code explicitly | **FIXED in v1.7.1 commit `7120970`** | | H4 | Process gap: no verifier-agent pass at workstream gates | M — process change, not a code fix; document a `superpowers:verification-before-completion` checkpoint in `agents/` for future releases | **FIXED in v1.7.1 commit `3ea443f` (new `agents/verifier.md` + CLAUDE.md reference)** | | H5 | `detect-transport.sh` JSON escaping via shell substitution | S (~20min) — pipe through python3 json.dumps | **FIXED in v1.7.1 commit `722ac97`** | | H6 | `skills/wiki-retrieve/SKILL.md` doesn't document data egress | S (~10min) — Data Privacy callout (bundle with B1 fix) | **FIXED in v1.7.1 commit `ca68bb6`** (bundled with B1) | Total v1.7.1 effort: ~2.5 hours focused work. Recommend a single fix-and-test session, push v1.7.1 instead of v1.7.0. **v1.7.1 execution closeout (2026-05-17)**: - 6 commits landed on `v1.7.0-compound-vault`: `ca68bb6`, `4837d4f`, `7e1f187`, `7120970`, `722ac97`, `3ea443f` (in execution order). - All 7 findings (1 BLOCKER + 6 HIGH) closed. - `make test` 7 suites green after each commit; final run also green. - `bash bin/setup-retrieve.sh --no-llm` end-to-end re-provisioned cleanly post-fixes. - Version bumped to 1.7.1 in `.claude-plugin/plugin.json` + `.claude-plugin/marketplace.json`; `CHANGELOG.md` entry added. - Branch remains local-only; no push, no tag. Awaiting user authorization to push + tag `v1.7.1`. **Post-fix self-audit (2026-05-17, same session)**: a re-pass with the new `agents/verifier.md` against the v1.7.1 slice surfaced 2 MEDIUM + 3 LOW polish items (none functional). All 5 closed in a single follow-up commit, with verifier re-pass returning 0/0/0/0 and SHIP verdict. See `## Polish` block in the [1.7.1] CHANGELOG entry for per-file detail. The hook breadcrumb path (`.vault-meta/hook.log`) was empirically verified under 10× parallel hook fires (atomic appends; no interleaving) and format-string-injection probe (printf uses literal format with %s placeholders only). **Second self-audit round (chair adversarial probe, same session)**: the user challenged the 100/100 self-grade. A deeper chair-led probe surfaced three real items the verifier missed: (a) `.vault-meta/hook.log` was not in `.gitignore`, creating a self-pollution loop where the breadcrumb file would be auto-staged by the same hook that wrote it; (b) `CLI_VERSION_RAW` was not in the top-of-script init block in `detect-transport.sh`, working today only by bash short-circuit semantics under `set -u`; (c) `verifier.md` `tools:` was converted to YAML list in P2, but the in-repo precedent (`wiki-ingest.md`, `wiki-lint.md`) and the canonical form across `~/.claude/agents/` is CSV — the polish introduced a single-file style outlier. All three closed in a follow-up commit. Lesson: even verifier-validated SHIP slices benefit from a third pass of adversarial chair scrutiny; the agent kernel's "explorers map, workers implement, verifiers gate" still leaves the chair as the final accountability layer. **v1.7.2 + v1.8.0 plan execution (same session)**: the user further requested "best ever per priority research." Plan written at [v1.7.2-sss-plus-plan.md](v1.7.2-sss-plus-plan.md) with acceptance criteria + 6h hard cap + 2-round verify-fix cap. Phase 2 (LOC pruning) honest outcome: pruned 43 LOC of dead code (closing L3/L4/L5) but the `main..HEAD` net delta is `+6009 / -30`, NOT meeting the plan's `≤+5000 OR ≥-200` criterion. Per the plan §4 failure-mode clause: "Do not invent prunes to game the metric." Honest decomposition: ~5500 LOC across new files alone (4 new scripts + 4 new tests + 2 new skills + 1 new agent + 1 new bin + ~2200 LOC docs). The +6009 IS the substrate; v1.6 had no equivalent of a retrieval pipeline, lock primitive, transport detector, or contextual prefix generator to delete. The kernel principle "delete more than you add" presumes refactor or maintenance; v1.7 was net-new feature substrate. **Kernel-application axis ceilings at ~92-95 honestly** for this release, not 100; the deduction is structural to building substrate, not negligence. **v1.7.2 closure status (2026-05-17, end of v1.7 line audit-debt remediation)**: - BLOCKER: **1/1 closed** (v1.7.1 `ca68bb6`) - HIGH: **6/6 closed** (v1.7.1 `ca68bb6`, `4837d4f`, `7e1f187`, `7120970`, `722ac97`, `3ea443f`) - MEDIUM: **10/10 addressed**: M1 documented as irreducible; M2 closed `8c219fb`; M3-M7 closed `d0db354`; M8 closed `a80ae61`; M9 documented as process-defer; M10 closed by v1.7.1 H4 `3ea443f`; M11 still open (synonym tied 60/60, filed for v1.7.x rerank tuning); M12 empirically closed (was tied 40/40 in v1.7.0, now 40/20 after Unicode tokenizer change in `8c219fb`) - LOW: **7/7 addressed**: L1 documented as process-defer; L2 closed `59cd7c8`; L3-L5 closed `eafd449`; L6 closed `59cd7c8`; L7 closed `59cd7c8` - v1.7.2 benchmark refresh (full 50 queries): v17 top-1 54.0% / top-5 88.0% vs v16 22.0% / 44.0%. Δ top-1 +32pp, error-reduction +41% (ship gate ≥30%, PASS). Slightly beats v1.7.0 audit's +30pp/+39.5% measurement. - Version bumped to 1.7.2 in `.claude-plugin/plugin.json` + `marketplace.json`; CHANGELOG `[1.7.2]` entry comprehensive. - v1.7 line audit-debt is now CLOSED-or-formally-DEFERRED. v1.8.0 (methodology modes) is the next scope per the user's "best ever per priority research" goal. ### 10.3 v1.7.x (defer to next minor; file as issues) | # | Finding | Notes | |---|---|---| | M1 | §3.2 net +485/-0 LOC; no v1.6 cruft pruned | Document or prune; low-impact | | M2 | `bm25-index.py` non-ASCII tokenization silently drops content | Document as known limitation; add Unicode-aware tokenizer in v1.7.x | | M3 | `rerank.py --allow-remote-ollama` error message blames user incorrectly | Improve error to mention forwarding from retrieve.py | | M4 | `wiki-lock.sh validate_path` accepts paths with newlines | Add `case "$p" in *$'\n'*) die "newlines" 4 ;;` | | M5 | `retrieve.py import_sibling` doesn't catch ImportError | Wrap in try/except with user-friendly error | | M6 | `contextual-prefix.py` empty-body edge case is silent | Add WARN log | | M7 | `rerank.py save_cache()` blocks indefinitely on non-flock filesystem | Add LOCK_NB + retry with timeout | | M8 | `test_retrieve.py` missing --explain and --no-rerank coverage | Add 2 test cases | | M9 | Bounded-slices: 4 skills touched by both §3.2 and §3.4 | Process note for future releases; not a bug | | M10 | No verifier agents during v1.7 dev | Same as H4 process item | | M11 | Synonym category benchmark tied (60% both pipelines) | Investigate why rerank didn't help; tune in v1.7.x or document | | M12 | Negative-query precision tied at 40% | Tune rerank to suppress low-confidence top results below threshold | | L7 | BM25 divide-by-zero in `query()` is theoretically reachable | Defensive `or 1.0` guard | | L8 | Cross-page top-1 tied at 30% | Per-source weighting or ensemble scoring; v1.7.x optimization | ### 10.4 v1.8 (methodology modes + reviews — already in roadmap) - Backlog item #6 (`wiki-mode`): LYT / PARA / Zettelkasten / Generic. Closes methodology TIE into 5th LEAD per §9 verdict. - Backlog item #11 (`wiki-review`): PARA-aware weekly/monthly/quarterly reviews. ### 10.5 v1.9 (multimodal ingest — already in roadmap) - Backlog item #12 (YouTube/PDF/audio/image ingest). - Backlog item #8 (NotebookLM/Readwise/Zotero adapters). - M14 (new): EPUB upload is now table-stakes per NotebookLM May 2026; ensure `adapter-epub.py` is on the v1.9 list. ### 10.6 v2.0 (derive — already in roadmap, scope adjusted) - Backlog item #5 (audio). - Backlog items #9 + #14 (quiz, flashcards, study-guide, brief, slides, mindmap). - **NEW (M13)**: Add **Video Overviews** to v2.0 `wiki-derive` spec — Marp slides + TTS narration → ffmpeg MP4. Required for NotebookLM parity per Phase C findings. ### 10.7 v2.5+ (GUI onramp — major effort) - Backlog item #7: Obsidian-plugin shell. Fork Claudian or deivid11/obsidian-claude-code-plugin pattern. Wraps the 13 skills in an in-vault GUI. L-effort. Closes §9 axis #4 gap. ### 10.8 Polish PR (bundle before v1.8) | # | Finding | Why | |---|---|---| | L1 | §3.1 substrate rewrite +17/-5 (no deletion) | Documented + defensible; flag for posterity | | L2 | `bin/setup-retrieve.sh` no Stage 1 timeout | Add progress indicator + timeout | | L3 | `bm25-index.py` dead `bm25_score()` function | Delete 27 unused lines | | L4 | `--rebuild` flag on bm25-index.py is no-op | Decide: implement incremental, or remove flag | | L5 | `--no-bm25` flag on retrieve.py is no-op | Decide: implement vector-only, or remove | | L6 | `wiki-lock.sh` STALE_AFTER_SEC vs --max-age naming | Rename for clarity | | L9 | SC 4.5.0 Footer Connections promoted to Core (UX widening) | Narrative note for positioning copy; we don't directly compete | | L10 | Copilot CLI integration issue stale 3 months | Surface in positioning: "the only Claude+Obsidian stack that's actually CLI-native today" | ### 10.9 Finding counts | Tier | Phase A | Phase B | Phase C | Total | |---|---|---|---|---| | BLOCKER | 1 | 0 | 0 | **1** | | HIGH | 6 | 0 | 0 | **6** | | MEDIUM | 10 | 2 (M11, M12) | 2 (M13, M14) | **14** | | LOW | 7 | 1 (L8) | 2 (L9, L10) | **10** | | **Total** | **24** | **3** | **4** | **31** | Plan §1 expected 15-30. **31** is slightly over because Phases B + C surfaced unforeseen findings (the benchmark exposed the synonym/negative ties; the market recheck exposed the NotebookLM Video Overviews expansion). Reasonable overage; nothing was filed at higher severity than evidence supports. --- ## Appendix A — 50-query benchmark corpus (Phase B — PENDING) --- ## Appendix B — Per-commit six-cut walkthrough Already inline at §3.2; expand here if user wants per-file evidence captures. --- ## Appendix C — Raw competitor responses (Phase C — PENDING)