Files
MultiPhysicsVault/wiki/concepts/DragonScale Memory.md
T
김경종 72dad72703
Tests / Hermetic test suite (push) Has been cancelled
Tests / Skill frontmatter validation (push) Has been cancelled
add claude-obsidian
2026-05-28 10:57:16 +09:00

296 lines
21 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
type: concept
title: "DragonScale Memory"
address: c-000001
complexity: advanced
domain: knowledge-management
aliases:
- "DragonScale"
- "DragonScale Architecture"
- "Fractal Memory"
created: 2026-04-23
updated: 2026-04-24
tags:
- concept
- knowledge-management
- memory
- architecture
- fractal
status: proposed
related:
- "[[LLM Wiki Pattern]]"
- "[[Compounding Knowledge]]"
- "[[Hot Cache]]"
- "[[concepts/_index]]"
sources:
---
# DragonScale Memory
A memory-layer design for LLM wiki vaults, inspired by the Heighway dragon curve. Four mechanisms (fold operator, deterministic page addresses, semantic tiling, boundary-first autoresearch) give an LLM-maintained wiki a principled way to grow, compact, and stay coherent. The dragon curve is a design-justification device, not a reasoning architecture.
> **Status: v0.4 2026-04-24.** All four mechanisms shipped as opt-in features. Phase 0 (spec) + Phase 1 (wiki-fold skill, dry-run verified) + Phase 2 (address MVP) + Phase 3 (semantic tiling) + Phase 3.5/3.6 (hardening) + Phase 4 (boundary-first autoresearch). See Review History for the progression.
---
## Scope
DragonScale is a **memory architecture**: it governs how a wiki grows, compacts, addresses its pages, and checks for duplicates. It is **not a search, planning, or reasoning algorithm.** Agent reasoning uses existing patterns (Tree of Thoughts with BFS/DFS/beam search; Yao et al. 2023).
**Honest disclaimer**: memory-layer choices are never neutral with respect to reasoning. What the vault surfaces, and in what order, shapes what the model sees. Long-context performance is position-sensitive (Liu et al. 2023, *Lost in the Middle*), and MemGPT's premise is that paging policy affects task success (Packer et al. 2023). One of the four mechanisms below (boundary-first autoresearch) explicitly crosses into agenda control; it is included deliberately and marked as such.
---
## The Core Analogy
Four dragon-curve properties map onto memory-system patterns already validated in adjacent fields. The word is *analogue*, not *identity*.
| Dragon curve property | Memory analogue | Strength of analogy |
|---|---|---|
| Paper-folding recursion: `D_{n+1} = D_n · R · swap(reverse(D_n))` | Hierarchical rollup / materialized summary with exponential fanout | Loose. Shares exponential batch structure, not compaction semantics. |
| Turn derivable from bits of `n` (regular paperfolding sequence, OEIS A014577) | Deterministic page addresses as organizational convention (MVP is a creation-order counter, not a true content hash) | Loose. Deterministic addressing is useful independent of the dragon. |
| Tiling / no self-intersection | Canonical-home coverage: one concept, one page | Medium. Dedup lint enforces this mechanically. |
| Boundary dim ≈ 1.523627 vs interior dim 2 | Agent attention weighted toward frontier pages | Aesthetic. The fractal dimension number does no load-bearing work. |
The curve is useful for deciding *which knobs to tighten and why*, not as a math proof that any given mechanism is optimal.
---
## Mechanism 1 — Fold Operator
After a batch of ingests, run a fold: produce a meta-page summarizing the batch, link children back, update the index. Folds stack: after enough level-`k` folds accumulate, a level-`k+1` fold produces a super-summary.
This is a **hierarchical rollup**, loosely similar to LSM-tree compaction but with important differences.
**What it shares with LSM compaction:**
- Exponential batch fanout across levels (like LevelDB's fixed level-size ratio, typically 10× per level in leveled mode)
- Periodic consolidation rather than per-write work
**What it does NOT inherit from LSM:**
- No sorted-key semantics (pages have semantic, not key-ordered, identity)
- No SSTable/memtable distinction, no tombstones, no Bloom filters
- No write-amplification arithmetic; no read-path acceleration
- **Folds are additive**: children remain in place. LSM compaction rewrites and deletes. A DragonScale fold is closer to a materialized view than a compaction.
**Trigger options:**
- `2^k` entry count (k=4 ⇒ every 16 log entries). Simple to implement; straightforward level math; ignores page size and novelty.
- **Adaptive trigger (preferred for production)**: token budget (e.g., fold when unfolded batch exceeds N tokens), novelty score (average embedding distance from existing summaries), or staleness age (last fold > T days). Phase 1 will implement entry-count for MVP; adaptive triggers are a follow-up.
**Invariants:**
- Idempotent on the same range (re-running is a no-op).
- Reversible (children stay; a fold is additive).
- Level-bounded: with entry-count trigger `2^k`, fold depth is at most `⌈log₂(N)⌉` above leaf pages. Derived, not empirical.
---
## Mechanism 2 — Deterministic Page Addresses
Every new page gets a stable `address` field in frontmatter. The Phase 2 MVP uses a simple creation-order counter:
```yaml
address: c-000042
```
Format: `c-<6-digit-counter>`. `c-` means "creation-order counter." Zero-padded.
**Future extension** (documented, not shipped in Phase 2):
- Fold-relative path: `f1.2/c-000042` once folds exist, where `f1.2` encodes the fold-tree lineage.
- Content hash suffix: `c-000042:h7f3c2` once the hash-rotation policy is decided.
**What Phase 2 MVP gives:**
- Uniqueness: counter is monotonically increasing; deleted pages' addresses are retired, never reused.
- Stability: never changes across content edits.
- Determinism: derivable from the counter state at `.vault-meta/address-counter.txt`.
- Ordering: preserves creation sequence.
**What this does NOT give (renamed "content-addressable paths" was misleading in v0.1):**
- **No content-addressability in the MVP.** The Phase 2 address is a sequence counter, not a content hash. Renaming this mechanism from "content-addressable paths" to "deterministic page addresses" is more honest about what actually ships.
- **No prompt cache benefit** (already corrected in v0.1 → v0.2). Per Anthropic docs, cache hits require byte-identical prefixes; an address field in frontmatter only helps if the frontmatter itself is inside a cached block AND stays byte-identical. Stable prefixes, not addresses, drive cache hits.
**Phase 2 exclusions** (all deferred):
- Backfill of legacy pre-Phase-2 pages (will use `l-` prefix with its own counter).
- Fold-ancestry bit prefix (requires committed folds from a future fold-of-folds skill).
- Content hash suffix (rotation policy unresolved; see limitations).
**Implementation** (Phase 2, shipped):
- `scripts/allocate-address.sh`: flock-guarded atomic allocator. All counter reads/writes go through this script; direct Write/Edit on `.vault-meta/address-counter.txt` is prohibited (would fire PostToolUse hook).
- `skills/wiki-ingest/SKILL.md` → Address Assignment section: opt-in feature detection; delegates allocation to the helper; records path-to-address mapping in `.raw/.manifest.json` `address_map` for re-ingest stability.
- `skills/wiki-lint/SKILL.md` → Address Validation section: format check, uniqueness check, counter-drift check, address-map consistency check.
**Lint severity model** (matches `skills/wiki-lint/SKILL.md` Address Validation behavior):
- Post-rollout pages (frontmatter `created:` >= 2026-04-23, or any page newly created after DragonScale adoption) that lack an address are **errors**. This is the silent-regression guard.
- Legacy pages (`created:` < 2026-04-23) without addresses are **informational**. The optional `.vault-meta/legacy-pages.txt` manifest can grandfather pages whose `created:` metadata is wrong or missing.
- Meta pages (`_index.md`, `index.md`, `log.md`, `hot.md`, etc.) and fold pages are excluded entirely.
---
## Mechanism 3 — Semantic Tiling Lint
The tiling property says the same concept should live in one canonical page. Enforce it with an embedding-based dedup check in `wiki-lint`.
**Procedure (calibrated, not a guess):**
1. Compute embeddings for every page. Default model: local `nomic-embed-text` via ollama on `http://127.0.0.1:11434`. Cost: local hardware time only (no API fees). The script supports a remote override under `--allow-remote-ollama`; remote endpoints may incur provider API fees.
2. Compute pairwise cosine similarities for all page pairs.
3. **Calibration** (one-time, before first use): label 50-100 in-vault page pairs as duplicate/near/distinct; find the thresholds that optimize target precision for each band.
4. **Default bands** (used before calibration, then refined):
- `≥ 0.90` — near-duplicate, lint error
- `0.80 0.90` — review bucket, lint warning
- `< 0.80` — distinct, no flag
5. Never auto-merge. Output a review list.
**Why not a fixed 0.85?** v0.1 used 0.85 with no justification. Published thresholds in the embeddings literature span a wide range (Sentence Transformers' `community_detection` defaults to 0.75; Quora-duplicate calibrations land around 0.770.83; sparse-model defaults differ again). Thresholds are model-, corpus-, and objective-dependent, so calibration is required.
---
## Mechanism 4 — Boundary-First Autoresearch
> **Status: shipped (Phase 4, opt-in)** as of 2026-04-24. Implementation: `scripts/boundary-score.py`. Integration: `skills/autoresearch/SKILL.md` Topic Selection section B. Tests: `tests/test_boundary_score.py`.
Boundary pages (high out-degree relative to in-degree, recency-weighted) are the vault's frontier. `/autoresearch` invoked without a topic reads the top-5 boundary pages and offers them as research candidates; the user selects one (or types a free-text topic, or declines all and falls back to the original ask-user mode).
**Formula (exact)**:
```
out_degree(p) = count of distinct filename-stem wikilinks in body of p that resolve to scoreable pages
in_degree(p) = count of distinct scoreable pages whose body contains a wikilink to p
recency_weight(p) = exp(-days_since_updated / 30) # no floor; old pages approach 0
boundary_score(p) = (out_degree - in_degree) * recency_weight
```
**Link resolution**: filename-stem only. `[[Foo]]` resolves to `Foo.md` anywhere in the vault. Aliases declared via frontmatter `aliases:` are NOT parsed. Folder-qualified links (e.g. `[[notes/Foo]]`) are resolved by stem alone. This matches Obsidian's default behavior for unique filenames but does not implement full alias resolution.
**Scoreable** = any page NOT excluded by any of:
- frontmatter `type: meta` or `type: fold`
- filename in `{_index.md, index.md, log.md, hot.md, overview.md, dashboard.md, Wiki Map.md, getting-started.md}`
- path prefix in `wiki/folds/` or `wiki/meta/`
- symlinks or paths whose resolved target escapes the vault root (rejected at scan time)
**Code-block filtering**: triple-backtick AND triple-tilde fenced code blocks are skipped, with CommonMark-like length tracking so a longer opening fence is not closed by a shorter inner fence. Indented code blocks (4+ spaces) are NOT filtered because Obsidian bullet lists commonly use 4-space indentation and contain real wikilinks. See `scripts/boundary-score.py:RECENCY_HALFLIFE_DAYS` for the sole tunable constant.
**Honest labeling**: this mechanism is **agenda control**, not pure memory. It shapes what the agent researches next. It is included in DragonScale because it is a direct consequence of the dragon-curve boundary analogy, and because it pairs naturally with folds (freshly folded pages have low out-degree; frontier pages are pre-fold). But the "memory only, not reasoning" framing does not cover it. Users who want a strict memory-layer subset should omit this mechanism (simply do not invoke `/autoresearch` without a topic, or do not set up `scripts/boundary-score.py`).
**What is NOT included**:
- No auto-triggering. `/autoresearch` is still user-invoked.
- No persistent boundary-score cache. Scoring is O(N * avg_links) and runs on every invocation from fresh wiki/ state.
- No integration with folds or addresses. Pure graph analysis on the wikilink graph.
- No automatic topic selection without user confirmation. The helper presents choices; the user picks.
---
## Operational Policies (required before implementation)
Adversarial review flagged these gaps in v0.1. Each must be decided before the corresponding phase ships.
| Policy | Phase 0 position | Decision point |
|---|---|---|
| **Retention / GC** | No automatic deletion. Pages are permanent. | Revisit if vault exceeds ~5000 pages. |
| **Tombstones** | None. Deleted pages are removed via git revert. | Revisit if delete events become common. |
| **Versioning** | Relied on git history, not in-vault versioning. | Address-hash rotation policy doubles as a coarse version signal. |
| **Conflict resolution for contradictory folds** | Meta-page must quote both sources with explicit "conflict" callout. No automatic resolution. | Phase 1 spec required. |
| **Concurrency / atomicity** | Single-writer assumption (one Claude session at a time). PostToolUse auto-commit serializes. | Multi-writer case deferred. |
| **Provenance for meta-pages** | Every fold page must include frontmatter listing children and fold level. | Phase 1 must enforce. |
| **Access control** | Out of scope. This is a single-user vault. | Revisit only if shared. |
---
## Mapping to Claude-Obsidian
| Mechanism | Status | New | Extends |
|---|---|---|---|
| Fold operator | shipped (Phase 1, dry-run verified) | `skills/wiki-fold/` | reads `log.md`, writes `wiki/folds/`, updates `index.md` on commit |
| Address anchors | shipped (Phase 2, opt-in) | `scripts/allocate-address.sh`, new frontmatter field | `wiki-ingest` (assignment), `wiki-lint` (validation) |
| Semantic tiling | shipped (Phase 2/3, opt-in) | `scripts/tiling-check.py`, `.vault-meta/tiling-thresholds.json` | `wiki-lint` with banded thresholds, calibration procedure documented |
| Boundary-first | shipped (Phase 4, opt-in) | `scripts/boundary-score.py`, `tests/test_boundary_score.py` | `skills/autoresearch/SKILL.md` Topic Selection section B; `commands/autoresearch.md` no-topic path |
The existing hot → index → domain → page hierarchy already implements self-similarity across scales. That's the one dragon-curve property this vault had before DragonScale.
---
## Why This Over Alternatives
| Pattern | What it gives | What DragonScale adds |
|---|---|---|
| MemGPT virtual context (two-tier paging) | Main context ↔ external context swap | More than two levels; explicit fold triggers; dedup lint |
| Pure LSM compaction | Exponential write-path throughput | Semantic-layer mechanisms (tiling, boundary); additive rollups over destructive merges |
| Ad-hoc `/save` | Human-triggered filing | Rule-based fold cadence |
| Vector-only RAG | Retrieval | Canonical-home structure; lineage addresses |
DragonScale composes patterns validated in adjacent systems: LSM *batching* (databases), MemGPT *paging* (agents), Anthropic *cache ordering* (prompt engineering), and embedding *dedup* (knowledge graphs).
---
## Known Limitations (v0.3)
- **Unvalidated at scale.** All four mechanisms are theoretical; none tested on a multi-thousand-page vault.
- **Fold cadence is a knob, not a theorem.** `k=4` is a starting guess. Adaptive triggers are likely better.
- **Address stability is unsolved.** Hash rotation on edits is a known issue; deferred.
- **Boundary-first crosses scope.** Included with a warning, not quietly.
- **Calibration load.** Tiling requires a one-time labeling pass; without it, only defaults apply.
---
## Primary Sources
Verified against primary sources on 2026-04-23. **Scope of tagging**: the specific numeric values, formulas, and named patterns below are tagged **[sourced]** when directly citable, **[derived]** when derivable from sourced material, or **[conjecture]** when based on reasoning without a specific source. **Not tagged** (and readers should treat as interpretive synthesis): framing sentences in the body such as "composes patterns validated," "self-similarity already exists," and the design rationale tying the four mechanisms together. These are editorial, not source-backed.
**Dragon curve math [sourced]**
- Boundary dimension `2·log₂(λ)` where `λ³ λ² 2 = 0`, giving 1.523627086: [Dragon curve, Wikipedia](https://en.wikipedia.org/wiki/Dragon_curve)
- Paper-folding construction and OEIS A014577: [Regular paperfolding sequence, Wikipedia](https://en.wikipedia.org/wiki/Regular_paperfolding_sequence); [OEIS A014577](https://oeis.org/A014577)
- Tiling and rep-tiles: [Wolfram Demonstrations: Tiling Dragons and Rep-tiles of Order Two](https://demonstrations.wolfram.com/TilingDragonsAndRepTilesOfOrderTwo/)
**LSM trees [sourced]**
- Level size ratios and compaction semantics: [RocksDB Compaction wiki](https://github.com/facebook/rocksdb/wiki/Compaction), [RocksDB Tuning Guide](https://github.com/facebook/rocksdb/wiki/RocksDB-Tuning-Guide), [How to Grow an LSM-tree? (2025)](https://arxiv.org/abs/2504.17178)
- LevelDB 10× level ratio: referenced in the arXiv paper above. Treat as *typical*, not required.
**LLM memory architectures [sourced]**
- OS-inspired paging: [MemGPT: Towards LLMs as Operating Systems (Packer et al. 2023)](https://arxiv.org/abs/2310.08560)
- Position sensitivity: [Lost in the Middle (Liu et al. 2023)](https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00638/119630/Lost-in-the-Middle-How-Language-Models-Use-Long)
- Note-based agentic memory: [A-Mem (2025)](https://arxiv.org/abs/2502.12110)
**Prompt caching [sourced]**
- Byte-identical prefix requirement, breakpoint mechanics, TTL options: [Anthropic Prompt Caching docs](https://platform.claude.com/docs/en/build-with-claude/prompt-caching)
**Embedding thresholds [sourced]**
- Sentence Transformers defaults and calibration examples: [Sentence Transformers util](https://sbert.net/docs/package_reference/util.html), [SBERT evaluation docs](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html)
**Reasoning search (out of scope, cited only to justify the scope boundary) [sourced]**
- [Tree of Thoughts (Yao et al. 2023)](https://arxiv.org/abs/2305.10601)
**Items marked [conjecture] in this doc:**
- `k=4`/`k=5` starting value for fold cadence (needs empirical tuning)
- `~30s` full-vault embedding-pass time (needs measurement)
- `boundary_score` formula exact weighting (a plausible starting form; not validated against retrieval metrics)
**Items marked [derived]:**
- `⌈log₂(N)⌉` fold-depth bound (trivially derivable from the entry-count trigger)
- Default tiling bands `{≥0.90, 0.80-0.90, <0.80}` before calibration (interpolated from cited ranges in Sentence Transformers examples; not optimal by construction)
---
## Review History
**v0.1 (2026-04-23, initial draft)** — written after a verification pass against Wikipedia, arXiv, and Anthropic docs. Four mechanisms proposed.
**v0.4 (2026-04-24, Phase 4 shipped)** — Mechanism 4 (boundary-first autoresearch) implemented as `scripts/boundary-score.py` with `tests/test_boundary_score.py` covering parsing, recency weight, wikilink extraction (with fence-length + tilde + indented-block tests), graph construction (self-loop/unresolved/meta-target exclusion), symlink rejection, and CLI surface (`--top`, `--page`, `--json`). Integrated into `skills/autoresearch/SKILL.md` as an opt-in Topic Selection mode with explicit helper-failure fallback. Spec's "NOT IMPLEMENTED" marker removed; exact scoring formula (no recency floor), filename-stem-only resolution disclosure, scope, and "what is NOT included" section added. Phase 3.6 pre-Phase-4 hardening shipped concurrently (5 fixes: `--report` path confinement, rollout baseline, AGENTS.md consistency, wiki-ingest .raw contradiction, install-guide version).
**v0.3 (2026-04-23, Phase 2 alignment)** — Mechanism 2 rewritten to match the actual Phase 2 MVP shipped in `wiki-ingest` and `wiki-lint`. Renamed from "Content-Addressable Paths" to "Deterministic Page Addresses" (the MVP is a creation-order counter, not a content hash). Documented the extension path for fold-ancestry bits and content-hash suffix, both explicitly deferred.
**v0.2 (2026-04-23, post-adversarial review)** — after `codex exec` adversarial review. All 7 critiques accepted:
1. *LSM "structurally identical"* → weakened to "loosely analogous to hierarchical rollup"; non-inherited properties listed explicitly.
2. *Prompt cache address benefit* → removed strong claim; narrowed to organizational convention.
3. *0.85 threshold* → replaced with calibration procedure and banded defaults.
4. *2^k cadence* → justified as implementation convenience; adaptive trigger flagged as preferred for production.
5. *Scope boundary contradiction* → acknowledged; boundary-first explicitly labeled as agenda control.
6. *Missing production mechanisms* → added Operational Policies section (retention, versioning, conflict resolution, concurrency, provenance).
7. *Unverified claims* → tagged specific numeric values, formulas, and named patterns as [sourced], [derived], or [conjecture]. Editorial synthesis in the body explicitly flagged as not tagged (see scope note under Primary Sources).
---
## Connections
See [[LLM Wiki Pattern]] for the broader pattern this extends.
See [[Compounding Knowledge]] for why persistent state is the precondition for DragonScale.
See [[Hot Cache]] for the existing 500-word session context, which is a level-0 manual fold.
See [[Andrej Karpathy]] for the intellectual lineage.