296 lines
21 KiB
Markdown
296 lines
21 KiB
Markdown
---
|
||
type: concept
|
||
title: "DragonScale Memory"
|
||
address: c-000001
|
||
complexity: advanced
|
||
domain: knowledge-management
|
||
aliases:
|
||
- "DragonScale"
|
||
- "DragonScale Architecture"
|
||
- "Fractal Memory"
|
||
created: 2026-04-23
|
||
updated: 2026-04-24
|
||
tags:
|
||
- concept
|
||
- knowledge-management
|
||
- memory
|
||
- architecture
|
||
- fractal
|
||
status: proposed
|
||
related:
|
||
- "[[LLM Wiki Pattern]]"
|
||
- "[[Compounding Knowledge]]"
|
||
- "[[Hot Cache]]"
|
||
- "[[concepts/_index]]"
|
||
sources:
|
||
---
|
||
|
||
# DragonScale Memory
|
||
|
||
A memory-layer design for LLM wiki vaults, inspired by the Heighway dragon curve. Four mechanisms (fold operator, deterministic page addresses, semantic tiling, boundary-first autoresearch) give an LLM-maintained wiki a principled way to grow, compact, and stay coherent. The dragon curve is a design-justification device, not a reasoning architecture.
|
||
|
||
> **Status: v0.4 2026-04-24.** All four mechanisms shipped as opt-in features. Phase 0 (spec) + Phase 1 (wiki-fold skill, dry-run verified) + Phase 2 (address MVP) + Phase 3 (semantic tiling) + Phase 3.5/3.6 (hardening) + Phase 4 (boundary-first autoresearch). See Review History for the progression.
|
||
|
||
---
|
||
|
||
## Scope
|
||
|
||
DragonScale is a **memory architecture**: it governs how a wiki grows, compacts, addresses its pages, and checks for duplicates. It is **not a search, planning, or reasoning algorithm.** Agent reasoning uses existing patterns (Tree of Thoughts with BFS/DFS/beam search; Yao et al. 2023).
|
||
|
||
**Honest disclaimer**: memory-layer choices are never neutral with respect to reasoning. What the vault surfaces, and in what order, shapes what the model sees. Long-context performance is position-sensitive (Liu et al. 2023, *Lost in the Middle*), and MemGPT's premise is that paging policy affects task success (Packer et al. 2023). One of the four mechanisms below (boundary-first autoresearch) explicitly crosses into agenda control; it is included deliberately and marked as such.
|
||
|
||
---
|
||
|
||
## The Core Analogy
|
||
|
||
Four dragon-curve properties map onto memory-system patterns already validated in adjacent fields. The word is *analogue*, not *identity*.
|
||
|
||
| Dragon curve property | Memory analogue | Strength of analogy |
|
||
|---|---|---|
|
||
| Paper-folding recursion: `D_{n+1} = D_n · R · swap(reverse(D_n))` | Hierarchical rollup / materialized summary with exponential fanout | Loose. Shares exponential batch structure, not compaction semantics. |
|
||
| Turn derivable from bits of `n` (regular paperfolding sequence, OEIS A014577) | Deterministic page addresses as organizational convention (MVP is a creation-order counter, not a true content hash) | Loose. Deterministic addressing is useful independent of the dragon. |
|
||
| Tiling / no self-intersection | Canonical-home coverage: one concept, one page | Medium. Dedup lint enforces this mechanically. |
|
||
| Boundary dim ≈ 1.523627 vs interior dim 2 | Agent attention weighted toward frontier pages | Aesthetic. The fractal dimension number does no load-bearing work. |
|
||
|
||
The curve is useful for deciding *which knobs to tighten and why*, not as a math proof that any given mechanism is optimal.
|
||
|
||
---
|
||
|
||
## Mechanism 1 — Fold Operator
|
||
|
||
After a batch of ingests, run a fold: produce a meta-page summarizing the batch, link children back, update the index. Folds stack: after enough level-`k` folds accumulate, a level-`k+1` fold produces a super-summary.
|
||
|
||
This is a **hierarchical rollup**, loosely similar to LSM-tree compaction but with important differences.
|
||
|
||
**What it shares with LSM compaction:**
|
||
- Exponential batch fanout across levels (like LevelDB's fixed level-size ratio, typically 10× per level in leveled mode)
|
||
- Periodic consolidation rather than per-write work
|
||
|
||
**What it does NOT inherit from LSM:**
|
||
- No sorted-key semantics (pages have semantic, not key-ordered, identity)
|
||
- No SSTable/memtable distinction, no tombstones, no Bloom filters
|
||
- No write-amplification arithmetic; no read-path acceleration
|
||
- **Folds are additive**: children remain in place. LSM compaction rewrites and deletes. A DragonScale fold is closer to a materialized view than a compaction.
|
||
|
||
**Trigger options:**
|
||
- `2^k` entry count (k=4 ⇒ every 16 log entries). Simple to implement; straightforward level math; ignores page size and novelty.
|
||
- **Adaptive trigger (preferred for production)**: token budget (e.g., fold when unfolded batch exceeds N tokens), novelty score (average embedding distance from existing summaries), or staleness age (last fold > T days). Phase 1 will implement entry-count for MVP; adaptive triggers are a follow-up.
|
||
|
||
**Invariants:**
|
||
- Idempotent on the same range (re-running is a no-op).
|
||
- Reversible (children stay; a fold is additive).
|
||
- Level-bounded: with entry-count trigger `2^k`, fold depth is at most `⌈log₂(N)⌉` above leaf pages. Derived, not empirical.
|
||
|
||
---
|
||
|
||
## Mechanism 2 — Deterministic Page Addresses
|
||
|
||
Every new page gets a stable `address` field in frontmatter. The Phase 2 MVP uses a simple creation-order counter:
|
||
|
||
```yaml
|
||
address: c-000042
|
||
```
|
||
|
||
Format: `c-<6-digit-counter>`. `c-` means "creation-order counter." Zero-padded.
|
||
|
||
**Future extension** (documented, not shipped in Phase 2):
|
||
- Fold-relative path: `f1.2/c-000042` once folds exist, where `f1.2` encodes the fold-tree lineage.
|
||
- Content hash suffix: `c-000042:h7f3c2` once the hash-rotation policy is decided.
|
||
|
||
**What Phase 2 MVP gives:**
|
||
- Uniqueness: counter is monotonically increasing; deleted pages' addresses are retired, never reused.
|
||
- Stability: never changes across content edits.
|
||
- Determinism: derivable from the counter state at `.vault-meta/address-counter.txt`.
|
||
- Ordering: preserves creation sequence.
|
||
|
||
**What this does NOT give (renamed "content-addressable paths" was misleading in v0.1):**
|
||
- **No content-addressability in the MVP.** The Phase 2 address is a sequence counter, not a content hash. Renaming this mechanism from "content-addressable paths" to "deterministic page addresses" is more honest about what actually ships.
|
||
- **No prompt cache benefit** (already corrected in v0.1 → v0.2). Per Anthropic docs, cache hits require byte-identical prefixes; an address field in frontmatter only helps if the frontmatter itself is inside a cached block AND stays byte-identical. Stable prefixes, not addresses, drive cache hits.
|
||
|
||
**Phase 2 exclusions** (all deferred):
|
||
- Backfill of legacy pre-Phase-2 pages (will use `l-` prefix with its own counter).
|
||
- Fold-ancestry bit prefix (requires committed folds from a future fold-of-folds skill).
|
||
- Content hash suffix (rotation policy unresolved; see limitations).
|
||
|
||
**Implementation** (Phase 2, shipped):
|
||
- `scripts/allocate-address.sh`: flock-guarded atomic allocator. All counter reads/writes go through this script; direct Write/Edit on `.vault-meta/address-counter.txt` is prohibited (would fire PostToolUse hook).
|
||
- `skills/wiki-ingest/SKILL.md` → Address Assignment section: opt-in feature detection; delegates allocation to the helper; records path-to-address mapping in `.raw/.manifest.json` `address_map` for re-ingest stability.
|
||
- `skills/wiki-lint/SKILL.md` → Address Validation section: format check, uniqueness check, counter-drift check, address-map consistency check.
|
||
|
||
**Lint severity model** (matches `skills/wiki-lint/SKILL.md` Address Validation behavior):
|
||
- Post-rollout pages (frontmatter `created:` >= 2026-04-23, or any page newly created after DragonScale adoption) that lack an address are **errors**. This is the silent-regression guard.
|
||
- Legacy pages (`created:` < 2026-04-23) without addresses are **informational**. The optional `.vault-meta/legacy-pages.txt` manifest can grandfather pages whose `created:` metadata is wrong or missing.
|
||
- Meta pages (`_index.md`, `index.md`, `log.md`, `hot.md`, etc.) and fold pages are excluded entirely.
|
||
|
||
---
|
||
|
||
## Mechanism 3 — Semantic Tiling Lint
|
||
|
||
The tiling property says the same concept should live in one canonical page. Enforce it with an embedding-based dedup check in `wiki-lint`.
|
||
|
||
**Procedure (calibrated, not a guess):**
|
||
1. Compute embeddings for every page. Default model: local `nomic-embed-text` via ollama on `http://127.0.0.1:11434`. Cost: local hardware time only (no API fees). The script supports a remote override under `--allow-remote-ollama`; remote endpoints may incur provider API fees.
|
||
2. Compute pairwise cosine similarities for all page pairs.
|
||
3. **Calibration** (one-time, before first use): label 50-100 in-vault page pairs as duplicate/near/distinct; find the thresholds that optimize target precision for each band.
|
||
4. **Default bands** (used before calibration, then refined):
|
||
- `≥ 0.90` — near-duplicate, lint error
|
||
- `0.80 – 0.90` — review bucket, lint warning
|
||
- `< 0.80` — distinct, no flag
|
||
5. Never auto-merge. Output a review list.
|
||
|
||
**Why not a fixed 0.85?** v0.1 used 0.85 with no justification. Published thresholds in the embeddings literature span a wide range (Sentence Transformers' `community_detection` defaults to 0.75; Quora-duplicate calibrations land around 0.77–0.83; sparse-model defaults differ again). Thresholds are model-, corpus-, and objective-dependent, so calibration is required.
|
||
|
||
---
|
||
|
||
## Mechanism 4 — Boundary-First Autoresearch
|
||
|
||
> **Status: shipped (Phase 4, opt-in)** as of 2026-04-24. Implementation: `scripts/boundary-score.py`. Integration: `skills/autoresearch/SKILL.md` Topic Selection section B. Tests: `tests/test_boundary_score.py`.
|
||
|
||
Boundary pages (high out-degree relative to in-degree, recency-weighted) are the vault's frontier. `/autoresearch` invoked without a topic reads the top-5 boundary pages and offers them as research candidates; the user selects one (or types a free-text topic, or declines all and falls back to the original ask-user mode).
|
||
|
||
**Formula (exact)**:
|
||
|
||
```
|
||
out_degree(p) = count of distinct filename-stem wikilinks in body of p that resolve to scoreable pages
|
||
in_degree(p) = count of distinct scoreable pages whose body contains a wikilink to p
|
||
recency_weight(p) = exp(-days_since_updated / 30) # no floor; old pages approach 0
|
||
boundary_score(p) = (out_degree - in_degree) * recency_weight
|
||
```
|
||
|
||
**Link resolution**: filename-stem only. `[[Foo]]` resolves to `Foo.md` anywhere in the vault. Aliases declared via frontmatter `aliases:` are NOT parsed. Folder-qualified links (e.g. `[[notes/Foo]]`) are resolved by stem alone. This matches Obsidian's default behavior for unique filenames but does not implement full alias resolution.
|
||
|
||
**Scoreable** = any page NOT excluded by any of:
|
||
- frontmatter `type: meta` or `type: fold`
|
||
- filename in `{_index.md, index.md, log.md, hot.md, overview.md, dashboard.md, Wiki Map.md, getting-started.md}`
|
||
- path prefix in `wiki/folds/` or `wiki/meta/`
|
||
- symlinks or paths whose resolved target escapes the vault root (rejected at scan time)
|
||
|
||
**Code-block filtering**: triple-backtick AND triple-tilde fenced code blocks are skipped, with CommonMark-like length tracking so a longer opening fence is not closed by a shorter inner fence. Indented code blocks (4+ spaces) are NOT filtered because Obsidian bullet lists commonly use 4-space indentation and contain real wikilinks. See `scripts/boundary-score.py:RECENCY_HALFLIFE_DAYS` for the sole tunable constant.
|
||
|
||
**Honest labeling**: this mechanism is **agenda control**, not pure memory. It shapes what the agent researches next. It is included in DragonScale because it is a direct consequence of the dragon-curve boundary analogy, and because it pairs naturally with folds (freshly folded pages have low out-degree; frontier pages are pre-fold). But the "memory only, not reasoning" framing does not cover it. Users who want a strict memory-layer subset should omit this mechanism (simply do not invoke `/autoresearch` without a topic, or do not set up `scripts/boundary-score.py`).
|
||
|
||
**What is NOT included**:
|
||
- No auto-triggering. `/autoresearch` is still user-invoked.
|
||
- No persistent boundary-score cache. Scoring is O(N * avg_links) and runs on every invocation from fresh wiki/ state.
|
||
- No integration with folds or addresses. Pure graph analysis on the wikilink graph.
|
||
- No automatic topic selection without user confirmation. The helper presents choices; the user picks.
|
||
|
||
---
|
||
|
||
## Operational Policies (required before implementation)
|
||
|
||
Adversarial review flagged these gaps in v0.1. Each must be decided before the corresponding phase ships.
|
||
|
||
| Policy | Phase 0 position | Decision point |
|
||
|---|---|---|
|
||
| **Retention / GC** | No automatic deletion. Pages are permanent. | Revisit if vault exceeds ~5000 pages. |
|
||
| **Tombstones** | None. Deleted pages are removed via git revert. | Revisit if delete events become common. |
|
||
| **Versioning** | Relied on git history, not in-vault versioning. | Address-hash rotation policy doubles as a coarse version signal. |
|
||
| **Conflict resolution for contradictory folds** | Meta-page must quote both sources with explicit "conflict" callout. No automatic resolution. | Phase 1 spec required. |
|
||
| **Concurrency / atomicity** | Single-writer assumption (one Claude session at a time). PostToolUse auto-commit serializes. | Multi-writer case deferred. |
|
||
| **Provenance for meta-pages** | Every fold page must include frontmatter listing children and fold level. | Phase 1 must enforce. |
|
||
| **Access control** | Out of scope. This is a single-user vault. | Revisit only if shared. |
|
||
|
||
---
|
||
|
||
## Mapping to Claude-Obsidian
|
||
|
||
| Mechanism | Status | New | Extends |
|
||
|---|---|---|---|
|
||
| Fold operator | shipped (Phase 1, dry-run verified) | `skills/wiki-fold/` | reads `log.md`, writes `wiki/folds/`, updates `index.md` on commit |
|
||
| Address anchors | shipped (Phase 2, opt-in) | `scripts/allocate-address.sh`, new frontmatter field | `wiki-ingest` (assignment), `wiki-lint` (validation) |
|
||
| Semantic tiling | shipped (Phase 2/3, opt-in) | `scripts/tiling-check.py`, `.vault-meta/tiling-thresholds.json` | `wiki-lint` with banded thresholds, calibration procedure documented |
|
||
| Boundary-first | shipped (Phase 4, opt-in) | `scripts/boundary-score.py`, `tests/test_boundary_score.py` | `skills/autoresearch/SKILL.md` Topic Selection section B; `commands/autoresearch.md` no-topic path |
|
||
|
||
The existing hot → index → domain → page hierarchy already implements self-similarity across scales. That's the one dragon-curve property this vault had before DragonScale.
|
||
|
||
---
|
||
|
||
## Why This Over Alternatives
|
||
|
||
| Pattern | What it gives | What DragonScale adds |
|
||
|---|---|---|
|
||
| MemGPT virtual context (two-tier paging) | Main context ↔ external context swap | More than two levels; explicit fold triggers; dedup lint |
|
||
| Pure LSM compaction | Exponential write-path throughput | Semantic-layer mechanisms (tiling, boundary); additive rollups over destructive merges |
|
||
| Ad-hoc `/save` | Human-triggered filing | Rule-based fold cadence |
|
||
| Vector-only RAG | Retrieval | Canonical-home structure; lineage addresses |
|
||
|
||
DragonScale composes patterns validated in adjacent systems: LSM *batching* (databases), MemGPT *paging* (agents), Anthropic *cache ordering* (prompt engineering), and embedding *dedup* (knowledge graphs).
|
||
|
||
---
|
||
|
||
## Known Limitations (v0.3)
|
||
|
||
- **Unvalidated at scale.** All four mechanisms are theoretical; none tested on a multi-thousand-page vault.
|
||
- **Fold cadence is a knob, not a theorem.** `k=4` is a starting guess. Adaptive triggers are likely better.
|
||
- **Address stability is unsolved.** Hash rotation on edits is a known issue; deferred.
|
||
- **Boundary-first crosses scope.** Included with a warning, not quietly.
|
||
- **Calibration load.** Tiling requires a one-time labeling pass; without it, only defaults apply.
|
||
|
||
---
|
||
|
||
## Primary Sources
|
||
|
||
Verified against primary sources on 2026-04-23. **Scope of tagging**: the specific numeric values, formulas, and named patterns below are tagged **[sourced]** when directly citable, **[derived]** when derivable from sourced material, or **[conjecture]** when based on reasoning without a specific source. **Not tagged** (and readers should treat as interpretive synthesis): framing sentences in the body such as "composes patterns validated," "self-similarity already exists," and the design rationale tying the four mechanisms together. These are editorial, not source-backed.
|
||
|
||
**Dragon curve math [sourced]**
|
||
- Boundary dimension `2·log₂(λ)` where `λ³ − λ² − 2 = 0`, giving 1.523627086: [Dragon curve, Wikipedia](https://en.wikipedia.org/wiki/Dragon_curve)
|
||
- Paper-folding construction and OEIS A014577: [Regular paperfolding sequence, Wikipedia](https://en.wikipedia.org/wiki/Regular_paperfolding_sequence); [OEIS A014577](https://oeis.org/A014577)
|
||
- Tiling and rep-tiles: [Wolfram Demonstrations: Tiling Dragons and Rep-tiles of Order Two](https://demonstrations.wolfram.com/TilingDragonsAndRepTilesOfOrderTwo/)
|
||
|
||
**LSM trees [sourced]**
|
||
- Level size ratios and compaction semantics: [RocksDB Compaction wiki](https://github.com/facebook/rocksdb/wiki/Compaction), [RocksDB Tuning Guide](https://github.com/facebook/rocksdb/wiki/RocksDB-Tuning-Guide), [How to Grow an LSM-tree? (2025)](https://arxiv.org/abs/2504.17178)
|
||
- LevelDB 10× level ratio: referenced in the arXiv paper above. Treat as *typical*, not required.
|
||
|
||
**LLM memory architectures [sourced]**
|
||
- OS-inspired paging: [MemGPT: Towards LLMs as Operating Systems (Packer et al. 2023)](https://arxiv.org/abs/2310.08560)
|
||
- Position sensitivity: [Lost in the Middle (Liu et al. 2023)](https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00638/119630/Lost-in-the-Middle-How-Language-Models-Use-Long)
|
||
- Note-based agentic memory: [A-Mem (2025)](https://arxiv.org/abs/2502.12110)
|
||
|
||
**Prompt caching [sourced]**
|
||
- Byte-identical prefix requirement, breakpoint mechanics, TTL options: [Anthropic Prompt Caching docs](https://platform.claude.com/docs/en/build-with-claude/prompt-caching)
|
||
|
||
**Embedding thresholds [sourced]**
|
||
- Sentence Transformers defaults and calibration examples: [Sentence Transformers util](https://sbert.net/docs/package_reference/util.html), [SBERT evaluation docs](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html)
|
||
|
||
**Reasoning search (out of scope, cited only to justify the scope boundary) [sourced]**
|
||
- [Tree of Thoughts (Yao et al. 2023)](https://arxiv.org/abs/2305.10601)
|
||
|
||
**Items marked [conjecture] in this doc:**
|
||
- `k=4`/`k=5` starting value for fold cadence (needs empirical tuning)
|
||
- `~30s` full-vault embedding-pass time (needs measurement)
|
||
- `boundary_score` formula exact weighting (a plausible starting form; not validated against retrieval metrics)
|
||
|
||
**Items marked [derived]:**
|
||
- `⌈log₂(N)⌉` fold-depth bound (trivially derivable from the entry-count trigger)
|
||
- Default tiling bands `{≥0.90, 0.80-0.90, <0.80}` before calibration (interpolated from cited ranges in Sentence Transformers examples; not optimal by construction)
|
||
|
||
---
|
||
|
||
## Review History
|
||
|
||
**v0.1 (2026-04-23, initial draft)** — written after a verification pass against Wikipedia, arXiv, and Anthropic docs. Four mechanisms proposed.
|
||
|
||
**v0.4 (2026-04-24, Phase 4 shipped)** — Mechanism 4 (boundary-first autoresearch) implemented as `scripts/boundary-score.py` with `tests/test_boundary_score.py` covering parsing, recency weight, wikilink extraction (with fence-length + tilde + indented-block tests), graph construction (self-loop/unresolved/meta-target exclusion), symlink rejection, and CLI surface (`--top`, `--page`, `--json`). Integrated into `skills/autoresearch/SKILL.md` as an opt-in Topic Selection mode with explicit helper-failure fallback. Spec's "NOT IMPLEMENTED" marker removed; exact scoring formula (no recency floor), filename-stem-only resolution disclosure, scope, and "what is NOT included" section added. Phase 3.6 pre-Phase-4 hardening shipped concurrently (5 fixes: `--report` path confinement, rollout baseline, AGENTS.md consistency, wiki-ingest .raw contradiction, install-guide version).
|
||
|
||
**v0.3 (2026-04-23, Phase 2 alignment)** — Mechanism 2 rewritten to match the actual Phase 2 MVP shipped in `wiki-ingest` and `wiki-lint`. Renamed from "Content-Addressable Paths" to "Deterministic Page Addresses" (the MVP is a creation-order counter, not a content hash). Documented the extension path for fold-ancestry bits and content-hash suffix, both explicitly deferred.
|
||
|
||
**v0.2 (2026-04-23, post-adversarial review)** — after `codex exec` adversarial review. All 7 critiques accepted:
|
||
|
||
1. *LSM "structurally identical"* → weakened to "loosely analogous to hierarchical rollup"; non-inherited properties listed explicitly.
|
||
2. *Prompt cache address benefit* → removed strong claim; narrowed to organizational convention.
|
||
3. *0.85 threshold* → replaced with calibration procedure and banded defaults.
|
||
4. *2^k cadence* → justified as implementation convenience; adaptive trigger flagged as preferred for production.
|
||
5. *Scope boundary contradiction* → acknowledged; boundary-first explicitly labeled as agenda control.
|
||
6. *Missing production mechanisms* → added Operational Policies section (retention, versioning, conflict resolution, concurrency, provenance).
|
||
7. *Unverified claims* → tagged specific numeric values, formulas, and named patterns as [sourced], [derived], or [conjecture]. Editorial synthesis in the body explicitly flagged as not tagged (see scope note under Primary Sources).
|
||
|
||
---
|
||
|
||
## Connections
|
||
|
||
See [[LLM Wiki Pattern]] for the broader pattern this extends.
|
||
See [[Compounding Knowledge]] for why persistent state is the precondition for DragonScale.
|
||
See [[Hot Cache]] for the existing 500-word session context, which is a level-0 manual fold.
|
||
See [[Andrej Karpathy]] for the intellectual lineage.
|