add skills
This commit is contained in:
@@ -0,0 +1,236 @@
|
||||
---
|
||||
name: hermes-history-ingest
|
||||
description: >
|
||||
Ingest Hermes agent history into the Obsidian wiki. Use this skill when the user wants to mine
|
||||
their past Hermes sessions for knowledge, import their ~/.hermes folder, extract insights from
|
||||
previous Hermes conversations, or says things like "process my Hermes history", "add my Hermes
|
||||
memories to the wiki", "ingest ~/.hermes", or "what have I worked on in Hermes". Also triggers
|
||||
when the user mentions Hermes memories, Hermes sessions, ~/.hermes/memories, or Hermes skill logs.
|
||||
---
|
||||
|
||||
# Hermes History Ingest — Conversation & Memory Mining
|
||||
|
||||
You are extracting knowledge from the user's Hermes agent history and distilling it into the Obsidian wiki. Hermes stores both free-form memories and structured session transcripts — focus on durable knowledge, not operational telemetry.
|
||||
|
||||
This skill can be invoked directly or via the `wiki-history-ingest` router (`/wiki-history-ingest hermes`).
|
||||
|
||||
## Before You Start
|
||||
|
||||
1. **Resolve config** — follow the Config Resolution Protocol in `llm-wiki/SKILL.md` (walk up CWD for `.env` → `~/.obsidian-wiki/config` → prompt setup). This gives `OBSIDIAN_VAULT_PATH` and `HERMES_HISTORY_PATH` (defaults to `~/.hermes`)
|
||||
2. Read `.manifest.json` at the vault root to check what has already been ingested
|
||||
3. Read `index.md` at the vault root to understand what the wiki already contains
|
||||
|
||||
## Ingest Modes
|
||||
|
||||
### Append Mode (default)
|
||||
|
||||
Check `.manifest.json` for each source file. Only process:
|
||||
|
||||
- Files not in the manifest (new memory files, new session logs)
|
||||
- Files whose modification time is newer than `ingested_at` in the manifest
|
||||
|
||||
Use this mode for regular syncs.
|
||||
|
||||
### Full Mode
|
||||
|
||||
Process everything regardless of manifest. Use after `wiki-rebuild` or if the user explicitly asks for a full re-ingest.
|
||||
|
||||
## Hermes Data Layout
|
||||
|
||||
Hermes stores all local artifacts under `~/.hermes/` (or `$HERMES_HOME` for non-default profiles).
|
||||
|
||||
```
|
||||
~/.hermes/
|
||||
├── memories/ # Persistent agent memories (markdown or JSON)
|
||||
│ └── *.md / *.json
|
||||
├── skills/ # Installed skills (read-only for ingest purposes)
|
||||
│ └── <skill-name>/SKILL.md
|
||||
├── sessions/ # Session transcripts (if session logging is enabled)
|
||||
│ └── YYYY-MM-DD/
|
||||
│ └── <session-id>.jsonl
|
||||
├── config.yaml # User config (model, theme, paths)
|
||||
└── .hub/ # Skills Hub state (lock.json, audit.log, quarantine/)
|
||||
```
|
||||
|
||||
### Key data sources ranked by value
|
||||
|
||||
1. `memories/*.md` / `memories/*.json` — highest signal; curated persistent knowledge the agent accumulated
|
||||
2. `sessions/**/*.jsonl` — structured turn-by-turn transcripts; rich but noisy
|
||||
3. `config.yaml` — metadata only (model preferences, paths); rarely worth ingesting
|
||||
|
||||
Skip `.hub/` internals (audit/quarantine state) and the `skills/` directory (source material, not user knowledge).
|
||||
|
||||
## Step 1: Survey and Compute Delta
|
||||
|
||||
Scan `HERMES_HISTORY_PATH` and compare against `.manifest.json`:
|
||||
|
||||
- `~/.hermes/memories/`
|
||||
- `~/.hermes/sessions/**/` (if present)
|
||||
|
||||
Classify each file:
|
||||
|
||||
- **New** — not in manifest
|
||||
- **Modified** — in manifest but file is newer than `ingested_at`
|
||||
- **Unchanged** — already ingested and unchanged
|
||||
|
||||
Report a concise delta summary before deep parsing.
|
||||
|
||||
## Step 2: Parse Memories First
|
||||
|
||||
Memories are the highest-value source. Hermes writes them as either:
|
||||
|
||||
- **Markdown** — structured prose with optional frontmatter; ingest directly
|
||||
- **JSON** — `{"content": "...", "created_at": "...", "tags": [...]}` records
|
||||
|
||||
For each memory:
|
||||
|
||||
- Extract the core knowledge claim
|
||||
- Note any tags Hermes attached (they often map to wiki categories)
|
||||
- Merge into the appropriate wiki page rather than creating one memory = one page
|
||||
|
||||
## Step 3: Parse Session JSONL Safely
|
||||
|
||||
Each session JSONL line is an event envelope. Common shapes:
|
||||
|
||||
```json
|
||||
{"role": "user", "content": "..."}
|
||||
{"role": "assistant", "content": "..."}
|
||||
{"type": "tool_use", "name": "...", "input": {...}}
|
||||
{"type": "tool_result", "content": "..."}
|
||||
```
|
||||
|
||||
### Extraction rules
|
||||
|
||||
- Prioritize assistant responses that state conclusions, patterns, or decisions
|
||||
- Extract user intent from high-signal turns; skip low-information follow-ups
|
||||
- Treat `tool_use` / `tool_result` pairs as context, not primary content
|
||||
- Skip token accounting, internal plumbing, and repeated plan echoes
|
||||
|
||||
### Critical privacy filter
|
||||
|
||||
Session logs can include injected instructions, tool payloads, and sensitive text. Do not ingest verbatim.
|
||||
|
||||
- Remove API keys, tokens, passwords, credentials
|
||||
- Redact private identifiers unless relevant and user-approved
|
||||
- Summarize; do not quote raw transcripts verbatim
|
||||
|
||||
## Step 4: Cluster by Topic
|
||||
|
||||
Do not create one wiki page per memory or session.
|
||||
|
||||
- Group memories by stable topic (concept, tool, project, technique)
|
||||
- Split mixed sessions into separate themes
|
||||
- Merge recurring patterns across dates and projects
|
||||
- Use file paths or session `cwd` metadata to infer project scope when available
|
||||
|
||||
## Step 5: Distill into Wiki Pages
|
||||
|
||||
Route extracted knowledge using existing wiki conventions:
|
||||
|
||||
- Project-specific architecture/process → `projects/<name>/...`
|
||||
- General concepts → `concepts/`
|
||||
- Recurring techniques/debug playbooks → `skills/`
|
||||
- Tools/services/frameworks → `entities/`
|
||||
- Cross-session patterns → `synthesis/`
|
||||
|
||||
For each impacted project, create/update `projects/<name>/<name>.md`.
|
||||
|
||||
### Writing rules
|
||||
|
||||
- Distill knowledge, not chronology
|
||||
- Avoid "on date X we discussed..." unless date context is essential
|
||||
- Add `summary:` frontmatter on each new/updated page (1–2 sentences, ≤ 200 chars)
|
||||
- Add confidence and lifecycle fields to every new page:
|
||||
```yaml
|
||||
base_confidence: 0.42
|
||||
lifecycle: draft
|
||||
lifecycle_changed: <ISO date today>
|
||||
```
|
||||
Leave `lifecycle` unchanged on update.
|
||||
- Add provenance markers:
|
||||
- `^[extracted]` when directly grounded in explicit memory/session content
|
||||
- `^[inferred]` when synthesizing patterns across multiple memories
|
||||
- `^[ambiguous]` when memories conflict
|
||||
- Add/update `provenance:` frontmatter mix for each changed page
|
||||
|
||||
## Step 6: Update Manifest, Log, and Index
|
||||
|
||||
### Update `.manifest.json`
|
||||
|
||||
For each processed source file:
|
||||
|
||||
- `ingested_at`, `size_bytes`, `modified_at`
|
||||
- `source_type`: `hermes_memory` | `hermes_session`
|
||||
- `project`: inferred project name (when applicable)
|
||||
- `pages_created`, `pages_updated`
|
||||
|
||||
Add/update a top-level summary block:
|
||||
|
||||
```json
|
||||
{
|
||||
"hermes": {
|
||||
"source_path": "~/.hermes/",
|
||||
"last_ingested": "TIMESTAMP",
|
||||
"memories_ingested": 42,
|
||||
"sessions_ingested": 7,
|
||||
"pages_created": 5,
|
||||
"pages_updated": 12
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Update special files
|
||||
|
||||
Update `index.md` and `log.md`:
|
||||
|
||||
```
|
||||
- [TIMESTAMP] HERMES_HISTORY_INGEST memories=N sessions=M pages_updated=X pages_created=Y mode=append|full
|
||||
```
|
||||
|
||||
**`hot.md`** — Read `$OBSIDIAN_VAULT_PATH/hot.md` (create from the template in `wiki-ingest` if missing). Update **Recent Activity** with a one-line summary — e.g. "Ingested 42 Hermes memories and 7 sessions; dominant themes: reasoning strategies, tool use patterns." Keep the last 3 operations. Update `updated` timestamp.
|
||||
|
||||
## Privacy and Compliance
|
||||
|
||||
- Distill and synthesize; avoid raw memory or transcript dumps
|
||||
- Default to redaction for anything that looks sensitive
|
||||
- Ask the user before storing personal or sensitive details
|
||||
- Keep references to other people minimal and purpose-bound
|
||||
|
||||
## Reference
|
||||
|
||||
See `references/hermes-data-format.md` for field-level notes and extraction guidance.
|
||||
|
||||
## QMD Refresh After Vault Writes
|
||||
|
||||
QMD is a search index, not the source of truth. If `$QMD_WIKI_COLLECTION` is empty or unset, skip this step. Run it only after this skill has written or rewritten vault markdown. If QMD refresh fails, do not roll back the vault changes; report the QMD status separately.
|
||||
|
||||
Use `$QMD_CLI` if set; otherwise use `qmd`.
|
||||
|
||||
```bash
|
||||
${QMD_CLI:-qmd} update
|
||||
```
|
||||
|
||||
If the output says vectors are needed or embeddings may be stale, run:
|
||||
|
||||
```bash
|
||||
${QMD_CLI:-qmd} embed
|
||||
```
|
||||
|
||||
Verify the collection with either:
|
||||
|
||||
```bash
|
||||
${QMD_CLI:-qmd} ls "$QMD_WIKI_COLLECTION"
|
||||
```
|
||||
|
||||
or, when a specific page path is known:
|
||||
|
||||
```bash
|
||||
${QMD_CLI:-qmd} get "qmd://$QMD_WIKI_COLLECTION/<page>.md" -l 5
|
||||
```
|
||||
|
||||
Record one of:
|
||||
- `QMD refreshed: update + embed + verified`
|
||||
- `QMD refreshed: update only + verified`
|
||||
- `QMD skipped: QMD_WIKI_COLLECTION unset`
|
||||
- `QMD skipped: qmd CLI unavailable`
|
||||
- `QMD failed: <short error summary>`
|
||||
@@ -0,0 +1,131 @@
|
||||
# Hermes Agent — Data Format Reference
|
||||
|
||||
Field-level notes for parsing `~/.hermes/` artifacts during wiki ingest.
|
||||
|
||||
## Cache Root
|
||||
|
||||
`~/.hermes/` — or `$HERMES_HOME` for non-default profiles. All paths below are relative to this root.
|
||||
|
||||
## memories/
|
||||
|
||||
Each file is one discrete memory the agent persisted.
|
||||
|
||||
### Markdown memories (`*.md`)
|
||||
|
||||
Optional YAML frontmatter, then prose body:
|
||||
|
||||
```markdown
|
||||
---
|
||||
tags: [python, async, debugging]
|
||||
created_at: 2026-03-10T14:22:00Z
|
||||
project: my-api
|
||||
---
|
||||
When using `asyncio.gather` with return_exceptions=True, failed tasks return the exception
|
||||
object rather than raising — check `isinstance(result, Exception)` on each item.
|
||||
```
|
||||
|
||||
Fields of interest:
|
||||
- `tags` — maps directly to wiki tags; normalize to kebab-case
|
||||
- `created_at` — use for provenance / journal category decisions
|
||||
- `project` — route to `projects/<project>/` when set
|
||||
|
||||
### JSON memories (`*.json`)
|
||||
|
||||
```json
|
||||
{
|
||||
"content": "...",
|
||||
"created_at": "2026-03-10T14:22:00Z",
|
||||
"tags": ["python", "async"],
|
||||
"project": "my-api",
|
||||
"source": "session:abc123"
|
||||
}
|
||||
```
|
||||
|
||||
Same field semantics as the markdown variant. `source` links back to the originating session.
|
||||
|
||||
## sessions/
|
||||
|
||||
Present only when session logging is enabled (`config.yaml: logging.sessions: true`).
|
||||
|
||||
### Directory layout
|
||||
|
||||
```
|
||||
sessions/
|
||||
└── 2026-03-10/
|
||||
└── abc123.jsonl
|
||||
```
|
||||
|
||||
### JSONL line schemas
|
||||
|
||||
**User / assistant turns:**
|
||||
|
||||
```json
|
||||
{"role": "user", "content": "How do I debounce a React input?"}
|
||||
{"role": "assistant", "content": "Use useCallback + useEffect with a setTimeout..."}
|
||||
```
|
||||
|
||||
**Tool use:**
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "tool_use",
|
||||
"id": "tu_abc",
|
||||
"name": "read_file",
|
||||
"input": {"path": "/home/ubuntu/project/src/App.tsx"}
|
||||
}
|
||||
```
|
||||
|
||||
**Tool result:**
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "tool_result",
|
||||
"tool_use_id": "tu_abc",
|
||||
"content": "..."
|
||||
}
|
||||
```
|
||||
|
||||
**Session metadata (first line):**
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "session_meta",
|
||||
"id": "abc123",
|
||||
"cwd": "/home/ubuntu/projects/my-app",
|
||||
"model": "claude-sonnet-4-6",
|
||||
"started_at": "2026-03-10T14:00:00Z"
|
||||
}
|
||||
```
|
||||
|
||||
`cwd` is the most reliable project inference signal — use it to route knowledge to the right `projects/<name>/` page.
|
||||
|
||||
## config.yaml
|
||||
|
||||
Rarely useful for ingest. Useful fields if needed:
|
||||
|
||||
```yaml
|
||||
model: claude-sonnet-4-6
|
||||
hermes_home: ~/.hermes # resolved path, respects $HERMES_HOME
|
||||
logging:
|
||||
sessions: true # whether session JSONL files are written
|
||||
memories: true # whether memories are persisted
|
||||
```
|
||||
|
||||
## .hub/
|
||||
|
||||
Skills Hub state. **Skip entirely during ingest.** Contains:
|
||||
|
||||
- `lock.json` — installed skill manifest (not user knowledge)
|
||||
- `audit.log` — install/update history
|
||||
- `quarantine/` — flagged skills awaiting review
|
||||
|
||||
## Extraction Priority
|
||||
|
||||
| Source | Signal | Noise |
|
||||
|---|---|---|
|
||||
| `memories/*.md` | High — curated, stable | Low |
|
||||
| `memories/*.json` | High — structured | Low |
|
||||
| `sessions/**/*.jsonl` — assistant turns | Medium | Medium |
|
||||
| `sessions/**/*.jsonl` — tool pairs | Low | High |
|
||||
| `config.yaml` | Very low | — |
|
||||
| `.hub/` | None | — |
|
||||
Reference in New Issue
Block a user