add skills
This commit is contained in:
@@ -0,0 +1,82 @@
|
||||
# Codex Data Format — Detailed Reference
|
||||
|
||||
This reference describes practical, observed structures for Codex local history ingestion.
|
||||
|
||||
## Root Layout
|
||||
|
||||
`~/.codex/` usually contains:
|
||||
|
||||
- `sessions/YYYY/MM/DD/rollout-*.jsonl` — primary structured session logs
|
||||
- `archived_sessions/` — archived rollouts
|
||||
- `session_index.jsonl` — session id/name/update index
|
||||
- `history.jsonl` — transcript history (depends on config)
|
||||
- `config.toml` — history persistence controls
|
||||
|
||||
## Session Index
|
||||
|
||||
`~/.codex/session_index.jsonl` entries are one JSON object per line, commonly:
|
||||
|
||||
```json
|
||||
{"id":"<thread-id>","thread_name":"<title>","updated_at":"<timestamp>"}
|
||||
```
|
||||
|
||||
Use this as the inventory backbone for append/full mode deltas.
|
||||
|
||||
## Rollout JSONL
|
||||
|
||||
`rollout-*.jsonl` files are event streams with envelope fields:
|
||||
|
||||
```json
|
||||
{
|
||||
"timestamp": "2026-04-12T09:40:02.337Z",
|
||||
"type": "session_meta|turn_context|event_msg|response_item",
|
||||
"payload": { "...": "..." }
|
||||
}
|
||||
```
|
||||
|
||||
Common `type` values:
|
||||
|
||||
- `session_meta` — run metadata (id, cwd, model/provider, etc.)
|
||||
- `turn_context` — turn-scoped context envelope
|
||||
- `event_msg` — runtime events (task lifecycle, token/accounting, tool-call markers)
|
||||
- `response_item` — model response items (messages, tool calls, reasoning blocks)
|
||||
|
||||
### Typical payload subtypes
|
||||
|
||||
Observed examples include:
|
||||
|
||||
- `event_msg.payload.type`: `task_started`, `user_message`, `agent_message`, `mcp_tool_call_end`, `exec_command_end`, `token_count`
|
||||
- `response_item.payload.type`: `message`, `function_call`, `function_call_output`, `reasoning`
|
||||
|
||||
## Extraction Strategy
|
||||
|
||||
### Keep
|
||||
|
||||
- User intent from user-message records
|
||||
- Assistant conclusions/decisions from assistant message records
|
||||
- High-signal tool outputs that encode reusable knowledge
|
||||
|
||||
### Skip
|
||||
|
||||
- Pure telemetry (`token_count`, low-level plumbing events)
|
||||
- Internal reasoning traces unless user explicitly asks to retain them
|
||||
- Verbose execution dumps with no durable insight
|
||||
|
||||
## Privacy Notes
|
||||
|
||||
Rollouts can contain sensitive data:
|
||||
|
||||
- Injected instruction layers
|
||||
- Tool inputs/outputs
|
||||
- Potential secrets in command output
|
||||
|
||||
Always redact secrets and summarize instead of copying raw transcript content.
|
||||
|
||||
## Config Interaction
|
||||
|
||||
`~/.codex/config.toml` keys that affect ingestion completeness:
|
||||
|
||||
- `history.persistence = "save-all" | "none"`
|
||||
- `history.max_bytes = <int>` (truncation/compaction cap)
|
||||
|
||||
`codex exec --ephemeral` runs may not persist rollout files.
|
||||
Reference in New Issue
Block a user