Architecture
EchOS is structured as a pnpm monorepo where a central agent core handles all reasoning, and separate packages implement each interface and storage layer. This document covers how a message moves through the system, how packages depend on each other, and how the three-layer storage architecture keeps data consistent.Data Flow
Package Dependencies
Scheduler & Notifications
The scheduler package (@echos/scheduler) runs background jobs via BullMQ + Redis. It is opt-in via ENABLE_SCHEDULER=true and requires a running Redis instance.
Notification delivery is decoupled via NotificationService (defined in @echos/shared). The Telegram package provides the concrete implementation; the scheduler receives it via dependency injection and never imports @echos/telegram directly. When Telegram is disabled, a log-only fallback is used.
Workers:
- Digest: Creates a throwaway AI agent to summarize recent notes and reminders, broadcasts the result
- Reminder check: Queries SQLite for overdue reminders and sends notifications
- Content processing: Processes article/YouTube URLs queued by the agent
Export Utility
Location:packages/core/src/export/index.ts
The export utility provides pure serialization functions for converting notes into downloadable file formats. It is used by the export_notes agent tool and is independent of any interface or storage layer.
Formats
| Format | Function | Output |
|---|---|---|
markdown | exportToMarkdown(note) | Full markdown file with YAML frontmatter (reads raw file from disk when available; reconstructs from SQLite otherwise) |
text | exportToText(note) | Plain text with markdown syntax stripped (headings, bold, links, list markers, etc. removed) |
json | exportToJson(notes) | JSON array of { metadata, content } objects |
zip | exportToZip(notes) | ZIP archive of .md files (one per note), deduplicated filenames |
export_notes Agent Tool
Location: packages/core/src/agent/tools/export-notes.ts
The tool selects notes (by ID or filter), serializes them, and returns an ExportFileResult JSON string in its tool result. Interfaces intercept this via the tool_execution_end agent event (which exposes event.result) and deliver the file:
- Single note, markdown/text → returned inline (no file written to disk)
- Multiple notes, or json/zip format → written to
data/exports/export-{timestamp}.{ext}
markdown or text format, the tool automatically upgrades to zip.
Exports Directory
Export files are written todata/exports/ (configurable via exportsDir in AgentDeps). Files are cleaned up automatically after 1 hour by the export cleanup cron job in the scheduler (export-cleanup, runs hourly at 0 * * * *).
Interface Delivery
| Interface | Delivery mechanism |
|---|---|
| Telegram | ctx.replyWithDocument(new InputFile(buffer, fileName)) after agent completes; temp file deleted immediately |
| CLI | Inline content → stdout; file exports → --output path or ./fileName in CWD; path printed to stderr |
| Web | GET /api/export/:fileName download endpoint; agent includes the URL in its text response |
Plugin Architecture
Content processors live inplugins/ as separate workspace packages. Each plugin:
- Implements the
EchosPlugininterface from@echos/core - Returns agent tools from its
setup(context)method - Receives a
PluginContextwith storage, embeddings, logger, and config - Is registered via
PluginRegistryin the application entry point
@echos/core.
Domain-specific processors (YouTube, article, image, etc.) are plugins.
Plugins can optionally use the AI categorization service from @echos/core to automatically extract category, tags, gist, summary, and key points from content. See Categorization for details.
Available Plugins
- article: Web article extraction using Readability
- youtube: YouTube video transcript extraction
- image: Image storage with metadata extraction (format, dimensions, EXIF)
- content-creation: Content generation tools
Storage Architecture
SQLite (better-sqlite3): Structured metadata index, FTS5 full-text search, memory store, reminders. The memory table stores long-term personal facts with a confidence score (0–1) and kind (fact, preference, person, project, expertise). Notes also store a content_hash (SHA-256) used to detect changes and skip unnecessary re-embedding. The status column tracks content lifecycle (saved, read, archived) and input_source records how content was captured (text, voice, url, file, image). For images, additional columns store image_path (local file path), image_url (source URL), image_metadata (JSON with dimensions, format, EXIF), and ocr_text (for future OCR support).
LanceDB (embedded): Vector embeddings for semantic search. No server process needed.
Markdown files: Source of truth. YAML frontmatter with structured metadata. Directory layout: knowledge/{type}/{category}/{date}-{slug}.md. Images are stored in knowledge/image/{category}/{hash}.{ext} and referenced from markdown notes.
Storage Sync
EchOS keeps the three storage layers in sync automatically, even when markdown files are added or edited outside the application: Startup reconciliation (reconcileStorage in packages/core/src/storage/reconciler.ts):
Runs once at boot. Scans all .md files in the knowledge directory and compares them against SQLite using the content_hash column:
- New file → full upsert to SQLite + generate embedding in LanceDB
- Content changed → update SQLite + re-embed (OpenAI called only when content hash differs)
- File moved (same hash, different path) → update file path in SQLite only, no re-embed
- No change → skipped entirely
- SQLite record with no file on disk → deleted from SQLite and LanceDB
createFileWatcher in packages/core/src/storage/watcher.ts):
Uses chokidar to watch knowledge/**/*.md while the app is running. Events are debounced (500 ms) and awaitWriteFinish is enabled to handle atomic saves from editors (VS Code, Obsidian, etc.):
add/change→ parse, compare content hash, upsert if changed (re-embed only on content change)unlink→ look up note by file path in SQLite, delete from SQLite + LanceDB
Search
Hybrid search combines three strategies via Reciprocal Rank Fusion (RRF):- Keyword (FTS5): BM25-ranked full-text search across title, content, tags
- Semantic (LanceDB): Cosine similarity on OpenAI embeddings
- Hybrid: RRF fusion of keyword + semantic results
Memory System
Long-term memory (remember_about_me / recall_knowledge tools) uses a hybrid strategy to balance cost and recall:
- At agent creation (including after
/reset): the top 15 memories ranked byconfidence DESC, updated DESCare injected directly into the system prompt as “Known Facts About the User”. This ensures core personal facts are always available without an explicit tool call. - On-demand retrieval: if more than 15 memories exist,
recall_knowledgesearches the full memory table using word-tokenised LIKE queries. The system prompt notes additional memories are available so the agent knows to use the tool.
/reset only clears the conversation history — all stored memories persist in SQLite and are reloaded into the next session automatically.
Custom Agent Message Types
EchOS extends theAgentMessage union from @mariozechner/pi-agent-core via TypeScript declaration merging in packages/core/src/agent/messages.ts.
echos_context
convertToLlm function (echosConvertToLlm) prepends the context content to the immediately following user message before the LLM call. Custom messages are preserved in agent.state.messages for debugging but never sent standalone to the LLM.
Helpers exported from @echos/core:
createContextMessage(content)— creates anechos_contextmessagecreateUserMessage(content)— creates a typedusermessage
AI Categorization — Streaming with Progressive JSON
The categorization service (packages/core/src/agent/categorization.ts) uses streamSimple from @mariozechner/pi-ai instead of a blocking fetch. As the LLM streams its JSON response, parseStreamingJson parses each partial chunk — which never throws, always returning {} on incomplete input.
When new fields become fully formed in the partial JSON, an optional onProgress callback fires:
"Category: programming"— as soon ascategoryis resolved"Tags: typescript, api"— updated each time a new tag appears"Gist: One sentence summary."— once the gist looks complete (>20 chars, ends with punctuation) — full mode only
categorizeLightweight and processFull accept onProgress?: (message: string) => void. Callers that don’t need progressive updates (e.g. the scheduler digest worker) pass no callback and get the same blocking behaviour as before.
Context Overflow Detection
The agent uses a two-layer approach to context window management: Layer 1 — Proactive pruning (createContextWindow in context-manager.ts):
Runs before every LLM call via transformContext. Estimates token usage and slides the message window back to the nearest user-turn boundary until the budget fits. This should prevent overflows under normal operation.
Layer 2 — Reactive detection (isAgentMessageOverflow in context-manager.ts):
If a provider rejects the request despite pruning (e.g. single oversized message, model switch, token estimation drift), the last assistant message is checked against isContextOverflow from @mariozechner/pi-ai, which matches provider-specific error patterns for Anthropic, OpenAI, Gemini, Groq, Mistral, OpenRouter, and others.
On overflow detection:
- Telegram: Replies with “Conversation history is too long. Use /reset to start a new session.” instead of a raw provider error string.
- Web API: Returns HTTP 413 with a structured error body (
{ error: "Conversation history is too long. Please reset your session." }).
isAgentMessageOverflow(message, contextWindow) is exported from @echos/core for use in any interface adapter.
Agent Session Caching
Each agent instance is assigned asessionId at creation time, forwarded to LLM providers that support session-aware prompt caching:
| Interface | Session ID format |
|---|---|
| Telegram | telegram-{userId} |
| Web | web-{userId} |
CLI (pnpm echos) | cli-local |
- Anthropic: Extends prompt cache TTL from the default 5 minutes to longer durations. Set
PI_CACHE_RETENTION=longfor 1-hour retention. - OpenAI: Enables 24-hour in-memory cache reuse across calls.
sessionId is present.
Security
- User authentication via Telegram user ID whitelist
- SSRF prevention on all URL fetching
- HTML sanitization via DOMPurify
- Rate limiting (token bucket per user)
- Structured audit logging
- Secret redaction in Pino logs