AI-Powered Categorization
EchOS uses Claude AI to automatically categorize and summarize content. This feature helps organize your knowledge base with minimal manual effort.Overview
The categorization system offers two processing modes:| Mode | Speed | Output | Best For |
|---|---|---|---|
| Lightweight | ~1-2 seconds | Category + Tags | Quick organization, batch processing |
| Full | ~3-5 seconds | Category + Tags + Gist + Summary + Key Points | Important content, detailed analysis |
Usage
Auto-Categorize on Save
When saving articles or YouTube videos, use theautoCategorize parameter:
Categorize Existing Notes
Use thecategorize_note tool to analyze content that’s already saved:
Processing Modes
Lightweight Mode
Speed: ~1-2 secondsAPI Cost: ~500 tokens
Output:
category: Single category (e.g., “programming”, “health”, “finance”)tags: 3-5 relevant tags for organization
- Quick categorization of many items
- When you don’t need summaries
- Batch processing existing notes
- Speed is more important than detail
Full Mode
Speed: ~3-5 secondsAPI Cost: ~2000 tokens
Output:
category: Single categorytags: 3-5 relevant tagsgist: One-sentence summary (max 100 characters)summary: Comprehensive summary (2-3 paragraphs)keyPoints: 3-5 actionable takeaways
- Important articles or videos
- Content you’ll reference frequently
- When you need quick summaries
- Research or learning materials
Categories
The AI automatically selects appropriate categories based on content. Common categories include:programming- Software development, codingmachine-learning- AI, ML, data sciencehealth- Fitness, nutrition, medicalfinance- Money, investing, economicspersonal- Personal notes, reflectionswork- Professional, careerproductivity- Time management, toolsscience- Research, discoveriesphilosophy- Ideas, thinkinguncategorized- Fallback when unclear
Technical Details
Architecture
Implementation
The categorization service is inpackages/core/src/agent/categorization.ts:
Content Length Limits
- Lightweight mode: First 5,000 characters analyzed
- Full mode: First 10,000 characters analyzed
Error Handling
The service includes robust error handling:Configuration
Categorization requires an Anthropic API key in.env:
API Reference
categorizeContent()
title: Content titlecontent: Full text contentmode:'lightweight'or'full'apiKey: Anthropic API keylogger: Pino logger instance
categorizeLightweight()
processFull()
Agent Tools
categorize_note
Manually categorize existing notes:Plugin Parameters
Bothsave_article and save_youtube support:
Best Practices
When to Use Lightweight Mode
- Processing multiple items at once
- Don’t need summaries immediately
- Want to minimize API costs
- Content is straightforward to categorize
When to Use Full Mode
- Saving important reference material
- Need quick summaries for later
- Content is complex or technical
- Want structured key takeaways
Manual vs. Auto-Categorization
Manual (specify category/tags in the command):- You know exactly how to categorize
- Content is personal or context-specific
- Want consistent naming
- Discovering new topics
- Content is unfamiliar
- Want objective categorization
- Processing public/professional content
Batch Processing
For categorizing many existing notes:Performance
Token Usage
| Mode | Input | Output | Total |
|---|---|---|---|
| Lightweight | ~300-400 | ~100-200 | ~500 |
| Full | ~1200-1500 | ~500-800 | ~2000 |
Response Times
Times measured with typical article content (~3000 words):- Lightweight: 1-2 seconds
- Full: 3-5 seconds
Cost Estimation
Based on Anthropic Claude Haiku 4.5 pricing (4/MTok output):| Mode | Cost per Item |
|---|---|
| Lightweight | ~$0.0010 |
| Full | ~$0.0045 |
Troubleshooting
”Categorization failed”
Cause: API error, rate limit, or invalid key Fix:- Check
ANTHROPIC_API_KEYis valid - Verify API rate limits not exceeded
- Check content isn’t too long (>10k chars)
- Review logs for specific error
Missing gist/summary in results
Cause: Using lightweight mode Fix: Use full mode instead:Incorrect categories
Cause: Content is ambiguous or AI misinterpreted Fix:- Provide more context in the content
- Manually specify category
- Add manual tags to guide categorization
Rate limiting
Cause: Too many API requests in short time Fix:- Use lightweight mode for batch processing
- Add delays between requests
- Check Anthropic rate limits for your tier
Examples
Example 1: Save Article with Auto-Categorization
Example 2: Categorize Existing Note (Full Mode)
Example 3: Lightweight Batch Processing
Content Taxonomy
Beyond categorization (category + tags), EchOS tracks content through a lifecycle:ContentType
| Type | Description | Default Status |
|---|---|---|
note | User-authored note | read |
journal | Diary/reflection entry | read |
conversation | Saved conversation summary | read |
article | Saved web article | saved |
youtube | Saved YouTube transcript | saved |
reminder | Task reminder | n/a |
ContentStatus
| Status | Meaning | When set |
|---|---|---|
saved | Captured, not yet consumed | Default for article/youtube saves |
read | User has engaged with the content | Default for authored notes; set explicitly or auto-detected |
archived | Hidden from normal search | User request or explicit archiving |
- Articles/YouTube start as
saved— they’re in the reading list, not the knowledge base - Authored notes (
note,journal,conversation) start asread - The agent auto-marks content as
readwhen the user begins discussing it - Use
mark_content(id, 'read')to mark explicitly, or ask the agent “I’ve read that article”
InputSource
Tracks how content was captured:text— typed by user (default for create_note)voice— from a transcribed voice messageurl— from a pasted URL (save_article, save_youtube)file— from a file
Filtering by Status
See Also
- Building a Plugin - How to add categorization to custom plugins
- Architecture - System architecture overview
- Core source code - Implementation details