MCP Memoria - Unlimited Local AI Memory
Introducing MCP Memoria
I’m excited to announce the release of MCP Memoria, a Model Context Protocol (MCP) server that provides persistent, unlimited memory capabilities for Claude Code and Claude Desktop.
Why Memoria?
When working with AI assistants like Claude, context is everything. But conversations are ephemeral - once a session ends, all that context is lost. MCP Memoria solves this by giving Claude a persistent memory that survives across sessions.
Unlike cloud-based alternatives with storage limits and privacy concerns, Memoria is:
- 100% Local: All data stays on your machine
- Unlimited Storage: No 50MB limits like cloud services
- Zero Cost: Completely free and open source
- Private: Your memories never leave your computer
How It Works
Memoria uses Qdrant for vector storage and Ollama for local embeddings. This means semantic search - finding relevant memories by meaning, not just keywords.
┌──────────────────────────────────────┐
│ Claude Code/Desktop │
└───────────────┬──────────────────────┘
│ MCP Protocol
▼
┌────────────────────────────────────────┐
│ MCP Memoria Server │
├────────────────────────────────────────┤
│ Tools: store, recall, search, etc. │
├────────────────────────────────────────┤
│ Memory Manager │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │Episodic │ │Semantic │ │Procedure│ │
│ └─────────┘ └─────────┘ └─────────┘ │
├────────────────────────────────────────┤
│ Ollama (embeddings) │ Qdrant (vectors)│
└────────────────────────────────────────┘
Three Types of Memory
Memoria organizes knowledge into three cognitive categories:
- Episodic Memory: Events, conversations, decisions made, problems encountered
- Semantic Memory: Facts, knowledge, API endpoints, configurations, best practices
- Procedural Memory: Workflows, deployment steps, build commands, common patterns
Key Features
- Semantic Search: Find relevant memories by meaning
- Memory Consolidation: Automatic merging of similar memories
- Forgetting Curve: Natural decay of unused, low-importance memories
- Project Context: Associate memories with specific projects
- Export/Import: Backup and share your memories
Example Usage
Once configured, just talk naturally to Claude:
# Store memories
Remember that the API endpoint for users is /api/v1/users
Save this procedure: To deploy, run ./scripts/deploy.sh --env prod
# Recall memories
What do you know about the database?
How do we handle authentication in this project?
# Manage memories
Show me the memoria stats
Consolidate memories to merge duplicates
Export all memories to backup.json
Get Started
The repository is available on GitHub: github.com/trapias/memoria
Prerequisites
- Python 3.11+
- Ollama with
nomic-embed-textmodel - Docker (optional, for Qdrant server)
Option A: Local Storage (No Docker)
The simplest setup - everything runs locally without Docker:
git clone https://github.com/trapias/memoria.git
cd memoria
./scripts/install.sh
The install script will:
- Check and install Ollama if needed
- Pull the
nomic-embed-textembedding model - Create a Python virtual environment
- Set up local Qdrant storage in
~/.mcp-memoria/qdrant - Generate the Claude Code configuration
Add to your Claude Code config (~/.claude/config.json):
{
"mcp_servers": {
"memoria": {
"command": "/path/to/memoria/venv/bin/python",
"args": ["-m", "mcp_memoria"],
"env": {
"MEMORIA_QDRANT_PATH": "~/.mcp-memoria/qdrant",
"MEMORIA_OLLAMA_HOST": "http://localhost:11434"
}
}
}
}
Option B: Docker with Qdrant Server
For better performance and scalability, run Qdrant as a Docker container:
git clone https://github.com/trapias/memoria.git
cd memoria
# Install Python package
pip install -e .
# Start Qdrant container (persistent storage)
cd docker
docker-compose -f docker-compose.qdrant-only.yml up -d
Add to your Claude Code config (~/.claude/config.json):
{
"mcp_servers": {
"memoria": {
"command": "python",
"args": ["-m", "mcp_memoria"],
"env": {
"MEMORIA_QDRANT_HOST": "localhost",
"MEMORIA_QDRANT_PORT": "6333",
"MEMORIA_OLLAMA_HOST": "http://localhost:11434"
}
}
}
}
Qdrant data persists in a Docker volume - your memories survive container restarts.
Update — January 27, 2026
Since the initial release, Memoria has gained two significant features:
Content Chunking
Long memories are now automatically split into overlapping chunks, each with its own embedding. This dramatically improves semantic search quality for large content — instead of a single embedding trying to represent an entire document, each chunk captures a focused semantic region.
The process is fully transparent: when you store a long memory, Memoria splits it behind the scenes; when you recall or search, results are deduplicated by logical memory, so you always see complete content, never raw chunks. All operations — update, delete, consolidation, export/import — are chunk-aware.
Two new configuration options control this behavior:
| Variable | Default | Description |
|---|---|---|
MEMORIA_CHUNK_SIZE | 500 | Max characters per chunk |
MEMORIA_CHUNK_OVERLAP | 50 | Overlap between consecutive chunks |
Full-Text Match Filtering
The memoria_recall and memoria_search tools now accept an optional text_match parameter. While semantic search finds memories by meaning, text_match adds an exact keyword filter — useful when you need a specific term to appear in the content. The two can be combined: semantic similarity narrows by meaning, text_match ensures the keyword is present.
# Semantic search + keyword filter
Recall memories about deployment that mention "staging"
Both features are available in the latest version on GitHub.
Feedback
This is an open-source project and your feedback is invaluable! Please open an issue on GitHub for bugs, feature requests, or suggestions.
License
MCP Memoria is released under the Apache 2.0 license.