MCP Memoria - Unlimited Local AI Memory

Introducing MCP Memoria

I’m excited to announce the release of MCP Memoria, a Model Context Protocol (MCP) server that provides persistent, unlimited memory capabilities for Claude Code and Claude Desktop.

Why Memoria?

When working with AI assistants like Claude, context is everything. But conversations are ephemeral - once a session ends, all that context is lost. MCP Memoria solves this by giving Claude a persistent memory that survives across sessions.

Unlike cloud-based alternatives with storage limits and privacy concerns, Memoria is:

100% Local: All data stays on your machine
Unlimited Storage: No 50MB limits like cloud services
Zero Cost: Completely free and open source
Private: Your memories never leave your computer

How It Works

Memoria uses Qdrant for vector storage and Ollama for local embeddings. This means semantic search - finding relevant memories by meaning, not just keywords.

┌──────────────────────────────────────┐
│         Claude Code/Desktop          │
└───────────────┬──────────────────────┘
                │ MCP Protocol
                ▼
┌────────────────────────────────────────┐
│         MCP Memoria Server             │
├────────────────────────────────────────┤
│  Tools: store, recall, search, etc.    │
├────────────────────────────────────────┤
│          Memory Manager                │
│  ┌─────────┐ ┌─────────┐ ┌─────────┐   │
│  │Episodic │ │Semantic │ │Procedure│   │
│  └─────────┘ └─────────┘ └─────────┘   │
├────────────────────────────────────────┤
│  Ollama (embeddings) │ Qdrant (vectors)│
└────────────────────────────────────────┘

Three Types of Memory

Memoria organizes knowledge into three cognitive categories:

Episodic Memory: Events, conversations, decisions made, problems encountered
Semantic Memory: Facts, knowledge, API endpoints, configurations, best practices
Procedural Memory: Workflows, deployment steps, build commands, common patterns

Key Features

Semantic Search: Find relevant memories by meaning
Memory Consolidation: Automatic merging of similar memories
Forgetting Curve: Natural decay of unused, low-importance memories
Project Context: Associate memories with specific projects
Export/Import: Backup and share your memories

Example Usage

Once configured, just talk naturally to Claude:

# Store memories
Remember that the API endpoint for users is /api/v1/users
Save this procedure: To deploy, run ./scripts/deploy.sh --env prod

# Recall memories
What do you know about the database?
How do we handle authentication in this project?

# Manage memories
Show me the memoria stats
Consolidate memories to merge duplicates
Export all memories to backup.json

Get Started

The repository is available on GitHub: github.com/trapias/memoria

Prerequisites

Python 3.11+
Ollama with nomic-embed-text model
Docker (optional, for Qdrant server)

Option A: Local Storage (No Docker)

The simplest setup - everything runs locally without Docker:

git clone https://github.com/trapias/memoria.git
cd memoria
./scripts/install.sh

The install script will:

Check and install Ollama if needed
Pull the nomic-embed-text embedding model
Create a Python virtual environment
Set up local Qdrant storage in ~/.mcp-memoria/qdrant
Generate the Claude Code configuration

Add to your Claude Code config (~/.claude/config.json):

{
  "mcp_servers": {
    "memoria": {
      "command": "/path/to/memoria/venv/bin/python",
      "args": ["-m", "mcp_memoria"],
      "env": {
        "MEMORIA_QDRANT_PATH": "~/.mcp-memoria/qdrant",
        "MEMORIA_OLLAMA_HOST": "http://localhost:11434"
      }
    }
  }
}

Option B: Docker with Qdrant Server

For better performance and scalability, run Qdrant as a Docker container:

git clone https://github.com/trapias/memoria.git
cd memoria

# Install Python package
pip install -e .

# Start Qdrant container (persistent storage)
cd docker
docker-compose -f docker-compose.qdrant-only.yml up -d

Add to your Claude Code config (~/.claude/config.json):

{
  "mcp_servers": {
    "memoria": {
      "command": "python",
      "args": ["-m", "mcp_memoria"],
      "env": {
        "MEMORIA_QDRANT_HOST": "localhost",
        "MEMORIA_QDRANT_PORT": "6333",
        "MEMORIA_OLLAMA_HOST": "http://localhost:11434"
      }
    }
  }
}

Qdrant data persists in a Docker volume - your memories survive container restarts.

Update — January 27, 2026

Since the initial release, Memoria has gained two significant features:

Content Chunking

Long memories are now automatically split into overlapping chunks, each with its own embedding. This dramatically improves semantic search quality for large content — instead of a single embedding trying to represent an entire document, each chunk captures a focused semantic region.

The process is fully transparent: when you store a long memory, Memoria splits it behind the scenes; when you recall or search, results are deduplicated by logical memory, so you always see complete content, never raw chunks. All operations — update, delete, consolidation, export/import — are chunk-aware.

Two new configuration options control this behavior:

Variable	Default	Description
`MEMORIA_CHUNK_SIZE`	`500`	Max characters per chunk
`MEMORIA_CHUNK_OVERLAP`	`50`	Overlap between consecutive chunks

Full-Text Match Filtering

The memoria_recall and memoria_search tools now accept an optional text_match parameter. While semantic search finds memories by meaning, text_match adds an exact keyword filter — useful when you need a specific term to appear in the content. The two can be combined: semantic similarity narrows by meaning, text_match ensures the keyword is present.

# Semantic search + keyword filter
Recall memories about deployment that mention "staging"

Both features are available in the latest version on GitHub.

Feedback

This is an open-source project and your feedback is invaluable! Please open an issue on GitHub for bugs, feature requests, or suggestions.

License

MCP Memoria is released under the Apache 2.0 license.