Explainer Beginner 8 min read

What Is MemPalace and How Does It Work?

A complete guide to the free, highest-scoring AI memory system — what it is, the problem it solves, how the palace architecture works (Wings, Rooms, Halls, Tunnels, Closets, Drawers), and how to get started in under 5 minutes.

What Is MemPalace?

MemPalace is a free, open-source AI memory system that gives large language models persistent, searchable memory across sessions. Created by actress Milla Jovovich and developer Ben Sigman using Claude Code, it is currently the highest-scoring free AI memory system on the LongMemEval benchmark — achieving 96.6% recall with zero API calls. Everything runs entirely on your machine.

Quick Facts Version: 3.1.0  ·  License: MIT  ·  Python: 3.9+  ·  LongMemEval: 96.6% raw / 100% hybrid  ·  GitHub Stars: 26,900+

The Problem It Solves

Every conversation you have with an AI disappears when the session ends. Six months of daily AI use equals roughly 19.5 million tokens of decisions, reasoning, and context — all trapped in chat windows that evaporate. Existing memory tools like Mem0 and Zep try to fix this by letting an LLM decide what is "worth remembering," but they consistently discard the reasoning you need most.

MemPalace's answer is to store every word verbatim and make it findable — rather than letting any AI decide what to throw away.

The Palace Structure

Inspired by the ancient method of loci used by Greek orators, MemPalace organises your conversations into a navigable building. Each part of the structure has a specific role:

Wings

The top-level container — one per person or project. For example: wing_kai for a teammate, wing_driftwood for a project. Wings keep all memories cleanly separated by context.

Rooms

Specific topics within a wing — auth-migration, graphql-switch, ci-pipeline. Searching within a room gives 94.8% R@10 recall versus 60.9% for unfiltered search — a 34-point improvement from the structure alone.

Halls

Corridors connecting rooms by memory type. Five hall types: hall_facts (decisions), hall_events (milestones), hall_discoveries (insights), hall_preferences (habits), hall_advice (recommendations).

Tunnels

When the same topic room appears in multiple wings, MemPalace automatically creates a tunnel linking them — so a search for "auth migration" surfaces memories from every person and project that touched it.

Closets and Drawers

Closets contain plain-text summaries pointing to source content. Drawers hold the original verbatim files — never summarised, never deleted. The Closet is an index; the Drawer is the ground truth.

  WING: wing_kai
    ├── hall_facts / auth-migration  →  Closet  →  Drawer (verbatim)
    ├── hall_events / auth-migration →  Closet  →  Drawer (verbatim)
    └── [tunnel] ──────────────────────────────────────────────────┐
                                                                   │
  WING: wing_driftwood                                             │
    ├── hall_facts / auth-migration  ←──────────────── [tunnel] ──┘
    └── hall_advice / auth-migration →  Closet  →  Drawer (verbatim)

The 4-Layer Memory Stack

LayerContentsSizeWhen Loaded
L0Identity — who is this AI, active project~50 tokensAlways
L1Critical facts — team, projects, preferences~120 tokensAlways
L2Room recall — recent sessions for current topicOn demandWhen topic arises
L3Deep search — semantic query across all closetsOn demandWhen asked

Your AI wakes up with L0 + L1 — just ~170 tokens — and already knows your entire world. This costs ~$0.70/year versus ~$507/year for LLM-summarisation approaches.

What Is AAAK?

AAAK is an experimental lossy abbreviation dialect for packing repeated entities into fewer tokens. It is readable by any LLM without a decoder. Important caveats: AAAK is the compression layer for context loading, not storage — raw verbatim mode scores 96.6% while AAAK mode scores 84.2% on LongMemEval.

Important Use raw mode (the default) for best accuracy. AAAK is for scenarios where token density matters more than perfect recall — like loading a 50-agent summary into a constrained context window.

Getting Started in 60 Seconds

bash
# 1. Install
pip install mempalace
# 2. Initialise your palace
mempalace init ~/projects/myapp
# 3. Mine your AI conversations
mempalace mine ~/chats/ --mode convos
# 4. Search six months of memory instantly
mempalace search "why did we switch to GraphQL"
→ "Chose GraphQL — concurrent writes, dataset exceeds 10GB. 2025-11-03"

For the complete walkthrough including Claude Code plugin, MCP server, and auto-save hooks, see the Setup Guide →

Next →
MemPalace v3.0.0 Release Notes — Every Feature Explained