Your AI forgets everything between sessions. Here's how Engram fixes that — and why compound context changes how you work with AI tools.
Every AI session starts from zero. You open Claude Code, Cursor, or Windsurf, and before you can get to the actual problem, you spend the first few turns re-explaining: the architecture you're using, the decision you made last week, the constraint that's been baked in since day one. It's the cognitive overhead nobody talks about.
Engram solves this. It's a persistent memory layer that sits between your sessions — extracting structured facts from conversations and storing them in a knowledge graph you can query at any time. Your AI editor pulls that context automatically, or you inject it explicitly. Either way, you stop explaining and start building.
Engram runs as a local CLI tool. When you finish a session — or even mid-session — you point it at a transcript or markdown file:
engram extract my-project transcript.txt
Under the hood, a fast LLM reads the content and pulls out structured nodes:
middleware/auth.ts"These nodes live in a local knowledge graph. Duplicate or superseded facts are merged automatically — if you make a different decision later, the graph reflects the current state, not the history.
Install Engram with pip:
pip install engram
engram --version
Create a project for your codebase:
engram init my-project
Configure the extraction model. Gemini Flash is fast and cheap (~$0.001 per session). Add your key to ~/.engram/config.yaml:
model: gemini/gemini-2.5-flash
api_key: YOUR_GEMINI_API_KEY
Extract your first session:
engram extract my-project session.txt
Query it:
engram query my-project "what decisions have we made about auth?"
The fastest path is the MCP server. Start it:
engram mcp-serve
Then add Engram to your Claude Code config at ~/.claude/settings.json:
{
"mcpServers": {
"engram": {
"command": "engram",
"args": ["mcp-serve"]
}
}
}
Now engram_query and engram_extract are available as tools in every Claude Code session. The AI can pull context when it needs it, or you can invoke the tools explicitly.
Cursor and Windsurf use the same MCP config format — see the MCP Server docs for editor-specific setup.
The real value isn't any single session. It's what happens after 20 sessions. By that point, Engram knows your architecture, your constraints, your past mistakes, and your open questions. New sessions don't start from zero — they start from everything you've already figured out.
The AI stops asking what your stack is. It stops suggesting patterns you've already ruled out. It starts contributing at a higher level because it has the context to operate at a higher level.
That's the compounding effect. The longer you use Engram, the more leverage you get from every session.