Hermes

Use Engram as a drop-in MemoryProvider for Hermes. Persistent, graph-based memory that retrieves relevant context before every LLM call — automatically.

Why Engram + Hermes

Hermes has a clean MemoryProvider interface and leaves memory implementation up to you. Most implementations fall back to flat files or simple vector stores. Engram gives Hermes a structured knowledge graph with semantic retrieval, temporal decay, and automatic fact retirement:

  • Context injected before every LLM callprefetch() runs a BFS retrieval against your project graph and surfaces the most relevant facts in under a second
  • Non-blocking extractionsync_turn() buffers each turn and runs fact extraction in a background thread, so memory never slows the conversation
  • Crash recovery — the turn buffer is persisted to disk between turns; nothing is lost if the session ends unexpectedly
  • Scales across sessions — decisions from session 1 are just as retrievable in session 50

How Engram compares

There are several MemoryProvider options available for Hermes. Here's how they stack up:

Engram Mem0 Hindsight Honcho
Storage model Knowledge graph (SQLite) Vector store Extractive summarization (rolling compression) Hosted API only
Retrieval Semantic + graph traversal Semantic similarity Full context dump Semantic similarity
Superseded facts Retired automatically Manual deletion Stays in context forever Manual deletion
Extraction blocking? Non-blocking (background thread) Blocking Blocking Non-blocking
Crash recovery Yes (disk-persisted buffer) No No Yes (hosted)
Local / offline Yes (default) Hosted only Yes No
Cost ~$0.001/session (LLM only) API subscription Free (no extraction LLM) API subscription

1. Install

pip install engram[hermes]

This installs Engram plus the Hermes plugin adapter. Verify:

engram --version
python -c "from engram.hermes_plugin import EngramMemoryProvider; print('ok')"

2. Configure Hermes

In your Hermes config, set the memory provider to Engram:

memory:
  provider: engram
  project: my-project
  # Optional: point at hosted API for cross-machine sync
  # api_key: your-api-key
  # remote_url: https://api.engram.unbidden.ai

Or via the interactive setup wizard:

hermes memory setup

Select Engram when prompted for a provider. The wizard configures ENGRAM_PROJECT and optionally ENGRAM_API_KEY / ENGRAM_REMOTE_URL for hosted mode.

How it works

The EngramMemoryProvider hooks into three points in the Hermes session lifecycle:

  1. prefetch(task) — called before each LLM invocation. Runs a BFS retrieval against the Engram project graph using the current task description as the query. Returns the 10–25 most relevant facts as structured context, injected into the system prompt. Sub-second — does not block the LLM call.
  2. sync_turn(turn) — called after each conversation turn. Buffers the turn to disk (crash-safe), then dispatches fact extraction in a background thread. The LLM never waits on extraction.
  3. on_session_end() — called when the session closes cleanly. Flushes any buffered turns that haven't been extracted yet. Also handles the case where a prior session crashed — it processes the recovered buffer on next startup.

LLM-accessible tools

Two tools are exposed to the Hermes LLM directly during a session:

  • engram_query — lets the LLM explicitly query project memory mid-session for a specific topic. Useful when context shifts significantly mid-conversation.
  • engram_recall — retrieves a specific node by ID or tag. Useful for the LLM to confirm or expand on a specific fact surfaced by prefetch().

Hosted API for cross-machine sync

By default, Engram stores memory in a local SQLite file at ~/.engram/. To sync memory across machines or share a project with a team, switch to the hosted API:

memory:
  provider: engram
  project: my-project
  api_key: your-api-key
  remote_url: https://api.engram.unbidden.ai

Get an API key from the Pro or Team plan. Local mode is free and has no rate limits.

Troubleshooting

ImportError: cannot import EngramMemoryProvider

Install the Hermes extra: pip install engram[hermes]. If you already have engram installed without the extra, run pip install --upgrade "engram[hermes]" to add it.

prefetch() returns empty results

The project is likely empty or doesn't exist yet. Run engram list to see current projects. If the project is missing, run engram init my-project and then engram extract my-project <transcript> to seed it. After a few sessions, retrieval will populate automatically.

Background extraction is failing silently

Check the Engram log at ~/.engram/engram.log. Background thread errors are written there. Common causes: LLM endpoint misconfigured, network issue, or extraction LLM quota exceeded. Fix the underlying issue and the buffered turns will be replayed on next session start.

Memory isn't persisting between sessions

Confirm that on_session_end() is being called — some Hermes integrations skip the teardown hook on forced exits. You can also call engram show my-project after a session to verify nodes were written. If the count isn't increasing, extraction is not completing.