Configuration
config.yaml Reference
All configuration options for Engram, with defaults and examples.
Config file location
Engram looks for configuration in two places, in order:
./config.yaml— a project-local file in the current working directory~/.engram/config.yaml— a global user config
Both files are deep-merged. Keys in the local file override the global config, which overrides built-in defaults. Missing keys fall back to defaults — you only need to specify the values you want to change.
You can also point to a specific config file at runtime:
engram --config /path/to/config.yaml query my-project "what auth approach?"
Minimal configuration
The only setting most users need to change is their LLM. Here are working examples for the supported providers:
# Gemini 2.5 Flash (recommended — fast, accurate, low cost)
llm:
model: gemini-2.5-flash
api_key: YOUR_GEMINI_API_KEY # or set GEMINI_API_KEY env var
# OpenAI
# llm:
# model: gpt-4o-mini
# api_key: YOUR_OPENAI_API_KEY
# Local model via Ollama or LM Studio
# llm:
# base_url: http://localhost:11434/v1
# model: qwen2.5:32b
llm — Language model settings
Controls the LLM used for extraction and (optionally) querying.
| Key | Default | Description |
|---|---|---|
base_url |
http://localhost:1234/v1 |
OpenAI-compatible API base URL. For Gemini, use https://generativelanguage.googleapis.com/v1beta/openai. Omit when using a hosted provider that doesn't need a custom base URL. |
model |
qwen3.5-35b-a3b |
Model identifier. Must be a valid name for the endpoint specified in base_url. |
api_key |
none | API key string. As an alternative, set api_key_env to the name of an environment variable that holds the key (e.g. GEMINI_API_KEY). |
api_key_env |
none | Name of the environment variable that contains the API key. Preferred over api_key so secrets stay out of config files. |
temperature |
0.1 |
Sampling temperature. Lower values produce more deterministic extractions. |
max_tokens |
4096 |
Maximum tokens in the LLM response. Increase for very long transcripts (e.g. 65000 for Gemini Flash). |
timeout |
30.0 |
HTTP timeout in seconds for LLM API calls. Increase for slow local models. |
defaults — Query defaults
Sets the default behavior when running engram query. All of these can be overridden per-query with CLI flags.
| Key | Default | Description |
|---|---|---|
hops |
3 |
BFS traversal depth. Higher values pull in more distant related nodes. Overridable with --hops N. |
top_k |
10 |
Maximum number of nodes to return. Overridable with --top-k N. |
format |
markdown |
Output format. Currently only markdown is supported. |
strategies — Retrieval strategies
Controls the filtering and scoring pipeline applied to query results. See Retrieval in the Concepts page for a full explanation of each strategy.
| Key | Default | Description |
|---|---|---|
superseded_pruning |
true |
Hide nodes that have been superseded by newer nodes. Recommended on. |
confidence_threshold |
0.0 |
Minimum confidence score to include a node. 0.0 disables the filter. Set to e.g. 0.6 to exclude low-certainty extractions. |
recency_decay |
false |
Apply exponential decay to node scores based on age. Older facts rank lower. Pair with recency_half_life_days. |
recency_half_life_days |
30 |
Half-life for recency decay. A node extracted 30 days ago will have its score halved (when decay is enabled). |
token_budget |
0 |
Maximum estimated tokens in query output. 0 is unlimited. Useful for fitting context into a fixed slot. |
relevance_scoring |
true |
Rank BFS entry nodes by number of keyword tag matches before traversal. Recommended on. |
projects_dir
Where Engram stores project data. Each project gets a subdirectory containing its SQLite database.
projects_dir: "~/.engram/projects" # default
# Move to a custom location
projects_dir: "/data/engram-projects"
Remote API (hosted mode)
To use the hosted Engram API instead of a local database, add your API URL and key:
api_url: https://api.unbidden.ai
api_key: YOUR_API_KEY # or set ENGRAM_API_KEY env var
Hosted API: The hosted API is coming soon. Until then, all data is stored locally. See REST API for how to self-host.
Full example config
A complete ~/.engram/config.yaml showing all common settings:
# LLM used for extraction
llm:
base_url: "https://generativelanguage.googleapis.com/v1beta/openai"
model: "gemini-2.5-flash"
api_key_env: "GEMINI_API_KEY"
temperature: 0.0
max_tokens: 65000
timeout: 120.0
# Query defaults (all overridable per-query with CLI flags)
defaults:
hops: 3
top_k: 10
format: "markdown"
# Retrieval strategy settings
strategies:
superseded_pruning: true # hide superseded nodes
confidence_threshold: 0.0 # 0.0 = disabled; try 0.6 to filter noisy extractions
recency_decay: false # apply time-based score decay
recency_half_life_days: 30 # half-life when decay is enabled
token_budget: 0 # 0 = unlimited
relevance_scoring: true # rank entry nodes by tag overlap
# Where projects are stored
projects_dir: "~/.engram/projects"
Environment variables
API keys can be passed as environment variables instead of putting them in config files:
| Variable | Description |
|---|---|
GEMINI_API_KEY |
Google Gemini API key. Used when llm.api_key_env: "GEMINI_API_KEY". |
OPENAI_API_KEY |
OpenAI API key. Used when llm.api_key_env: "OPENAI_API_KEY". |
ENGRAM_API_KEY |
Engram hosted API key. Automatically read when api_url is set. |
Engram also loads a .env file in the project root automatically if python-dotenv is installed. This is the recommended way to store keys in development without putting them in a config file.
Next steps
- How retrieval works — understand what each strategy does
- CLI reference — per-query overrides with
--enable/--disable - REST API — self-host the Engram API server