Configuration

config.yaml Reference

All configuration options for Engram, with defaults and examples.

Config file location

Engram looks for configuration in two places, in order:

./config.yaml — a project-local file in the current working directory
~/.engram/config.yaml — a global user config

Both files are deep-merged. Keys in the local file override the global config, which overrides built-in defaults. Missing keys fall back to defaults — you only need to specify the values you want to change.

You can also point to a specific config file at runtime:

engram --config /path/to/config.yaml query my-project "what auth approach?"

Minimal configuration

The only setting most users need to change is their LLM. Here are working examples for the supported providers:

# Gemini 2.5 Flash (recommended — fast, accurate, low cost)
llm:
  model: gemini-2.5-flash
  api_key: YOUR_GEMINI_API_KEY  # or set GEMINI_API_KEY env var

# OpenAI
# llm:
#   model: gpt-4o-mini
#   api_key: YOUR_OPENAI_API_KEY

# Local model via Ollama or LM Studio
# llm:
#   base_url: http://localhost:11434/v1
#   model: qwen2.5:32b

llm — Language model settings

Controls the LLM used for extraction and (optionally) querying.

Key	Default	Description
`base_url`	`http://localhost:1234/v1`	OpenAI-compatible API base URL. For Gemini, use `https://generativelanguage.googleapis.com/v1beta/openai`. Omit when using a hosted provider that doesn't need a custom base URL.
`model`	`qwen3.5-35b-a3b`	Model identifier. Must be a valid name for the endpoint specified in `base_url`.
`api_key`	none	API key string. As an alternative, set `api_key_env` to the name of an environment variable that holds the key (e.g. `GEMINI_API_KEY`).
`api_key_env`	none	Name of the environment variable that contains the API key. Preferred over `api_key` so secrets stay out of config files.
`temperature`	`0.1`	Sampling temperature. Lower values produce more deterministic extractions.
`max_tokens`	`4096`	Maximum tokens in the LLM response. Increase for very long transcripts (e.g. `65000` for Gemini Flash).
`timeout`	`30.0`	HTTP timeout in seconds for LLM API calls. Increase for slow local models.

defaults — Query defaults

Sets the default behavior when running engram query. All of these can be overridden per-query with CLI flags.

Key	Default	Description
`hops`	`3`	BFS traversal depth. Higher values pull in more distant related nodes. Overridable with `--hops N`.
`top_k`	`10`	Maximum number of nodes to return. Overridable with `--top-k N`.
`format`	`markdown`	Output format. Currently only `markdown` is supported.

strategies — Retrieval strategies

Controls the filtering and scoring pipeline applied to query results. See Retrieval in the Concepts page for a full explanation of each strategy.

Key	Default	Description
`superseded_pruning`	`true`	Hide nodes that have been superseded by newer nodes. Recommended on.
`confidence_threshold`	`0.0`	Minimum confidence score to include a node. `0.0` disables the filter. Set to e.g. `0.6` to exclude low-certainty extractions.
`recency_decay`	`false`	Apply exponential decay to node scores based on age. Older facts rank lower. Pair with `recency_half_life_days`.
`recency_half_life_days`	`30`	Half-life for recency decay. A node extracted 30 days ago will have its score halved (when decay is enabled).
`token_budget`	`0`	Maximum estimated tokens in query output. `0` is unlimited. Useful for fitting context into a fixed slot.
`relevance_scoring`	`true`	Rank BFS entry nodes by number of keyword tag matches before traversal. Recommended on.

projects_dir

Where Engram stores project data. Each project gets a subdirectory containing its SQLite database.

projects_dir: "~/.engram/projects"  # default

# Move to a custom location
projects_dir: "/data/engram-projects"

Remote API (hosted mode)

To use the hosted Engram API instead of a local database, add your API URL and key:

api_url: https://api.unbidden.ai
api_key: YOUR_API_KEY  # or set ENGRAM_API_KEY env var

Hosted API: The hosted API is coming soon. Until then, all data is stored locally. See REST API for how to self-host.

Full example config

A complete ~/.engram/config.yaml showing all common settings:

# LLM used for extraction
llm:
  base_url: "https://generativelanguage.googleapis.com/v1beta/openai"
  model: "gemini-2.5-flash"
  api_key_env: "GEMINI_API_KEY"
  temperature: 0.0
  max_tokens: 65000
  timeout: 120.0

# Query defaults (all overridable per-query with CLI flags)
defaults:
  hops: 3
  top_k: 10
  format: "markdown"

# Retrieval strategy settings
strategies:
  superseded_pruning: true     # hide superseded nodes
  confidence_threshold: 0.0    # 0.0 = disabled; try 0.6 to filter noisy extractions
  recency_decay: false         # apply time-based score decay
  recency_half_life_days: 30   # half-life when decay is enabled
  token_budget: 0              # 0 = unlimited
  relevance_scoring: true      # rank entry nodes by tag overlap

# Where projects are stored
projects_dir: "~/.engram/projects"

Environment variables

API keys can be passed as environment variables instead of putting them in config files:

Variable	Description
`GEMINI_API_KEY`	Google Gemini API key. Used when `llm.api_key_env: "GEMINI_API_KEY"`.
`OPENAI_API_KEY`	OpenAI API key. Used when `llm.api_key_env: "OPENAI_API_KEY"`.
`ENGRAM_API_KEY`	Engram hosted API key. Automatically read when `api_url` is set.

Engram also loads a .env file in the project root automatically if python-dotenv is installed. This is the recommended way to store keys in development without putting them in a config file.

Next steps

How retrieval works — understand what each strategy does
CLI reference — per-query overrides with --enable / --disable
REST API — self-host the Engram API server