config.yaml Reference

All configuration options for Engram, with defaults and examples.

Config file location

Engram looks for configuration in two places, in order:

  1. ./config.yaml — a project-local file in the current working directory
  2. ~/.engram/config.yaml — a global user config

Both files are deep-merged. Keys in the local file override the global config, which overrides built-in defaults. Missing keys fall back to defaults — you only need to specify the values you want to change.

You can also point to a specific config file at runtime:

engram --config /path/to/config.yaml query my-project "what auth approach?"

Minimal configuration

The only setting most users need to change is their LLM. Here are working examples for the supported providers:

# Gemini 2.5 Flash (recommended — fast, accurate, low cost)
llm:
  model: gemini-2.5-flash
  api_key: YOUR_GEMINI_API_KEY  # or set GEMINI_API_KEY env var

# OpenAI
# llm:
#   model: gpt-4o-mini
#   api_key: YOUR_OPENAI_API_KEY

# Local model via Ollama or LM Studio
# llm:
#   base_url: http://localhost:11434/v1
#   model: qwen2.5:32b

llm — Language model settings

Controls the LLM used for extraction and (optionally) querying.

Key Default Description
base_url http://localhost:1234/v1 OpenAI-compatible API base URL. For Gemini, use https://generativelanguage.googleapis.com/v1beta/openai. Omit when using a hosted provider that doesn't need a custom base URL.
model qwen3.5-35b-a3b Model identifier. Must be a valid name for the endpoint specified in base_url.
api_key none API key string. As an alternative, set api_key_env to the name of an environment variable that holds the key (e.g. GEMINI_API_KEY).
api_key_env none Name of the environment variable that contains the API key. Preferred over api_key so secrets stay out of config files.
temperature 0.1 Sampling temperature. Lower values produce more deterministic extractions.
max_tokens 4096 Maximum tokens in the LLM response. Increase for very long transcripts (e.g. 65000 for Gemini Flash).
timeout 30.0 HTTP timeout in seconds for LLM API calls. Increase for slow local models.

defaults — Query defaults

Sets the default behavior when running engram query. All of these can be overridden per-query with CLI flags.

Key Default Description
hops 3 BFS traversal depth. Higher values pull in more distant related nodes. Overridable with --hops N.
top_k 10 Maximum number of nodes to return. Overridable with --top-k N.
format markdown Output format. Currently only markdown is supported.

strategies — Retrieval strategies

Controls the filtering and scoring pipeline applied to query results. See Retrieval in the Concepts page for a full explanation of each strategy.

Key Default Description
superseded_pruning true Hide nodes that have been superseded by newer nodes. Recommended on.
confidence_threshold 0.0 Minimum confidence score to include a node. 0.0 disables the filter. Set to e.g. 0.6 to exclude low-certainty extractions.
recency_decay false Apply exponential decay to node scores based on age. Older facts rank lower. Pair with recency_half_life_days.
recency_half_life_days 30 Half-life for recency decay. A node extracted 30 days ago will have its score halved (when decay is enabled).
token_budget 0 Maximum estimated tokens in query output. 0 is unlimited. Useful for fitting context into a fixed slot.
relevance_scoring true Rank BFS entry nodes by number of keyword tag matches before traversal. Recommended on.

projects_dir

Where Engram stores project data. Each project gets a subdirectory containing its SQLite database.

projects_dir: "~/.engram/projects"  # default

# Move to a custom location
projects_dir: "/data/engram-projects"

Remote API (hosted mode)

To use the hosted Engram API instead of a local database, add your API URL and key:

api_url: https://api.unbidden.ai
api_key: YOUR_API_KEY  # or set ENGRAM_API_KEY env var

Hosted API: The hosted API is coming soon. Until then, all data is stored locally. See REST API for how to self-host.

Full example config

A complete ~/.engram/config.yaml showing all common settings:

# LLM used for extraction
llm:
  base_url: "https://generativelanguage.googleapis.com/v1beta/openai"
  model: "gemini-2.5-flash"
  api_key_env: "GEMINI_API_KEY"
  temperature: 0.0
  max_tokens: 65000
  timeout: 120.0

# Query defaults (all overridable per-query with CLI flags)
defaults:
  hops: 3
  top_k: 10
  format: "markdown"

# Retrieval strategy settings
strategies:
  superseded_pruning: true     # hide superseded nodes
  confidence_threshold: 0.0    # 0.0 = disabled; try 0.6 to filter noisy extractions
  recency_decay: false         # apply time-based score decay
  recency_half_life_days: 30   # half-life when decay is enabled
  token_budget: 0              # 0 = unlimited
  relevance_scoring: true      # rank entry nodes by tag overlap

# Where projects are stored
projects_dir: "~/.engram/projects"

Environment variables

API keys can be passed as environment variables instead of putting them in config files:

Variable Description
GEMINI_API_KEY Google Gemini API key. Used when llm.api_key_env: "GEMINI_API_KEY".
OPENAI_API_KEY OpenAI API key. Used when llm.api_key_env: "OPENAI_API_KEY".
ENGRAM_API_KEY Engram hosted API key. Automatically read when api_url is set.

Engram also loads a .env file in the project root automatically if python-dotenv is installed. This is the recommended way to store keys in development without putting them in a config file.

Next steps