What exactly is a “memory”?

A memory is the smallest unit of information you store in Flumes—typically a chat message, document chunk, or JSON blob. Each memory is stored with:
  • id – deterministic UUID
  • memory – raw text content used for embedding & display
  • metadata – arbitrary JSON for filtering (e.g. {"source":"chat"})
  • user_id, agent_id, run_idscoping keys that isolate data across tenants/agents
  • Timestamps: created_at, updated_at

Embeddings & similarity

When you create a memory we automatically:
  1. Clean & truncate the text to fit the model context window.
  2. Generate an embedding using an in-house model (compatible with OpenAI text-embedding-3-small).
  3. Upsert the vector into a FAISS index (
    Postgres + pgvector coming soon
    ).
During retrieval we perform cosine similarity search with optional metadata filters and time decay.

Hot vs cold memory

Cold storage is on the roadmap. For now all memories live in the “hot” (immediate) index.
TierLatencyUse-case
Hot< 200 msChat history, agent state
Cold~1 sLong-term knowledge base

Decay & summarization (coming soon)

We’re adding built-in time decay and auto-summarization so that stale memories roll up into compact summaries, keeping your index lean and latency snappy.
Follow our changelog for upcoming memory operators.