Hive Hive
Sign in

Memory subsystem

#1 · Public · Created directly

Proposed
Proposal

Why

Hive is the central hub for agentic product orchestration. External agents (Claude Code today, in-process Hive workers later) connect over MCP, browse and modify specs, and reason about the product. Today every conversation starts cold: there is no durable place for an agent to record “we decided to ship X with constraint Y” or “this user owns the iOS roadmap” so the next thread has to rediscover it from scratch.

Atlas, Hive’s Slack-resident sibling, shipped a typed-memory-graph subsystem (Atlas.Memory) inspired by spacebot.sh that solves this for Slack threads. Hive needs the same primitives, adapted to its domain: no Slack channels, MCP-first surface, integration with the existing visibility / product / spec model, and reuse of the OpenData Vector service Hive already runs.

Goals

  1. A typed memory store agents can write to during a session and read back across sessions.
  2. Hybrid (vector + full-text) recall scoped to either the workspace or a single product, with the same supersede / contradict semantics Atlas uses.
  3. MCP tools (save_memory, recall_memory, forget_memory) so connected agents can curate memory directly.
  4. A /memory section in the dashboard so org members can audit, search, and forget memories.
  5. A path (deferred to phase 2 / 3) to add LLM-driven edge classification and bulletin synthesis once Hive grows an LLM abstraction.

Non-goals

  • Building an in-process conversational agent. Hive serves memory; the connected agent decides when to save and recall.
  • Per-spec scoping as a dimension. Memories link to a source spec via source_spec_id (an FK), but the scope axis is workspace or product, not spec.
  • Per-user scoping in phase 1. The schema captures created_by_user_id but retrieval is not partitioned by user.
  • Building an LLM runtime in phase 1. Phase 2 / 3 introduces one; phase 1 ships storage + search and stays useful without it.

Reference model (Atlas)

Atlas.Memory is built around three schemas:

  • memory_nodes: eight kinds (fact, preference, decision, identity, event, observation, goal, todo). Each carries a body, importance (0..1), access_count, last_accessed_at, scope, forgotten flag, and best-effort embedding metadata.
  • memory_edges: directed links (related_to, updates, contradicts) used at recall time to demote superseded or older nodes.
  • memory_bulletins: one synthesized prose briefing per scope, regenerated by a debounced Oban worker, prepended to the Slack agent’s system prompt.

Retrieval is a hybrid of vector similarity (via OpenData Vector) and Postgres full-text search, fused with reciprocal rank fusion (RRF, k=60) and weighted by importance. Access counters are bumped on every hit. A background LinkNode worker classifies relations between a new node and its nearest neighbours via a single structured-output LLM call; a RefreshBulletin worker (debounced via Oban unique) regenerates the per-scope bulletin.

Tools come in two flavours: Condukt tools (memory_save, memory_recall) bound to the Slack channel and user, plus MCP tools (save_memory, recall_memory) for external clients.

Design for Hive

Scopes

scope enum on memory_nodes:

  • :global: workspace-wide knowledge (“we use Conventional Commits with explicit scopes”).
  • :product: tied to one product (“the iOS macros library targets Swift 6”).

Atlas’s slack_* foreign keys disappear. Each node carries:

  • created_by_user_id (FK to users, on_delete: nilify_all): whoever called save_memory (OIDC session or MCP-authenticated user).
  • product_id (FK to products, on_delete: delete_all): set only when scope == :product. The changeset validates that :product requires a non-nil product_id and :global forbids one.
  • source_spec_id (FK to specs, on_delete: nilify_all): optional pointer to the spec the memory was derived from, analogous to Atlas’s source_slack_message_id.

Visibility

A new visibility field (:public | :private, default :public) consistent with Spec, Product, and GithubRepository. Hive.Memory.effective_visibility/1 mirrors Hive.Specs.effective_visibility/1:

  • Returns the node’s explicit visibility when set.
  • For :product-scoped nodes with nil visibility, inherits from the linked product (private product => private memory).
  • For :global-scoped nodes, defaults to :public.

can_view?(node, user) parallels Hive.Specs.can_view?: Auth.member?(user) OR effective_visibility(node) == :public. Public visitors can recall public-scoped memory at a public Hive instance, matching how public specs work today.

Schema sketch

memory_nodes
id uuid pk
kind enum [fact, preference, decision, identity, event, observation, goal, todo]
body text (1..4_000 chars)
importance float (0..1, default per kind)
access_count int (default 0)
last_accessed_at utc_datetime
forgotten boolean default false
scope enum [global, product] default global
visibility enum [public, private] default public
product_id uuid fk(products) on_delete delete_all (required when scope=product)
source_spec_id uuid fk(specs) on_delete nilify_all
created_by_user_id uuid fk(users) on_delete nilify_all
embedding_model string
embedded_at utc_datetime
inserted_at / updated_at
index (scope, forgotten)
index (kind, forgotten)
index (product_id)
index (source_spec_id)
index (created_by_user_id)
gin index (to_tsvector('english', body))
memory_edges
id uuid pk
src_id uuid fk(memory_nodes) on_delete delete_all not null
dst_id uuid fk(memory_nodes) on_delete delete_all not null
kind enum [related_to, updates, contradicts] not null
weight float (0..1, default 1.0)
inserted_at / updated_at
unique index (src_id, dst_id, kind)
check (src_id <> dst_id)
memory_bulletins (phase 3)
id uuid pk
scope enum [global, product] default global not null
product_id uuid fk(products) on_delete delete_all (null for :global; required for :product)
body text not null
generated_at utc_datetime not null
inserted_at / updated_at
unique index (scope, product_id) nulls_distinct false

Layout

lib/hive/memory.ex public API: create_node, get_node, list_nodes,
search_nodes, forget_node, create_edge,
list_edges_by_*, get_bulletin, upsert_bulletin,
can_view?, can_save?, effective_visibility
lib/hive/memory/node.ex schema + changeset, default-importance map
lib/hive/memory/edge.ex schema + changeset
lib/hive/memory/bulletin.ex schema + changeset (phase 3)
lib/hive/memory/policy.ex LetMe policy, mirrors lib/hive/forage/policy.ex
lib/hive/memory/workers/link_node.ex phase 2 Oban worker
lib/hive/memory/workers/refresh_bulletin.ex phase 3 Oban worker
lib/hive/memory/edge_classifier.ex phase 2 LLM agent
lib/hive/memory/bulletin_synthesizer.ex phase 3 LLM agent
lib/hive/mcp/components/tools/save_memory.ex
lib/hive/mcp/components/tools/recall_memory.ex
lib/hive/mcp/components/tools/forget_memory.ex
lib/hive/mcp/components/tools/get_memory_bulletin.ex phase 3
lib/hive_web/live/memory_live/ /memory dashboard section
priv/repo/migrations/*_create_memory_nodes.exs
priv/repo/migrations/*_create_memory_edges.exs
priv/repo/migrations/*_create_memory_bulletins.exs phase 3
test/hive/memory_test.exs
test/hive_web/live/memory_live/...
test/hive/mcp/components/tools/{save,recall,forget}_memory_test.exs

Hive.Memory stays the single domain entry point. Per CLAUDE.md, modules are small and domain-focused.

Retrieval

Hive.Memory.search_nodes/2 mirrors Atlas.Memory.search_nodes/2:

  1. Run a vector search (when Hive.OpendataVector.enabled?()) and a Postgres full-text search in parallel via Task.async.
  2. Fuse with RRF (k=60). Pull limit * 3 candidates from each lane.
  3. Hydrate from Postgres, filter by scope, kind, forgotten == false, and can_view?.
  4. Apply :updates supersession (drop the dst when both endpoints are in the result set) and :contradicts resolution (keep the newer).
  5. Sort by score * importance, take limit, bump access_count and last_accessed_at in a single Repo.update_all.

Vector keys follow Atlas’s convention: "memory_node:<uuid>", with source_type attribute "memory_node" so the OpenData Vector filter matches.

Indexing is best-effort: when OpenData Vector is not configured or the embedding call fails, the node is persisted without embedding_model / embedded_at, and lexical search still finds it.

MCP tools

Register in Hive.MCP.Server alongside the existing spec tools:

  • save_memory: args { kind, body, importance?, scope?, product_id?, source_spec_id?, visibility? }. Member-gated via conn.assigns.current_user and Hive.Memory.can_save?. Validates scope <=> product_id pairing. Schedules the phase-2 link-node worker (best-effort no-op when phase 2 has not landed). Schedules the phase-3 bulletin refresh.
  • recall_memory: args { query, kind?, scope?, product_id?, max_results? }. Anyone may call; the policy filter trims private memory for non-members. max_results clamps to 1..20, default 6.
  • forget_memory: args { id }. Member-gated. Marks the node forgotten and deletes its vector entry.
  • get_memory_bulletin (phase 3): args { scope, product_id? }. Returns the current bulletin body and generated_at. Useful for an external agent that wants to prepend the bulletin to its own system prompt.

All four follow the existing Hive.MCP.Tool pattern (see lib/hive/mcp/components/tools/create_spec.ex) and serialize via Hive.MCP.Tool.json_response/1. Errors round-trip as %{error: "...", details: ...} like create_spec.

LiveView

lib/hive_web/live/memory_live/ mirrors forage_live/:

  • index_live.ex: searchable, filterable list (by kind, by scope, by product) of non-forgotten memory.
  • show_live.ex: a single node with its edges, recall stats, and a forget action gated by Hive.Memory.Policy.
  • Authorization via Hive.Memory.Policy (LetMe), same pattern as lib/hive/forage/policy.ex.
  • Page-specific OpenGraph metadata via OpenGraph.assigns/1 (CLAUDE.md convention).
  • CSS in assets/css/routes/memory_index.css and routes/memory_show.css, scoped under #memory-index / #memory-show, using the data-part anchor convention. No BEM, no utilities.
  • Dropdowns for kind / scope / product use Noora.Select, not native <select> (CLAUDE.md convention).

LLM-dependent pieces

Atlas’s edge classifier and bulletin synthesizer are single-turn Condukt agents that need an LLM. Hive does not currently have an LLM abstraction. Phase 1 ships everything except those two and the MCP tool that depends on the bulletin. Specifically:

  • Hive.Memory.create_node/1 returns successfully without enqueueing LinkNode when phase 2 is not present.
  • The recall pipeline still applies :updates and :contradicts semantics; edges just will not be auto-created. The agent can manually create edges before phase 2 lands via a small link_memories tool if useful.
  • get_memory_bulletin is omitted until phase 3.

When Hive grows a Hive.LLMs.Runner analog (or adopts Condukt directly), phase 2 wires up Hive.Memory.EdgeClassifier + Workers.LinkNode, and phase 3 wires up Hive.Memory.BulletinSynthesizer + Workers.RefreshBulletin. The synthesizer prompt is identical to Atlas’s modulo replacing “Slack assistant” with “Hive assistant” and dropping the channel reference.

Phasing

Each phase is mergeable and shippable on its own. Conventional Commit scope: memory.

Phase 1: storage + hybrid search + MCP curation (no LLM)

  • Migrations for memory_nodes and memory_edges.
  • Hive.Memory.Node, Hive.Memory.Edge, Hive.Memory (CRUD + hybrid search + supersession).
  • Hive.Memory.Policy (LetMe).
  • MCP tools save_memory, recall_memory, forget_memory registered in Hive.MCP.Server.
  • /memory LiveView (index + show).
  • Unit + LiveView tests, all async: true. Stub Hive.OpendataVector via Mimic; add it to test/test_helper.exs if not already there.

Phase 2: edge classifier + auto-linking

Requires a Hive LLM abstraction. New work:

  • Introduce Hive.LLMs.Runner (or vendor Condukt) and document the HIVE_LLM_* env vars in config/runtime.exs and CLAUDE.md.
  • Hive.Memory.EdgeClassifier (structured output, identical schema to Atlas).
  • Hive.Memory.Workers.LinkNode (Oban worker; enqueued from create_node/1 when the runner is configured).
  • Best-effort: if the runner is not configured, LinkNode.enqueue/1 no-ops and create_node/1 still succeeds.

Phase 3: bulletin synthesizer + MCP bulletin tool

  • Migration for memory_bulletins.
  • Hive.Memory.BulletinSynthesizer (single-turn agent, prompt adapted from Atlas).
  • Hive.Memory.Workers.RefreshBulletin (debounced via Oban unique, daily cron fallback).
  • save_memory MCP tool schedules a debounced refresh.
  • get_memory_bulletin MCP tool surfaces the current bulletin for connecting agents.
  • /memory LiveView shows the latest bulletin at the top of each scope view.

Open questions

  • MCP authentication: save_memory and forget_memory need a current user. Confirm the existing MCP authentication path assigns current_user for MCP-token holders (it does for OAuth-via-Boruta clients per CreateSpec); recall_memory stays open like list_specs.
  • Visibility inheritance precedence: if a memory has an explicit :public value but its product is :private, the explicit value wins (matches Spec). Worth confirming before the migration lands so we do not have to change the precedence later.
  • Oban dependency: Atlas uses Oban for LinkNode and RefreshBulletin. Hive does not yet depend on Oban; phase 2 / 3 will need to add it. Alternative: a GenServer-based debounce inside the supervisor tree, but Oban gives us free retries plus cron for the daily refresh.
  • Embedding endpoint: phase 1 assumes Hive’s OpenData Vector exposes a compatible embedding endpoint the way Atlas uses Atlas.Documents.Embedding. Either confirm or add Hive.OpendataVector.Embedding as part of phase 1.

Done when

  • An MCP client can call save_memory to record a typed memory, then recall_memory from a fresh session and have the right memory surfaced, with supersession and contradiction handled.
  • Org members can browse memory at /memory, filter by kind / scope / product, and forget a node.
  • Phase 2 lands edges automatically from saves once the LLM dependency is in place.
  • Phase 3 lands the bulletin and the MCP tool to retrieve it.
Draft history
Revision Status Edited
Revision 2 Edited by pedro@tuist.dev
Proposed
Revision 1 Edited by pedro@tuist.dev
Draft
Comments

No comments yet

Comments from contributors and members will appear here.

Sign in to comment

Comments are available to authenticated users.