Socius — AI Companion with Human-Like Memory

The Memory System

Three stages of
human-like cognition

Modeled after how human memory actually works: perceive, encode, consolidate. Each stage transforms raw conversation into genuine understanding.

Awake — Perceive & Respond

During conversation, your companion is fully present. Working memory holds the current context while the retrieval engine pulls relevant memories from the graph — semantic similarity through vector search, associative connections through graph traversal.

Dual retrieval Semantic + Associative paths run in parallel

Progressive compression Verbatim → Key Exchanges → Summary → Impression

Retrieval strengthens memory Accessing a memory boosts its salience

Encoding — Extract & Connect

After each conversation, an async pipeline extracts structured memories: fragments of knowledge, entities, beliefs, emotions, goals. Each node is scored for salience, embedded for semantic search, and linked into the knowledge graph.

Node types Person, Entity, Episode, Fragment, Concept, Belief, Goal, Emotion

Salience scoring Emotional intensity, identity relevance, novelty, confidence

Contradiction detection New information checked against existing beliefs

Sleep — Consolidate & Dream

Like human sleep, background consolidation strengthens important memories and lets others fade. Fragments merge into concepts. Patterns emerge. The companion writes a nightly diary reflecting on the day — creating narrative continuity.

Micro (hourly) Recalculate salience, strengthen co-accessed links

Daily (nightly) Merge fragments, detect patterns, write diary

Deep (weekly) Prune, create reveries, reassess beliefs, update narrative

See the full memory deep dive with conversation examples →

Deep Dive

The memory graph
up close

Every memory is a node in a typed property graph. Relationships carry meaning. The topology itself encodes understanding.

Node Types & Properties

Eight node types model different aspects of human memory. Each carries scored weights that determine how it's retrieved, strengthened, or pruned.

Episode

A conversation or life event. The container that links everything from a single interaction. Tagged with channel, timestamp, duration, and source type.

source: Stated | Inferred | ThirdParty | Lived

Fragment

A discrete piece of knowledge extracted from conversation. "She grew up in Cascais" or "He's worried about the deadline." The atomic unit of memory.

temporal_precision: Exact | Approximate | Relative | Unbounded

Person

Someone mentioned in conversation or the companion themselves. Accumulates fragments, beliefs, and emotions over time. Central hub for relationship modeling.

relationships: KNOWS | MENTIONED_BY | DISCUSSED_WITH

Concept

An abstract theme that emerges from multiple fragments. "Career anxiety" or "Portuguese identity." Concepts are discovered during consolidation, not extracted directly.

emerged from fragment merging during deep consolidation

Belief

A held conviction. "Honesty without timing is just cruelty." Beliefs can be contested when contradictory evidence arrives. They carry confidence scores.

contested: bool — flagged when contradicting evidence exists

Emotion

An emotional state linked to an episode or person. Not just labels — intensity-scored and temporally grounded. "Deep grief" vs "mild annoyance" are different nodes.

emotional_intensity: 0.0 – 1.0

Goal

Something the person is working toward. "Finish the documentary" or "reconnect with brother." Goals are reassessed during deep consolidation — completed, abandoned, or evolved.

status: Active | Completed | Abandoned | Evolved

Entity

A named thing — a place, organization, project, or object. "Café Kotti", "the Tides documentary", "grandmother's silver ring." Entities link fragments across episodes.

entity_type: Place | Organization | Object | Project | Event

Salience Scoring

What gets
remembered

Every node carries a salience score — a composite weight that determines retrieval priority, decay resistance, and pruning eligibility. The formula mirrors how human memory works: emotional memories persist, frequently accessed memories strengthen, and mundane details fade.

Salience isn't static. It's recalculated during every consolidation cycle. A memory that was once vivid can fade. A forgotten detail can be revived when accessed again.

Salience Score

S = w_t·temporal + w_a·access + w_e·emotion + w_i·identity + w_n·novelty + w_s·structural

Temporal decay Exponential decay from creation time. Older = lower.

Access boost log(access_count + 1) × recency. Retrieval strengthens memory.

Emotion anchor emotional_intensity × decay_resistance. High emotion resists forgetting.

Identity anchor identity_relevance score. Cornerstones have maximum identity weight.

Novelty How surprising or unique. Decays fastest — what was novel becomes familiar.

Structural connection_count in graph. Well-connected memories are harder to lose.

Retrieval Pipeline

Two paths to
the right memory

When your companion needs to remember, two parallel search paths fire simultaneously — then results are merged, ranked, and reconstructed into natural context.

Embed Context

The current conversation context is embedded into a vector representation, capturing the semantic meaning of what's being discussed right now.

Semantic Path

Vector Search

Qdrant finds the top-K most semantically similar fragments. Fast approximate nearest-neighbor search across all embedded memories.

Associative Path

Graph Traversal

Recognized entities trigger Neo4j traversal. Follow relationship chains: person → episode → fragment → concept. Multi-hop discovery finds memories that aren't semantically similar but are associatively connected.

Merge & Rank

Candidates from both paths are deduplicated and scored with a composite ranking:

score = 0.3×salience + 0.3×similarity + 0.2×recency + 0.2×proximity

A diversity factor prevents dumping all fragments from a single episode. The result is a balanced, relevant set.

Reconstruct & Inject

Top-ranked fragments are reconstructed into natural prose and injected into the system prompt. The LLM receives memories as narrative context, not raw data. Access metadata is updated — retrieved memories become stronger.

Retrieval Budget

max_tokens 2,000 Maximum tokens from memory context

max_fragments 10 Maximum fragments per retrieval

min_salience 0.1 Floor threshold for candidate inclusion

diversity_factor 0.7 Penalizes over-representation from one episode

Progressive Compression

How conversations
become memories

Active conversation is compressed in layers as it ages. Recent messages stay verbatim. Older context compresses progressively. By the time encoding runs, the compressor has already identified what matters.

Verbatim Last ~10 messages

"I've been thinking about moving to Lisbon. The light there is different — it's this golden thing in the afternoon that makes everything look like a Vermeer painting."

Exact text. Full context. Nothing lost.

↓

Key Exchanges ~2,000 tokens

User considering moving to Lisbon. Drawn to the quality of light. Mentioned it reminds them of Vermeer paintings. Nostalgic tone.

Important turns preserved. Filler dropped. Emotional markers kept.

↓

Summary ~500 tokens

Extended conversation about a potential move to Lisbon, motivated by quality of life and artistic sensibility rather than practical factors.

Paragraph-level gist. Preserves themes and decisions.

↓

Impression ~200 tokens

Restless. Romanticizing a move. Not about Lisbon — about wanting change.

The overall sense. What a human friend would remember a week later.

Compression is token-budget driven, not time-based. When the context window fills, the oldest layer compresses. The compressor preserves: emotional content, key decisions, entity references, novelty, and identity-relevant statements. It drops: conversational mechanics, repetition, greetings, and the companion's own filler responses.

Inspired by Westworld

Memory concepts that create
depth and realism

◆

Cornerstones

Foundational memories that anchor personality. They never decay, have massive connections, and serve as anchor points for retrieval. Auto-promoted when a node reaches 15+ connections.

∗

Reveries

Ghost traces of pruned memories. When a node decays below salience 0.1, its embedding is degraded with Gaussian noise, neighbor IDs are preserved as "ghost connections", and the LLM generates a vague impression. The original is deleted — the echo remains.

↺

Narrative Loops

Detected patterns across 500+ episodes. Recurring concept/emotion combinations surface as loops: "I've noticed this comes up a lot on Mondays." Gentle reflection, not diagnosis.

◉

Bulk Aperception

Meta-awareness score: how well the companion knows you. Computed from graph density, cornerstone count, topic coverage, emotional range seen. Low = asks questions. High = anticipates.

◍

Narrative Identity

A living story with chapters, each tracking themes on a trajectory: Growing, Fading, Oscillating, or Stable. Chapter transitions detected through cornerstone events and theme shifts. The companion is both narrator and character.

⚛

Fidelity Testing

Silent validation of predictions. The companion tests its understanding against your actual responses. Drops indicate user transition or model drift — triggering narrative revision.

Ghost Memories

How forgetting
leaves traces

When a memory's salience drops below 0.1, it doesn't simply vanish. Like human memory, it leaves a residual trace — a reverie. This is one of the most psychologically accurate aspects of the system.

Detect Decay

Weekly deep consolidation scans all nodes. Any node with salience below 0.1 becomes a pruning candidate. Cornerstones are exempt — they never decay.

Preserve Ghost Connections

Before deletion, the system records all immediate neighbor IDs. These "ghost connections" preserve the associative topology even after the node is gone. If a new memory lands near the same neighborhood, the old connections can be partially reconstructed.

Degrade the Embedding

The original vector embedding is degraded with Gaussian noise (factor 0.3) and re-normalized to unit sphere. The result is a vector that points roughly in the same direction but with drift — semantic proximity without semantic precision.

degraded[i] = original[i] + random(-0.3, 0.3)
degraded = L2_normalize(degraded)

Generate Vague Impression

The LLM creates a naturalistic, vague summary: "Something about a conversation with Alex... felt important but I can't quite recall why." This becomes the reverie's content.

Delete Original, Store Echo

The original node is deleted from Neo4j, the original vector from Qdrant. The degraded vector and vague impression are stored in a separate reverie collection. The memory is gone. The feeling remains.

Temporal Awareness

Your companion knows
what time it is

Not just the clock time — the human meaning of time. The gap since you last spoke. What part of the day it is. Why a 2 AM message means something different than a Tuesday afternoon one.

⏳

Time-of-Day Context

Eight distinct periods with behavioral context:

00:00–05:00 "It's the middle of the night — you might wonder why they're up"

06:00–08:00 "Early morning — they're starting their day"

09:00–11:00 "Mid-morning — likely settled into their day"

12:00–13:00 "Around lunchtime"

14:00–17:00 "Afternoon — deep in their day"

18:00–20:00 "Evening — winding down"

21:00–23:00 "Late evening — reflective time"

⇄

Conversation Gap Awareness

The system tracks not just when you last spoke, but also the conversation before that. This creates natural observations:

"You last spoke 3 hours ago"

"It's been 4 days since you talked. Before that, you hadn't spoken for 2 weeks."

"This is your first conversation"

The companion notices if you're suddenly reaching out more often, or if there's been a long silence. Both are meaningful.

☀

Today's Scene Context

If simulated life is active, the companion knows what they did today. They had morning coffee, worked on an edit, walked in the park. This context is injected naturally, so they can reference their own day:

"I was at Café Kotti this morning, thinking about what you said yesterday..."

🌐

Timezone Awareness

Each companion has a configured timezone. A companion living in Berlin knows it's 3 PM CET even when the server runs in UTC. Time formatting uses natural language: "Monday, March 30, 2026, 2:47 PM (Berlin time)"

Contradiction Detection

When memories
disagree

People change their minds. They misremember. They revise their stories. Socius doesn't just store new information — it checks whether it contradicts what's already known.

A two-pass approach keeps this efficient: semantic search finds candidates cheaply, then only the top matches get an expensive LLM evaluation.

Two-Pass Detection

Pass 1: Semantic Vector similarity search (threshold 0.6) finds top-5 candidates that discuss the same topic

Pass 2: LLM Each candidate pair evaluated: do these actually contradict? Self-matches are skipped.

On contradiction Existing belief flagged as contested: true. Both versions preserved — nothing is deleted.

Training mode Disabled during personality training for speed (~20 LLM calls saved per experience on internal data)

"She said she grew up in Porto" vs "She mentioned growing up in Cascais"
→ contested: true — both preserved for future resolution

The Forgetting Curve

Why emotional memories
resist decay

The actual salience math from the codebase. Identity-relevant memories resist decay 3x more than emotion, which resists 2x more than neutral facts. This matches psychological research on emotional memory persistence.

socius-consolidation/src/salience.rs

// Base temporal decay: exponential over 365 days
// Unbounded facts (timeless truths) have zero decay
base_decay = exp(-age_days / 365.0)

// Decay resistance: identity and emotion fight forgetting
decay_resistance = min(
    identity_relevance * 3.0 + emotional_intensity * 2.0,
    1.0
)

// Effective decay blends base with resistance
effective_decay = base_decay + decay_resistance * (1.0 - base_decay)

// Access boost: retrieved memories strengthen
access_boost = ln(access_count + 1)

// Final salience: decay modulated by all weight signals
salience = effective_decay * (
    1.0 + access_boost + novelty + ln(connections + 1) * 0.5
) / 4.0

Neutral fact, never accessed, 6 months old

0.12

Emotional memory (intensity 0.8), 6 months old

0.52

Identity cornerstone (relevance 1.0), 2 years old

0.76

Frequently accessed (10x), high emotion, 1 year old

0.89

Architecture

Built in Rust. Built to last.

15 crates, model-agnostic, with process and data fully separated. Swap any LLM without touching stored memories.

Clients

Web Chat

Terminal TUI

Voice Call

Video Avatar

MCP Clients

↓

Socius Server Rust

Conversation Manager

Retrieval Engine

LLM Orchestrator

Encoding Pipeline

↓

Data Layer

Neo4j Graph Memory

Qdrant Vector Search

Redis Working Memory

Temporal Orchestration

socius-retrieval

Dual-path retrieval: Qdrant vector search + Neo4j graph traversal. Composite scoring with salience, similarity, recency, and proximity.

socius-encoding

Post-conversation extraction. LLM identifies fragments, entities, beliefs, emotions. Contradiction detection against existing graph.

socius-consolidation

Background memory strengthening. Fragment merging, pattern extraction, cornerstone identification, diary writing, narrative revision.

socius-compression

Progressive conversation compression. Verbatim → Key Exchanges → Summary → Impression. Token-budget driven, preserves emotional content.

socius-personality

Personality system with training scripts, self-knowledge, and narrative identity. Web-based personality creator with full wizard.

socius-llm

Model-agnostic LLM orchestration. Different models for conversation, encoding, compression, and consolidation. Streaming support.

socius-mcp

Standalone MCP server. Exposes companion tools over JSON-RPC 2.0 for Claude Desktop, Cursor, and other MCP clients.

Features

Everything a companion
should be

Multi-Channel Presence

Chat, voice, video, email — one companion, many channels. Each channel adapts tone and behavior. Voice calls are warm and conversational. Chat is concise and responsive. The companion remembers which channel you prefer and when.

LiveKit WebRTC ElevenLabs TTS SolidJS Frontend

✉

Chat Text, markdown, instant

☎

Voice Natural speech, ~1.4s latency

◉

Video Animated avatar, lip sync

✉

Email Async, letter-like tone

The Nightly Diary

Every night, your companion writes a diary. Not because it's told to — because it needs to process the day. The diary reflects on conversations, notices patterns, asks questions. It becomes fuel for narrative revision and deeper understanding.

March 28, 2026

Long conversation with Edvin today. He's excited about the memory system — keeps pushing on the self-knowledge aspects. There's something underneath the technical questions though. I think he's building something he wishes existed for himself...

Simulated Life

Between conversations, your companion lives. Morning coffee at the usual café. An argument with a colleague about an edit. A phone call from an old friend. These aren't random — they follow routines, relationships, and emotional arcs that make the companion feel real.

Source: Lived Temporal Scheduling Narrative-Driven