Skip to main content

Agent Memory System

GeniSpace agents employ an advanced dual-layer memory architecture that intelligently manages and retrieves historical interaction information, providing users with a more coherent and personalized conversational experience. This guide covers the memory system's architecture, configuration methods, and best practices in detail.

Dual-Layer Memory Architecture

GeniSpace uses an optimized dual-layer memory architecture that eliminates content overlap issues found in traditional three-layer architectures:

1. Recent Conversation Memory

  • Function: Preserves the current session and recent conversation history
  • Retrieval Method: Fetches all recent messages in a single API call
  • Intelligent Processing: Automatically formats into coherent conversational context
  • Capacity Configuration: 10–200 messages (default: 50)

2. Historical Memory

  • Function: Retrieves relevant historical memories based on vector similarity
  • Storage Method: Uses a vector database (Milvus) for semantic retrieval
  • Intelligent Matching: Finds the most relevant historical interactions based on semantic similarity
  • Retrieval Count: 1–20 results (default: 5)

Dual-Layer Architecture Advantages

Compared to traditional three-layer architectures:

  • Avoids Overlap: Eliminates content duplication between short-term and mid-term memory
  • Performance Optimization: Reduces API call count and improves response speed
  • Simplified Logic: The two-layer structure is clearer and easier to maintain
  • Intelligent Integration: Automatically merges both memory layers to form complete context

Multi-Level Memory Isolation

The agent memory system supports three isolation levels, adapting to different use cases and privacy requirements:

Session Isolation

  • Scope: Each conversation session has independent memory
  • Use Cases:
    • Conversations requiring strict privacy protection
    • Independent discussions on different topics
    • Temporary tasks and consultations
  • Characteristics:
    • Information is completely isolated between sessions
    • Memory can be optionally cleared after a session ends
    • Highest level of privacy protection

User Isolation

  • Scope: Memory is shared across all sessions for the same user
  • Use Cases:
    • Personal assistants and learning companions
    • Applications that need to remember user preferences
    • Long-term project tracking
  • Characteristics:
    • Remembers user habits and preferences across sessions
    • Provides a personalized continuous experience
    • Supports long-term relationship building

Team Isolation

  • Scope: Memory is shared across all team members
  • Use Cases:
    • Team collaboration projects
    • Knowledge sharing and preservation
    • Collective decision-making support
  • Characteristics:
    • Team knowledge accumulation and sharing
    • Collaborative memory and experience transfer
    • Improves overall team efficiency

Configuring the Memory System

Basic Configuration

In the "Memory Configuration" section of the agent configuration page, you can adjust the following settings:

1. Enable Memory

☑️ Enable Memory
  • Conversational Agents: Enable conversation memory to maintain context across sessions
  • Task Agents: Enable task memory to remember previous task executions

2. Memory Isolation Level Selection

○ Session Isolation - Each conversation session has independent memory  
○ User Isolation - Memory is shared across all sessions for the same user
○ Team Isolation - Memory is shared across all team members

3. Fine-Tuned Memory Recall Parameters

Historical Memory Retrieval Count

  • Range: 1–20
  • Default: 5
  • Description: Maximum number of historical memories retrieved from the vector database

Maximum Recent Conversation Messages

  • Range: 10–200
  • Default: 50
  • Description: Maximum number of messages in recent conversation memory (includes current and historical sessions)

Important Turns Count

  • Range: 3–50
  • Default: 10
  • Description: Number of important conversation turns used for intelligent summarization

Configuration Details

After configuration, the system works as follows:

Memory Workflow

  1. Recent Conversations: The latest session messages fetched from the database
  2. Historical Memory: Relevant historical content found via vector search
  3. Important Turns: Key interaction count used for generating conversation summaries

Intelligent Summarization Mechanism

When the message count exceeds the important turns count, the system will:

  • Retain the most recent important conversation turns (full content)
  • Intelligently summarize older historical messages
  • Automatically extract key information points and decisions
  • Save context space and improve processing efficiency

Vector Retrieval Principles

Historical memory uses advanced vector database technology:

  • Converts user input into vector representations
  • Performs semantic similarity searches across historical interactions
  • Returns the most relevant historical memory fragments
  • Supports cross-language and multimodal content retrieval

Best Practices

Choosing a Memory Strategy Based on Isolation Level

Session Isolation Use Cases

Recommended Configuration:

  • Historical memory retrieval: 3–5 items
  • Recent conversation messages: 20–50
  • Important turns: 5–8

Suitable For:

  • Customer consultation services (high privacy requirements)
  • Temporary task processing
  • Sensitive information discussions

User Isolation Use Cases

Recommended Configuration:

  • Historical memory retrieval: 5–10 items
  • Recent conversation messages: 50–100
  • Important turns: 8–15

Suitable For:

  • Personal assistants and learning companions
  • Long-term project tracking
  • Personalized services

Team Isolation Use Cases

Recommended Configuration:

  • Historical memory retrieval: 8–15 items
  • Recent conversation messages: 100–200
  • Important turns: 15–30

Suitable For:

  • Team collaboration projects
  • Knowledge management systems
  • Collective decision-making support

Performance Optimization Tips

Improving Memory Retrieval Efficiency

  1. Set Retrieval Count Appropriately: Avoid retrieving too many historical memories, which can affect response speed
  2. Optimize Query Quality: Use specific, clear keywords to improve retrieval accuracy
  3. Regularly Clean Up Invalid Memories: Delete outdated or incorrect historical memories

Memory Usage Optimization

  1. Control Recent Conversation Count: Adjust the message count limit based on actual needs
  2. Enable Intelligent Summarization: Let the system automatically compress lengthy historical conversations
  3. Balance Performance and Memory Depth: Find the right balance between response speed and memory comprehensiveness

Memory Management API

Developers can manage the agent memory system through the API:

Get Memory Statistics

const response = await fetch(`/developer/api-agents/${agentId}/memory/stats?${queryParams}`, {
headers: {
"Authorization": "Bearer YOUR_API_KEY"
}
});

const stats = await response.json();
// Returns: { total_memories: 150, agent_id: "agt_123", isolation_level: "session" }

Clear Session Memory

// Clear memory for a specific session
await fetch(`/developer/api-agents/${agentId}/memory/session/${sessionId}`, {
method: "DELETE",
headers: {
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json"
},
body: JSON.stringify({
agentId: agentId,
isolation_level: "session" // session, user, team
})
});

Clear Agent Memory

// Clear all memory for an agent
await fetch(`/developer/api-agents/${agentId}/memory`, {
method: "DELETE",
headers: {
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json"
},
body: JSON.stringify({
isolation_level: "user" // Set the isolation level as needed
})
});

Configure Memory Parameters

Agent memory configuration is managed through the agent configuration API:

const memoryConfig = {
enabled: true,
isolation: "user", // session, user, team
recent_conversation_max_messages: 50,
recent_important_turns: 10,
historical_max_results: 5
};

await fetch(`/developer/api-agents/${agentId}`, {
method: "PATCH",
headers: {
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json"
},
body: JSON.stringify({
memoryConfig: memoryConfig
})
});

FAQ

Details

What are the advantages of the dual-layer memory architecture over traditional architectures? The dual-layer architecture eliminates content overlap between short-term and mid-term memory, reduces API call count, and improves response speed. The architecture is also more concise, easier to understand and maintain, and intelligently merges both memory layers to form complete context.

How do I choose the right memory isolation level?
  • Session Isolation: Suitable for scenarios requiring strict privacy protection, such as customer consultations
  • User Isolation: Suitable for personal assistants that need to remember user preferences and habits
  • Team Isolation: Suitable for team collaboration, sharing knowledge and accumulated experience
Details

How does the memory system handle multimodal content? The system supports memory storage and retrieval for multimodal content including text and images. Using vector databases for semantic similarity searches, it can perform intelligent matching across languages and multimodal content.

Details

How is memory retrieval accuracy ensured? The system uses advanced vector embedding technology to convert content into semantic vectors, performing retrieval through semantic similarity rather than keyword matching. This ensures the most relevant historical memories are found, even when expressed differently.

How can I optimize memory system performance?
  1. Set retrieval count appropriately to avoid speed impacts from excessive retrieval
  2. Adjust recent conversation message count based on use case
  3. Enable intelligent summarization to compress lengthy conversations
  4. Regularly clean up invalid or outdated memories
Details

How is memory data security ensured? Memory data employs multi-level isolation mechanisms and supports end-to-end encrypted storage. Based on the configured isolation level, data access permissions are strictly controlled to ensure the privacy and security of user and team data.

Next Steps

Explore more agent features:

Related guides: