Agent Memory System

GeniSpace agents employ an advanced dual-layer memory architecture that intelligently manages and retrieves historical interaction information, providing users with a more coherent and personalized conversational experience. This guide covers the memory system's architecture, configuration methods, and best practices in detail.

Dual-Layer Memory Architecture

GeniSpace uses an optimized dual-layer memory architecture that eliminates content overlap issues found in traditional three-layer architectures:

1. Recent Conversation Memory

Function: Preserves the current session and recent conversation history
Retrieval Method: Fetches all recent messages in a single API call
Intelligent Processing: Automatically formats into coherent conversational context
Capacity Configuration: 10–200 messages (default: 50)

2. Historical Memory

Function: Retrieves relevant historical memories based on vector similarity
Storage Method: Uses a vector database (Milvus) for semantic retrieval
Intelligent Matching: Finds the most relevant historical interactions based on semantic similarity
Retrieval Count: 1–20 results (default: 5)

Dual-Layer Architecture Advantages

Compared to traditional three-layer architectures:

Avoids Overlap: Eliminates content duplication between short-term and mid-term memory
Performance Optimization: Reduces API call count and improves response speed
Simplified Logic: The two-layer structure is clearer and easier to maintain
Intelligent Integration: Automatically merges both memory layers to form complete context

Multi-Level Memory Isolation

The agent memory system supports three isolation levels, adapting to different use cases and privacy requirements:

Session Isolation

Scope: Each conversation session has independent memory
Use Cases:
- Conversations requiring strict privacy protection
- Independent discussions on different topics
- Temporary tasks and consultations
Characteristics:
- Information is completely isolated between sessions
- Memory can be optionally cleared after a session ends
- Highest level of privacy protection

User Isolation

Scope: Memory is shared across all sessions for the same user
Use Cases:
- Personal assistants and learning companions
- Applications that need to remember user preferences
- Long-term project tracking
Characteristics:
- Remembers user habits and preferences across sessions
- Provides a personalized continuous experience
- Supports long-term relationship building

Team Isolation

Scope: Memory is shared across all team members
Use Cases:
- Team collaboration projects
- Knowledge sharing and preservation
- Collective decision-making support
Characteristics:
- Team knowledge accumulation and sharing
- Collaborative memory and experience transfer
- Improves overall team efficiency

Configuring the Memory System

Basic Configuration

In the "Memory Configuration" section of the agent configuration page, you can adjust the following settings:

1. Enable Memory

☑️ Enable Memory

Conversational Agents: Enable conversation memory to maintain context across sessions
Task Agents: Enable task memory to remember previous task executions

2. Memory Isolation Level Selection

○ Session Isolation - Each conversation session has independent memory  
○ User Isolation - Memory is shared across all sessions for the same user
○ Team Isolation - Memory is shared across all team members

3. Fine-Tuned Memory Recall Parameters

Historical Memory Retrieval Count

Range: 1–20
Default: 5
Description: Maximum number of historical memories retrieved from the vector database

Maximum Recent Conversation Messages

Range: 10–200
Default: 50
Description: Maximum number of messages in recent conversation memory (includes current and historical sessions)

Important Turns Count

Range: 3–50
Default: 10
Description: Number of important conversation turns used for intelligent summarization

Configuration Details

After configuration, the system works as follows:

Memory Workflow

Recent Conversations: The latest session messages fetched from the database
Historical Memory: Relevant historical content found via vector search
Important Turns: Key interaction count used for generating conversation summaries

Intelligent Summarization Mechanism

When the message count exceeds the important turns count, the system will:

Retain the most recent important conversation turns (full content)
Intelligently summarize older historical messages
Automatically extract key information points and decisions
Save context space and improve processing efficiency

Vector Retrieval Principles

Historical memory uses advanced vector database technology:

Converts user input into vector representations
Performs semantic similarity searches across historical interactions
Returns the most relevant historical memory fragments
Supports cross-language and multimodal content retrieval

Best Practices

Choosing a Memory Strategy Based on Isolation Level

Session Isolation Use Cases

Recommended Configuration:

Historical memory retrieval: 3–5 items
Recent conversation messages: 20–50
Important turns: 5–8

Suitable For:

Customer consultation services (high privacy requirements)
Temporary task processing
Sensitive information discussions

User Isolation Use Cases

Recommended Configuration:

Historical memory retrieval: 5–10 items
Recent conversation messages: 50–100
Important turns: 8–15

Suitable For:

Personal assistants and learning companions
Long-term project tracking
Personalized services

Team Isolation Use Cases

Recommended Configuration:

Historical memory retrieval: 8–15 items
Recent conversation messages: 100–200
Important turns: 15–30

Suitable For:

Team collaboration projects
Knowledge management systems
Collective decision-making support

Performance Optimization Tips

Improving Memory Retrieval Efficiency

Set Retrieval Count Appropriately: Avoid retrieving too many historical memories, which can affect response speed
Optimize Query Quality: Use specific, clear keywords to improve retrieval accuracy
Regularly Clean Up Invalid Memories: Delete outdated or incorrect historical memories

Memory Usage Optimization

Control Recent Conversation Count: Adjust the message count limit based on actual needs
Enable Intelligent Summarization: Let the system automatically compress lengthy historical conversations
Balance Performance and Memory Depth: Find the right balance between response speed and memory comprehensiveness

Memory Management API

Developers can manage the agent memory system through the API:

Get Memory Statistics

const response = await fetch(`/developer/api-agents/${agentId}/memory/stats?${queryParams}`, {
  headers: {
    "Authorization": "Bearer YOUR_API_KEY"
  }
});

const stats = await response.json();
// Returns: { total_memories: 150, agent_id: "agt_123", isolation_level: "session" }

Clear Session Memory

// Clear memory for a specific session
await fetch(`/developer/api-agents/${agentId}/memory/session/${sessionId}`, {
  method: "DELETE",
  headers: {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
  },
  body: JSON.stringify({
    agentId: agentId,
    isolation_level: "session"  // session, user, team
  })
});

Clear Agent Memory

// Clear all memory for an agent
await fetch(`/developer/api-agents/${agentId}/memory`, {
  method: "DELETE", 
  headers: {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
  },
  body: JSON.stringify({
    isolation_level: "user"  // Set the isolation level as needed
  })
});

Configure Memory Parameters

Agent memory configuration is managed through the agent configuration API:

const memoryConfig = {
  enabled: true,
  isolation: "user",  // session, user, team
  recent_conversation_max_messages: 50,
  recent_important_turns: 10,
  historical_max_results: 5
};

await fetch(`/developer/api-agents/${agentId}`, {
  method: "PATCH",
  headers: {
    "Authorization": "Bearer YOUR_API_KEY", 
    "Content-Type": "application/json"
  },
  body: JSON.stringify({
    memoryConfig: memoryConfig
  })
});

FAQ

Details

What are the advantages of the dual-layer memory architecture over traditional architectures?

The dual-layer architecture eliminates content overlap between short-term and mid-term memory, reduces API call count, and improves response speed. The architecture is also more concise, easier to understand and maintain, and intelligently merges both memory layers to form complete context.

How do I choose the right memory isolation level?

Session Isolation: Suitable for scenarios requiring strict privacy protection, such as customer consultations
User Isolation: Suitable for personal assistants that need to remember user preferences and habits
Team Isolation: Suitable for team collaboration, sharing knowledge and accumulated experience

Details

How does the memory system handle multimodal content?

The system supports memory storage and retrieval for multimodal content including text and images. Using vector databases for semantic similarity searches, it can perform intelligent matching across languages and multimodal content.

Details

How is memory retrieval accuracy ensured?

The system uses advanced vector embedding technology to convert content into semantic vectors, performing retrieval through semantic similarity rather than keyword matching. This ensures the most relevant historical memories are found, even when expressed differently.

How can I optimize memory system performance?

Set retrieval count appropriately to avoid speed impacts from excessive retrieval
Adjust recent conversation message count based on use case
Enable intelligent summarization to compress lengthy conversations
Regularly clean up invalid or outdated memories

Details

How is memory data security ensured?

Memory data employs multi-level isolation mechanisms and supports end-to-end encrypted storage. Based on the configured isolation level, data access permissions are strictly controlled to ensure the privacy and security of user and team data.

Next Steps

Explore more agent features:

Learn about the complete feature set in Agent Overview
Study the advanced reasoning capabilities in Agent Overview
Master the powerful extensibility of the Tool System
Explore API Integration to integrate agents into your applications

Related guides:

Workflow Engine: Learn how agents work with workflows
Tool System: Extend agent capabilities
Data & Knowledge Base: Provide professional knowledge support for agents

Dual-Layer Memory Architecture​

1. Recent Conversation Memory​

2. Historical Memory​

Dual-Layer Architecture Advantages​

Multi-Level Memory Isolation​

Session Isolation​

User Isolation​

Team Isolation​

Configuring the Memory System​

Basic Configuration​

1. Enable Memory​

2. Memory Isolation Level Selection​

3. Fine-Tuned Memory Recall Parameters​

Configuration Details​

Memory Workflow​

Intelligent Summarization Mechanism​

Vector Retrieval Principles​

Best Practices​

Choosing a Memory Strategy Based on Isolation Level​

Session Isolation Use Cases​

User Isolation Use Cases​

Team Isolation Use Cases​

Performance Optimization Tips​

Improving Memory Retrieval Efficiency​

Memory Usage Optimization​

Memory Management API​

Get Memory Statistics​

Clear Session Memory​

Clear Agent Memory​

Configure Memory Parameters​

FAQ​

Next Steps​

Dual-Layer Memory Architecture

1. Recent Conversation Memory

2. Historical Memory

Dual-Layer Architecture Advantages

Multi-Level Memory Isolation

Session Isolation

User Isolation

Team Isolation

Configuring the Memory System

Basic Configuration

1. Enable Memory

2. Memory Isolation Level Selection

3. Fine-Tuned Memory Recall Parameters

Configuration Details

Memory Workflow

Intelligent Summarization Mechanism

Vector Retrieval Principles

Best Practices

Choosing a Memory Strategy Based on Isolation Level

Session Isolation Use Cases

User Isolation Use Cases

Team Isolation Use Cases

Performance Optimization Tips

Improving Memory Retrieval Efficiency

Memory Usage Optimization

Memory Management API

Get Memory Statistics

Clear Session Memory

Clear Agent Memory

Configure Memory Parameters

FAQ

Next Steps