Enterprise-Level Memory in Amazon Bedrock with Amazon Neptune and Mem0

This article is co-authored by Shawn Tsai of TrendMicro.

Providing relevant and contextual answers is important for customer satisfaction. For enterprise AI chatbots, it is essential to understand not only the current query, but also the organizational context behind it. Per-enterprise memory in Amazon Bedrock, powered by Amazon Neptune and Mem0, provides AI agents with persistent, enterprise-specific context, allowing them to learn, adapt, and respond intelligently across multiple interactions. TrendMicro, one of the world’s largest antivirus software companies, developed Trend’s Companion chatbot, so its customers can explore information through natural conversational interactions (learn more).

TrendMicro aimed to improve its AI chatbot service to provide personalized and contextual assistance to business customers. The chatbot needed to maintain conversation history to ensure continuity, reference business-specific knowledge at scale, and ensure the memory remained accurate, secure, and up-to-date. The challenge is to integrate long-term memory for organizational knowledge with short-term memory for ongoing conversations, while fostering enterprise-wide knowledge sharing. Working with the AWS team, including the AWS Generative AI Innovation Center, TrendMicro addressed this challenge using Amazon Neptune, Amazon OpenSearch, and Amazon Bedrock, as we explain in this blog.

Solution Overview

TrendMicro implemented enterprise-level memory in Amazon Bedrock by combining several AWS services. Amazon Neptune stores an enterprise-specific knowledge graph representing organizational relationships, processes, and data to enable precise, structured retrieval. Mem0 manages short-term conversational memory for immediate context and long-term memory for persistent knowledge across sessions. Amazon Bedrock orchestrates AI agent workflows, integrating with both Neptune and Mem0 to retrieve and apply contextual knowledge during inference. This architecture allows the chatbot to recall relevant history, retrieve structured knowledge about the business, and respond with personalized, context-rich responses, helping to significantly improve the user experience.

Creating and Updating Memory

The architecture begins by capturing user messages and extracting potential entities, relationships, and memories via the Claude model on Amazon Bedrock. These are then integrated into Amazon Bedrock Titan Text Embed and searched on both Amazon OpenSearch Service and Amazon Neptune. Relevant entities and memories are retrieved and updated via the model before being re-integrated and indexed in OpenSearch and Neptune. This closed-loop process ensures that entity-related memories can be continually refreshed and that Neptune’s knowledge graph remains consistent with conversational information.

Memory Recovery

When processing user queries, the system applies a similar embedding pipeline with Bedrock Titan to search for both vector embeddings in OpenSearch Service and feature triples in Neptune. Relevant memories are then reranked using the Amazon Bedrock Rerank or Cohere Rerank models to ensure the most accurate contextual information is provided. This dual retrieval strategy offers both the semantic flexibility of OpenSearch and the structured precision of Neptune, allowing the chatbot to provide highly relevant and contextual responses.

Response Memory Mapping and Human Feedback in the Loop

For each AI response, the system maps the sentences to the specific memories referenced, generating a memory assessment report. Users then have the option to approve or reject these mappings. Approved submissions remain in the knowledge base, while rejected submissions are removed from OpenSearch Service and Neptune. This ensures that only validated and reliable knowledge persists. This human-in-the-loop mechanism builds trust and helps continually improve memory accuracy and gives enterprise customers direct influence on refining their AI’s knowledge.

Amazon Neptune in Action

Chart

To illustrate how Amazon Neptune enriches the chatbot’s memory, consider a customer asking, “Who recognized Kublai as a leader?” Without the knowledge graph, the AI might return a vague answer such as: “Kublai was a Mongolian ruler who was recognized by different groups.” This kind of response is generic and lacks precision.

When the same question is asked but the Neptune entity graph is queried and placed in the Large Language Model (LLM) pop-up window, the model can base its reasoning on triples structured like (Ilkhans, Recognized, Kublai). The chatbot can then respond more precisely: “According to the organization’s knowledge base, Kublai has been recognized by the Ilkhans as a leader.” This before and after example demonstrates how structured entity relationships in Neptune allow the model to produce answers that are both contextually relevant and verifiable.

Conclusion and Next Steps

As described in the AWS Trend Micro case study, Trend Micro uses AWS to deliver more secure, scalable, and intelligent customer experiences. Building on this foundation, Trend Micro combines Amazon Bedrock, Amazon Neptune, Amazon OpenSearch Service, and Mem0 to create an AI chatbot with organization-specific persistent memory that delivers intelligent, contextual conversations at scale. By integrating graph-based insights with generative AI, Trend Micro is expected to improve response quality, providing clearer, more accurate answers while establishing a foundation for AI systems that continually adapt to evolving organizational knowledge; This work remains under evaluation and tuning to further improve the end-user experience.

Looking ahead, TrendMicro is exploring future enhancements such as wider graphics coverage, additional update pipelines, and multi-language support. For readers who want to dig deeper, we recommend exploring the GitHub sample implementation, which includes the source code we implemented, and the Amazon Neptune documentation for more technical details and inspiration.

About the Authors

Shawn Tsai

Shawn Tsai is a Senior Architect at Trend Micro, specializing in large-scale language model application development and security practices, cloud architecture design, large-scale software architecture design, and DevOps practices. He is currently primarily responsible for the development of Trend Micro’s extended language model applications and security framework.

Ray Wang

Ray Wang is a Senior Solutions Architect at AWS. With over 12 years of backend and consulting experience, Ray is dedicated to building modern cloud solutions, including NoSQL, Big Data, Machine Learning, and Generative AI. As a hungry go-getter, he completed all 12 AWS certificates to increase the breadth and depth of his technical knowledge. He loves reading and watching science fiction movies in his free time.

Zhihao Lin

Zhihao Lin is an applied researcher at the AWS Generative AI Innovation Center. With a master’s degree from Peking University and publications in leading conferences such as CVPR and IJCAI, he brings extensive experience in AI/ML research to his role. At AWS, he focuses on developing generative AI solutions, leveraging cutting-edge technology for innovative applications. He specializes in solving complex computer vision and natural language processing problems and advancing the practical use of generative AI in business.

Here

“`

The next phase of the Microsoft OpenAI partnership

Apple reportedly testing four models of upcoming smart glasses

This simulation startup wants to be the cursor of physical AI

The future of TVs is bright, but I don’t think it’s MicroLED

Enterprise-Level Memory in Amazon Bedrock with Amazon Neptune and Mem0

Solution Overview

Creating and Updating Memory

Memory Recovery

Response Memory Mapping and Human Feedback in the Loop

Amazon Neptune in Action

Conclusion and Next Steps

About the Authors

Shawn Tsai

Ray Wang

Zhihao Lin

The next phase of the Microsoft OpenAI partnership

Apple reportedly testing four models of upcoming smart glasses

This simulation startup wants to be the cursor of physical AI

The future of TVs is bright, but I don’t think it’s MicroLED

OpenAI launches grant to fund external AI security research – THE Journal

7 Specific Unconventional Things to Do with Language Models

GPT-4 has 1.8 trillion parameters. He uses 2% per token.

Designing synthetic datasets for the real world: mechanism design and reasoning from first principles

Get Your First Agent Up and Running in Minutes: Announcing New Features in Amazon Bedrock AgentCore

7 Practical OpenClaw Use Cases You Should Know About

LEAVE A REPLY Cancel reply

Useful Links

Latest News

Apple reportedly testing four models of upcoming smart glasses

This simulation startup wants to be the cursor of physical AI

The future of TVs is bright, but I don’t think it’s MicroLED

Our Newsletter