LLM-Powered Decentralized Generative Agents with Adaptive Hierarchical Knowledge Graph for Cooperative Planning

Hanqing Yang1, Jingdi Chen1, Marie Siew2, Tania Lorido-Botran3 4, Carlee Joe-Wong1

Six language model-powered agents work together to mine a diamond in the Multi-Agent Crafter environment. This environment allows a customizable number of agents to respawn and interact with each other. The goal is to collect a diamond as quickly as possible, and the environment terminates once a diamond is found. To achieve this, agents must craft the necessary tools by following a hierarchical crafting order. Additionally, they must maintain their health to remain in the environment.

What does our framework do?

We propose Decentralized Adaptive Knowledge Graph Memory and Structured Communication System (DAMCS) in a novel Multi-agent Crafter environment (MAC). Our approach is built on three key innovations:

  • Multi-Agent Crafter (MAC) Benchmark: We introduce MAC, an open-world testbed for multi-agent cooperation, extending Crafter. It provides a realistic platform for structured agent communication, coordination, and task allocation.
  • Decentralized Adaptive Knowledge Graph Memory (A-KGMS): A hierarchical memory system that enables agents to store, retrieve, and adapt knowledge dynamically, enhancing long-term planning and execution.
  • Structured Communication System (S-CS): A structured messaging framework that reduces redundant communication while ensuring efficient information sharing among agents.

DAMCS integrates A-KGMS and S-CS to improve long-term collaboration, reducing redundant actions and enhancing role allocation in cooperative tasks. Evaluations using MAC show that DAMCS outperforms baselines, cutting task completion steps by up to 74% compared to single-agent scenarios. Our framework builds upon recent advancements in LLM-powered agents, such as Generative Agents, to enhance decentralized multi-agent cooperation. By enabling agents to autonomously plan, coordinate, and optimize communication, DAMCS aims to advance scalable, decentralized LLM-powered multi-agent systems for real-world applications.

Abstract

Developing intelligent agents for long-term cooperation in dynamic open-world scenarios is a major challenge in multi-agent systems. Traditional Multi-agent Reinforcement Learning (MARL) frameworks like centralized training decentralized execution (CTDE) struggle with scalability and flexibility. They require centralized long-term planning, which is difficult without custom reward functions, and face challenges in processing multi-modal data. CTDE approaches also assume fixed cooperation strategies, making them impractical in dynamic environments where agents need to adapt and plan independently. To address decentralized multi-agent cooperation, we propose Decentralized Adaptive Knowledge Graph Memory and Structured Communication System (DAMCS) in a novel Multi-agent Crafter environment. Our generative agents, powered by Large Language Models (LLMs), are more scalable than traditional MARL agents by leveraging external knowledge and language for long-term planning and reasoning. Instead of fully sharing information from all past experiences, DAMCS introduces a multi-modal memory system organized as a hierarchical knowledge graph and a structured communication protocol to optimize agent cooperation. This allows agents to reason from past interactions and share relevant information efficiently. Experiments on novel multi-agent open-world tasks show that DAMCS outperforms both MARL and LLM baselines in task efficiency and collaboration. Compared to single-agent scenarios, the two-agent scenario achieves the same goal with 63% fewer steps, and the six-agent scenario with 74% fewer steps, highlighting the importance of adaptive memory and structured communication in achieving long-term goals.

Video Presentation

RL Training

The figures showcase the training performance of reinforcement learning (RL) agents in a complex environment. The first figure presents a single-agent trained using Proximal Policy Optimization (PPO), while the second figure illustrates a multi-agent setup using Multi-Agent Deep Deterministic Policy Gradient (MADDPG). In both cases, the agents initially improve their rewards, but they soon hit a bottleneck as further progress requires learning advanced skills in a hierarchical order. The RL agents struggle to efficiently acquire these skills, leading to slow and unstable learning in this complex environment.

MY ALT TEXT MY ALT TEXT

Task Completion Time

Two agents with communication complete tasks faster than two agents without communication, who complete tasks at about the same speed as a single agent. The basic agent is slower than agents with our memory system.

Six agents with communication complete tasks faster than six agents without communication. They are also faster than two agents with communication.

MY ALT TEXT

Memory of Each Agent in Gameplay

While each agent independently controls its own behavior and maintains its own memory, the Structured Communication System ensures they remain aware of others’ progress, enabling timely and adaptive cooperation.

Agent 0, responsible for tool crafting, follows a sequential memory structure, reflecting hierarchical goal progression. Agent 1, tasked with assisting Agent 0, develops clustered memories centered on crafting and resource gathering, helping Agent 0 with its needs. Similarly, Agent 2 supports Agent 1, with memory clusters focused on cooperative material collection and crafting tasks. These agents dynamically adjust their strategies based on shared information in a decentralized manner.

Agents 3 and 4, focused on resource sharing, exhibit simpler, less interconnected memory structures since their role is primarily to collect and distribute materials rather than craft tools. Agent 5, which monitors the overall team’s progress, integrates information from all agents and determines when to transition toward diamond collection. The S-CS plays a crucial role in shaping these memory patterns.

MY ALT TEXT

BibTeX


        @misc{yang2025damcs,
          title={LLM-Powered Decentralized Generative Agents with Adaptive Hierarchical Knowledge Graph for Cooperative Planning}, 
          author={Hanqing Yang and Jingdi Chen and Marie Siew and Tania Lorido-Botran and Carlee Joe-Wong},
          year={2025},
          eprint={2502.05453},
          archivePrefix={arXiv},
          primaryClass={cs.AI},
          url={https://arxiv.org/abs/2502.05453}, 
        }