CUBE is a lightweight, portable, and scalable multi-agent environment that unifies
symbolic reasoning with embodied interaction. It exposes both primitive grid actions
(UP, DOWN, LEFT, RIGHT, STAY) and a symbolic action library (e.g., MOVETOBLOCK,
RENDEZVOUS, PUSH, WAIT), enabling LLM-based planners, RL agents, and
hybrid methods to operate within the same task.
n, determines agent count, grid size, and block distribution,
yielding comparable cooperative difficulty for the same n and progressively harder coordination as n increases.
This creates a transparent, scalable curriculum for evaluating multi-agent cooperation.
We introduce CUBE (Collaborative Multi-Agent Block-Pushing Environment), a lightweight yet expressive testbed for studying embodied cooperation in multi-agent systems. While traditional reinforcement learning benchmarks emphasize low-level action spaces and scalar rewards, and symbolic planning domains emphasize logical reasoning under deterministic transitions, neither alone captures the combination of embodiment, uncertainty, and symbolic structure needed to evaluate emerging embodied LLM-based agents. CUBE addresses this gap by wrapping primitive block-pushing actions into a symbolic action vocabulary, enabling interpretable and compositional cooperation strategies. It also provides a library of symbolic concepts for customized feedback at both per-agent and collective levels. These features allow the same environment to support reinforcement learning and LLM-based agents or hybrid architectures. For ease and fair comparison across experiments, a single parameter n specifies the number of agents, grid size, and block weights, creating a transparent curriculum that scales difficulty and cooperation demands. CUBE thus offers a flexible platform for scalable evaluation of algorithms that integrate symbolic reasoning with embodied multi-agent interaction.
In CUBE, blocks can form chains—composite structures that require coordinated force from multiple agents to move. A pushing line forms when agents align along a block’s face, each contributing unit force in the push direction. Motion succeeds only if the total applied force meets or exceeds the block’s weight and the frontmost destination cell is free. The first row of figures illustrates these geometric outcomes: when agents align, the chain advances; when misaligned or obstructed, it fails.
The second row shows failure cases due to agent and environment dynamics. Even when geometry permits motion, local heuristics or asynchronous policies may cause deadlock, blocking, or oscillation. Agents can obstruct one another’s approach or become trapped by congestion, revealing how spatial and temporal dependencies shape cooperative success and failure in embodied settings.
Successful chain: Aligned agents form a stable pushing line, transferring sufficient force to move the composite block.
Geometric failure: Misalignment or blocked destination cell prevents the chain from advancing.
Failure I: Agents cannot stage on the target face due to congestion—no free cells for coordinated pushing.
Failure II: Agents block one another or oscillate, preventing stable cooperation despite a feasible plan.
LLM agents complete progressively heavier blocks as n increases; heuristics perform well on small layouts but struggle as coordination demands grow.
Average steps vs. n (cap 200).
Runtime vs. n by model.
Mean time per step vs agents.
Memory usage vs agents.
CPU utilization vs agents.
Mean runtime per action vs agents.
Heatmap of action runtimes (log‑scale).
@inproceedings{yang2025cube,
title = {CUBE: Collaborative Multi-Agent Block-Pushing Environment for Collective Planning with LLM Agents},
author = {Hanqing Yang and Narjes Nourzad and Shiyu Chen and Carlee Joe-Wong},
booktitle = {NeurIPS 2025 Workshop on Scaling Environments for Agents (SEA)},
year = {2025},
url = {https://happyeureka.github.io/cube}
}