CUBE: Collaborative Multi‑Agent Block‑Pushing Environment

CUBE: Collaborative Multi‑Agent Block‑Pushing Environment for Collective Planning with LLM Agents

¹Carnegie Mellon University
²University of Southern California
NeurIPS 2025 Workshop: Scaling Environments for Agents (SEA)
^*Equal Contribution ^† Work done during an internship at Carnegie Mellon University

What is CUBE?

CUBE is a lightweight, portable, and scalable multi-agent environment that unifies symbolic reasoning with embodied interaction. It exposes both primitive grid actions (UP, DOWN, LEFT, RIGHT, STAY) and a symbolic action library (e.g., MOVETOBLOCK, RENDEZVOUS, PUSH, WAIT), enabling LLM-based planners, RL agents, and hybrid methods to operate within the same task.

Embodied multi-agent cooperation: Agents collaborate in a 2D grid to push weighted blocks into a goal zone, subject to embodied constraints such as collisions, congestion, and block-chains. These spatial dependencies make cooperation both necessary and measurable, as heavier blocks demand greater synchronization and coordination.
Lightweight, portable, and scalable: Implemented in Python with Numba acceleration, CUBE runs efficiently with hundreds of agents on a single CPU core, supporting rapid experimentation and large-scale benchmarking.
Dual-layer design: CUBE couples a symbolic layer and a vector-based layer across observation, action, and feedback channels. This architecture integrates symbolic planning with low-level state representations, enabling scalable and interpretable approaches to multi-agent coordination.
Fair benchmarking: A single parameter, n, determines agent count, grid size, and block distribution, yielding comparable cooperative difficulty for the same n and progressively harder coordination as n increases. This creates a transparent, scalable curriculum for evaluating multi-agent cooperation.

Abstract

We introduce CUBE (Collaborative Multi-Agent Block-Pushing Environment), a lightweight yet expressive testbed for studying embodied cooperation in multi-agent systems. While traditional reinforcement learning benchmarks emphasize low-level action spaces and scalar rewards, and symbolic planning domains emphasize logical reasoning under deterministic transitions, neither alone captures the combination of embodiment, uncertainty, and symbolic structure needed to evaluate emerging embodied LLM-based agents. CUBE addresses this gap by wrapping primitive block-pushing actions into a symbolic action vocabulary, enabling interpretable and compositional cooperation strategies. It also provides a library of symbolic concepts for customized feedback at both per-agent and collective levels. These features allow the same environment to support reinforcement learning and LLM-based agents or hybrid architectures. For ease and fair comparison across experiments, a single parameter n specifies the number of agents, grid size, and block weights, creating a transparent curriculum that scales difficulty and cooperation demands. CUBE thus offers a flexible platform for scalable evaluation of algorithms that integrate symbolic reasoning with embodied multi-agent interaction.

Chains in Environment and Failure Cases due to Agent/Environment Dynamics

In CUBE, blocks can form chains—composite structures that require coordinated force from multiple agents to move. A pushing line forms when agents align along a block’s face, each contributing unit force in the push direction. Motion succeeds only if the total applied force meets or exceeds the block’s weight and the frontmost destination cell is free. The first row of figures illustrates these geometric outcomes: when agents align, the chain advances; when misaligned or obstructed, it fails.

The second row shows failure cases due to agent and environment dynamics. Even when geometry permits motion, local heuristics or asynchronous policies may cause deadlock, blocking, or oscillation. Agents can obstruct one another’s approach or become trapped by congestion, revealing how spatial and temporal dependencies shape cooperative success and failure in embodied settings.

Successful pushing chain with agents aligned

Successful chain: Aligned agents form a stable pushing line, transferring sufficient force to move the composite block.

Geometric failure: Misalignment or blocked destination cell prevents the chain from advancing.

Failure I: Agents cannot stage on the target face due to congestion—no free cells for coordinated pushing.

Failure II: Agents block one another or oscillate, preventing stable cooperation despite a feasible plan.

BibTeX

@inproceedings{yang2025cube, title = {CUBE: Collaborative Multi-Agent Block-Pushing Environment for Collective Planning with LLM Agents}, author = {Hanqing Yang and Narjes Nourzad and Shiyu Chen and Carlee Joe-Wong}, booktitle = {NeurIPS 2025 Workshop on Scaling Environments for Agents (SEA)}, year = {2025}, url = {https://happyeureka.github.io/cube} }

CUBE: Collaborative Multi‑Agent Block‑Pushing Environment for Collective Planning with LLM Agents

What is CUBE?

Abstract

Chains in Environment and Failure Cases due to Agent/Environment Dynamics

Task Performance

Action Runtime Profiling

Video Presentation

BibTeX