CUBE is a lightweight, embodied grid world for studying cooperative multi-agent behavior with both RL and LLM agents. It combines primitive block-pushing dynamics with a symbolic action vocabulary and a library of symbolic concepts, enabling interpretable planning, synchronized pushing, and customized feedback at per-agent and team levels. :contentReference[oaicite:0]{index=0}
n
: jointly controls agent count, block weights, and grid size, yielding a transparent difficulty curriculum. :contentReference[oaicite:2]{index=2}We introduce CUBE, a cooperative block-pushing testbed that blends embodied dynamics with symbolic structure. Primitive pushes are wrapped into interpretable symbolic actions and paired with symbolic concepts for per-agent and team-level feedback. CUBE’s single parameter n scales agents, block weights, and grid size, creating a transparent curriculum from minimal to large-scale coordination. The environment supports both RL and LLM agents and runs efficiently on commodity CPUs. :contentReference[oaicite:6]{index=6}
MOVETOBLOCK
, RENDEZVOUS
, PUSH
) compile to primitives for synchronized pushing (Table on p.6). :contentReference[oaicite:8]{index=8}
A simple heuristic planner reliably completes blocks but can deadlock under congestion; naive zero-shot LLM agents (e.g., gpt-4o, gpt-4o-mini) can produce executable symbolic plans yet are less stable and efficient as coordination scales. (Figures 6–7). :contentReference[oaicite:11]{index=11}
Takeaway: symbolic structure helps LLM agents act, but robust cooperation benefits from synchronized actions and congestion-aware strategies. :contentReference[oaicite:12]{index=12}
@inproceedings{yang2025cube,
title={CUBE: Collaborative Multi-Agent Block-Pushing Environment for LLM Agents},
author={Yang, Hanqing and Nourzad, Narjes and Chen, Shiyu and Joe-Wong, Carlee},
booktitle={NeurIPS 2025 Workshop on Scaling Environments for Agents (SEA)},
note={Project page: https://happyeureka.github.io/cube}
}