Demo: Control the ARmada: LMM Coordination for Multi-Robot AR-HRC
Augmented reality (AR) has shown strong potential to enable more flexible interaction in human-robot collaboration (HRC) by conveying a robot’s state and intent [1, 3] and enabling intuitive control [1]. However, while many systems exist that demonstrate AR’s ability to enhance robot control, they often incorporate control for only a singular robotic collaborator, with interaction techniques that require direct manipulation of virtual content; in scenarios with many robots, these manipulations would become much more frequent, lowering task efficiency through repeated manual interventions. Moreover, increased focus on virtual content could reduce awareness of the real environment [2], posing safety risks if this environment contains hazards. Motivated by these challenges, we present ARmada, an AR-HRC system leveraging edge scene understanding and an LMM to enhance environmental awareness and enable AR control in multi-robot settings. System Design: ARmada has 4 primary components: a Meta Quest 3 AR headset, a Unitree Go2 quadruped robot, an edge server, and a cloud LMM. Due to space constraints, the additional robot collaborators are 6 virtual drones; however, the system can include multiple physical robots of diverse form factors. Through Quest depth and image data processing by the LMM and the edge, ARmada detects and virtually marks key landmarks such as environmental hazards, improving awareness for both the user and robots. Following detection, a user can issue concise, high-level commands relative to landmarks in the environment. Using LMM-based robot control, the system can autonomously direct robots to individual destinations or coordinate them into complex formations, such as geometric shapes, without further user input.