Embodied Intelligent Mobile Operation Robots For Industrial Scenarios
DOI:
https://doi.org/10.70695/IAAI202504A13Keywords:
Embodied Intelligence; Mobile Manipulation Robot; Large Industrial Model; Multimodal Perception; Command Fine-tuning; Generalization AbilityAbstract
In industrial scenarios with multiple workstations and processes, traditional rule-based control and small-scale policy networks are prone to performance degradation when tasks are expanded and equipment varies. Therefore, this study developed an embodied intelligent mobile operation platform. This platform unifies the observation -action interface of the mobile chassis, robotic arm, and multimodal sensors. An industrial-grade large-scale model was also designed, integrating vision, point cloud, force sensing, and language commands, and improved through pre-training and command fine-tuning. Experiments based on a real-world workshop task library show that the model's accuracy in multi-task operations in the source domain is close to that of manual operation. Furthermore, it demonstrates high adaptability to unfamiliar task synthesis and workstation layouts in zero-sample and few-sample scenarios. Ablation and engineering case studies further validate the benefits of multimodal fusion, hierarchical motion, and generalization improvement in terms of cycle time, human time, and safety, demonstrating its replicable engineering application value.