At the SIGGRAPH 2024 conference, NVIDIA unveiled a suite of tools and services designed to streamline and expedite the process of robot training for developers. This advancement has the potential to significantly reduce development time, facilitating a rapid transition to real-world robot deployment and testing.
A demonstration video showcased a robot being controlled using Apple Vision Pro. The user could see what the robot perceived but lacked the tactile sensation of the objects it grasped. Despite this, the user appeared to manipulate the robot’s arms and hands to pick up various items.
While users could likely estimate the necessary force to exert on the robot’s hand, a significant margin of error existed due to the absence of real-world tactile feedback. It seems that NVIDIA’s tools are poised to bridge this gap by estimating the appropriate force to apply to objects based on their type.
Leveraging these tools and services, NVIDIA trains artificial intelligence models for the robot based on data captured through the headset. These training experiences are then used to generate a multitude of synthetic scenarios through a service called MimicGen NIM. A second service, Robocasa, creates tasks for robot training in a virtual simulation, taking on the responsibility of constructing simulations and scenarios for the robot to execute. Through this simulated experience, the robot is trained to master tasks using reinforcement learning.
These simulations are created using Isaac Sim, which adheres to the laws of physics. Consequently, the robot’s responses and interactions with the environment within the simulation closely mirror what could occur in the real physical world.
Once the training process is completed in the simulation, the robot systems are deployed on the physical robot for real-world testing.
In essence, NVIDIA simplifies the transition from controlling a robot via Apple Vision Pro to perform tasks to building a model that enables the robot to master the required task.
While this capability is not exclusive to Vision Pro, other headsets can be employed with appropriate configuration. Additionally, the robot can be trained using data from other sources such as video or real-world robot experiences.