NVIDIA Robotics reposted this
NVIDIA Senior Research Manager & Lead of Embodied AI (GEAR Group). Stanford Ph.D. Building Humanoid robot and gaming foundation models. OpenAI's first intern. Sharing insights on the bleeding edge of AI.
Going to SIGGRAPH next week! An emerging application of graphics is simulation: creating many worlds to train embodied AI, virtual or physical. The driving principle is very simple: if an AI agent can master 10,000 simulated realities with all kinds of variations, then it may very well generalize zero-shot to our real physical world, which is simply the 10,001st reality. Our team’s research agenda operates on this first principle. In Mar. 2024, NVIDIA CEO Jensen unveiled Project GR00T, our moonshot initiative to create a general-purpose AI brain for humanoid robots. GR00T scales up training massively in simulation. It will enable a humanoid robot to understand multimodal instructions, such as language, video, and demonstration, and perform a variety of useful tasks. GR00T is being baked on NVIDIA’s deep technology stack - what we call the “Three computer problem”: 1. DGX Systems for multimodal foundation model training. These systems are the powerhouse for handling vast amounts of visual, textual, and action data. We tokenize them all into sequence of integers, and DGX helps GR00T internalize these sequences on an enormous scale, powered by the Transformer architecture. 2. OVX Systems for simulation. NVIDIA’s simulation platforms like Omniverse and Isaac Sim will be able to generate an infinite supply of high-quality tokens for GR00T to learn from. In simulation, we train our humanoid robots to see, act, and react, in what we call a “sensorimotor loop”. We are also able to ensure a safe virtual training environment without any physical risks. 3. AGX Systems for hardware-in-the-loop validation. AGX provides the edge computing power, or “mobile brain”, to process a continuous stream of sensor input in real-time. Next week at SIGGRAPH, I’ll be presenting Project GR00T and embodied AI on a panel: https://lnkd.in/gnyyMacf Additionally, Jensen will have live streamed keynote discussions where you might see some exciting demos of GR00T’s tech stack at 2:30 p.m. MT. He’ll also be chatting with Meta CEO Mark Zuckerburg afterwards at 4pm MT. Tune in: https://lnkd.in/gx7jW5sn