'Google's Gemini AI technology is enhancing the intelligence of its robots'
Tech & AI | July 11, 2024, 10:43 a.m.
Google is enhancing the capabilities of its robots through training with Gemini AI technology, specifically Gemini 1.5 Pro, to improve navigation and task completion. The DeepMind robotics team detailed in a recent research paper how using Gemini 1.5 Pro's long context window allows their RT-2 robots to better comprehend natural language instructions.
By creating a video tour of a designated area, researchers can teach the robot about its environment by "watching" the video. The robot can then execute commands based on verbal or image inputs, such as guiding users to a power outlet when asked where to charge a phone. DeepMind reported a 90 percent success rate in over 50 user instructions in a 9,000-plus-square-foot area.
Moreover, researchers observed that Gemini 1.5 Pro enables robots to plan tasks beyond navigation, such as fetching a preferred beverage from the fridge. DeepMind plans to further explore these promising results in the future.