Revolutionizing Robotics: How Google DeepMind's Chatbot-Powered Robot is Leading the Charge

Tech & AI | July 12, 2024, 12:43 a.m.

A tall and slender wheeled robot in a cluttered open-plan office in Mountain View, California has been upgraded with Google DeepMind's Gemini large language model, allowing it to act as a tour guide and office helper. The robot can understand commands, navigate the office, and perform tasks like finding a writing space. Gemini's ability to process video and text, as well as its knowledge of the office environment from recorded video tours, enables the robot to make logical decisions and respond to commands using commonsense reasoning. Demis Hassabis, CEO of Google DeepMind, believes Gemini's multimodal capabilities will revolutionize robot abilities. The success of DeepMind's system in navigating the office and interacting with humans demonstrates the power of large language models in physical applications. Research labs and startups are exploring ways to combine AI language models with robotics for enhanced problem-solving abilities and interactions in the real world.