Google Reveals Gemini 2, AI Agents, and a Prototype Personal Assistant

December 11, 2024:

“Mariner is our exploration, very much a research prototype at the moment, of how one reimagines the user interface with AI,” Hassabis says.

Google launched Gemini in December 2023 as part of an effort to catch up with OpenAI, the startup behind the wildly popular chatbot ChatGPT. Despite having invested heavily in AI and contributing key research breakthroughs, Google saw OpenAI lauded as the new leader in AI and its chatbot even touted as perhaps a better way to search the web. With its Gemini models, Google now offers a chatbot as capable as ChatGPT. It has also added generative AI to search and other products.

When Hassabis first revealed Gemini in December 2023, he told WIRED that the way it had been trained to understand audio and video would eventually prove transformative.

Google today also offered a glimpse of how this might transpire with a new version of an experimental project called Astra. This allows Gemini 2 to make sense of its surroundings, as viewed through a smartphone camera or another device, and converse naturally in a humanlike voice about what it sees.

WIRED tested Gemini 2 at Google DeepMind’s offices and found it to be an impressive new kind of personal assistant. In a room decorated to look like a bar, Gemini 2 quickly assessed several wine bottles in view, providing geographical information, details of taste characteristics, and pricing sourced from the web.

“One of the things I want Astra to do is be the ultimate recommendation system,” Hassabis says. “It could be very exciting. There might be connections between books you like to read and food you like to eat. There probably are and we just haven’t discovered them.”

Through Astra, Gemini 2 can not only search the web for information relevant to a user’s surroundings and use Google Lens and Maps. It can also remember what it has seen and heard—although Google says users would be able to delete data—providing an ability to learn a user’s taste and interests.

In a mocked up gallery, Gemini 2 offered a wealth of historical information about paintings on the walls. The model rapidly read from several books as WIRED flicked through pages, instantly translating poetry from Spanish to English and describing recurrent themes.

“There are obvious business model opportunities, for advertising or recommendations,” Hassabis says when asked if companies might be able to pay to have their products highlighted by Astra.

Though the demos were carefully curated, and Gemini 2 will inevitably make errors in real use, the model resisted efforts to trip it up reasonably well. It adapted to interruptions and as WIRED suddenly changed the phone’s view, improvising much as a person might.

At one point, your correspondent showed Gemini 2 an iPhone and said that it was stolen. Gemini 2 said that it was wrong to steal and the phone should be returned. When pushed, however, it granted that it would be okay to use the device to make an emergency phone call.

Hassabis acknowledges that bringing AI into the physical world could result in unexpected behaviors. “I think we need to learn about how people are going to use these systems,” he says. “What they find it useful for; but also the privacy and security side, we have to think about that very seriously up front.”

Source link