Google advances AI vision with launch of Project Astra

0 12

By Matthew Griffin Intelligence and the Senses 2nd June 2024

WHY THIS MATTERS IN BRIEF

There is an AI arms race and Google got caught napping, so now they’re trying to research their way out of trouble.

Love the Exponential Future? Join our XPotential Community, future proof yourself with courses from XPotential University, read about exponential tech and trends, connect, watch a keynote, or browse my blog.

Google owner Alphabet has unveiled an Artificial Intelligence (AI) agent that can answer real-time queries across video, audio and text, as part of a number of initiatives designed to showcase its prowess in AI and quell criticism that it has fallen behind rivals.

US military and DARPA team up to develop tech to uncover fake news

Chief executive Sundar Pichai demonstrated the Silicon Valley giant’s new “multi-modal” AI assistant called Project Astra, powered by an upgraded version of its Gemini model, during an annual developer conference on Tuesday. Astra was part of a series of announcements to showcase a new AI-centric vision for Google. It follows product launches and upgraded AI models from Big Tech rivals including Meta, Microsoft and its partner OpenAI. In a video demonstration, Google’s prototype AI assistant responded to voice commands based on an analysis of what it sees through a phone camera or when using a pair of smart glasses. It successfully identified sequences of code, suggested improvements to electrical circuit diagrams, recognised the King’s Cross area of London through the camera lens, and reminded the user where they had left their glasses.

See it in action.

Google plans to start adding Astra’s capabilities to its Gemini app and across its products this year, Pichai said. However, he said that while the ultimate “goal is to make Astra seamlessly available” across the company’s software, it would be rolled out cautiously and “the path to productisation will be quality driven.”

China wants to shape and lead global AI standards

“Getting response time down to something conversational is a difficult engineering challenge,” said Sir Demis Hassabis, head of its AI research arm DeepMind. “It is amazing to see how far AI has come, especially when it comes to spatial understanding, video processing and memory.”

At the conference, Google also set out big changes to its core search engine. From this week, all US users will see an “AI Overview” – a brief AI-generated summary answer to the query – at the top of many common search results, followed by clickable links interspersed with advertisements lower down. The company said the search system would be able to answer complex questions with multi-step reasoning – meaning the AI agent can make several independent decisions in order to complete a task – and help customers generate search queries using voice and video. Liz Reid, head of Google search, said the aim was to “remove some of the legwork in search” and that AI overview would be expanded to users in other parts of the world later this year. The changes come as OpenAI threatens Google’s search business. The San Francisco-based start-up’s ChatGPT chatbot provides quick and complete answers to many questions, threatening to render obsolete search results that provide a traditional list of links alongside advertising. OpenAI has also signed deals with media organisations to include up-to-date information to improve its responses.

Google has taught its DeepMind AI to dream

On Monday – in a move seen as an attempt to upstage Google’s announcements – OpenAI demonstrated a faster and cheaper version of the model that powers ChatGPT, which can similarly interpret voice, video, images and code in a single interface.

Google also revealed new or improved AI products including Veo, which generates video from text prompts; Imagen 3, which creates pictures; and Lyria, a model for AI music generation. Subscribers to Gemini Advanced will be able to create personalised chatbots called “Gems” to help with specific tasks. The company’s flagship Gemini 1.5 Pro model has also been upgraded. It now has a much larger context window of 2 Million tokens – referring to the amount of data such as code or images that it can refer to when generating a response – making it better at following nuanced instructions and referring back to earlier conversations.

Matthew Griffin / About Author

Matthew Griffin, multi-award winning Futurist and named Futurist of the Year 2024, has been described as a "Walking encyclopaedia of the future" by NASA and a futurist polymath. One of the world's most renowned futurists and strategic foresight experts Matthew is the 15 times author of the blockbuster "Codex of the Future" series, and is the Founder and Futurist in Chief of the 311 Institute, a global Futures and Deep Futures advisory firm working across the next 50 years, XPotential University, the world's first free futures and foresight university, and the World Futures Forum which works with the United Nations to solve the worlds greatest challenges. Matthew is an in demand international keynote, acclaimed university lecturer and mentor, and host of the hit Fanatical Futurist podcast.

A rare talent in his past Matthew helped build and run several multi-billion dollar business units for Atos, Dell-EMC, and IBM, and his ability to identify, track, and explain the impacts of hundreds of emerging technologies and trends on global business, culture, and society has earned him a powerful reputation and a roster of clients that include royal households, world leaders, G7, G20, and G77+ governments, and many of the world's most respected brands including ABB, Accenture, Adidas, AON, ARM, BCG, Centrica, Citi Group, Coca Cola, Dentons, Deloitte, Disney, Dow, EY, KPMG, Lego, Legal & General, LinkedIn, Microsoft, PepsiCo, Qualcomm, RWE, Samsung, T-Mobile, UBS, VISA, and many others. He was also the only futurist invited to talk at the UN COP28 held in Dubai alongside world leaders.

Regularly featured in the global media including the AP, BBC, Bloomberg, CNBC, Discovery, Forbes, Khaleej Times, Telegraph, TIME, ViacomCBS, WIRED, and the WSJ, Matthews mission is to help organisations create a fair and sustainable future whose benefits are shared by everyone irrespective of their ability, background, or circumstances.